Skip to content

Conversation

gnufied
Copy link
Contributor

@gnufied gnufied commented Jul 23, 2018

Fixes #236

cc @saad-ali @jdef

csi.proto Outdated
// The number of inodes used by the volume.
// This field is OPTIONAL.
// The value of this field MUST NOT be negative.
int64 used_inodes = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @ddebroy that we have to avoid leaking OS specific constructs in to the spec. Is there a similar concept in Windows, could we come up with something that would work for all OSes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Windows/NTFS does not have a concept of a variable number of inodes. It just uses File IDs. As @garthy mentions below, a NTFS volume is divided into a variable number of clusters (typically 4K in size but can be configured during format) and the cluster usage might be useful on top of capacity usage but clusters and inodes are quite different.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think our options boil down to:

a. Either come up with a name that may or may not represent inode number in a OS agnostic way. It seems hard to do and could easily cause confusion.
b. Drop the inode information altogether from stat information.
d. Keep the inode capacity information as it is and make it clear that plugin authors need not set this value when running on a platform that does not support it. These numbers will not be available when using raw block devices on *nix too. So, Windows is not alone in this.
e. @msau42 suggested we can move such platform specific information in a opaque dictionary. It could work but I am not sure if it is very different from option#d.

@garthy
Copy link

garthy commented Jul 23, 2018 via email

spec.md Outdated
@@ -1913,6 +1916,78 @@ The CO MUST implement the specified error recovery behavior when it encounters t
| Operation pending for volume | 10 ABORTED | Indicates that there is a already an operation pending for the specified volume. In general the Cluster Orchestrator (CO) is responsible for ensuring that there is no more than one call "in-flight" per volume at a given time. However, in some circumstances, the CO MAY lose state (for example when the CO crashes and restarts), and MAY issue multiple calls simultaneously for the same volume. The Plugin, SHOULD handle this as gracefully as possible, and MAY return this error code to reject secondary calls. | Caller SHOULD ensure that there are no other calls pending for the specified volume, and then retry with exponential back off. |


#### `NodeGetVolumeStats`

A Node plugin MUST implement this RPC call.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, this is a breaking change. We try to avoid that if possible.

Why not make it optional, and control by a plugin capability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

spec.md Outdated
// The path at which the volume was published. It MUST be an absolute
// path in the root filesystem of the process serving this request.
// This is a REQUIRED field.
string target_path = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we need this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed this field to volume_path and updated the comment to reflect as per our discussion in community call.

spec.md Outdated
// For volumes that share a file system with the host (e.g hostpath)
// this is the total size of underlying storage and it MAY
// not be equal to used_capacity + available_capacity because
// filesystem is shared.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand this. Can you give some examples?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have dropped these elaborate comments, because host paths that are shared with underlying storage may or may not be a strong use case. I expect plugin implementations will use their judgement to return right values from here.

csi.proto Outdated
string target_path = 2;
}

message NodeGetVolumeStatsResponse {
Copy link
Member

@jdef jdef Jul 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about consolidating this a bit w/ units that we can extend later?

message Usage {
  enum Unit {
    UNKNOWN = 0;
    INODE = 1;
    CLUSTER = 2;
    BYTES = 3;
  }

  int64 free = 1;
  int64 total = 2;
  Unit unit = 3;
} 

repeated Usage usage = 1;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@gnufied
Copy link
Contributor Author

gnufied commented Jul 26, 2018

@jdef @jieyu fixed or addressed most review comments. PTAL.

@gnufied gnufied force-pushed the volume_capacity_stats branch from d1fe31b to b5ddbe6 Compare July 27, 2018 14:17
@gnufied
Copy link
Contributor Author

gnufied commented Jul 30, 2018

@jdef @jieyu ping again. Can you guys please take another look?

Copy link
Member

@jieyu jieyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM! ping @jdef @saad-ali @julian-hj

csi.proto Outdated

message VolumeUsage {
enum Unit {
BYTES = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Most enums typically have a UNKNOWN = 0; Can we add one here too?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch @ddebroy!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks .. LGTM

@gnufied gnufied force-pushed the volume_capacity_stats branch from b5ddbe6 to c6674bb Compare July 30, 2018 22:42
@saad-ali
Copy link
Member

LGTM

spec.md Outdated
A Node plugin MUST implement this RPC call if it has GET_VOLUME_STATS node capability.
`NodeGetVolumeStats` RPC call returns the volume capacity stats available for the volume.

If the volume is being used in `BlockVolume` mode then `used_capacity` and `available_capacity` MAY be omitted from `NodeGetVolumeStatsResponse`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

used_capacity and available _capacity don't make sense here

spec.md Outdated
// The value of this field MUST NOT be negative.
int64 used = 3;

// The units that represent this value
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Units by which values are measured"?

spec.md Outdated

| Condition | gRPC Code | Description | Recovery Behavior |
|-----------|-----------|-------------|-------------------|
| Volume does not exist | 5 NOT_FOUND | Indicates that a volume corresponding to the specified `volume_id` does not exist. | Caller MUST verify that the `volume_id` is correct and that the volume is accessible and has not been deleted before retrying with exponential back off. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vol id AND volume_path, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the generic error we return when specified volume does not exist. See NodeUnpublishVolume for example.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a different case than node-unpublish-volume. For example, if the volume_id is already node-unpublished at target_path, then target_path may not exist (this is an expected possibility) - the call is deemed idempotent.

For this RPC, a missing vol staged/published at target_path is an error because of the nature of the operation: the RPC isn't changing the mount state of the volume at target_path, it's trying to query the state of the volume at target_path. And if volume_id exists, but it not present at target_path .. well, that's an error that we should bubble up to the CO.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One might argue that NOT_FOUND isn't appropriate for the case I outlined above and that we should use a different error code for such a case - I'm fine with that too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm I think the only other error code that makes sense perhaps is FAILED_PRECONDITION but in this case, it is not all that different from NOT_FOUND error because in both the cases the CO may retry.

I think I am okay with rewording the error to "Indicates that a volume corresponding to volume_id does not exist on specified volume_path"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the comment.

spec.md Outdated
#### `NodeGetVolumeStats`

A Node plugin MUST implement this RPC call if it has GET_VOLUME_STATS node capability.
`NodeGetVolumeStats` RPC call returns the volume capacity stats available for the volume.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/stats/statistics/

spec.md Outdated
// The ID of the volume. This field is REQUIRED.
string volume_id = 1;

// It can be any valid path where volume was previously published.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

staged or published, right?

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, left a few more minor comments. otherwise LGTM

spec.md Outdated
// The value of this field MUST NOT be negative.
int64 used = 3;

// Units by which values are measured.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is REQUIRED.

spec.md Outdated
}

message NodeGetVolumeStatsResponse {
repeated VolumeUsage usage = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this field is ... OPTIONAL?

spec.md Outdated
@@ -1968,6 +2032,10 @@ message NodeServiceCapability {
enum Type {
UNKNOWN = 0;
STAGE_UNSTAGE_VOLUME = 1;
// If Plugin implements GET_VOLUME_STATS capability
// then it MUST implement NodeGetVolumeStats RPC
// call for fetching volume stats
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/stats/statistics/ (... and end sentences with punctuation please).

@gnufied
Copy link
Contributor Author

gnufied commented Jul 31, 2018

@jdef done. thank you. I will squash when ready for merge.

Use repeated usage for representing volume stats
@gnufied gnufied force-pushed the volume_capacity_stats branch from 173553e to 1e78359 Compare July 31, 2018 18:21
@gnufied
Copy link
Contributor Author

gnufied commented Jul 31, 2018

@saad-ali @jdef squashed the commits. thank you!

@jieyu jieyu merged commit aeca98c into container-storage-interface:master Aug 1, 2018
@jdef
Copy link
Member

jdef commented Aug 1, 2018

Going forward, let's also hold ourselves to a higher standard w/ respect to descriptive PR summaries and/or commit messages. "... a design proposal ..." doesn't cut it here: this was an actual specification change not just a proposal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants