Skip to content

Conversation

gnufied
Copy link
Contributor

@gnufied gnufied commented Apr 19, 2018

Add proposal for volume expansion in CSI

Closes #212

@jdef
Copy link
Member

jdef commented Apr 19, 2018

please do not hand-edit the csi.proto file. instead, edit the spec.md file and run make to generate the proto and bindings.

@gnufied
Copy link
Contributor Author

gnufied commented Apr 19, 2018

@jdef cool. will do. also I think we should document that! I wasn't aware, sorry.

@gnufied gnufied changed the title Add a proposal for volume expansion [WIP] Add a proposal for volume expansion Apr 20, 2018
@jdef
Copy link
Member

jdef commented Apr 20, 2018 via email

csi.proto Outdated

// the current capacity of the volume in bytes.
// This field is OPTIONAL
int64 current_capacity_bytes = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the purpose of having the caller give the current capacity?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, I don't understand why this is needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because in many cases (such as glusterfs) - the volume expansion api call requires difference(new_size-current). It is possible that the SP can figure out the current size on its own but I don't know if that will be always possible.

csi.proto Outdated

// Secrets required by the plugin for expanding the volume.
// This field is OPTIONAL
map<string,string> controller_volume_expand_secrets = 4;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any storage systems that require per volume credentials for resizing? If not, I'd rather start without them and add them in the future, if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The credentials requirement of expand volume should fall inline with CreateVolume RPC call. For example - unless CSI driver runs with a pre-configured username and password for connecting to heketi, a glusterfs volume can not be expanded without secrets.

csi.proto Outdated

// Whether file system expansion is required for the volume
// REQUIRED
bool fs_resize_required = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what cases will this be true? Only file on block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that is correct.

@saad-ali
Copy link
Member

Largely LGTM. Some nits, +1 on spec.md instead of proto.

Let's bring this up at the next CSI community meeting, and if there is not too much feedback, I'm ok moving forward with this.

@jieyu
Copy link
Member

jieyu commented Apr 25, 2018

Can you explain why expanding a volume requires both Controller and Node service RPC calls?

@gnufied
Copy link
Contributor Author

gnufied commented Apr 25, 2018

@jieyu yes because to finish expanding volume in many cases; we also have to resize the file system on the node. That is why we need both controller and node RPC calls.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. I wonder if it's worth considering renaming the API calls in the context of "resize" vs "expand": if we ever want to support shrinking then it feels redundant to need to introduce a second set of APIs for that. I'm totally fine with the spec documenting a limit for a "resize" API that states CSI only supports expansion for now, and that attempts to shrink a volume will generate an error (and document the error code).

Also, please update spec.md with the proto changes, as well as a few paragraphs (like the other API calls have) that provide additional context for the API and error conditions.

csi.proto Outdated

// This field is REQUIRED. This allows CO to specify the
// capacity requirements of the volume after expansion
CapacityRange capacity_range = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May I assume that this capacity_range should specify a greater capacity than that of the volume in its current state? Given that this is a range, does that imply that the "required" and "limit" sub-fields have a lower bound that's greater than the current capacity? If so, it may be worth documenting that here.

csi.proto Outdated

// the current capacity of the volume in bytes.
// This field is OPTIONAL
int64 current_capacity_bytes = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto, I don't understand why this is needed

csi.proto Outdated

// Whether file system expansion is required for the volume
// REQUIRED
bool fs_resize_required = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under what conditions would a CO not want an underlying filesystem to be automatically resized? is this field really necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For network file systems like - glusterfs, portworx, AzureFile etc there is no file system resize necessary.

csi.proto Outdated

// the current capacity of the volume in bytes.
// This field is OPTIONAL
int64 current_capacity_bytes = 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, this field doesn't seem to be needed

csi.proto Outdated

// This field is OPTIONAL. This allows CO to specify the
// capacity requirements of the volume after expansion
CapacityRange capacity_range = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced that this is needed, really. The proposed controller API already provides this field.

Perhaps, instead, the ControllerExpandVolumeResponse could return an opaque "expansion_parameters" field that would be specified as part of this request, for use with any SP-specific tooling that required proprietary metadata in order to carry out follow-up operations on the node (like a filesystem resize)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added field with the expectation that - some file systems may require current and expected capacity for expanding the file system. Already for example xfs_growfs invocation looks like:

~> xfs_growfs /mount/point -D size

If we omit size from the argument then it is expanded to the end of volume. In many cases, I do agree that file system expansion in CSI driver could work without expected capacity but there can be cases where it needs the expected capacity too.

Also, if we were to rename this feature to resize from expand it will be absolutely needed.

I am not sure how ControllerExpandVolumeResponse will be accessible to node driver. I am assuming that most of CSI spec is designed to be stateless.

csi.proto Outdated
message NodeExpandVolumeResponse {
// the capacity of the volume in bytes.
// This field is OPTIONAL
int64 capacity_bytes = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the controller API already provides this information, this feels redundant

Copy link
Contributor Author

@gnufied gnufied Apr 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think this is redundant. Often size on disk(after formatting with file system) of a volume could be different from allocated size. How CO uses this field is entirely upto it but returning size on disk after volume expansion is important and it will let CO make more informed decision.

csi.proto Outdated
@@ -635,6 +671,25 @@ message NodeUnpublishVolumeRequest {
string target_path = 2;
}

message NodeExpandVolumeRequest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should NodeExpandVolumeRequest contain a volume_attributes field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now, I don't think so. Most SP's will call resize on volume path on node and that is the only information they would need. Is there an use case you were thinking?

@gnufied
Copy link
Contributor Author

gnufied commented Apr 25, 2018

@jdef yes I will update this proposal with spec.md. I also need to add error codes etc.

csi.proto Outdated
@@ -635,6 +671,25 @@ message NodeUnpublishVolumeRequest {
string target_path = 2;
}

message NodeExpandVolumeRequest {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm writing the CSI plugin for NetApp, and I don't see how I could implement this as written. I'd like to see NodeExpandVolumeRequest include the targetPath where the volume is mounted (consistent with NodePublishVolumeRequest, NodeUnpublishVolumeRequest). Because the volume ID may be created by the controller on a different node, and because the volume ID is created before the volume is ever published, the volume ID cannot always be sufficient to identify the volume on the node. Ideally we would also get the PublishInfo map, but much of that may be obtained by working backwards from the targetPath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the proposal to include a volume path. sorry about it.

@gpaul
Copy link

gpaul commented May 11, 2018

Related: #212

@gpaul
Copy link

gpaul commented May 11, 2018

Cross-posting from #212 as discussion seems to have moved to this PR.

For mount volumes we support specifying an fs_type. This field can be trivially used to format the volume with the given fs_type provided the necessary filesystem libraries are installed on the node. For example

mkfs -t ext4 /dev/sda
mkfs -t xfs /dev/sdb

However, there is no such standard utility for resizing filesystems:

resize2fs /dev/sda
xfs_growfs /dev/sdb

This means that CSI plugins that today support arbitrary filesystems will need to be modified to add individual logic for each filesystems they intend to support. In general saying "resize the filesystem on a volume" is more involved that growing the underlying partition - I don't know that we should be discussing it in a PR yet.

@gnufied gnufied force-pushed the resize-proposal branch from 4bca51c to b8bca1c Compare May 14, 2018 18:05
@gnufied gnufied force-pushed the resize-proposal branch from b8bca1c to 1f668ec Compare May 29, 2018 17:41
@gnufied gnufied changed the title [WIP] Add a proposal for volume expansion Add a proposal for volume expansion May 29, 2018
@gnufied
Copy link
Contributor Author

gnufied commented May 29, 2018

@gpaul that is true but is also out of scope for CSI. Do we really care how each Storage provider implements resizing? For some SP's such as glusterfs or Azure file, no file system resizing will be necessary.

@gnufied
Copy link
Contributor Author

gnufied commented May 29, 2018

@saad-ali @jdef addressed most review comments. This PR is ready for final round of review.

@gnufied gnufied force-pushed the resize-proposal branch from 1f668ec to 0bc7737 Compare May 29, 2018 17:54
@xing-yang
Copy link
Contributor

@gnufied So a plugin could support only ControllerExpandVolume or only NodeExpandVolume or both? Can you please clarify this somewhere in the spec? Can you also clarify in the spec that the NodeExpandVolume call is also idempotent? Thanks.

spec.md Outdated
This RPC allows the CO to expand size of underlying volume.
`ControllerExpandVolume` RPC call MUST be idempotent and if size of underlying
volume already meets requested capacity, the plugin MUST respond with
successfull response.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

successful

spec.md Outdated
int64 capacity_bytes = 1;

// Whether file system expansion is required for the volume.
// This field REQUIRED.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is REQUIRED.

@jdef
Copy link
Member

jdef commented May 30, 2018

@gnufied please stop squashing intermediate commits until this passes review. Otherwise it's hard for reviewers to see what's changed. Thanks!

@gpaul
Copy link

gpaul commented May 31, 2018

@gpaul that is true but is also out of scope for CSI. Do we really care how each Storage provider implements resizing? For some SP's such as glusterfs or Azure file, no file system resizing will be necessary.

@gnufied I don't see why it is out of scope? We have fs_type in the spec already. My objection is to having fs_type configurable on volume create, but then missing that information on resize.

@gnufied
Copy link
Contributor Author

gnufied commented Jun 5, 2018

@gpaul shouldn't the SP already have fstype information because flow wise resize will be only called after volume already has been formatted? having said that - I am open to adding it, if you think SP cant really determine fstype from volume itself.

@gpaul
Copy link

gpaul commented Jun 5, 2018

Determining the fstype of a given filesystem should be doable by using some combination of lsblk, file, etc. I do this for the csilvm plugin to ensure idempotency of the NodePublishVolume RPC, which formats a volume.) The question is really: what to do then?

As there is no standard resize command every plugin would have to bake in a set of fstypes for which it hardcodes the resize incantation.

Alternatively, a plugin would have to provide the operator with some way to specify a list of supported fstypes (I already do this for the csilvm plugin) including a resize command for each.

Something like:

# -fstype="<format cmd>:<resize command>", for example:
./csilvm -fstype="mkfs.xfs {{.VolumePath}}:xfs_growfs -D {{.SizeInBlocks}} {{.VolumePath}}"

Where SizeInBlocks could be calculated by dividing the target size by the logical block size of the volume (e.g., blockdev --getss <dev>).

I don't know how far a generic template-based approach would get you, but I hope we don't have to bake in support for specific filesystems into every plugin.

spec.md Outdated
#### `NodeExpandVolume`

A Node Plugin MUST implement this RPC call if it has `EXPAND_VOLUME` node capability.
This RPC call allows CO to expand volume on the node.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me when CO needs to call this. For instance, should the CO call this when the volume is in NODE_READY state (i.e., after ControllerPublishVolume is successful)? or in PUBLISHED state (i.e., NodePublishVolume is successful)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A follow-up question: If NodeExpandVolume should be called after NodeStageVolume or NodePublishVolume, then say if the volume is multi-node capable and has been staged or published on multiple nodes, can the effect of calling NodeExpandVolume on one node be seen on all other nodes?

Copy link
Contributor Author

@gnufied gnufied Jun 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jieyu Yeah I think some clarification is in order for when the CO should call this. I was thinking rather than prescribing when a CO can call this, if we made it clear that NodeExpandVolume does not implicitly call NodeStageVolume or NodePublishVolume and call to NodeExpandVolume expects volume to be available in specified volume_path then CO will have enough information to decide when to make this call.

For example - for something like EBS, the CO will call this after NodeStageVolume but most likely before NodePublishVolume. I do not feel good about being very specific in call order and I think as long as CO knows expectation of this function, he can decide.

Another thing here worth specifying is - this RPC call should only be called after ControllerExpandVolume has completed if volume supports ControllerExpandVolume .

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chhsia0 That depends on file system. For some file systems like gfs2, growing the file system on one node typically makes the space available on all nodes.

But I think that any fencing that needs to be implemented to support file system types that can be mounted as rw on multiple nodes has to be implemented by CO. In k8s for example, this could mean that annotating a PVC that is RW mounted on multiple nodes so as only one node can expand the file system.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jieyu @chhsia0 expanded docs with flow about when this can be called.

spec.md Outdated
#### `NodeExpandVolume`

A Node Plugin MUST implement this RPC call if it has `EXPAND_VOLUME` node capability.
This RPC call allows CO to expand volume on the node.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A follow-up question: If NodeExpandVolume should be called after NodeStageVolume or NodePublishVolume, then say if the volume is multi-node capable and has been staged or published on multiple nodes, can the effect of calling NodeExpandVolume on one node be seen on all other nodes?

@gnufied
Copy link
Contributor Author

gnufied commented Jun 8, 2018

@gpaul that is a great idea. So how does the CLI parameter with template handles multiple types? Something like ./csilvm -xfs="<format cmd>:<resize cmd>" -ext4="<format cmd>:<resize cmd>" ?

I am not opposed to having fstype as an option in NodeExpandVolumeRequest, if it makes things easier. But - how is having explicit type with the request is different from having the plugin figure out fstype using tools you mentioned?

@gpaul
Copy link

gpaul commented Jun 8, 2018

So how does the CLI parameter with template handles multiple types?

I would probably use a repeated flag, but yes, something like that:

./csilvm -fs="xfs:mkfs.xfs:xfs_growfs" -fs="ext4:mkfs.ext4:resize2fs"

I am not opposed to having fstype as an option in NodeExpandVolumeRequest, if it makes things easier. But - how is having explicit type with the request is different from having the plugin figure out fstype using tools you mentioned?

The more I think about this the more I think it won't work. It allows for the possibility that the fs_type fields differ between the two calls. Instead, the plugin should interrogate the volume as the single source of truth about what filesystem it has.

@gnufied
Copy link
Contributor Author

gnufied commented Oct 31, 2018

@jdef fixed.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thought a bit more about @chhsia0 's comment and left a suggestion. Also noticed the error code tables were a bit out of order now.

@@ -535,9 +540,55 @@ message PluginCapability {
Type type = 1;
}

message VolumeExpansion {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ONLINE and OFFLINE are mutually exclusive (a plugin cannot indicate that it supports both) then we should mention that restriction very clearly using RFC requirement language. For example: "A plugin SHALL NOT indicate that it supports both ONLINE and OFFLINE volume expansion."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you think ONLINE and OFFLINE are mutually exclusive? I would assume a plugin may support both. For example, a AWS volume can be controller expanded both offline and online.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jdef am I correct in assuming that this concern is resolved?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? I guess I'm wondering about cases where the plugin supports both OFFLINE and ONLINE expansion and a user issues a call to the CO: "please expand volume X". The CO knows what the plugin capabilities are (or it will learn them) and ... it should know the last call that it issued to the plugin. Meaning that it should be able to reason about the current state of the volume. And because it should know the current state of the volume, it should know which mode(s) of volume expansion are appropriate to initiate.

I can't help but feel like I'm overlooking something here because this proposal is somewhat complex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the problem is with that. Many portions of CSI rely on state machine like transitions. You can't call NodeStage before calling ControllerPublish or can't call NodePublish before calling NodeStage. The CO also has burden of verifying per-volume capability before making ControllerPublish call for example. So, CO has to check current state of the volume and its capability (or it will do in practice).

@gnufied
Copy link
Contributor Author

gnufied commented Nov 2, 2018

@jdef okay I think addressed remaining open issues. PTAL. I did not address OFFLINE and ONLINE being mutually exclusive but responded inline. thank you for reviewing!

@saad-ali saad-ali mentioned this pull request Nov 5, 2018
// This allows CO to specify the capacity requirements of the volume
// after expansion. If omitted the volume will be expanded to maximum
// available size. This field is OPTIONAL.
CapacityRange capacity_range = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to remember why this field has value and the only thing I can come up with is this: if plugin doesn't support controller-level expansion, but DOES support node-level expansion then it could be useful to tell the node plugin what the desired target size is.

Is there another case that I'm missing?

Copy link
Contributor Author

@gnufied gnufied Nov 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of filesystem types that force you to specify a size for expansion. In those cases, this parameter may be needed. It could be argued that, plugin can probably find out available space and issue appropriate resize command but I think it will be better to be explicit whenever possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds like an optimization. Plugins that only support node-level expansion (maybe they don't even have a controller service) need this, so we can leave it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should include a clarifying comment along the lines of:

If capacity_range is omitted then a plugin MAY inspect the file system of the volume to determine the maximum capacity to which the volume can be expanded. In such cases a plugin MAY expand the volume to its maximum capacity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have incorporated this suggestion.

@saad-ali
Copy link
Member

saad-ali commented Nov 7, 2018

@jdef wants to do one more pass on this to make sure we're not missing anything.

@gnufied
Copy link
Contributor Author

gnufied commented Nov 13, 2018

@jdef can you please clarify what you are thinking about this PR? If you have any concerns or you think something ought to be done differently, I(or someone else) can look into redesigning it.

Copy link
Member

@jdef jdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performed another full pass over the changes here. I think I'm OK with the semantics as proposed. I left additional (re-)wording suggestions. Also, with the push to CSI 1.0 the spec has shifted a bit so some of the comments here come as a result of that drift.

Thanks for your tenacity.

// This allows CO to specify the capacity requirements of the volume
// after expansion. If omitted the volume will be expanded to maximum
// available size. This field is OPTIONAL.
CapacityRange capacity_range = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should include a clarifying comment along the lines of:

If capacity_range is omitted then a plugin MAY inspect the file system of the volume to determine the maximum capacity to which the volume can be expanded. In such cases a plugin MAY expand the volume to its maximum capacity.

@gnufied
Copy link
Contributor Author

gnufied commented Nov 14, 2018

I am going to close this PR in favour of #334. This PR was failing to load and sometimes timing out for me.

@gnufied gnufied closed this Nov 14, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants