Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flexvolume resize implementation #67851

Conversation

aniket-s-kulkarni
Copy link
Contributor

What this PR does / why we need it

This PR adds support for expanding Flex Persistent Volumes - kubernetes/enhancements#304

  • This PR allows the expansion code flow to execute on the kubelet for flex volumes rather than the controller as the flex plugins are not installed on the controller.
  • It allows the flex volume plugin to implement ExpandVolumeDevice method as well as a new ExpandFS method to allow for expanding in use Flex PersistentVolumes
  • Includes e2e tests
Flex volume plugins now support expandvolume (to increase underlying volume capacity) and expanfs (resize filesystem) commands that Flex plugin authors can implement to support expanding in use Flex PersistentVolumes

/sig storage
/assign @gnufied @chakri-nelluri

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Aug 25, 2018
@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. sig/testing Categorizes an issue or PR as relevant to SIG Testing. labels Aug 25, 2018
Copy link
Member

@neolit123 neolit123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aniket-s-kulkarni
thank you for your PR!
added a couple of small comments about formatting.

pvcWithResizeRequest.CurrentSize)

if expandErr != nil {
detailedErr := fmt.Errorf("error expanding volume %q of plugin %s: %v", pvcWithResizeRequest.QualifiedName(), volumePlugin.GetPluginName(), expandErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...of plugin %q...

if volumeSpec := volume.NewSpecFromPersistentVolume(pv, false); volumeSpec.PersistentVolume.Spec.FlexVolume == nil {
// A flex volume is expandable but since it resides on the kubelet not the controller
// and is dynamically probed plugin, FindExpandablePluginBySpec will never find it
// Therefore explicitly allow flex volumes to resize
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add punctuation in the comment.

...find it.
Therefore explicitly allow flex volumes to resize.

the comment seems to repeat multiple times in the diff.

// and is dynamically probed plugin, FindExpandablePluginBySpec will never find it
// Therefore explicitly allow flex volumes to resize
volumePlugin, err := expc.volumePluginMgr.FindExpandablePluginBySpec(volumeSpec)
glog.Warningf("plugin=%v err=%v fv=%v", volumePlugin != nil, err, volumeSpec.PersistentVolume.Spec.FlexVolume)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be without glog.V(x)?

call.Append(deviceMountPath)

_, err := call.Run()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please remove this empty line.

if err != nil {
return nil, err
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

best to trim some of the empty lines that follow a line that only has }

@aniket-s-kulkarni
Copy link
Contributor Author

@neolit123 I made the suggested changes.

@brahmaroutu brahmaroutu mentioned this pull request Aug 27, 2018
@brahmaroutu
Copy link
Contributor

Docs: kubernetes/website#10097
@zparnold

@dims
Copy link
Member

dims commented Aug 27, 2018

/ok-to-test

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 27, 2018
@neolit123
Copy link
Member

/uncc

@aniket-s-kulkarni
Copy link
Contributor Author

/ok-to-test

@aniket-s-kulkarni
Copy link
Contributor Author

/retest

util.GetPersistentVolumeClaimQualifiedName(newPVC), newPVC.UID, err)
return
if volumeSpec := volume.NewSpecFromPersistentVolume(pv, false); volumeSpec.PersistentVolume.Spec.FlexVolume == nil {
// A flex volume is expandable but since it resides on the kubelet not the controller
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the volume driver shall be installed on master node.
According to https://github.com/kubernetes/community/blob/master/contributors/design-proposals/storage/grow-flexvolume-size.md#expandfs

If flex volume driver supports resizing, it shall implement expandvolume method and at least, the volume driver shall be installed on master node.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@linyouchong after discussions with @gnufied and @chakri-nelluri, installing the flex driver on the master didn't seem to be ideal. Maybe we need to update that point in the design document - will do so.

Meanwhile, I do realize that if the flex driver doesn't support expansion and it does return an error via ExpandVolumeDevice- the faked expansion status on the PV and the PVC condition isn't cleared. I will fix that up so that its in line with the design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we finalize on that? I don't particularly like special casing here for flex. Can we not make it a hard requirement that for flex volume resizing the plugin must be at least installed on the controller so as controller is aware of plugin capabilities before allowing volume expansion?

Copy link
Contributor Author

@aniket-s-kulkarni aniket-s-kulkarni Aug 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied I was under the impression there was discussion around not make that (installing flex plugin on master) a hard requirement. Maybe we need to revisit it?

If we do make it a hard requirement, it would lead to issues for certain uses cases. Esp. for "managed Kubernetes service" like Amazon EKS or IBM CKS or AKS - in each cases the control plane management is hidden from the end user making it tough to install the flex driver on the master.

How about an alternative -

  1. The volume-expand controller sets the PVC condition to ExternalResizing instead of Resizing for plugins (only Flex today) that don't reside on the master/controller
  2. The kubelet reconciler calls ExpandVolumeDevice of the plugin (Flex) for such volumes and then
  • if it succeeds the condition is cleared prompting the volume-expand controller to finish and update PV and move onto fs resizing as it does today
  • if it fails the condition.Reason set to Error and the .Message has the actual reason; this prompts the volume-expand to abort resizing with appropriate error

This kind of controller <-> kubelet feedback loop is necessary since only the controller is guaranteed to have privileges to update PV, the kubelet may not be able to do it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali @chakri-nelluri what you guys think about this? I think that - controller to kubelet and then back to controller feedback loop is problematic. Expand Controller must reprocess such PVCs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saad-ali @chakri-nelluri - @gnufied and I were discussing this on slack, and there maybe a simpler way.

This discussion got underway because the design mentioned the need to install Flex on the controller so that a Flex author can signal expansion is unsupported by returning an error from ExpandVolumeDevice.

Since only dynamically provisioned volumes with allowVolumeExpansion = true in their storage classes are allowed to be expanded, a Flex plugin that doesn't support expansion can simply set allowVolumeExpansion = false and we don't need to come up with a complicated flow to handle it.

So is that a good enough option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aniket-s-kulkarni if the plugin is not installed on master, assume it to be true and mark the volume "RequiresFSResize" and let the Kubelet side code manage it. Let me know if you see any issues with it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chakri-nelluri you are right. That works - and that's how the PR is doing it. And with what @gnufied mentioned above - it covers all the cases. I will address your comment about abstracting the .FlexVolume != nil checks

// Therefore explicitly allow flex volumes to resize
volumePlugin, err := expc.volumePluginMgr.FindExpandablePluginBySpec(volumeSpec)
glog.V(4).Infof("plugin=%v err=%v fv=%v", volumePlugin != nil, err, volumeSpec.PersistentVolume.Spec.FlexVolume)
if err != nil || (volumePlugin == nil && volumeSpec.PersistentVolume.Spec.FlexVolume == nil) {
Copy link
Contributor

@linyouchong linyouchong Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

volumeSpec.PersistentVolume.Spec.FlexVolume == nil always be true at this line, should delete it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds good.

@aniket-s-kulkarni
Copy link
Contributor Author

/retest

@@ -1265,26 +1281,54 @@ func (og *operationGenerator) GenerateExpandVolumeFunc(

volumePlugin, err := og.volumePluginMgr.FindExpandablePluginBySpec(volumeSpec)

if err != nil {
if err != nil && volumeSpec.PersistentVolume.Spec.FlexVolume == nil {
Copy link
Contributor

@chakri-nelluri chakri-nelluri Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way we can abstract Flexvolume type here? We should not bring knowledge of Flexvolume into operationexcutor

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chakri-nelluri yes, maybe we can do some abstraction there. Let me look into it.

call := plugin.NewDriverCall(expandFSCmd)
call.AppendSpec(spec, plugin.host, nil)
call.Append(devicePath)
call.Append(deviceMountPath)
Copy link
Contributor

@chakri-nelluri chakri-nelluri Aug 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the new size of the volume. Some plugins might need this information.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chakri-nelluri do we need the new size? The size has already been passed in the prior call (ExpandVolumeDevice) and has been acted on by the plugin.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aniket-s-kulkarni Yes, there can be plugins which can support FS resize, but have no corresponding block resize calls. They expand dynamically and can ignore previous device resize call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I can look up the size from spec.PersistentVolume.Status.Capacity and add that in here for the call.

call.AppendSpec(spec, plugin.host, nil)
call.Append(strconv.FormatInt(newSize.Value(), 10))
call.Append(strconv.FormatInt(oldSize.Value(), 10))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us also add device path for this call.

Copy link
Contributor Author

@aniket-s-kulkarni aniket-s-kulkarni Sep 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do

if fsResizablePlugin, callExpandFs := expandableVolumePlugin.(volume.FSResizableVolumePlugin); callExpandFs {
// special case for flex volumes
if _, resizeErr = fsResizablePlugin.ExpandVolumeDevice(volumeToMount.VolumeSpec, pvSpecCap, pvcStatusCap); resizeErr == nil {
resizeErr = fsResizablePlugin.ExpandFS(volumeToMount.VolumeSpec, devicePath, deviceMountPath)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ExpandFS call should be extended to all volume plugins that require file system resize rather than special casing this for Flex.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied so you would like to see all existing plugins have a default ExpandFS implementation that calls the existing resize FS ?

Copy link
Contributor

@chakri-nelluri chakri-nelluri Sep 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aniket-s-kulkarni Yes this might be required for all plugins going forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so we go ahead and add this to all existing plugins.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. Also all the existing plugins will simply call out to existing util resize function.

util.GetPersistentVolumeClaimQualifiedName(newPVC), newPVC.UID, err)
return
if volumeSpec := volume.NewSpecFromPersistentVolume(pv, false); volumeSpec.PersistentVolume.Spec.FlexVolume == nil {
// A flex volume is expandable but since it resides on the kubelet not the controller
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we finalize on that? I don't particularly like special casing here for flex. Can we not make it a hard requirement that for flex volume resizing the plugin must be at least installed on the controller so as controller is aware of plugin capabilities before allowing volume expansion?

@chakri-nelluri
Copy link
Contributor

@aniket-s-kulkarni The code freeze is very close. Ping me on slack if you need a quick review.

@chakri-nelluri
Copy link
Contributor

/LGTM

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 24, 2018
return stKlass, err
}

var _ = utils.SIGDescribe("Mounted flexvolume volume expand [Slow] [Feature:ExpandInUsePersistentVolumes]", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This e2e should not have the feature tag since this feature does not tie itself to ONLINE resizing.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2018
@chakri-nelluri
Copy link
Contributor

/LGTM

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2018
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2018
@gnufied
Copy link
Member

gnufied commented Oct 25, 2018

/approve

@gnufied
Copy link
Member

gnufied commented Oct 25, 2018

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 25, 2018
@aniket-s-kulkarni
Copy link
Contributor Author

/assign @saad-ali

@saad-ali assigning you as approver, since this has a test case change that needs approval.

@chakri-nelluri
Copy link
Contributor

/LGTM

@gnufied
Copy link
Member

gnufied commented Oct 25, 2018

Please don't forget to open follow up PR that adds e2e that does not depend on online resizing feature.

"strconv"
)

func (plugin *flexVolumePlugin) ExpandVolumeDevice(spec *volume.Spec, newSize resource.Quantity, oldSize resource.Quantity) (resource.Quantity, error) {
Copy link
Member

@gnufied gnufied Oct 25, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this implementation EVER get called? It looks like, even if flex volume plugin is installed on the controller, the code still returns a no-op version of flex plugin.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nvm. looks like it works as expected. thanks.

@gnufied
Copy link
Member

gnufied commented Oct 25, 2018

Putting the PR on hold for a bit.
/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 25, 2018
@gnufied
Copy link
Member

gnufied commented Oct 25, 2018

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 25, 2018
Copy link
Member

@saad-ali saad-ali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aniket-s-kulkarni, gnufied, saad-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 2, 2018
@k8s-ci-robot k8s-ci-robot merged commit 6813ebb into kubernetes:master Nov 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants