Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle CSI volume resize migration. #77994

Merged
merged 4 commits into from May 30, 2019

Conversation

@gnufied
Copy link
Member

commented May 16, 2019

Add code to handle csi volume resize migration.

Fixes #77417

/sig storage

Handle resize operation for volume plugins migrated to CSI
@leakingtapan

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

@k8s-ci-robot k8s-ci-robot requested review from davidz627 and ddebroy May 16, 2019

@@ -221,11 +248,31 @@ func (expc *expandController) syncHandler(key string) error {
return err
}

return expc.expand(pvc, pv)
if volumePlugin.IsMigratedToCSI() {

This comment has been minimized.

Copy link
@leakingtapan

leakingtapan May 16, 2019

Contributor

Even if the plugin support CSI migration, should also check for whether the migration is enabled and supported by kubelet using nodeUsingCSIPlugin

This comment has been minimized.

Copy link
@gnufied

gnufied May 17, 2019

Author Member

Thats a good point but I do not think we can check if plugin is installed on the node via nodeUsingCSIPlugin because resize operation does not have a relationship with the node.

Thinking some more I think even if CSI plugin was not installed on the node and control-plane expansion was done using CSI resizer, the file system expansion can still be performed by in-tree driver and everything should just work.

This comment has been minimized.

Copy link
@gnufied

gnufied May 17, 2019

Author Member

Also, this will be less of a concern once https://github.com/kubernetes/enhancements/blob/master/keps/sig-storage/20190408-volume-scheduling-limits.md is impelemted, because then if CSI migration has been enabled for a plugin and plugin is not installed on the node, then the node will not have pod scheduled to the node.

I am thinking we may be able to remove entire nodeUsingCSIPlugin logic, once volume limit KEP is implemented.

This comment has been minimized.

Copy link
@leakingtapan

leakingtapan May 17, 2019

Contributor

Node uses in-tree plugin while controller uses CSi driver is a combination we are trying to avoid in other cases. I remember is reason was it's hard to guarantee the compatibility between the two.

@davidz627 How do you think?

This comment has been minimized.

Copy link
@gnufied

gnufied May 20, 2019

Author Member

Sure - but:

a. We can't avoid it in this case.
b. Since nodes which doesn't have CSI plugin will not have migrated volumes scheduled on them, this will be a non-issue.

This comment has been minimized.

Copy link
@jsafrane

jsafrane May 27, 2019

Member

IsMigratedToCSI() checks if a particular plugin has necessary feature gates in controller-manager.

It can happen that a volume is migrated in the controller and is thus resized by CSI driver and then it is scheduled on a node that has disabled migration for this plugin. In-tree NodeExpand() will be called there. IMO, it should not be an issue, as both do just resize2fs (or something similar), but CSI driver authors must be aware of this and don't expect that NodeExpand() is always handled by the CSI driver.

@davidz627 @ddebroy: we should also document this somehow - do we have a guide with requirements to CSI driver authors of the migrated drivers?

This comment has been minimized.

Copy link
@davidz627

davidz627 May 28, 2019

Contributor

My preference would be to keep the same logic we use for attach/mount sync here - that we want to use one type of backend for both controller calls & node calls and therefore we should use nodeUsingCSIPlugin. I understand that theoretically there shouldn't be an issue but that is also the case for most of the plugins for attach/mount steps as well. The issue is that it is impossible to actually guarantee that the controller/node level operations are compatible between migrated/not-migrated since there can be driver specific quirks.

@jsafrane I am working on a guide for requirements to CSI Driver authors.

This comment has been minimized.

Copy link
@gnufied

gnufied May 28, 2019

Author Member

The problem is - contrary to attach/mount operation, no such relationship exists between node and control-plane resize operation.

The volume may not be in-use by any pod when resize was attempted.

This comment has been minimized.

Copy link
@jsafrane

jsafrane May 29, 2019

Member

There is an option for "delayed resizing" - wait until a pod is scheduled. But I don't like it, it breaks current behavior and looks ugly.

The issue is that it is impossible to actually guarantee that the controller/node level operations are compatible between migrated/not-migrated since there can be driver specific quirks.

IMO, there are no driver specific quirks for filesystem resize in-tree. We can require the same behavior for migrated drivers.

This comment has been minimized.

Copy link
@davidz627

davidz627 May 29, 2019

Contributor

ok, understood!

Show resolved Hide resolved pkg/controller/volume/expand/expand_controller.go
add some tests for expand controller
Update bazel file

@gnufied gnufied force-pushed the gnufied:csi-resize-migration branch from 58d3970 to 33b95eb May 20, 2019

@gnufied gnufied changed the title {WIP} Handle CSI volume resize migration. Handle CSI volume resize migration. May 20, 2019

@gnufied

This comment has been minimized.

Copy link
Member Author

commented May 20, 2019

/kind feature
/priority important-soon

@msau42

This comment has been minimized.

Copy link
Member

commented May 23, 2019

Show resolved Hide resolved pkg/controller/volume/expand/expand_controller.go Outdated
@@ -221,11 +248,31 @@ func (expc *expandController) syncHandler(key string) error {
return err
}

return expc.expand(pvc, pv)
if volumePlugin.IsMigratedToCSI() {

This comment has been minimized.

Copy link
@jsafrane

jsafrane May 27, 2019

Member

IsMigratedToCSI() checks if a particular plugin has necessary feature gates in controller-manager.

It can happen that a volume is migrated in the controller and is thus resized by CSI driver and then it is scheduled on a node that has disabled migration for this plugin. In-tree NodeExpand() will be called there. IMO, it should not be an issue, as both do just resize2fs (or something similar), but CSI driver authors must be aware of this and don't expect that NodeExpand() is always handled by the CSI driver.

@davidz627 @ddebroy: we should also document this somehow - do we have a guide with requirements to CSI driver authors of the migrated drivers?

Show resolved Hide resolved pkg/controller/volume/expand/expand_controller.go Outdated
Show resolved Hide resolved pkg/volume/util/operationexecutor/fakegenerator.go Outdated
newPVC := pvc.DeepCopy()
newPVC = MergeResizeConditionOnPVC(newPVC, conditions)
newPVC = setResizer(newPVC, resizerName)
return PatchPVCStatus(pvc /*oldPVC*/, newPVC, kubeClient)

This comment has been minimized.

Copy link
@jsafrane

jsafrane May 27, 2019

Member

Should there be a check (in PatchPVCStatus?) that old and new are really different, so we don't do unnecessary API server request?

This comment has been minimized.

Copy link
@gnufied

gnufied May 28, 2019

Author Member

may be - but I am not sure. I think such a check belongs may be to generic http client. In this case though - the "diff" byte array will be empty and hence http client can reject such a patch. BTW - I tried to look for a precedence of performing old vs new object (diff) before submitting patch request and I couldn't find one.

Show resolved Hide resolved pkg/controller/volume/expand/expand_controller.go
Show resolved Hide resolved pkg/volume/util/operationexecutor/fakegenerator.go Outdated
Show resolved Hide resolved pkg/controller/volume/expand/expand_controller_test.go
Show resolved Hide resolved pkg/controller/volume/expand/expand_controller_test.go
Show resolved Hide resolved pkg/controller/volume/expand/expand_controller_test.go

@gnufied gnufied force-pushed the gnufied:csi-resize-migration branch from 952ac5c to d119899 May 28, 2019

@gnufied

This comment has been minimized.

Copy link
Member Author

commented May 28, 2019

/retest

@davidz627
Copy link
Contributor

left a comment

mostly LGTM. One major comment in generated func. some other small questions.

Thanks Hemant!

csiResizerName, err := csitranslation.GetCSINameFromInTreeName(class.Provisioner)
if err != nil {
errorMsg := fmt.Sprintf("error getting CSI driver name for pvc %s, with error %v", util.ClaimToClaimKey(pvc), err)
expc.recorder.Event(pvc, v1.EventTypeWarning, events.ExternalExpanding, errorMsg)

This comment has been minimized.

Copy link
@davidz627

davidz627 May 28, 2019

Contributor

looks like we log an error event whenever there is an error. Do we instead want to log the event at a higher level so as to not have to do an explicit event recording before each error case?

This comment has been minimized.

Copy link
@gnufied

gnufied May 29, 2019

Author Member

but the logs will only be accessible to admin and I think we should surface any errors associated with resizing in PVC itself.

This comment has been minimized.

Copy link
@jsafrane

jsafrane May 29, 2019

Member

I think the idea was to return error to caller and the caller records event, instead of individual event creation in this function.

I can see there are paths that return err without creating event, not sure if they're worth creating an event or not.

This comment has been minimized.

Copy link
@davidz627

davidz627 May 29, 2019

Contributor

makes sense. I was under the impression any error was also logged as event. If that is not the case please ignore this comment

Show resolved Hide resolved pkg/controller/volume/expand/expand_controller_test.go

// fakeOGCounter is a simple OperationGenerator which counts number of times a function
// has been caled
type fakeOGCounter struct {

This comment has been minimized.

Copy link
@davidz627

davidz627 May 28, 2019

Contributor

dumb question: why can't this just be an extension to the existing fakeOperationGenerator

This comment has been minimized.

Copy link
@gnufied

gnufied May 29, 2019

Author Member

The existing operationGenerator fake is for internal testing because it allows us to pause and unpause operations etc. But the interface is not easy to use for external testing (IMO). I thought about extending existing functions but it seems ugly and I think we are better off having a fake operation generator that can be used by external controller tests more easily..

Show resolved Hide resolved pkg/volume/util/operationexecutor/operation_generator.go Outdated

@k8s-ci-robot k8s-ci-robot added size/XL and removed size/L labels May 29, 2019

og.volumePluginMgr.FindPluginBySpec(volumeToMount.VolumeSpec)
if err != nil || volumePlugin == nil {
return volumeToMount.GenerateError("VolumeFSResize.FindPluginBySpec failed", err)
}

This comment has been minimized.

Copy link
@cwdsuzhou

cwdsuzhou May 29, 2019

Member

If we need provide detailed error info when volumePlugin is nil?

This comment has been minimized.

Copy link
@gnufied

gnufied May 29, 2019

Author Member

You mean like code below? I think GenerateError works as well here. Also it automatically returns two error values and hence is easier to satisfy constraints of the function.

@jsafrane

This comment has been minimized.

Copy link
Member

commented May 29, 2019

/approve
from sig-storage perspective, just polishing the last details

I let the final lgtm to @davidz627

@davidz627

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

/retest
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label May 29, 2019

@gnufied

This comment has been minimized.

Copy link
Member Author

commented May 29, 2019

For controller-manager change.

/assign @derekwaynecarr

@gnufied

This comment has been minimized.

Copy link
Member Author

commented May 29, 2019

/assign @deads2k
/unassign @derekwaynecarr

@deads2k

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k, gnufied, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@davidz627

This comment has been minimized.

Copy link
Contributor

commented May 29, 2019

does this PR resolve: #77417

@gnufied

This comment has been minimized.

Copy link
Member Author

commented May 29, 2019

@davidz627 it does. I am going to close the issue.

@fejta-bot

This comment has been minimized.

Copy link

commented May 29, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 05df640 into kubernetes:master May 30, 2019

21 checks passed

cla/linuxfoundation gnufied authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-storage-slow Job succeeded.
Details
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.