Prevent plugin from returning in-use error for source volumes #297

timoreimann · 2020-09-29T08:33:48Z

What type of PR is this?

/kind bug

What this PR does / why we need it:

This PR addresses a regression in #261 causing plugins to return an in-use error (FAILED_PRECONDITION) when a sourcing resource (i.e., a snapshot or a volume) is deleted before the sourced volume is.

See #289 (comment) for details.

Does this PR introduce a user-facing change?:

Clean up sourced volumes prior to deleting sourcing resources

@apurv15 could you give this a try with your driver?

k8s-ci-robot · 2020-09-29T08:33:56Z

Hi @timoreimann. Thanks for your PR.

I'm waiting for a kubernetes-csi member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

pohly · 2020-09-29T11:23:51Z

pkg/sanity/controller.go

+			// in-use error.
+			By("cleaning up deleting the volume created from snapshot")
+			delVol2Req := MakeDeleteVolumeReq(sc, volume2.GetVolume().GetVolumeId())
+			_, err = r.DeleteVolume(context.Background(), delVol2Req)


This looks like the cleanup code didn't delete volumes in the right order, correct?

I think we should fix that instead of adding custom cleanup code in the tests themselves. Suppose we add another check after creating the second volume and before deleting it: if that check then fails, test cleanup will fail like it does now.

I haven't checked how volumes are tracked, but perhaps clean up with last-in-first-out order?

We currently always delete snapshots first, then volumes. Deletion in LIFO order sounds like the right approach to me, I think it should work. (For this particular test case, it means that we'd delete volume 2 first, then the snapshot, then the volume 1.)

I'll work on making the necessary updates to this PR.

I made the discussed adjustments. Will comment some of the changes via Github comments to add more clarity.

Please have a look and let me know what you think.

timoreimann · 2020-10-05T08:49:42Z

pkg/sanity/resources.go

-type VolumeInfo struct {
+// managedResourceInfos is a map of managed resource infos keyed by resource
+// IDs.
+type resourceInfos map[string]resourceInfo


Alternatively, this could also be typed as []resourceInfo which would make the LIFO property more natural/obvious. On the downside, it would also bring extra effort to remove items from the slice as we clean up resources (volumes, snapshots) during Cleanup.

I decided to go with a map and introduced an order field to keep track of when resources where added to the list.

This uses the volume ID and snapshot ID as key, right? Is is guaranteed that those never conflict?

it would also bring extra effort to remove items from the slice as we clean up resources (volumes, snapshots) during Cleanup

That extra effort is just finding the item and then creating a slice without it, right? That doesn't seem that hard.

Without having seen what it looks like, I suspect that []resourceInfo will be the better solution.

Made the change. I think you were right, this looks much simpler.

timoreimann · 2020-10-05T08:53:08Z

pkg/sanity/resources.go

+}
+
+// volumeInfo keeps track of the information needed to delete a volume.
+type volumeInfo struct {


I unexported the type since it's effectively an implementation detail. I guess strictly speaking, this could be considered a breaking change, though I doubt anyone uses it (or rather hope not so).

If we're concerned about this, I'm also happy to undo the change and safe it for when we have more things to change that justify a new major release.

timoreimann · 2020-10-05T08:54:00Z

pkg/sanity/resources.go

 	// Node on which the volume was published, empty if none
 	// or publishing is not supported.
 	NodeID string
-
-	// Volume ID assigned by CreateVolume.
-	VolumeID string


I removed VolumeID because it's now part of the key in resourceInfos. Same considerations as with the unexporting of [Vv]olumeInfo apply, though.

apurv15 · 2020-10-09T05:18:39Z

@timoreimann : We are seeing that a lot more test cases are failing with this change.
e.g.
ListSnapshots [Controller Server]
should return snapshots that match the specified snapshot id
/root/csi-sanity-new/src/github.com/kubernetes-csi/csi-test/pkg/sanity/controller.go:1126
STEP: reusing connection to CSI driver at /var/lib/kubelet/plugins/org.veritas.infoscale/csi.sock
STEP: creating mount and staging directories
STEP: creating first unrelated snapshot
STEP: creating target snapshot
STEP: creating second unrelated snapshot
STEP: listing snapshots
cleanup volume: deleting vol_c0o3m11862x7xmlbt5o9
cleanup volume: DeleteVolume failed: rpc error: code = FailedPrecondition desc = Could not delete volume vol_c0o3m11862x7xmlbt5o9, as it has dependent snapshots ['snap_043b9dl9u20l60xj4f2i'] with sync in progress

timoreimann · 2020-10-29T22:46:33Z

@apurv15 I'm not exactly sure why things don't work for you. 🤔

I pushed another commit that outputs extra debug messages when you set the DEBUG_RESOURCE_MANAGER environment variable to something non-empty. Could you try this out and share the log output, especially the [DEBUG]-prefixed messages?

@pohly let me know if we think it makes sense to keep the poor man's debug logging approach in or not.

pohly · 2020-10-30T09:03:40Z

let me know if we think it makes sense to keep the poor man's debug logging approach in or not.

Why not just klog.V(4) or so?

timoreimann · 2021-01-18T18:02:36Z

This one fell off my radar. I'll make sure to address the last comment soon.

This addresses a regression in [1] causing plugins to return an in-use error (FAILED_PRECONDITION) when a sourcing resource (i.e., a snapshot or a volume) is deleted before the sourced volume is. [1]: kubernetes-csi#261

timoreimann · 2021-01-19T09:53:22Z

@pohly

Why not just klog.V(4) or so?

We currently set a log.Logger to write into a GinkgoWriter, causing the output to be suppressed if all tests pass. Unless I'm missing something, klog only supports setting the output at the global package level. I'm a bit reluctant to do that as it'd mean that we could overwrite klog settings that csi-test library consumers may have set for their tests.

WDYT?

pohly · 2021-01-19T12:08:19Z

We currently set a log.Logger to write into a GinkgoWriter, causing the output to be suppressed if all tests pass.

We could do the same in the csi-sanity binary and enable the klog flags, which will have the same effect. We can't do it when used as lib, but then users who care can do something similar in their main function.

That should go a long way towards nicer logging without doing anything special (like a DEBUG_RESOURCE_MANAGER that most users won't know about).

xing-yang · 2021-01-19T14:56:42Z

pkg/sanity/resources.go

 func (cl *Resources) Cleanup() {
 	cl.mutex.Lock()
 	defer cl.mutex.Unlock()
 	ctx := context.Background()

-	cl.deleteSnapshots(ctx, 2)
-	cl.deleteVolumes(ctx, 2)
+	// Clean up resources in LIFO order to account for dependency order.


PVC and VolumeSnapshot have independent life cycles in Kubernetes. If we fix the sanity test to make sure snapshots are always deleted before volumes are deleted, driver maintainers won't know it is possible for users to delete PVCs first and may run into the problems later on in production.

So I think we should actually have sanity tests that delete snapshot before and after the volume's deletion to make sure a driver can handle this independent life cycle. Yes, this means driver will have to implement logic to do soft deletion if volume and snapshot are dependent to each other on the storage system in order to pass the sanity tests.

I had commented on #289
Spec provides flexibility for PVC and its snapshots to be dependent or not. Spec https://github.com/container-storage-interface/spec/blob/master/spec.md says following:

===
CSI plugins SHOULD treat volumes independent from their snapshots.

If the Controller Plugin supports deleting a volume without affecting its existing snapshots, then these snapshots MUST still be fully operational and acceptable as sources for new volumes as well as appear on ListSnapshot calls once the volume has been deleted.

When a Controller Plugin does not support deleting a volume without affecting its existing snapshots, then the volume MUST NOT be altered in any way by the request and the operation must return the FAILED_PRECONDITION error code and MAY include meaningful human-readable information in the status.message field.

So, the test case should not validate just one behavior as the spec allows volumes and snapshots to be dependent and that is the case for some of the drivers.

With LIFO-ordered deletion implemented by this PR, I think we should satisfy the minimum requirement that the spec mandates, which is that plugins do not or cannot treat volumes and snapshots independently. That is, if a volume is created from a snapshot, we first delete the snapshot and then the volume. (I also assume the same logic should apply if a volume is sourced from another volume.)

However, @apurv15 mentioned in a previous comment that csi-test built from this change would cause more tests to fail. 🤔 I guess I should give the PR another hard look to check if I missed anything. (@apurv15, did you ever get around running it with the enhanced debug log messages as I suggested?)

@xing-yang would it make sense to file a separate issue to track creating a new test for the missed scenario you mentioned (while taking into account the specifics of the spec as @apurv15 quoted above)? IMHO, this PR should focus on addressing the regression I introduced and returning to the original cleanup order.

There are also storage systems that cannot delete snapshots before deleting PVCs. Our existing e2e tests also delete in a specific order, deleting snapshots first and then deleting PVCs. However, that caused some CSI drivers to fail because they require an opposite ordering of deletion which is PVCs must be deleted before snapshots are deleted. There's an issue opened to track and fix this in e2e tests so that deletion of PVC and snapshots should be independent of each other:
kubernetes/kubernetes#96719

Our sanity test should be consistent with e2e test.

@xing-yang how should csi-test (and the e2e tests) account for the two possible orderings?

Should we try one order, and if we receive a particular error, try the inverse?

Or is the expectation that the driver acknowledges delete requests and completes the cleanup asynchronously once both the snapshot and volume deletes have been issued?

csi-test currently does not check that deletes have completed, so we might naturally support the latter scenario but not the former.

This doesn't need to be solved in this PR. It's more like an idea for some dedicated tests.

Regarding cleanup order: of both cases are valid, then picking one for cleanup (LIFO at the moment) is correct, i.e. we can merge this PR.

Regarding this comment Clean up resources in LIFO order to account for dependency order., we should remove the text "to account to dependency order" as we are not trying to make sure snapshots are deleted before volumes are deleted.

In its current form, the code cleans up resources "backwards": the resource (either volume or snapshot) that was created last is cleaned up first, followed by cleaning up the resource that was created second-to-last, and so on. Given that sourcing volumes must exist before the snapshots that are spawn of, this means that we first delete a snapshot followed by its sourcing volume.

Doesn't that imply that we do make sure that snapshots are deleted before volumes are @xing-yang?

Or are we talking about the scenario where a volume is being created from a snapshot and a possible dependency in that chain?

We discussed this a bit further on Monday. @msau42 pointed out that it is valid when a CSI driver refuses to delete PVCs that have snapshots. It is also valid to refuse deleting snapshots when there is still a PVC. In other words, a CSI driver only needs to support one order of deletions, but not both.

This means that picking LIFO order will work for some drivers, but not all.

The right approach is to retry deletion after it failed once, i.e. try to delete snapshot, delete pvc, then if snapshot deletion failed, try again. I think this can be added relatively easily by iterating twice over all resources instead of just once.

We also discussed writing tests that explicitly cover both deletion orders. One test can do "delete snapshot, delete pvc, delete snapshot again (if necessary)", the other "delete pvc, delete snapshot, delete pvc again (if necessary)". This then works with all drivers and ensures that both is tested. But this should be added in a separate PR.

timoreimann · 2021-01-22T09:06:08Z

@pohly I replaced the custom logger and env var flag with klog as discussed. Also optimized / reduced the amount of logging a bit more.

Please take another look.

timoreimann · 2021-01-22T09:07:25Z

I should manually test-drive this PR a bit more, so for now

/hold

pohly · 2021-01-22T10:01:34Z

pkg/sanity/resources.go

@@ -299,31 +296,33 @@ func (cl *Resources) cleanupVolume(ctx context.Context, offset int, volumeID str
 				Secrets:  cl.Context.Secrets.ControllerUnpublishVolumeSecret,
 			},
 		)
-		logger.Errorf(err, "ControllerUnpublishVolume failed: %s", err)
+		ExpectWithOffset(offset, err).NotTo(HaveOccurred(), "[volume cleanup] ControllerUnpublishVolume failed")


Previously errors during cleanup were logged, but cleanup continued, right? Now it aborts.

Error handling during cleanup is tricky. Ideally, as much as possible should get cleaned up and then a summary of all unsolved problems should be raised as a test failure. The latter ensures that a test that itself ran fine but then has problems during cleanup is reported as failed overall.

Whoops, somehow I missed that we used to continue to clean up even when we hit errors.

I made another update to collect all errors and assert at the end of the Cleanup call. I think this goes even beyond what we had before, where we'd assert at the end of a cleanup{Volume,Snapshot} call.

One thing I want to mention is that failure to, say, node-unpublish a volume, could mean that follow-up cleanup operations on the same resource (node-unstage, controller-unpublish, delete) may then be more likely to fail as well. I'm wondering if this aggravates understanding the produced error messages, and whether it could make more sense to not try follow-up cleanup operations within a single resource (but continue to try to clean up all resources).

I followed your suggestion for now, but let me know if you have thoughts on the above and whether it should influence another update.

I agree that errors are likely to cause other errors. But that is nothing new, developers should be familiar with follow-up errors and focused on the first one when trying to understand what went wrong.

So I think this behavior is fine.

pohly

/lgtm

k8s-ci-robot · 2021-01-30T14:03:24Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pohly, timoreimann

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [pohly]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

xing-yang · 2021-02-02T20:30:18Z

I'm fine to have this PR merged first and have another PR to handle the volume/snapshot deletion ordering issue.

xing-yang · 2021-02-02T20:30:23Z

/lgtm

timoreimann · 2021-02-02T23:24:50Z

If no one plans to, I can drive the follow up PR.

Thanks everyone involved.

/hold cancel

k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 29, 2020

k8s-ci-robot requested review from jsafrane and pohly September 29, 2020 08:33

k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 29, 2020

pohly reviewed Sep 29, 2020

View reviewed changes

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 5, 2020

timoreimann commented Oct 5, 2020

View reviewed changes

timoreimann mentioned this pull request Jan 18, 2021

LifeCycle test fails since 'unpublish' is not run #316

Closed

timoreimann added 5 commits January 19, 2021 10:31

Prevent plugin from returning in-use error for source volumes

64dae77

This addresses a regression in [1] causing plugins to return an in-use error (FAILED_PRECONDITION) when a sourcing resource (i.e., a snapshot or a volume) is deleted before the sourced volume is. [1]: kubernetes-csi#261

Clean up resources in LIFO order

f7d344b

Remove VolumeID from VolumeInfo and unexport variable

3ba1762

Retype resourceInfos to slice

9a905e4

Add debug log messages

8ceb1a4

xing-yang reviewed Jan 19, 2021

View reviewed changes

Replace custom logger by klog

537f237

timoreimann force-pushed the prevent-plugin-from-returning-in-use-error-for-source-volumes branch from 82b7d90 to 537f237 Compare January 20, 2021 23:38

optimize/reduce logs

10dd452

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 22, 2021

pohly reviewed Jan 22, 2021

View reviewed changes

do best effort cleanup

ea2d97f

pohly approved these changes Jan 30, 2021

View reviewed changes

k8s-ci-robot assigned pohly Jan 30, 2021

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 30, 2021

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 30, 2021

k8s-ci-robot assigned xing-yang Feb 2, 2021

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 2, 2021

k8s-ci-robot merged commit 44a7535 into kubernetes-csi:master Feb 2, 2021

pohly mentioned this pull request Feb 3, 2021

snapshots and pvcs: test and support arbitrary deletion order #319

Open

Prevent plugin from returning in-use error for source volumes #297

Prevent plugin from returning in-use error for source volumes #297

Conversation

timoreimann commented Sep 29, 2020

k8s-ci-robot commented Sep 29, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apurv15 commented Oct 9, 2020

timoreimann commented Oct 29, 2020

pohly commented Oct 30, 2020

timoreimann commented Jan 18, 2021

timoreimann commented Jan 19, 2021

pohly commented Jan 19, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timoreimann Jan 20, 2021 • edited Loading

Choose a reason for hiding this comment

xing-yang Jan 30, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timoreimann Feb 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timoreimann commented Jan 22, 2021

timoreimann commented Jan 22, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pohly left a comment

Choose a reason for hiding this comment

k8s-ci-robot commented Jan 30, 2021

xing-yang commented Feb 2, 2021

xing-yang commented Feb 2, 2021

timoreimann commented Feb 2, 2021

timoreimann Jan 20, 2021 •

edited

Loading

xing-yang Jan 30, 2021 •

edited

Loading

timoreimann Feb 2, 2021 •

edited

Loading