New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
e2e storage: wait for PV deletion also for late binding #90335
Conversation
4c5d1b9
to
9e00c6c
Compare
/retest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this! I also recently noticed this myself: #90370
test/e2e/storage/testsuites/base.go
Outdated
pv2, err := cs.CoreV1().PersistentVolumes().Get(context.TODO(), pvc.Spec.VolumeName, metav1.GetOptions{}) | ||
switch { | ||
case err == nil: | ||
pv = pv2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a pv2? pv is supposed to be nil here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been bitten by scoping rules in Go before, so I prefer being explicit about updating some variable in an outer scope.
The "simple" approach doesn't work:
pv, err := cs.CoreV1().PersistentVolumes().Get(context.TODO(), pvc.Spec.VolumeName, metav1.GetOptions{})
It creates a new pv
variable which shadows the one we really want to update.
This works, but I find it ugly:
var err error
pv, err = cs.CoreV1().PersistentVolumes().Get(context.TODO(), pvc.Spec.VolumeName, metav1.GetOptions{})
🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
err
is already initialized at L315, so you shouldn't need to declare it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would work... until that L315 changes.
I still don't like how scoping works here, but as it reduces the number of lines I'll go with it:
pv, err = cs.CoreV1().PersistentVolumes().Get(context.TODO(), pvc.Spec.VolumeName, metav1.GetOptions{})
if err != nil {
cleanUpErrs = append(cleanUpErrs, errors.Wrapf(err, "Failed to find PV %v", pvc.Spec.VolumeName))
}
This simplified version also assumes that pv is nil in case of an error and not some incomplete garbage. Unfortunately (IMHO) Go doesn't enforce that.
When a test pattern or storage class uses late binding, the cleanup code didn't know about the PV that may have been created for the PVC since setting it up and thus then also didn't wait for PV deletion. This is problematic for test isolation because the next test was allowed to be started before fully cleaning up. Worse, it the driver gets removed after the test, the volume might never get deleted.
9e00c6c
to
e3d258d
Compare
/lgtm |
/kind important-soon |
@msau42: The label(s) In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/priority important-soon |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: msau42, pohly The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
/retest Review the full test history for this PR. Silence the bot with an |
1 similar comment
/retest Review the full test history for this PR. Silence the bot with an |
/retest |
/retest Review the full test history for this PR. Silence the bot with an |
1 similar comment
/retest Review the full test history for this PR. Silence the bot with an |
What type of PR is this?
/kind failing-test
What this PR does / why we need it:
When a test pattern or storage class uses late binding, the cleanup
code didn't know about the PV that may have been created for the PVC
since setting it up and thus then also didn't wait for PV deletion.
This is problematic for test isolation because the next test was
allowed to be started before fully cleaning up. Worse, it the driver
gets removed after the test, the volume might never get deleted.
Does this PR introduce a user-facing change?: