-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Local PV Stress test: don't fail on deleting missing PV #119745
Local PV Stress test: don't fail on deleting missing PV #119745
Conversation
Please note that we're already in Test Freeze for the Fast forwards are scheduled to happen every 6 hours, whereas the most recent run was: Thu Aug 3 10:25:55 UTC 2023. |
This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/assign @jsafrane |
/lgtm |
LGTM label has been added. Git tree hash: e5680ed94ede7ee8a02221521305468702b68930
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jsafrane, tsmetana The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
What type of PR is this?
/kind bug
/kind flake
What this PR does / why we need it:
The
[sig-storage] PersistentVolumes-local Stress with local volumes [Serial] should be able to process many pods and reuse local volumes
can sometimes fail with the following error:This causes the test to fail, however the 404 error seems to be bogus. The test periodically creates and deletes local volumes reusing their name. At the end of the test all the volumes are being cleaned up but the test loop itself is still running and it may happen that the deletion of the volume from the test itself tries to delete the already cleaned up volume.
There is a check in the test that tries to skip the missing volume but it happens too early, so the test loop races with the cleanup. The volume might be deleted after the test and then fail when trying to delete the volume for the second time. (The
pvc is nil
log proves the cleanup code is running already.)Seems like the simplest fix is to simply ignore the missing volume in the test (it was meant to be skipped anyway).
Special notes for your reviewer:
This looks to be difficult to reproduce, since the failure happens only occassionally (up to 10 % of the test runs on a stressed system).
Does this PR introduce a user-facing change?
No, it's a test fix.
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
N/A