-
Notifications
You must be signed in to change notification settings - Fork 38.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSI E2E: retry csi-pod creation #68822
Conversation
test/e2e/storage/csi_objects.go
Outdated
var err error | ||
ret, err = podClient.Create(pod) | ||
return err | ||
}).ShouldNot(HaveOccurred(), "Failed to create %q pod: %v", pod.GetName()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra %v
, needs err
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant to remove the "%v". Gomega itself will dump the error. But well-spotted, thanks.
Further testing also showed that the default Expect() timeouts were too short. I've bumped that to 1m in the revised commit.
/kind bug |
0991e4f
to
b30f556
Compare
/retest |
test/e2e/storage/csi_objects.go
Outdated
// We could use a DaemonSet, but then the name of the csi-pod changes | ||
// during each test run. It's simpler to just try for a while here. | ||
var ret *v1.Pod | ||
Eventually(func() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be worth adding a CreatePodWithRetry utility function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that may be useful. I'll do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I ended up adding CreateEventually to pod.go's PodClient, which looked like the right place because it already had a Create (without retry).
Normally the pod would get created via a DaemonSet controller, but during testing it is easier to create it directly. We just need to ignore errors (like 'No API token found for service account "csi-service-account"') and retry for a while. If the error persists, the error will still abort and report it eventually. This problem also occurs elsewhere, so an utility function in the framework for it seems justified. Fixes: kubernetes#68776
b30f556
to
6dbb07c
Compare
/retest |
/lgtm |
/assign saad-ali |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: pohly, saad-ali The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
As discussed upstream (kubernetes/kubernetes#67882), the 'No API token found for service account "csi-service-account"' error is normal and must be handled by trying to create the pod multiple times, either manually or via a DaemonSet. Here we simply loop with Gomega (same fix that is also proposed upstream in kubernetes/kubernetes#68822).
Normally the pod would get created via a DaemonSet controller, but
during testing it is easier to create it directly. We just need to
ignore errors (like 'No API token found for service account
"csi-service-account"') and retry for a while. If the error persists,
the error will still abort and report it eventually.
What this PR does / why we need it:
The CSI E2E can fail randomly due to a race condition.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #68776
Release note:
/sig storage