New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait for pod not running or gone in storage tests #113135
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/triage accepted |
/lgtm |
/retest |
cc @pohly |
@@ -421,6 +421,25 @@ func WaitForPodNameUnschedulableInNamespace(c clientset.Interface, podName, name | |||
}) | |||
} | |||
|
|||
// WaitTimeoutForPodNoLongerRunningOrNotFoundInNamespace waits default amount of time (defaultPodDeletionTimeout) | |||
// for the specified pod to stop running or disappear. Returns an error if timeout occurs first. | |||
func WaitTimeoutForPodNoLongerRunningOrNotFoundInNamespace(c clientset.Interface, podName, namespace string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this not the same as WaitForPodNotFoundInNamespace ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
L545
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WaitForPodNotFoundInNamespace
waits explicitly for NotFound
, whereas this is either NotFound or Completed (ie. WaitTimeoutForPodNoLongerRunningInNamespace
) so you can think of it as a combination of the two. That's how the code in storage was relying on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the test that uses this, it seems like it's called after the pod is deleted? Wouldn't it be better to just wait for NotFound in that case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is already merged, but left some questions about whether the original behavior was correct.
@@ -421,6 +421,25 @@ func WaitForPodNameUnschedulableInNamespace(c clientset.Interface, podName, name | |||
}) | |||
} | |||
|
|||
// WaitTimeoutForPodNoLongerRunningOrNotFoundInNamespace waits default amount of time (defaultPodDeletionTimeout) | |||
// for the specified pod to stop running or disappear. Returns an error if timeout occurs first. | |||
func WaitTimeoutForPodNoLongerRunningOrNotFoundInNamespace(c clientset.Interface, podName, namespace string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the test that uses this, it seems like it's called after the pod is deleted? Wouldn't it be better to just wait for NotFound in that case?
e2epv.PVPVCCleanup(c, ns, config.pv, config.pvc) | ||
err = e2epv.DeletePVSource(config.pvSource) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will these succeed if the pod still exists (but isn't running)?
@@ -93,7 +93,7 @@ func PodsUseStaticPVsOrFail(f *framework.Framework, podCount int, image string) | |||
e2epod.DeletePodOrFail(c, ns, config.pod.Name) | |||
} | |||
for _, config := range configs { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this could be run in parallel to speed it up.
I'll sync with storage folks and get them updated in followups, your points are totally valid @tallclair |
What type of PR is this?
The storage test
[sig-storage] Multi-AZ Cluster Volumes should schedule pods in the same zones as statically provisioned PVs
fromkubernetes/test/e2e/storage/ubernetes_lite_volumes.go
Line 53 in 83415e5
kubernetes/test/e2e/storage/ubernetes_lite_volumes.go
Line 96 in 83415e5
During refactoring in #109704 that condition has changed, previously
WaitForPodNoLongerRunningInNamespace
worked for both NotFound and Completed pod, after - it only checks Completed pods. This PR introduces another method which works for both conditions.Found in OpenShift CI (https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/27416/pull-ci-openshift-origin-master-e2e-gcp-ovn/1582239480008413184) where this test consistently fails, since we run more packed cluster where PodGC will kick in sooner removing the pods.
/kind cleanup
/kind failing-test
Special notes for your reviewer:
/assign @jsafrane
/cc @aojea @tallclair
Does this PR introduce a user-facing change?