-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix test:Probing container should have monotonically increasing restart #108652
Conversation
@249043822: This issue is currently awaiting triage. If a SIG or subproject determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
263fbe1
to
d7d801d
Compare
is not relaxing too much the condition? |
any suggestions? Thx |
@@ -192,7 +192,7 @@ var _ = SIGDescribe("Probing container", func() { | |||
FailureThreshold: 1, | |||
} | |||
pod := livenessPodSpec(f.Namespace.Name, nil, livenessProbe) | |||
RunLivenessTest(f, pod, 5, time.Minute*5) | |||
RunLivenessTest(f, pod, 3, time.Minute*10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only backoffs for 5 retries combined is almost 2 minutes. Plus, the default timeout for a single restart is 4 minutes. So clearly the timeout is not adequate here, should be at least 6 minutes. Since the first restart must be the longest as it involves the pod scheduling, I'd expect 14 minutes can be enough delay for 5 retries (assuming half of time is needed to restart the pod than schedule and start).
RunLivenessTest(f, pod, 3, time.Minute*10) | |
// ~2 minutes backoff timeouts + 4 minutes defaultObservationTimeout + 2 minutes for each pod restart | |
RunLivenessTest(f, pod, 5, 2 * time.Minute + defaultObservationTimeout + 4 * 2 * time.Minute) |
I think 14 minutes is ok timeout for conformance test.
If we switching to 3 retries, following the same logic we need 30 sec + 4 minutes + 2 * 2 minutes = 8.5 minutes, no need for 10 minutes. I'd prefer 5 retries though just to be on a safe side and not change the conformance test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
d7d801d
to
b131d49
Compare
b131d49
to
d7d7e0d
Compare
? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
let's see if this will make it
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: 249043822, SergeyKanzhelev The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind failing-test
What this PR does / why we need it:
Saw the failed log, the sync was very slow some circumstance, so I think no need 5 restarts, maybe set to 3 and extend waiting time.
Which issue(s) this PR fixes:
Fixes #108504
Special notes for your reviewer:
/cc @SergeyKanzhelev @ehashman
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: