-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Flake] TestPriorityAndFairnessWithPanicRecoveryAndTimeoutFilter flaking #111154
Comments
/priority important-soon |
/cc @lavalamp |
@MikeSpreitzer and @wojtek-t can this be related to #110104 ? |
@aojea: I see no direct relationship, but hey, anything's possible. We learned that with the misery in integration test failures when we merged not-really-guilty other stuff. |
The expanded logging turned up some new evidence. See https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/111162/pull-kubernetes-unit/1547751720010911744 , where the complaint is now
|
The problem here is not APF, it is something in the timeout filter or the test server. |
is something in APF triggering a timeout? |
The flake definitely came in with #110104 on the prior commit on master:
on the merge commit of #110104:
|
/assign @MikeSpreitzer |
Note that the flaking test is deliberately suffering server-side timeouts. The symptom in flakes is that the test is checking that the client gets kube's designed response to a timeout while handling, a particular form HTTP response message, but sometimes the client gets something else (an http2 internal error?). #110104 is a change in metrics, and should have nothing to do with how timeouts while handling and detected or handled. I will see what I can see... |
@liggitt , regarding your comment above (#111154 (comment)): note that the output shown (the panic "as designed") is normal; if there are failures involved, the failing part was elided. |
Also, the failures seem to be coming from one particular subtest ( |
On closer examination of the timeout filter, it appears that this "INTERNAL ERROR" is something that can happen normally in the course of handling a timeout. So I will adjust the test to accept this. |
I have updated #111162 to include the fix for this flake. |
Which jobs are flaking?
pull-kubernetes-unit
Which tests are flaking?
TestPriorityAndFairnessWithPanicRecoveryAndTimeoutFilter
Since when has it been flaking?
7/14
Testgrid link
https://testgrid.k8s.io/presubmits-kubernetes-blocking#pull-kubernetes-unit&include-filter-by-regex=TestPriorityAndFairnessWithPanicRecoveryAndTimeoutFilter&width=20
Reason for failure (if possible)
https://storage.googleapis.com/k8s-triage/index.html?pr=1&test=TestPriorityAndFairnessWithPanicRecoveryAndTimeoutFilter
Anything else we need to know?
No response
Relevant SIG(s)
/sig api-machinery
The text was updated successfully, but these errors were encountered: