New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: prevent panic on k8s services host fw test on some runs. #25747
test: prevent panic on k8s services host fw test on some runs. #25747
Conversation
/test |
0ef3120
to
ce7d345
Compare
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Do we have an example of such a panic? Or a related issue? So that we can trace past occurrences back to this PR and know that they should be fixed now. |
Change introduced 0e20d30 in order to provide workaround fix for flake can panic depending on the order which tests are run if the deploymentManager is not setup with a kubectl object. k8s/services already deploys Cilium with the hostfirewall enabled, so this moves installing Cilium out of the host-few preparation step and defers that to the caller (such as in datapath_configuration). As well, this was causing failures with hostfw K8sServices test because the deploymentManager Cilium install procedure is more strict regarding the ensure the liveness of Agents health endpoints. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
ce7d345
to
5865474
Compare
Good point - was working off a log file someone sent me, ill open up an issue to track. |
/test |
I don't think we can see this in CI. It happened to Maxim (IIRC) locally because he wasn't running the tests in the same order as the CI or because he was running only this test. |
Oh, I stand corrected. So what is the random part here that makes this not fail 100% of the time? |
I'm having a hard time figuring that out, I initially presumed that there was no set order to the tests running but that seemed to be wrong. In all cases it appears as though none of the tests that would call |
I'll see if I can figure it out here: #25748 |
/test |
/test-backport-1.13 Job 'Cilium-PR-K8s-1.24-kernel-4.19' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.24-kernel-4.19/12/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
Quoting so we don't lose it:
|
This looks like we're just working around the missing
|
Basically, alternatively I could've added the SetKubectl call to that test but that didn't make much sense to me since that test doesn't use "deploymentManager" (unlike the hostfw DatapathConfig tests). What I don't know is why there are the different approaches. |
/test-1.26-net-next |
/test-1.24-4.19 Job 'Cilium-PR-K8s-1.26-kernel-net-next' failed: Click to show.Test Name
Failure Output
Jenkins URL: https://jenkins.cilium.io/job/Cilium-PR-K8s-1.26-kernel-net-next/337/ If it is a flake and a GitHub issue doesn't already exist to track it, comment Then please upload the Jenkins artifacts to that issue. |
@aanm now also applied this variant of the fix via a2c2be5 :) |
Change introduced 0e20d30 in order to provide workaround fix for flake can panic depending on the order which tests are run if the deploymentManager is not setup with a kubectl object.
k8s/services already deploys Cilium with the hostfirewall enabled, so this moves installing Cilium out of the host-few preperation step and defers that to the caller (such as in datapath_configuration).
Fixes #25775