New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix:prevent goroutine leakage for pkg/k8s/watchers #22362
Conversation
Commit 22a0cd3000edea030948482287392cab63031eed does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
Commit d52e29f35457c27e90be0197d16d89a8b3abd964 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
Thanks for the PR. The code doesn't compile right now, so I would suggest that we keep it in draft until you are able to address the CI failures and push a fresh version that passes the initial smoke tests. Then, please mark the PR "Ready for review" and we can take a fresh look. |
Commit ae19cbd14a1f493b62557bac6482f7d93e4b8bcb does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
I pushed again, but CI still failed, how to pass? |
Most jobs failed during image pull. The image pull details link shows that the build failed and has the details of the make target failure. It should be reproducible with |
Commit 7c944970b3f8b39f2c9baa7d45045076ed772721 does not contain "Signed-off-by". Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin |
Ready for review |
I marked this as release-note/misc since it looks like this will only ensure that certain goroutines are cleaned up during shutdown, and when that occurs the agent will exit anyway. If you think that there could be any user-facing impact from this PR, please describe how that impact could occur (ideally with some example observations from an environment where you observed this). |
I refer to this #21913 |
I clicked ready for review and renamed the PR per #22362 (comment). |
I will run CI, then assigned reviewers can take a look. |
thank you |
|
From a function api level, I would say this behaviour might be a bit unintuitive. If I was to change the top level initK8sWatchers call to take a Context with a timeout, my expectation that that would just constrain how long it takes for the init procedure to complete, not how long the k8s controllers end up running for (i.e. if someone changed the global timeout expecting to constraint init times...) Anyway, I see that this pattern is already being followed so otherwise it lgtm (might be worth adding a note about this behaviour) |
There are many failures, but the jobs have expired. I would suggest rebasing your PR against master, then we can re-run CI again and try to establish whether CI is identifying code issues with the PR or not. |
Pushed it again, all CI check pass |
/test Job 'Cilium-PR-K8s-1.25-kernel-4.19' hit: #21519 (90.01% similarity) |
@joestringer |
514f55d
to
3b2a176
Compare
/test Job 'Cilium-PR-K8s-1.16-kernel-4.9' failed: Click to show.Test Name
Failure Output
If it is a flake and a GitHub issue doesn't already exist to track it, comment |
A couple of tests were still failing, so I clicked the rebase button and re-triggered full CI. Let's see what CI says after that. |
Signed-off-by: yulng <wei.yang@daocloud.io>
I rebase and push again ,all CI is ok now |
Each time you rebase, it hides the results of the last CI run. If we cannot see the results of the CI run, then we cannot merge the PR. I will run the CI again, then we can see whether the CI passes or not. Please hold back on rebasing for now. |
/test |
Use the ctx passed to ciliumClusterwideEnvoyConfigInit instead of wait.NeverStop.
reference #21913
@tommyp1ckles @aanm @joamaki