New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Only restart KubeDNS if required #11207
Conversation
test-me-please |
0f08186
to
38b2ae4
Compare
test-me-please |
38b2ae4
to
c06ec3f
Compare
test-me-please |
Maybe this fixes #10980 ? |
Yes, it does. I'll add a fixes tag |
Still needs more work:
|
c06ec3f
to
0bac652
Compare
test-me-please |
0bac652
to
00e22b1
Compare
test-me-please K8s-1.17 and K8s-1.11:
|
00e22b1
to
2cca7f6
Compare
retest-4.19 EDIT: Passed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome stuff! Two minor, but nice to have change requests left inline.
Instead of always restarting the kube-dns deployment, split the validation of the installation into a separate function so we can validate the deployment and only restart kube-dns if we have to. While doing so, replace the validation with a more efficient version that invokes as few kubectl execs within loops as possible and parallelizes operations were possible. Given that Cilium is typically re-deployed before this logic is executed, the slowest path is typically the service plumbing. Allow for some aggressive timeout for the Kubernetes DNS service to be plumbed to avoid restarting it in the common case. Example output: ``` STEP: Checking if kube-dns service is plumbed correctly STEP: Checking if pods have identity STEP: Checking if DNS can resolve STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-b9dcp: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-b9dcp STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-fm9qm: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-fm9qm STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-mxjvw: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-mxjvw STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-fm9qm: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-fm9qm STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-mxjvw: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-mxjvw STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-b9dcp: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-b9dcp STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-fm9qm: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-fm9qm STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-b9dcp: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-b9dcp STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-mxjvw: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-mxjvw STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-b9dcp: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-b9dcp STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-mxjvw: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-mxjvw STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-fm9qm: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-fm9qm STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-b9dcp: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-b9dcp STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-fm9qm: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-fm9qm STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-mxjvw: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-mxjvw STEP: Kubernetes DNS is not ready: ClusterIP 10.71.240.10 not found in service list of cilium pod cilium-mxjvw STEP: Restarting Kubernetes DSN (-l k8s-app=kube-dns) STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-mxjvw: unable to find service backend 10.68.1.228:53 in datapath of cilium pod cilium-mxjvw STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-b9dcp: unable to find service backend 10.68.1.228:53 in cilium pod cilium-b9dcp STEP: Checking service kube-system/kube-dns plumbing in cilium pod cilium-fm9qm: unable to find service backend 10.68.1.228:53 in cilium pod cilium-fm9qm STEP: Waiting for Kubernetes DNS to become operational STEP: Checking if deployment is ready STEP: Checking if kube-dns service is plumbed correctly STEP: Checking if pods have identity STEP: Checking if DNS can resolve ``` Signed-off-by: Thomas Graf <thomas@cilium.io>
Instead of restarting all kube-system pods, restart the unmanaged pods. Signed-off-by: Thomas Graf <thomas@cilium.io>
2cca7f6
to
8694b8d
Compare
Deploy preview for cilium-docs failed. Built with commit 8694b8d https://app.netlify.com/sites/cilium-docs/deploys/5ec6c950db57fb00075abd64 |
test-me-please |
Instead of always restarting the kube-dns deployment, split the validation of the installation into a separate function so we can validate the deployment and only restart kube-dns if we have to.
While doing so, replace the validation with a more efficient version that invokes as few kubectl execs within loops as possible and parallelize operations were possible.
Given that Cilium is typically re-deployed before this logic is executed, the slowest path is typically the service plumbing. Allow for some aggressive timeout for the Kubernetes DNS service to be plumbed to avoid restarting it in the common case.
Example output:
Before
~1min
After
~10s
Fixes: #10980