-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cilium v1.14.2 with Kubernetes v1.28 is unstable #27982
Comments
When this occurs on a node, the kubernetes Service cluster IP can't be used from the host, which aligns with what Pods see.
Once Cilium agents start (using the workaround above), the same curl command works from hosts. Seems like Cilium having run on the node in the past interferes with IPVS functionality. |
Thanks for this issue @dghubble, and especially for the great investigation. Cilium 1.14 actually only supports up to Kubernetes 1.27 - the client upgrade has only been merged into |
This actually happens after a node reboot. I am using a k3s cluster with kured to have automatic reboots after updates. It won't be able to access the kube-dns anymore so it cannot reach anything. |
It looks like there may be two issues in play here: An upstream issue (kubernetes/kubernetes#120247), and possibly #27848. The upstream issue fix is in and will be included in Kubernetes 1.28.2, due out soon, and the other investigation is ongoing at #27848. |
@dghubble @Silvest89 would you be able to test it again with Kubernetes 1.28.2? Thank you |
cicd-kub-control-01:/home/icce# cilium status DaemonSet cilium Desired: 3, Unavailable: 3/3 cicd-kub-control-01:/home/icce# kubectl get nodes NAME STATUS ROLES AGE VERSION cicd-kub-control-01:/home/icce# kubectl -n kube-system logs cilium-9ch5r |
Preliminarily, on a Kubernetes v1.28.2 cluster, I've not been able to reproduce the issue. Restarting nodes, Cilium can reach the apiserver just fine, which I suspected was the trigger before. I observed the original issue in real production clusters though, after several days of use, so I'll have more confidence in a few days. |
Thank you for the feedback! Let's leave it in |
This seems a duplicate of #27900, should we close it @aanm @julianwiedmann ? |
I've seen this occur once on a new cluster with Kubernetes v1.28.2 and Cilium v1.14.2. Most clusters have been fine since those upgrades. Is there anything specific I should be collecting? To confirm it's the same issue. Unfortunately, I usually have to apply mitigations asap and can't afford to leave clusters in this broken state for long. |
This issue can still happen. I've had to explicitly set a
This can take days of real-world usage to become evident. Fresh clusters looked fine, but they're not fine. |
@dghubble that could be related with an issue with kube-proxy and not Cilium itself as you have pointed it out that connecting to connecting directly to an external DNS record it works but not with a cluster IP, for which kube-proxy does the service translation. |
@squeed 👋🏻 long time! Yeah, my suspicion is that it's related to this overlapping responsibility kube-proxy and Cilium have for managing the apiserver's own Kubernetes Service traffic. Having @tedli The workaround to this issue is giving Cilium explicit IP addresses for the apiserver (undesired). If you're seeing issues in that case, you're probably describing a separate issue. |
K8s v1.28.0 causes the following regression: #27982. Most noticably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
K8s v1.28.0 causes the following regression: #27982. Most noticeably, this has been causing k8s conformance test failures. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com>
I saw it recur on one node today! A Cilium agent pod was unable to reach kube-apiserver. My usual workaround is to modify the Cilium DaemonSet to explicitly set a Bugtool: https://storage.googleapis.com/dghubble/bugtool.tar.gz (too big for GitHub) |
So, looking at the bugtool, I see the following set of backends for the apiserver service:
As of the time of the bugtool, are those correct? My theory is that cilium is missing changes to the Separately, #25169 may also be relevant to this. |
Yeah, this looks ood. Those two backend IPs are from the Pod CIDR range (
I'm not sure what the Pod IPs Cilium was seeing correspond to anymore. Cilium now shows the right backend.
Interesting that the apiserver restarted 33h ago, but maybe a coincidence. And only Cilium one node got into this bad state. The prior apiserver logs show it clearing the
|
Btw, I've preferred Cilium using in-cluster discovery (i.e. In theory, Cilium could just read the |
@dghubble makes perfect sense; the ultimate solution may be to add a flag disabling socket-lb for host-netns processes. (or perhaps just the Cilium agent). Then Cilium would receive load balancing from kube-proxy, and pods would use socket-lb. WDYT? |
I suspect that would fix this situation. Did Cilium 1.13 with partial kube-proxy previously work this way? It's odd this became a problem so recently. |
* With Cilium v1.14, Cilium's kube-proxy partial mode changed to either be enabled or disabled (not partial). This somtimes leaves Cilium (and the host) unable to reach the kube-apiserver via the in-cluster Kubernetes Service IP, until the host is rebooted * As a workaround, configure Cilium to rely on external DNS resolvers to find the IP address of the apiserver. This is less portable and less "clean" than using in-cluster discovery, but also what Cilium wants users to do. Revert this when the upstream issue cilium/cilium#27982 is resolved
@dghubble I've been running my cluster for more than a month now, haven't run into any issues. Cluster bootstrapped using k3s on Hetzner Cloud |
I believe 1.14 brings changes to the socket-lb, but that’s not my area of expertise. @aditighag, any pointers? |
I was fixated on this issue for a day before I found this threat. I have problems with what it means when I do a high-available setup because I then have to address the 'outside' load balancer for 'in cluster traffic'. I do not like it. Also, I can not imagine this is the way it is meant to be. We built in PKI and overlay to secure all communication, then we break it open to talk over the 'outside' infrastructure network. Instead of using the build in overlay Kubernetes service. I can not believe this? |
Issue is older. Also 1.13.9 breaks when configuring OIDC on api-server... |
I can concure it is a 1.28 issue. Provisioning k8s 1.27 Cilium does not break, after configuring OIDC. With breaking I refer to the familiar issue: Unable to contact k8s api-server / Forbidden 10.2.0.1;6443. |
This issue has been automatically marked as stale because it has not |
This issue has not seen any activity since it was marked stale. |
I ultimately had to adapt our Kubernetes distro to tell Cilium the DNS name resolving to any of the apiservers. The approach and resolver varies based on the cloud provider. It's a shame, Cilium used to support the in-cluster kubernetes ClusterIP, but now effectively relies on an external resolver. |
Is there an existing issue for this?
What happened?
Starting in Cilium v1.14.0 on Kubernetes v1.28.1, Cilium agents can lose connection to
kube-apisever
when usingkube-proxy
and the kubernetes service ClusterIP. This looks closely related to #27900Cilium supports hybrid modes in which Cilium can coexist with
kube-proxy
while performing some or all of its responsibilities (e.g. there are reasons one might not wish to remove kube-proxy). Cilium v1.14 removed thekube-proxy-replacement
partial
mode and changed it to eithertrue
orfalse
. But something else appears to have changed:Consider a cluster with a
kube-proxy
daemonset.kube-proxy
usesipvs
to load balance the defaultkubernetes
service ClusterIP to a kube-apiserver endpoint.Cilium agent's respect the default
KUBERNETES_SERVICE_HOST
by default (10.3.0.1), which usually works fine.But I've noticed there is a (yet unknown) sequence of events whereby connectivity to the kubernetes service Cluster IP breaks on certain nodes. This can happen after days of otherwise running normally. I think it's related to node restarts because I see it more on spot instances. The result is that the Cilium agent on those nodes crashloops, unable to reach the apiserver.
Workaround
The workaround is updating Cilium agent to have an explicit kube-apiserver IP address or DNS record in a
KUBERNETES_SERVICE_HOST
environment variable, but this should not be neccessary and is undesired. Workloads (including Cilium agent) on clusters withkube-proxy
should be able to use in-cluster service discoveryI suspect the wrinkle here is that Cilium itself can interact with Kubernetes Service mappings. That or something about Kubernetes v1.28 itself.
Scope
I've observed in this with KubeProxyReplacement
false
(enabling the individual features) and KubeProxyReplacement enabled.And with KubeProxyReplacement
true
Neither mode is related to the fix.
Cilium Version
Cilium v1.14.0, v1.14.1
Kernel Version
Linux ip-10-0-11-132 6.4.7-200.fc38.aarch64 #1 SMP PREEMPT_DYNAMIC Thu Jul 27 20:22:11 UTC 2023 aarch64 GNU/Linux
Kubernetes Version
Kubernetes v1.28.1
Sysdump
No response
Relevant log output
No response
Anything else?
The fallback here should be kube-proxy's IPVS, which does program the right LVS rules
Code of Conduct
The text was updated successfully, but these errors were encountered: