-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cilium in EKS without kube-proxy #10462
Comments
Notes on testing this from @brb and @nebril:
|
The current EKS AMI (e.g. ami-0907724389e8705d9 in us-west-2) ships kernel version 4.14 (4.14.165-133.209.amzn2.x86_64 to be exact), and 4.19 will be required. There is a package for it, but that would require node reboot or custom AMI. |
@errordeveloper Do you know which 4.19 version exactly (asking, as we require 4.19.57 or newer)? |
They call it 4.19.84-33.70.amzn2. |
Nice, then all features of the kube-proxy replacement should work. |
I guess we could create an AMI, if we wanna show it working in demo or blog post. I'll check Ubuntu AMIs also. |
Probably we don't want to maintain an AMI if there is one with the required kernel version. |
Yeah, I wouldn't suggest creating one that will be maintained, more just as one-off, presumably new kernels will ship as default in AL2 AMIs soon enough. |
So the Ubuntu AMI is 18.04.3 and it has kernel version 4.15.0-1057-aws. The 8.04.4 was released last month ships with 5.3 actually, so that might come to EKS sometime soon (but you never know, Canonical had been updating EKS AMIs very proactively in my experience). |
About disable-the-aws-node-daemonset-eks-only documentation, why replace the image instead of just deleting the whole daemonset? Same for kube-proxy and CoreDNS, they are just "add-ons" applied to the cluster when it's created. |
This was actually fixed in #10461 earlier, the docs from master haven't been published yet. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
I'm an EKS user with problems installing Cilium so dropping notes here after talking with @tgraf in Slack. We're using the latest official AMI but it does run through a userdata script to set node labels, install nessus agent, and probably most relevant, runs Installing Cilium results in intermittent connection failures and latency to different systems. I did confirm drops with Cilium itself:
And the connectivity test results:
After some experimenting in a test cluster, we narrowed this down to NodeLocalDNS which seems to be a known issue on recent kernels where Taking this to my real clusters, I did some quick testing. Setting I'm a little stumped why NodeLocalDNS seemed to be a key factor in the test cluster but removing it seemed irrelevant in the real one. Are there any changes that could persist even through node replacement? In the past, I've had Istio and Calico on this cluster as well, but neither are currently installed. Environment |
@jaygorrell Thanks. Can you paste the following from the node which reported the drops:
|
Just flipped it back to
|
@jaygorrell Can you please provide a sysdump? |
During business hours I have to keep things in working order so I have put things back into a working state for now. That means I installed NodeLocalDNS again and set If you do indeed need a dump from the cluster when it's in the busted state, let me know if you need it without NodeLocalDNS installed. Since things break for me regardless of that if |
Yes, please w/o node-local DNS. |
(potentially also related: #10645) |
To get this I uninstalled NodeLocalDNS, removed the nodelocaldns kubelet arg, and replaced each host. At first I didn't have problems but had forgotten to flip |
@jaygorrell Thanks for the sysdump. Can you list some |
@jaygorrell The failures in #10462 (comment) are the same as in #12824. I'm currently working on the fix. |
Should have been fixed via #14201, closing. |
Proposal / RFE
Is your feature request related to a problem?
Cilium can replace
kube-proxy
, it should be possible to do it in EKS.Describe the solution you'd like
Verify it works, update EKS documentation to show how to run Cilium without
kube-proxy
.The text was updated successfully, but these errors were encountered: