Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cilium broken on Amazon Linux 2 and RHEL8 #12429

Closed
rifelpet opened this issue Sep 27, 2021 · 7 comments · Fixed by #12504
Closed

Cilium broken on Amazon Linux 2 and RHEL8 #12429

rifelpet opened this issue Sep 27, 2021 · 7 comments · Fixed by #12504
Labels
blocks-next kind/bug Categorizes issue or PR as related to a bug.

Comments

@rifelpet
Copy link
Member

/kind bug

copied from #12141 (comment)

https://testgrid.k8s.io/kops-grid#kops-grid-cilium-rhel8-k22-containerd

kubelet logs report:

E0916 00:19:04.290207 18754 nestedpendingoperations.go:301] Operation for "{volumeName:kubernetes.io/host-path/23431797-5c17-4404-b61b-33d2e8ec3d54-cilium-cgroup podName:23431797-5c17-4404-b61b-33d2e8ec3d54 nodeName:}" failed. No retries permitted until 2021-09-16 00:19:36.290185341 +0000 UTC m=+151.386939478 (durationBeforeRetry 32s). Error: MountVolume.SetUp failed for volume "cilium-cgroup" (UniqueName: "kubernetes.io/host-path/23431797-5c17-4404-b61b-33d2e8ec3d54-cilium-cgroup") pod "cilium-8rdpr" (UID: "23431797-5c17-4404-b61b-33d2e8ec3d54") : mkdir /sys/fs/cgroup/unified: read-only file system

We could mount cgroup2 explicitly somewhere else as well, like we used to with bpffs. Or we could drop support for not-that-modern-distros.

In #11696 we deprecated CentOS 8 but not RHEL 8. AL2 is still the most modern version of Amazon Linux so I dont think we should be deprecating it. I think it would be best if we mount cgroup2 elsewhere.

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Sep 27, 2021
@rifelpet
Copy link
Member Author

rifelpet commented Sep 27, 2021

Also we may want to consider this blocking 1.22 given the change was introduced in 1.22 and is breaking clusters for supported distros.

@olemarkus
Copy link
Member

I agree. I'll revive the mount functionality and have it mount cgroup2 on the default cilium location (/run/cilium/cgroupv2)

@olemarkus
Copy link
Member

RHEL8's kernel is too old for cilium. But #12431 should take care of AL2 5.10 kernel

@rifelpet
Copy link
Member Author

Here are the cilium docs mentioning the kernel version requirements for some of the more advanced features. That page also mentions an overall requirement of 4.9 and that RHEL8 is supported. Is our manifest using the advanced features by default?

@olemarkus
Copy link
Member

olemarkus commented Sep 28, 2021

I am not sure I'd say this is an "advanced feature" anymore as it has been the default in cilium for quite some time, and for kOps as well. But "Kubernetes Without kube-proxy" is enabled by default (on new clusters) as of kOps 1.19.

@rifelpet
Copy link
Member Author

In that case maybe we add a 1.22 release note mentioning that cilium + RHEL8 is not supported. We can skip those grid jobs too (and add AL2)

@johngmyers
Copy link
Member

@rifelpet is the current release note sufficient?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocks-next kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants