-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix cgroup2 support #2844
fix cgroup2 support #2844
Conversation
Opened #2845 for requesting cgroup v2 CI infra. |
Fix issue 900 cgroup2 support was introduced in PR 2584, but got broken in f3de60f It was failing with "F1210 19:13:37.305388 4955 server.go:181] cannot set feature gate SupportPodPidsLimit to false, feature is locked to true" Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
2d6aa3a
to
127b96e
Compare
This looks fine to me but we really need @Oats87 to review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this PR looks good to me, but I do have some concerns around functionality. As far as I can tell this will not break our existing cgroupsv1
support, and so should be safe to merge, although we should do regression testing to verify this.
@@ -182,13 +184,34 @@ func checkCgroups() (kubeletRoot, runtimeRoot string, hasCFS, hasPIDs bool) { | |||
} | |||
defer f.Close() | |||
|
|||
v2 := cgroups.Mode() == cgroups.Unified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not very familiar with github.com/containerd/cgroups
behavior -- is it possible that we'll receive a cgroups.Unified
result from cgroups.Mode()
even if we don't have systemd
as our init process?
i.e. on a system with unified cgroups but no systemd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,that is possible, as long as cgroup2 is mounted on /sys/fs/cgroup
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AkihiroSuda do you know if there is a non-systemd based system that only supports cgroups v2 that we could test this on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OpenRC seems to support cgroup v2 (aka "unified mode") as well:
https://github.com/OpenRC/openrc/blob/170ce2624a98952e6eb871cd9915c7c1b76ab887/etc/rc.conf#L194-L199
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't actually find an OpenRC based distro that uses unified mode. Fedora33 uses unified mode with SystemD and k3s works fine after applying this PR:
[sysadm@fedora01 ~]$ mount | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,seclabel,nsdelegate)
Run as a systemd unit:
[sysadm@fedora01 ~]$ systemd-cgls
...
└─system.slice
├─k3s.service
│ ├─3742 /usr/local/bin/k3s server
│ └─3755 containerd
Run from a shell:
[sysadm@fedora01 ~]$ systemd-cgls
...
├─user.slice
│ └─user-1000.slice
│ ├─user@1000.service
│ │ └─init.scope
│ │ ├─747 /usr/lib/systemd/systemd --user
│ │ └─749 (sd-pam)
│ ├─session-3.scope
│ │ ├─ 773 sshd: sysadm [priv]
│ │ ├─ 777 sshd: sysadm@pts/0
│ │ ├─ 778 -bash
│ │ ├─ 903 sudo -i
│ │ ├─ 905 -bash
│ │ ├─1842 tee k3s.log
│ │ ├─1864 containerd
│ │ ├─2273 /var/lib/rancher/k3s/data/8e9ca8da90942295d88eb1e4277ae10c3cc9b17b43853382ee1d0dd19e4caa10/bin/containerd-shim-runc-v2 -namespace k8s.io -id 605179ece3>
│ │ ├─2300 /var/lib/rancher/k3s/data/8e9ca8da90942295d88eb1e4277ae10c3cc9b17b43853382ee1d0dd19e4caa10/bin/containerd-shim-runc-v2 -namespace k8s.io -id 8549898bd0>
│ │ ├─2330 /var/lib/rancher/k3s/data/8e9ca8da90942295d88eb1e4277ae10c3cc9b17b43853382ee1d0dd19e4caa10/bin/containerd-shim-runc-v2 -namespace k8s.io -id 566e62505e>
│ │ ├─2924 /var/lib/rancher/k3s/data/8e9ca8da90942295d88eb1e4277ae10c3cc9b17b43853382ee1d0dd19e4caa10/bin/containerd-shim-runc-v2 -namespace k8s.io -id 41bac9411d>
│ │ └─3036 /var/lib/rancher/k3s/data/8e9ca8da90942295d88eb1e4277ae10c3cc9b17b43853382ee1d0dd19e4caa10/bin/containerd-shim-runc-v2 -namespace k8s.io -id fac53cab8b>
└─k3s
└─1841 /usr/local/bin/k3s server
I tried Devuan Beowulf with OpenRC and it is in hybrid mode with all the controllers running under v2 by default. I tried editing /etc/rc.conf but it appears that none of the settings in there make any difference - I can't put it into legacy OR unified mode, or move any of the required controllers from v2 to v1.
sysadm@devuan01:~$ mount | grep cgroup
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,mode=755)
cgroup2 on /sys/fs/cgroup/unified type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate)
cgroup on /sys/fs/cgroup/elogind type cgroup (rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/lib/elogind/elogind-cgroups-agent,name=elogind)
K3s does NOT work on devuan (with or without this PR) because we don't support hybrid mode with the controllers under v2 - as currently implemented it only works if the the pids and cgroups controllers are in v1 (legacy or hybrid mode), or if we are in 'pure' unified mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brandond FYI. I built version of k3OS which combines this PR with OpenRC on "unified" mode. You can find it from: https://github.com/olljanat/k3os/releases/tag/rootless-cgroup-v2-test
Afaiu cgroup is using v2 mode and k3s still works:
k3os-11941 [~]$ ls -l /sys/fs/cgroup/
total 0
drwxr-xr-x 2 root root 0 Jan 25 22:04 acpid
drwxr-xr-x 2 root root 0 Jan 25 22:04 ccapply
-r--r--r-- 1 root root 0 Jan 25 22:04 cgroup.controllers
-rw-r--r-- 1 root root 0 Jan 25 22:04 cgroup.max.depth
-rw-r--r-- 1 root root 0 Jan 25 22:04 cgroup.max.descendants
-rw-r--r-- 1 root root 0 Jan 25 22:04 cgroup.procs
-r--r--r-- 1 root root 0 Jan 25 22:04 cgroup.stat
-rw-r--r-- 1 root root 0 Jan 25 22:11 cgroup.subtree_control
-rw-r--r-- 1 root root 0 Jan 25 22:04 cgroup.threads
drwxr-xr-x 2 root root 0 Jan 25 22:04 connman
-rw-r--r-- 1 root root 0 Jan 25 22:04 cpu.pressure
-r--r--r-- 1 root root 0 Jan 25 22:04 cpuset.cpus.effective
-r--r--r-- 1 root root 0 Jan 25 22:04 cpuset.mems.effective
drwxr-xr-x 2 root root 0 Jan 25 22:04 dbus
drwxr-xr-x 2 root root 0 Jan 25 22:04 haveged
-rw-r--r-- 1 root root 0 Jan 25 22:04 io.cost.model
-rw-r--r-- 1 root root 0 Jan 25 22:04 io.cost.qos
-rw-r--r-- 1 root root 0 Jan 25 22:04 io.pressure
drwxr-xr-x 2 root root 0 Jan 25 22:04 iscsid
drwxr-xr-x 2 root root 0 Jan 25 22:04 issue
drwxr-xr-x 2 root root 0 Jan 25 22:04 k3s-service
drwxr-xr-x 4 root root 0 Jan 25 22:04 kubepods
-rw-r--r-- 1 root root 0 Jan 25 22:04 memory.pressure
drwxr-xr-x 2 root root 0 Jan 25 22:04 sshd
drwxr-xr-x 2 root root 0 Jan 25 22:04 syslog
drwxr-xr-x 2 root root 0 Jan 25 22:04 udev
k3os-11941 [~]$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
k3os-11941 Ready control-plane,master 7m41s v1.20.2+k3s-127b96e8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@olljanat if you are confirming that this works with OpenRC in unified mode then I'm satisfied. We can open another issue to track hybrid mode with the cpu,pids controllers in v2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"hybrid" mode is almost meaningless in the container ecosystem, because a controller cannot be enabled for both v1 and v2 simultaneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah hybrid mostly just means we should check v2 and fall back to v1. Right now we just assume they are available in v1 in hybrid mode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the case is that without this one k3s fails to start if cgroup v2 is used then I can confirm that it works OpenRC in unified mode. Or is there something else what I should check?
@@ -197,7 +220,7 @@ func checkCgroups() (kubeletRoot, runtimeRoot string, hasCFS, hasPIDs bool) { | |||
if _, err := os.Stat(p); err == nil { | |||
hasCFS = true | |||
} | |||
} else if system == "name=systemd" { | |||
} else if system == "name=systemd" || v2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When on a non-systemd system but with unified cgroups, is it OK that we aren't setting kubeletRoot
if we don't have a .scope
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with kubeletRoot stuff honestly , but this PR corresponds to the v1 behavior.
Would it be possible to get CI for this? #2845 GitHub Actions with Vagrant can be used for smoke testing, though probably too heavy for the entire sonobuoy testing. |
Would it be possible to expedite a new release to include #2844 ? |
We will probably just wait for the next upstream patch release: #2857 |
I know it's too late just want to get clarification if these features enabled will be supported on the latest k3s with the latest Kubernetes or will it be considered some other last three or four Kubernetes support? Like for example will It be available on |
Proposed Changes
cgroup2 support was introduced in #2584, but got broken in f3de60f
It was failing with "F1210 19:13:37.305388 4955 server.go:181] cannot set feature gate SupportPodPidsLimit to false, feature is locked to true"
Types of Changes
BugFix
Verification
sudo k3s server
on Ubuntu 20.10 with cgroup2Linked Issues
Fix #900
Further Comments