Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

systemctl status shows wrong cgroup tree #21945

Closed
codicodi opened this issue Dec 30, 2021 · 7 comments
Closed

systemctl status shows wrong cgroup tree #21945

codicodi opened this issue Dec 30, 2021 · 7 comments
Milestone

Comments

@codicodi
Copy link
Contributor

systemd version the issue has been seen with

systemd 250 (250-4-arch)
+PAM +AUDIT -SELINUX -APPARMOR -IMA +SMACK +SECCOMP +GCRYPT +GNUTLS +OPENSSL +ACL +BLKID +CURL +ELFUTILS +FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBFDISK +PCRE2 -PWQUALITY +P11KIT -QRENCODE +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK +XKBCOMMON +UTMP -SYSVINIT default-hierarchy=unified

Used distribution

Arch Linux

Linux kernel version used (uname -a)

Linux enterprise 5.15.12-zen1-1-zen #1 ZEN SMP PREEMPT Wed, 29 Dec 2021 12:04:52 +0000 x86_64 GNU/Linux

CPU architecture issue was seen on

x86-64

Expected behaviour you didn't see

systemctl status shows tree of slices/scopes/sessions/services

Unexpected behaviour you saw
image

After updating from 249.7, systemctl status started showing a flat process list under net_cls. As far as I know, my VPN software [1] uses net_cls cgroups to control split tunneling.

$ mount | grep cgroup
cgroup2 on /sys/fs/cgroup type cgroup2 (rw,nosuid,nodev,noexec,relatime,nsdelegate,memory_recursiveprot)
net_cls on /sys/fs/cgroup/net_cls type cgroup (rw,relatime,net_cls)

If I stop VPN service and unmount /sys/fs/cgroup/net_cls, issue is (temporarily) fixed.
systemd-cgls is unaffected. systemctl status --user works fine as well.

[1] https://github.com/mullvad/mullvadvpn-app

@codicodi codicodi changed the title systemctl status shows wrong cgoup tree systemctl status shows wrong cgroup tree Dec 30, 2021
@mrc0mmand
Copy link
Member

I wonder if this issue is related to #22089, since it seems to be in the same area.

@mrc0mmand mrc0mmand added pid1 regression ⚠️ A bug in something that used to work correctly and broke through some recent commit labels Jan 12, 2022
@mrc0mmand
Copy link
Member

Yup, reverting 038cae0 helps.

Reproducer:

# mkdir -p /sys/fs/cgroup/net_cls
# mount -t cgroup -onet_cls net_cls /sys/fs/cgroup/net_cls
# systemctl status --no-pager
● arch.localdomain
    State: running
     Jobs: 0 queued
   Failed: 0 units
    Since: Wed 2022-01-12 12:12:22 UTC; 2h 8min ago
   CGroup: /
           └─net_cls
             ├─   1 /usr/lib/systemd/systemd --system --deserialize 51
             ├─ 243 /usr/lib/systemd/systemd-journald
             ├─ 247 /usr/lib/systemd/systemd-udevd
             ├─ 251 /usr/lib/systemd/systemd-networkd
             ├─ 280 /usr/lib/systemd/systemd-resolved
             ├─ 287 /usr/bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation --syslog-only
             ├─ 291 "dhcpcd: [manager] [ip4] [ip6]"
             ├─ 292 "dhcpcd: [privileged proxy]"
             ├─ 293 "dhcpcd: [network proxy]"
             ├─ 294 "dhcpcd: [control proxy]"
             ├─ 295 /usr/lib/systemd/systemd-logind
             ├─ 301 "sshd: /usr/bin/sshd -D [listener] 0 of 10-100 startups"
             ├─ 303 /usr/lib/polkit-1/polkitd --no-debug
             ├─ 304 /sbin/agetty -o "-p -- \\u" --noclear - linux
             ├─ 315 "dhcpcd: [BPF ARP] eth0 192.168.122.175"
             ├─ 322 "sshd: vagrant [priv]"
             ├─ 325 /usr/lib/systemd/systemd --user
             ├─ 326 "(sd-pam)"
             ├─ 332 "sshd: vagrant@pts/0"
             ├─ 333 -bash
             ├─ 362 /usr/lib/systemd/systemd-machined
             ├─ 498 /usr/bin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/vagrant-libvirt.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelp…
             ├─ 499 /usr/bin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/vagrant-libvirt.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_leaseshelp…
             ├─ 509 /usr/bin/virtlogd
             ├─1551 /bin/sleep infinity
             ├─1608 /usr/lib/systemd/systemd --user
             ├─1610 "(sd-pam)"
             ├─1621 "sshd: vagrant [priv]"
             ├─1623 "sshd: vagrant@pts/2"
             ├─1624 -bash
             ├─1626 sudo su -
             ├─1627 su -
             ├─1628 -bash
             └─2630 systemctl status --no-pager

@mrc0mmand mrc0mmand added this to the v251 milestone Jan 12, 2022
yuwata added a commit to yuwata/systemd that referenced this issue Jan 12, 2022
@yuwata
Copy link
Member

yuwata commented Jan 12, 2022

Also systemd-cgls is broken when net_cls is mounted.

Fix is waiting in #22095.

@codicodi
Copy link
Contributor Author

Also systemd-cgls is broken when net_cls is mounted.

That's not the case for me.
Anyway, yuwata@18e5664 works fine, thanks!

@keszybz
Copy link
Member

keszybz commented Jan 12, 2022

As I commented on #22095, I don't think this is something we can support:

This doesn't look right. The bug is about people mount cgroup-v1 controllers in a directory below the cgroup-v2 in a hierarchy. In particular, any v1 controller will behave like this, not just net_cls.

This can never work — as described in the cgroup delegation documents, systemd expects to have exclusive ownership of the hierarchy (except where delegation is enabled). I think we should close this as CANTFIX.

You need to either revert to cgroup-v1 or hybrid mode, or (very very much preferred) fix the software in question to use cgroups-v2.

@bluca
Copy link
Member

bluca commented Jan 12, 2022

This can never work — as described in the cgroup delegation documents, systemd expects to have exclusive ownership of the hierarchy (except where delegation is enabled). I think we should close this as CANTFIX.

For reference, this is the doc: https://systemd.io/CGROUP_DELEGATION/

@codicodi
Copy link
Contributor Author

Alright, thanks for explanation

fix the software in question to use cgroups-v2

I'll read some more on cgroups and report it there

@yuwata yuwata added cant-fix and removed regression ⚠️ A bug in something that used to work correctly and broke through some recent commit labels Jan 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants