Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ghost service PIDs in LXC containers #3520

Closed
harridu opened this issue Aug 14, 2020 · 7 comments
Closed

ghost service PIDs in LXC containers #3520

harridu opened this issue Aug 14, 2020 · 7 comments

Comments

@harridu
Copy link
Contributor

harridu commented Aug 14, 2020

Required information

  • Distribution:
    Debian 10
  • The output of
root@il08:~# cat /etc/debian_version
10.5
root@il08:~# lxc-start --version
4.0.4
root@il08:~# lxc-checkconfig 
LXC version 4.0.4
Kernel configuration not found at /proc/config.gz; searching...
Kernel configuration found at /boot/config-5.6.0-0.bpo.2-amd64
--- Namespaces ---
Namespaces: enabled
Utsname namespace: enabled
Ipc namespace: enabled
Pid namespace: enabled
User namespace: enabled
newuidmap is not installed
newgidmap is not installed
Network namespace: enabled

--- Control groups ---
Cgroups: enabled

Cgroup v1 mount points: 
/sys/fs/cgroup/cpuset
/sys/fs/cgroup/cpu
/sys/fs/cgroup/cpuacct
/sys/fs/cgroup/blkio
/sys/fs/cgroup/memory
/sys/fs/cgroup/devices
/sys/fs/cgroup/freezer
/sys/fs/cgroup/net_cls
/sys/fs/cgroup/perf_event
/sys/fs/cgroup/net_prio
/sys/fs/cgroup/pids
/sys/fs/cgroup/rdma

Cgroup v2 mount points: 


Cgroup v1 systemd controller: missing
Cgroup v1 clone_children flag: enabled
Cgroup device: enabled
Cgroup sched: enabled
Cgroup cpu account: enabled
Cgroup memory controller: enabled
Cgroup cpuset: enabled

--- Misc ---
Veth pair device: enabled, loaded
Macvlan: enabled, not loaded
Vlan: enabled, not loaded
Bridges: enabled, loaded
Advanced netfilter: enabled, loaded
CONFIG_NF_NAT_IPV4: missing
CONFIG_NF_NAT_IPV6: missing
CONFIG_IP_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_IP6_NF_TARGET_MASQUERADE: enabled, not loaded
CONFIG_NETFILTER_XT_TARGET_CHECKSUM: enabled, loaded
CONFIG_NETFILTER_XT_MATCH_COMMENT: enabled, not loaded
FUSE (for use with lxcfs): enabled, not loaded

--- Checkpoint/Restore ---
checkpoint restore: enabled
CONFIG_FHANDLE: enabled
CONFIG_EVENTFD: enabled
CONFIG_EPOLL: enabled
CONFIG_UNIX_DIAG: enabled
CONFIG_INET_DIAG: enabled
CONFIG_PACKET_DIAG: enabled
CONFIG_NETLINK_DIAG: enabled
File capabilities: 

Note : Before booting a new kernel, you can check its configuration
usage : CONFIG=/path/to/config /usr/bin/lxc-checkconfig

root@il08:~# uname -a
Linux il08.ac.aixigo.de 5.6.0-0.bpo.2-amd64 #1 SMP Debian 5.6.14-2~bpo10+1 (2020-06-09) x86_64 GNU/Linux
root@il08:~# cat /proc/self/cgroup
13:name=systemd:/
12:rdma:/
11:pids:/
10:perf_event:/
9:net_prio:/
8:net_cls:/
7:memory:/
6:freezer:/
5:devices:/
4:cpuset:/
3:cpuacct:/
2:cpu:/
1:blkio:/
0::/
root@il08:~# cat /proc/1/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,nosuid,relatime,size=8043192k,nr_inodes=2010798,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=1612620k,mode=755 0 0
/dev/sda1 / ext4 rw,noatime 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
pstore /sys/fs/pstore pstore rw,relatime 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime,size=6580660k 0 0
/dev/sda4 /export ext4 rw,noatime 0 0
cgroup /sys/fs/cgroup tmpfs rw,relatime,size=12k,mode=755 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuset,clone_children 0 0
cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu,release_agent=/run/cgmanager/agents/cgm-release-agent.cpu 0 0
cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuacct 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio,release_agent=/run/cgmanager/agents/cgm-release-agent.blkio 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,relatime,memory,release_agent=/run/cgmanager/agents/cgm-release-agent.memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices,release_agent=/run/cgmanager/agents/cgm-release-agent.devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer,release_agent=/run/cgmanager/agents/cgm-release-agent.freezer 0 0
cgroup /sys/fs/cgroup/net_cls cgroup rw,relatime,net_cls,release_agent=/run/cgmanager/agents/cgm-release-agent.net_cls 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event 0 0
cgroup /sys/fs/cgroup/net_prio cgroup rw,relatime,net_prio,release_agent=/run/cgmanager/agents/cgm-release-agent.net_prio 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,relatime,pids,release_agent=/run/cgmanager/agents/cgm-release-agent.pids 0 0
cgroup /sys/fs/cgroup/rdma cgroup rw,relatime,rdma,release_agent=/run/cgmanager/agents/cgm-release-agent.rdma 0 0
root@il08:~# lxc-attach -n il02
root@il02:~# cat /proc/self/cgroup
13:name=systemd:/
12:rdma:/
11:pids:/
10:perf_event:/
9:net_prio:/
8:net_cls:/
7:memory:/
6:freezer:/
5:devices:/
4:cpuset:/
3:cpuacct:/
2:cpu:/
1:blkio:/
0::/
root@il02:~# cat /proc/1/mounts
/dev/sda4 / ext4 rw,noatime 0 0
none /dev tmpfs rw,relatime,size=492k,mode=755 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
proc /proc/sys/net proc rw,nosuid,nodev,noexec,relatime 0 0
proc /proc/sys proc ro,nosuid,nodev,noexec,relatime 0 0
proc /proc/sysrq-trigger proc ro,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs ro,nosuid,nodev,noexec,relatime 0 0
sysfs /sys/devices/virtual/net sysfs rw,relatime 0 0
sysfs /sys/devices/virtual/net sysfs rw,nosuid,nodev,noexec,relatime 0 0
devpts /dev/console devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
none /proc/sys/kernel/random/boot_id tmpfs ro,nosuid,nodev,noexec,relatime,size=492k,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/ptmx devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/tty1 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/tty2 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/tty3 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666,max=1024 0 0
devpts /dev/tty4 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=666,max=1024 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0
tmpfs /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0
cgroup2 /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime 0 0
cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,release_agent=/run/cgmanager/agents/cgm-release-agent.systemd,name=systemd 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset,release_agent=/run/cgmanager/agents/cgm-release-agent.cpuset,clone_children 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices,release_agent=/run/cgmanager/agents/cgm-release-agent.devices 0 0
cgroup /sys/fs/cgroup/rdma cgroup rw,nosuid,nodev,noexec,relatime,rdma,release_agent=/run/cgmanager/agents/cgm-release-agent.rdma 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio,release_agent=/run/cgmanager/agents/cgm-release-agent.blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event,release_agent=/run/cgmanager/agents/cgm-release-agent.perf_event 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer,release_agent=/run/cgmanager/agents/cgm-release-agent.freezer 0 0
cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids,release_agent=/run/cgmanager/agents/cgm-release-agent.pids 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory,release_agent=/run/cgmanager/agents/cgm-release-agent.memory 0 0
mqueue /dev/mqueue mqueue rw,relatime 0 0
hugetlbfs /dev/hugepages hugetlbfs rw,relatime,pagesize=2M 0 0

Issue description

There are ghost services with PID=0 in cgroup.procps in the container:

root@il02:~# for i in /sys/fs/cgroup/unified/system.slice/*/cgroup.procs ; do test -n "$(cat $i)" || continue; echo $i; cat $i; echo; done
/sys/fs/cgroup/unified/system.slice/atd.service/cgroup.procs
85

/sys/fs/cgroup/unified/system.slice/bind9.service/cgroup.procs
0
108

/sys/fs/cgroup/unified/system.slice/console-getty.service/cgroup.procs
0
86

/sys/fs/cgroup/unified/system.slice/cron.service/cgroup.procs
0
74

/sys/fs/cgroup/unified/system.slice/dbus.service/cgroup.procs
0
83

/sys/fs/cgroup/unified/system.slice/inetd.service/cgroup.procs
71

/sys/fs/cgroup/unified/system.slice/isc-dhcp-server.service/cgroup.procs
143

/sys/fs/cgroup/unified/system.slice/nscd.service/cgroup.procs
80

/sys/fs/cgroup/unified/system.slice/opensmtpd.service/cgroup.procs
0
0
0
0
0
0
0
115
116
117
118
119
120
121

/sys/fs/cgroup/unified/system.slice/rsyslog.service/cgroup.procs
0
70

/sys/fs/cgroup/unified/system.slice/ssh.service/cgroup.procs
0
123

/sys/fs/cgroup/unified/system.slice/systemd-journald.service/cgroup.procs
0
17

/sys/fs/cgroup/unified/system.slice/systemd-logind.service/cgroup.procs
0
69

/sys/fs/cgroup/unified/system.slice/unattended-upgrades.service/cgroup.procs
122

/sys/fs/cgroup/unified/system.slice/zabbix-agent.service/cgroup.procs
82
124
125
126
127
128

Apparently all these "0" break systemd in the container, see https://lists.freedesktop.org/archives/systemd-devel/2020-August/044999.html .

Lennart wrote

Is it possible the container and the host run in the very same cgroup
hierarchy?

If that's the case (and it looks like it): this is not
supported. Please file a bug against LXC, it's very clearly broken.

I have seen this issue with LXC hosts running either systemd or sysvinit.

@brauner
Copy link
Member

brauner commented Aug 14, 2020

Can you show:

cat /proc/<container-init-pid>/cgroup

as seen from the host, please?

@harridu
Copy link
Contributor Author

harridu commented Aug 14, 2020

sure

# cat /proc/2106/cgroup 
13:name=systemd:/init.scope
12:rdma:/lxc.payload.il02
11:pids:/lxc.payload.il02
10:perf_event:/lxc.payload.il02
9:net_prio:/lxc.payload.il02
8:net_cls:/lxc.payload.il02
7:memory:/lxc.payload.il02
6:freezer:/lxc.payload.il02
5:devices:/lxc.payload.il02
4:cpuset:/lxc.payload.il02
3:cpuacct:/lxc.payload.il02
2:cpu:/lxc.payload.il02
1:blkio:/lxc.payload.il02
0::/init.scope

@brauner
Copy link
Member

brauner commented Aug 14, 2020 via email

@harridu
Copy link
Contributor Author

harridu commented Aug 14, 2020

Thanx for your detailed explanation. I highly appreciate this. Of course I will try the workaround/fixes you suggested and post the results here.

@harridu
Copy link
Contributor Author

harridu commented Aug 16, 2020

lxc.mount.auto = cgroup:rw:force

in the lxc config file does not help, AFAICS. systemd in the container uses groupv2 nevertheless. I could have disabled cgroupv2 on the kernel command line, but I did not try that.

I have modified cgroupfs-mount on the host to mount cgroup2 as well, e.g.:

mkdir -p /sys/fs/cgroup/unified
mount -t cgroup2 -o rw,nosuid,nodev,noexec,relatime,nsdelegate cgroup2 /sys/fs/cgroup/unified || echo ignored

That makes systemd in the LXC container happy.

@brauner
Copy link
Member

brauner commented Aug 30, 2020

I'm not happy about the kernel in this respect but this weird scenario was bound to happen at some point because of how cgroup2 mounts work. I'm closing this but if this becomes a persistent issue we need to start thinking about doing something clever.

@brauner brauner closed this as completed Aug 30, 2020
@brauner
Copy link
Member

brauner commented Aug 30, 2020

Thanks for the bug report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants