-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Processes launched via docker exec are not placed into the correct cgroup #42704
Comments
I'm wondering if this is either a bug or omission in containerd. Doing some initial searching, the container's cgroup parent is set when creating the container; Lines 808 to 814 in 9674540
That property (
moby/vendor/github.com/opencontainers/runtime-spec/specs-go/config.go Lines 162 to 165 in 9674540
However when doing a moby/libcontainerd/remote/client.go Lines 307 to 310 in 2773f81
moby/vendor/github.com/containerd/containerd/task.go Lines 158 to 159 in 2773f81
The moby/vendor/github.com/opencontainers/runtime-spec/specs-go/config.go Lines 32 to 33 in 9674540
Based on that, I would expect containerd (?) to inherit this from the container itself. If that's indeed what should happen, but not the case, that may be a bug in containerd 🤔 |
From a discussion on Slack I had with a containerd maintainer;
@AkihiroSuda perhaps you know; could this be a bug/issue in runc? |
Here is what happens in your case, @raxod502.
This can be seen by e.g. root@ubu2004:/home/kir# docker exec ed0892fae38b cat /proc/self/cgroup
12:memory:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
11:cpuset:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
10:blkio:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
9:perf_event:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
8:net_cls,net_prio:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
7:pids:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
6:cpu,cpuacct:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
5:rdma:/
4:devices:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
3:hugetlb:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
2:freezer:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
1:name=systemd:/my.slice/my-example.slice/my-example-cgroup.slice/docker-ed0892fae38beb03ad3211cac14af97bbb09ceff17f81ab1455741a5a995e218.scope
0::/system.slice/containerd.service The thing is, runc does not really support hybrid cgroup hierarchy, this is why it does not add the process being executed to the proper cgroupv2 scope (indentified by Having said that, there's a runc PR to supported hybrid cgroup hierarchy (opencontainers/runc#2087), which can't yet be merged (and will probably be merged later as part of opencontainers/runc#3059). Now, what you see is systemd looking into cgroup v2 only when reporting the process cgroup, which may or may not be a bug in systemd. |
Well, systemd looking into cgroup v2 only seems like arguably the correct behavior, because the resource limits imposed by cgroup v1 seem to be ignored when using systemd-defined cgroups. Do you think this is the kind of issue that can be worked around by changing something about the system environment? The end goal is simply to use |
Can you please elaborate on this (and/or provide a quick repro demonstrating this)? I don't think it's true (because it is, we have a huge issue with runc wrt hybrid cgroup hierarchy). Here's my repro: kir@ubu2004:~$ sudo docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ed0892fae38b alpine "/bin/sh" 46 hours ago Up 46 hours intelligent_leavitt
kir@ubu2004:~$ sudo docker update --pids-limit 16 ed0892fae38b
ed0892fae38b
kir@ubu2004:~$ sudo docker exec -it ed0892fae38b sh
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
/ # sleep 1h &
sh: can't fork: Resource temporarily unavailable As you can see, pids limit is enforced for |
Now that I do further testing, I'm finding that although the cgroup parent is not set like I would expect, the configured resource limits for the container cgroup parent are still applied correctly. This is not what I was observing before, so I must have had something else incorrectly configured in the system. Closing this issue as I can't provide a reproducible example of the bad behavior. Thanks for your help! |
I had the same issue this week. Stress testing and The solution was to add a slash before the slice name. Example
|
Description
Processes launched via
docker exec
are not placed into the correct cgroup when using--cgroup-parent
option ondocker run
with systemd cgroup driver.Steps to reproduce the issue:
Create
/etc/systemd/system/my-example-cgroup.slice
with contents:Edit
/etc/docker/daemon.json
as follows:Reload systemd and Docker configuration,
sudo systemctl daemon-reload
andsudo systemctl restart docker
.Now start a container and run something within it, e.g.:
From another terminal, we can verify the cgroup is set correctly:
However, now let's use
docker exec
:From another terminal, we can see the cgroup is not set correctly (should be the same as the previous process):
Consequently, cgroup resource limits are not enforced for any processes launched via
docker exec
.Describe the results you received:
cgroup of processes started via
docker run
are placed into the container--cgroup-parent
, but processes started viadocker exec
are placed into the default/system.slice/containerd.service
cgroup.Describe the results you expected:
cgroup of all processes in the container, no matter how they are started, are placed into the container
--cgroup-parent
.Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.): This is on an EC2 instance.
The text was updated successfully, but these errors were encountered: