Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start a container with systemd 232 #1175

Closed
francois2metz opened this issue Nov 5, 2016 · 9 comments
Closed

Cannot start a container with systemd 232 #1175

francois2metz opened this issue Nov 5, 2016 · 9 comments

Comments

@francois2metz
Copy link

Hi,

With systemd 232, they now mount /sys/fs/cgroup/systemd with cgroup2 aka unified hierarchy systemd/systemd#3965

This breaks runc with the error: no subsystem for mount

I had to use the systemd.legacy_systemd_cgroup_controller=yes boot parameter to fix it.

@cyphar
Copy link
Member

cyphar commented Nov 5, 2016

... scream internally

While this should be okay to handle, I really wish they hadn't implemented that. The cgroup migration code (having processes in some cgroupv2 controllers but not others) is not very safe IMO. Not to mention that the incompatibilities will make this an enormous pain to deal with.

@qzio
Copy link

qzio commented Nov 8, 2016

Just want to confirm that the workaround to use systemd.legacy_systemd_cgroup_controller=yes as a boot parameter (adding it to /etc/default/grub @ GRUB_CMDLINE_LINUX_DEFAULT and running update-grub && reboot) "fixed" it for me.

martinpitt added a commit to martinpitt/systemd that referenced this issue Nov 9, 2016
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
@cyphar
Copy link
Member

cyphar commented Nov 10, 2016

It looks like systemd/systemd#4628 is going to revert the regression. From the discussion linked, it looks like the solution is for us to mount a fake v1 name=systemd hierarchy and then just use that as the bindmount. Now, this isn't pretty by any stretch of the imagination (and I have my doubts about the overall stability of systemd's truly insane cgroup handling) but we don't really have a choice right now.

keszybz pushed a commit to systemd/systemd that referenced this issue Nov 10, 2016
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
@martinpitt
Copy link

systemd/systemd#4628 reverted this for now, but you can still boot with systemd.legacy_systemd_cgroup_controller=0 to get the unified behaviour; that is useful for developers to make runc/docker/lxc/etc. work with the unified hierarchy eventually.

fbuihuu pushed a commit to fbuihuu/systemd-opensuse-next that referenced this issue Nov 21, 2016
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5ba)
keszybz pushed a commit to systemd/systemd-stable that referenced this issue Jan 31, 2017
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5ba)
globin pushed a commit to NixOS/systemd that referenced this issue Feb 16, 2017
…temd#4628)

Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5ba)
@WhyNotHugo
Copy link

I can confirm that this is still an issue as of systemd-233, and the above workaround still works.

@cyphar
Copy link
Member

cyphar commented Apr 4, 2017

The latest version of runC includes #1266 which should fix this problem in the hybrid mode that systemd 232 shipped. AFAIK the newest Docker release should contain that fix.

dongsupark pushed a commit to kinvolk/kube-spawn that referenced this issue Jul 13, 2017
With runc 1.0.0-rc2 on Container Linux 1465, kube-spawn init hangs
forever with message like: "Created API client, waiting for the
control plane to become ready".
That's because docker daemon cannot execute runc, which returns error
like "no subsystem for mount". See also:
opencontainers/runc#1175 (comment)

This issue was apparently resolved in runc 1.0.0-rc3, so in theory
runc 1.0.0-rc3 should work fine with Docker 17.05. Unfortunately on
Container Linux, it's not trivial to replace only the runc binary with
a custom one, because Container Linux makes use of torcx to provide
docker as well as runc: /run/torcx/unpack is sealed, read-only mounted.
It's simply not doable to change those binaries altogether at run-time.

As workaround, we should change cgroupdriver for docker and kubelet
from systemd to cgroupfs. Then init process will succeed without hanging
forever.

See also #45
dongsupark pushed a commit to kinvolk/kube-spawn that referenced this issue Jul 13, 2017
With runc 1.0.0-rc2 on Container Linux 1465, kube-spawn init hangs
forever with message like: "Created API client, waiting for the
control plane to become ready".
That's because docker daemon cannot execute runc, which returns error
like "no subsystem for mount". See also:
opencontainers/runc#1175 (comment)

This issue was apparently resolved in runc 1.0.0-rc3, so in theory
runc 1.0.0-rc3 should work fine with Docker 17.05. Unfortunately on
Container Linux, it's not trivial to replace only the runc binary with
a custom one, because Container Linux makes use of torcx to provide
docker as well as runc: /run/torcx/unpack is sealed, read-only mounted.

As workaround, we should change cgroupdriver for docker and kubelet
from systemd to cgroupfs. Then init process will succeed without hanging
forever.

See also #45
Yamakuzure pushed a commit to elogind/elogind that referenced this issue Jul 17, 2017
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5baf6aad6c53fc00ea8d95d83209a4f92de1)
Yamakuzure pushed a commit to elogind/elogind that referenced this issue Jul 18, 2017
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"elogind.legacy_elogind_cgroup_controller=0".
@ajw107
Copy link

ajw107 commented Oct 30, 2017

I know this is an old thread, but just to make it pop up in google serches a bit more, upon upgrading my Ubuntu Server 17.04->10 I ran head long into this issue as well (17.10 uses systemd 2.34 if it's any help). I'm about to try the later fix of changing the cgroup driver, before I use the initial work-around of change the kernel boot up parameters which could break other things that expect the new cgroup implementation on 17.10.

@cyphar
Copy link
Member

cyphar commented Oct 31, 2017

@ajw107 runc does have a fix for this bug (#1266). Most distributions should've backported that fix (we did that in SLE and openSUSE for example).

@select
Copy link

select commented Nov 6, 2017

@ajw107 did you manage to fix that issue? I have the same problem and frankly not really a clue what I can do to solve this. Also thanks @cyphar I see the fix but no idea what to do with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants
@qzio @francois2metz @martinpitt @WhyNotHugo @select @ajw107 @cyphar and others