New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot start a container with systemd 232 #1175

Closed
francois2metz opened this Issue Nov 5, 2016 · 9 comments

Comments

Projects
None yet
7 participants
@francois2metz

Hi,

With systemd 232, they now mount /sys/fs/cgroup/systemd with cgroup2 aka unified hierarchy systemd/systemd#3965

This breaks runc with the error: no subsystem for mount

I had to use the systemd.legacy_systemd_cgroup_controller=yes boot parameter to fix it.

@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Nov 5, 2016

Member

... scream internally

While this should be okay to handle, I really wish they hadn't implemented that. The cgroup migration code (having processes in some cgroupv2 controllers but not others) is not very safe IMO. Not to mention that the incompatibilities will make this an enormous pain to deal with.

Member

cyphar commented Nov 5, 2016

... scream internally

While this should be okay to handle, I really wish they hadn't implemented that. The cgroup migration code (having processes in some cgroupv2 controllers but not others) is not very safe IMO. Not to mention that the incompatibilities will make this an enormous pain to deal with.

@qzio

This comment has been minimized.

Show comment
Hide comment
@qzio

qzio Nov 8, 2016

Just want to confirm that the workaround to use systemd.legacy_systemd_cgroup_controller=yes as a boot parameter (adding it to /etc/default/grub @ GRUB_CMDLINE_LINUX_DEFAULT and running update-grub && reboot) "fixed" it for me.

qzio commented Nov 8, 2016

Just want to confirm that the workaround to use systemd.legacy_systemd_cgroup_controller=yes as a boot parameter (adding it to /etc/default/grub @ GRUB_CMDLINE_LINUX_DEFAULT and running update-grub && reboot) "fixed" it for me.

martinpitt added a commit to martinpitt/systemd that referenced this issue Nov 9, 2016

core: don't use the unified hierarchy for the systemd cgroup yet
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Nov 10, 2016

Member

It looks like systemd/systemd#4628 is going to revert the regression. From the discussion linked, it looks like the solution is for us to mount a fake v1 name=systemd hierarchy and then just use that as the bindmount. Now, this isn't pretty by any stretch of the imagination (and I have my doubts about the overall stability of systemd's truly insane cgroup handling) but we don't really have a choice right now.

Member

cyphar commented Nov 10, 2016

It looks like systemd/systemd#4628 is going to revert the regression. From the discussion linked, it looks like the solution is for us to mount a fake v1 name=systemd hierarchy and then just use that as the bindmount. Now, this isn't pretty by any stretch of the imagination (and I have my doubts about the overall stability of systemd's truly insane cgroup handling) but we don't really have a choice right now.

keszybz added a commit to systemd/systemd that referenced this issue Nov 10, 2016

core: don't use the unified hierarchy for the systemd cgroup yet (#4628)
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
@martinpitt

This comment has been minimized.

Show comment
Hide comment
@martinpitt

martinpitt Nov 10, 2016

systemd/systemd#4628 reverted this for now, but you can still boot with systemd.legacy_systemd_cgroup_controller=0 to get the unified behaviour; that is useful for developers to make runc/docker/lxc/etc. work with the unified hierarchy eventually.

systemd/systemd#4628 reverted this for now, but you can still boot with systemd.legacy_systemd_cgroup_controller=0 to get the unified behaviour; that is useful for developers to make runc/docker/lxc/etc. work with the unified hierarchy eventually.

fbuihuu added a commit to fbuihuu/systemd-opensuse-next that referenced this issue Nov 21, 2016

core: don't use the unified hierarchy for the systemd cgroup yet (#4628)
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5ba)

keszybz added a commit to systemd/systemd-stable that referenced this issue Jan 31, 2017

core: don't use the unified hierarchy for the systemd cgroup yet (#4628)
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5ba)

globin added a commit to NixOS/systemd that referenced this issue Feb 16, 2017

core: don't use the unified hierarchy for the systemd cgroup yet (#4628)
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5ba)
@WhyNotHugo

This comment has been minimized.

Show comment
Hide comment
@WhyNotHugo

WhyNotHugo Apr 4, 2017

I can confirm that this is still an issue as of systemd-233, and the above workaround still works.

I can confirm that this is still an issue as of systemd-233, and the above workaround still works.

@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Apr 4, 2017

Member

The latest version of runC includes #1266 which should fix this problem in the hybrid mode that systemd 232 shipped. AFAIK the newest Docker release should contain that fix.

Member

cyphar commented Apr 4, 2017

The latest version of runC includes #1266 which should fix this problem in the hybrid mode that systemd 232 shipped. AFAIK the newest Docker release should contain that fix.

dongsupark added a commit to kinvolk/kube-spawn that referenced this issue Jul 13, 2017

etc: change cgroup driver to cgroupfs
With runc 1.0.0-rc2 on Container Linux 1465, kube-spawn init hangs
forever with message like: "Created API client, waiting for the
control plane to become ready".
That's because docker daemon cannot execute runc, which returns error
like "no subsystem for mount". See also:
opencontainers/runc#1175 (comment)

This issue was apparently resolved in runc 1.0.0-rc3, so in theory
runc 1.0.0-rc3 should work fine with Docker 17.05. Unfortunately on
Container Linux, it's not trivial to replace only the runc binary with
a custom one, because Container Linux makes use of torcx to provide
docker as well as runc: /run/torcx/unpack is sealed, read-only mounted.
It's simply not doable to change those binaries altogether at run-time.

As workaround, we should change cgroupdriver for docker and kubelet
from systemd to cgroupfs. Then init process will succeed without hanging
forever.

See also #45

dongsupark added a commit to kinvolk/kube-spawn that referenced this issue Jul 13, 2017

etc: change cgroup driver to cgroupfs
With runc 1.0.0-rc2 on Container Linux 1465, kube-spawn init hangs
forever with message like: "Created API client, waiting for the
control plane to become ready".
That's because docker daemon cannot execute runc, which returns error
like "no subsystem for mount". See also:
opencontainers/runc#1175 (comment)

This issue was apparently resolved in runc 1.0.0-rc3, so in theory
runc 1.0.0-rc3 should work fine with Docker 17.05. Unfortunately on
Container Linux, it's not trivial to replace only the runc binary with
a custom one, because Container Linux makes use of torcx to provide
docker as well as runc: /run/torcx/unpack is sealed, read-only mounted.

As workaround, we should change cgroupdriver for docker and kubelet
from systemd to cgroupfs. Then init process will succeed without hanging
forever.

See also #45

Yamakuzure added a commit to elogind/elogind that referenced this issue Jul 17, 2017

core: don't use the unified hierarchy for the systemd cgroup yet (#4628)
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"systemd.legacy_systemd_cgroup_controller=0".
(cherry picked from commit 843d5baf6aad6c53fc00ea8d95d83209a4f92de1)

Yamakuzure added a commit to elogind/elogind that referenced this issue Jul 18, 2017

core: don't use the unified hierarchy for the elogind cgroup yet (#4628)
Too many things don't get along with the unified hierarchy yet:

 * opencontainers/runc#1175
 * moby/moby#28109
 * lxc/lxc#1280

So revert the default to the legacy hierarchy for now. Developers of the above
software can opt into the unified hierarchy with
"elogind.legacy_elogind_cgroup_controller=0".
@ajw107

This comment has been minimized.

Show comment
Hide comment
@ajw107

ajw107 Oct 30, 2017

I know this is an old thread, but just to make it pop up in google serches a bit more, upon upgrading my Ubuntu Server 17.04->10 I ran head long into this issue as well (17.10 uses systemd 2.34 if it's any help). I'm about to try the later fix of changing the cgroup driver, before I use the initial work-around of change the kernel boot up parameters which could break other things that expect the new cgroup implementation on 17.10.

ajw107 commented Oct 30, 2017

I know this is an old thread, but just to make it pop up in google serches a bit more, upon upgrading my Ubuntu Server 17.04->10 I ran head long into this issue as well (17.10 uses systemd 2.34 if it's any help). I'm about to try the later fix of changing the cgroup driver, before I use the initial work-around of change the kernel boot up parameters which could break other things that expect the new cgroup implementation on 17.10.

@cyphar

This comment has been minimized.

Show comment
Hide comment
@cyphar

cyphar Oct 31, 2017

Member

@ajw107 runc does have a fix for this bug (#1266). Most distributions should've backported that fix (we did that in SLE and openSUSE for example).

Member

cyphar commented Oct 31, 2017

@ajw107 runc does have a fix for this bug (#1266). Most distributions should've backported that fix (we did that in SLE and openSUSE for example).

@select

This comment has been minimized.

Show comment
Hide comment
@select

select Nov 6, 2017

@ajw107 did you manage to fix that issue? I have the same problem and frankly not really a clue what I can do to solve this. Also thanks @cyphar I see the fix but no idea what to do with it.

select commented Nov 6, 2017

@ajw107 did you manage to fix that issue? I have the same problem and frankly not really a clue what I can do to solve this. Also thanks @cyphar I see the fix but no idea what to do with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment