-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot start privileged containers without cap sys_admin on Linux Kernel 4.6 and newer #1737
Comments
Hi, Note that I was running Jessie with a 4.9 kernel (so cgroup namespaces were already available) but LXC 1.0 didn't behave the same way. |
@evgeni, so right. From the top of my head this would require an implementation of cgroup-mixed with cgfsng which I'm currently not sure we have. But it shouldn't be too difficult to hack into the driver. The other option is to update to |
@brauner I think we do, or am I miss-reading cgfsng_mount? I removed https://github.com/lxc/lxc/blob/master/src/lxc/cgroups/cgfsng.c#L1627-L1628 and was able to start a container with /bin/sh as init and it had cgroups mounted fine. |
Cool. So I assume @hallyn didn't purely put this there because cgroup namespaces makes this features unrequired. I assume it is because before my patch to correctly sync between |
Yeah, that code is there for a reason. I just don't understand it properly, but you seem to ;) |
Has there been any recent progress on this issue? Anything we can do to help here? |
Oh, we somehow never followed up on this. Sorry, my bad. I'm testing a patch now. |
I'm sending a patch that enables pre-mounting the cgroup filesystems when and @evverx is tracking this in |
Great to see someone working on this. I will check with my coworker if we might be able to test your patch. I think I can actually rule-out systemd as the culprit here (atleast the debian versions). We tested Debian Jessie and Debian Stretch each with new and old kernels both resulted in the same behaviour, that is old kernel worked on both Debian/systemd versions, new kernel did not. |
@brauner wrong ev... ;) |
In case cgroup namespaces are supported but we do not have CAP_SYS_ADMIN we need to mount cgroups for the container. This patch enables both privileged and unprivileged containers without CAP_SYS_ADMIN. Closes lxc#1737. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
@kartoffelheinz, send a patch that should fix your problem. |
There's one more tweak to this needed. Currently we only mount writable cgroups which for privileged containers == all controllers but for unpriviliged containers means only a subset of them. While that's not a big deal since all of the others are not writable we should still mount them. |
In case cgroup namespaces are supported but we do not have CAP_SYS_ADMIN we need to mount cgroups for the container. This patch enables both privileged and unprivileged containers without CAP_SYS_ADMIN. Closes #1737. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
In case cgroup namespaces are supported but we do not have CAP_SYS_ADMIN we need to mount cgroups for the container. This patch enables both privileged and unprivileged containers without CAP_SYS_ADMIN. Closes #1737. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Any plans to backport this fix to the 2.0.x releases? Or does this depend on significant rewrites that are only present in 2.0? I tried backporting this to |
A bit more digging suggests that the fix backported to 2.0.9 did not work, because:
The problem with loading I wanted to try if removing |
Oh really? If so that'd be a bug. If you can show/reproduce this, please open a new issue. |
I just confirmed that removing the The relevant Debian bug about this issue is here: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=875733 |
In case cgroup namespaces are supported but we do not have CAP_SYS_ADMIN we need to mount cgroups for the container. This patch enables both privileged and unprivileged containers without CAP_SYS_ADMIN. Closes lxc#1737. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
This breaks the "cgfsng" backend within lxc, which prevents the fix for lxc/lxc#1737 from working. It seems that the default, when this config value is not defined, is to use all cgroups anyway, so removing this should not break anything (see lxc/lxc#2084 about this).
I did some testing with the patch from For example the sequence:
It might be some kind of race condition because it doesn't always happen with |
This bug is present atleast for 1 year now, but since we use Debian only and stock kernel was way below 4.6 (3.16 to be precise), we thought it might get fixed "by itself". Unfortunately, this was not the case and now Debian 9 arrived with Kernel 4.9 and the bug prevents us from upgrading.
Required information
lxc-start --version
: 2.0.7lxc-checkconfig
:uname -a
: Linux ws 4.9.0-3-amd64 Prefix tests with lxc-test- #1 SMP Debian 4.9.30-2+deb9u2 (2017-06-26) x86_64 GNU/Linuxcat /proc/self/cgroup
cat /proc/1/mounts
Issue description
If you try to start a privileged container lxc.cap.drop = sys_admin (anyone not insane will want this) on a kernel newer than 4.5 the container will not boot (hangs on init), emitting the following error:
This works just fine with a kernel up to (including) 4.5. I remember something about cgroup architecture had changed around that time, but I'm no kernel developer.
Steps to reproduce
Information to attach
lxc-start -n <c> -o <log> -l DEBUG
) test.log.txtThe text was updated successfully, but these errors were encountered: