-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
archlinux in unprivileged lxc cannot start #1678
Comments
Hm, I'm able to boot Archlinux just fine here even with a new systemd version. I might have to install it in a VM and try for myself. |
@brauner Here is the way I built it. |
@brauner 172.17.16.238 from canonical-lxd |
It's got to do with unprivileged user starting the container vs root starting the container.
So it's got to be something in your cgroup setup that's making systemd unhappy. My guess is that it's related to the unified hierarchy and possibly has to do with your user not owning that part of the tree. |
I suspect just installing libpam-cgfs would fix your problem. |
Ah, it probably would if there was a version of it with the needed cherry-picks for the unified hierarchy... |
#!/bin/sh
echo 1 > /sys/fs/cgroup/cpuset/cgroup.clone_children
for cgroup in /sys/fs/cgroup/*; do
mkdir -p ${cgroup}/user.slice/user-$(id -u ${1}).slice
chown -R $(id -u ${1}):$(id -g ${1}) ${cgroup}/user.slice/user-$(id -u ${1}).slice
if [ "$(basename ${cgroup})" != "unified" ]; then
echo ${2} > ${cgroup}/user.slice/user-$(id -u ${1}).slice/tasks
fi
done Run that as root passing the username as first argument and the shell's PID as the second argument. That will setup all your cgroups cleanly at which point your container will happily start. |
|
So not actually an LXC bug, the issue was a bad cgroup setup on the host which was preventing systemd in the container from mounting the "unified" controller (which is what changed in new systemd). If you configure your host to have your user own every single one of the controller that are used, then the container starts up properly. |
Sorry, I was off yesterday. :) Yes, that's exactly what I suspected. But the approach @stgraber outlined is ok but problematic in the long run. This currently only works with the unified hierarchy because it is the empty hierarchy, i.e. there are no controller enabled in it. Also, I'd be interested in the full trace log for the booting container that @stgraber just started after making those cgroup changes. The fact that it looks like the cgfs driver is used is weird. Actually, the cgfsng driver should be used an should just work fine. |
@brauner VM is still up, you can go poke at it :) |
@stgraber sorry I found that I have made a mistake, the lxc-download template will download archlinux rootfs with systemd 232 |
@stgraber |
@stgraber and in privileged lxc with systemd 233 |
This is still a problem for me. I can't update systemd in any of my unprivileged containers as it breaks the container. I don't really understand cgroups / systemd / unprivileged containers and how they interact with each other ( I have some reading to do!). So I don't know which piece of software is at fault here. The issue is not resolved. Why is this thread closed? Is there a solution in here that I've missed? Also - I have also just created a new unprivileged "test" container using the lxc-download template. The container starts up no problem but it is using systemd 232-8. When i upgrade it to systemd 233.75-3 it breaks. I can't even stop the container actually ... after executing the lxc-stop -n test command it just sits there doing nothing and the command never completes. Perhaps I haven't been waiting long enough, but I've been having to reboot the host machine to get it to stop. |
By the way, the only thing that prevents
Adding There is systemd/systemd#6408, but I'm wondering which part of |
@evverx, yeah, I had some other work to finish first but I'm going to take care of this likely during this week. :) |
@jjb2016 |
This is great guys - thanks a lot for taking notice of this! |
Closes lxc#1669. Closes lxc#1678. Relates to systemd/systemd#6408. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
Hey everyone, I just sent a branch to implement support for the empty cgroup v2 hierarchy. Note, for unprivileged containers run by unprivileged users two conditions must be met:
The second condition is caused by the specific delegation model the cgroup v2 hierarchy implements. A visual guide to what I mean is:
where the line
and
If you have a recent enough version of
|
Closes #1669. Closes #1678. Relates to systemd/systemd#6408. Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
i think we still got this issue on NixOS in our nixcloud-container abstraction:
if someone wants to have to configurations, see this:
@stgraber the script you proposed on #1678 (comment) does not work for us as we start the lxc-start as root user. now i wonder if that is a bad idea and if we should always start lxc-start with a 'normal' user. any thoughts on that? @ss1h2a3tw does your workaround degrade security? @brauner what you wrote in #1678 (comment) we don't use LXCFS at all IIRC but you seem to execute your stuff as normal user 'chb' as well. is this a must? also you added your change into 2.1 which is what we use so i had assumed that this error shouldn't have appeared after all. help? |
It seems you're starting an unprivileged container as root. That is perfectly fine. It means the setup process for the container runs as root but the container itself runs unprivileged.
That's just an argument to systemd itself and it doesn't degrade security. It's a valid boot option that systemd itself exposes.
If you start the container as root the pam module @qknight, the boot you're showing seems fine to me. What exactly is the problem you're observing? Could you please boot the container with:
and append/paste the contents of
|
@brauner i think my problem is completely different from the OP's posting. i've just seen the same error messages and assumet it was the same. i'm just concerned about;
is that a different problem? |
This should be fixed when poettering/systemd@f0f0fe3 is being merged. It would be great if you could check that systemd/systemd#7246 works for you. |
…roup/unified` It's possible for `systemd` inside an unprivileged user namespace container to be able to mount `cgroup2` on `/sys/fs/cgroup/unified` without being able to create directories there. When this happens, `systemd` fails to boot, making it impossible to reexecute itself without restarting the container runtime. In this patch the issue is avoided by trying creating a temporary directory after mounting `cgroup2` and falling back to `v1` if `mkdir` fails. Closes systemd#6408 and lxc/lxc#1678.
…roup/unified` It's possible for `systemd` inside an unprivileged user namespace container to be able to mount `cgroup2` on `/sys/fs/cgroup/unified` without being able to create directories there. When this happens, `systemd` fails to boot, making it impossible to reexecute itself without restarting the container runtime. In this patch the issue is avoided by trying creating a temporary directory after mounting `cgroup2` and falling back to `v1` if `mkdir` fails. Closes systemd#6408 and lxc/lxc#1678.
@evverx thanks for you help! i tried to apply all patches on 2.34 you linked from the poettering PR but it failed. then i tried to apply systemd/systemd@d3070fb only which also failed. should i use the 'master' version for testing instead? i've also checked what contains the patch: d3070fbdf6077d7da9dbafa198fff8dea712d2ff but only master has it. |
As said before, it looks like your container is starting just fine. If not, please, as requested before, append the required debug output from your container.
That shouldn't be fatal and shouldn't really be a problem for your container. |
@brauner YES, you are right. the container starts and my problem isn't the OP's propblem. sorry for the thread hijhacking. |
@qknight, np. So the utmp service should show you a clean exit status when you do |
@qknight, I think that the log level of those messages should be simply downgraded to |
@brauner regarding the utmp service i have this:
|
On Tue, Nov 21, 2017 at 09:42:27PM +0000, Joachim Schiele wrote:
@brauner regarding the utmp service i have this:
```
systemctl status systemd-update-utmp.service
● systemd-update-utmp.service - Update UTMP about System Boot/Shutdown
Loaded: loaded (/nix/store/hxaj535hm6p7gi24zv2k2ifvqadc3js2-systemd-234/example/systemd/system/systemd-update-utmp.service; enabled; vendor preset: enabled)
Drop-In: /nix/store/9hhz80d2lf6byxwmhd6yy5pm0wc142h4-system-units/systemd-update-utmp.service.d
└─overrides.conf
Active: active (exited) since Tue 2017-11-14 11:52:48 UTC; 1 weeks 0 days ago
Docs: man:systemd-update-utmp.service(8)
man:utmp(5)
Process: 205 ExecStart=/nix/store/hxaj535hm6p7gi24zv2k2ifvqadc3js2-systemd-234/lib/systemd/systemd-update-utmp reboot (code=exited, status=0/SUCCESS)
Main PID: 205 (code=exited, status=0/SUCCESS)
Right, so this exit success suggests that systemd is not considering this a
failure and will just move on.
Tasks: 0 (limit: 4915)
CGroup: /system.slice/systemd-update-utmp.service
Nov 14 11:52:48 v34 systemd[1]: Started Update UTMP about System Boot/Shutdown.
Nov 14 11:52:48 v34 systemd[1]: systemd-update-utmp.service: Failed to set invocation ID on control group /system.slice/systemd-update-utmp.service, ignoring: Operation not permitted
This error is non-critical and @evverx already discusses the option of making
this a debug message and not a warning in systemd.
|
When systemd is running inside a container employing user namespaces it currently mounts the unified cgroup hierarchy without being able to write to it. This causes systemd to freeze during boot. This patch checks whether the unified cgroup hierarchy is writable. If it is not it will not mount it. This solution is based on a patch by Evgeny Vereshchagin. Closes systemd#6408. Closes lxc/lxc#1678 .
When lxc uses the cgfsng cgroups driver, lxc-start works as expected, but this fails when trying an unprivileged container, and so cgfs driver is used. By starting a new container as root user, cgfsng works and the container is started as expected. Here are the output of lxc-ls with cgfsng debug enabled. While the unprivileged start fails in cgfsng, root user prints a handful of hierarchies. Does it help to debug this problem further @brauner ? |
This script really works for my environment which is current Manjaro and ubuntu container installed with:
But, I have to rerun the script after every reboot. I have never dealt with cgroup, so I am not sure what I should to to preserve the configuration properly. Thank you By the way, I am managing |
Seems to be that current mainline systemd have some new operations that unprivileged lxc
will blocks.
It works fine in privileged lxc.
Can confirm that systemd 232-8 still works well
Required information
lxc-start --version
2.0.8lxc-checkconfig
all greenuname -a
4.11.7-1-userns (the linux-userns in AUR)cat /proc/self/cgroup
cat /proc/1/mounts
Issue description
Using the lxc-download template downloads the archlinux amd64 and try to run it.
systemd's error
The text was updated successfully, but these errors were encountered: