Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mounting block device fails when CLONE_NEWUSER #221

Closed
yadutaf opened this issue May 15, 2014 · 5 comments
Closed

Mounting block device fails when CLONE_NEWUSER #221

yadutaf opened this issue May 15, 2014 · 5 comments
Assignees
Labels
Bug Confirmed to be a bug
Milestone

Comments

@yadutaf
Copy link

yadutaf commented May 15, 2014

When using an lxc.id_map any lxc.mount.entry entry involving a block device will fail. tmpfs, procfs, bind mounts work as expected.

A workaround is to:

  1. Mount bdev in host namespace
  2. Mount bind host-->container
  3. Umount bdev from host

I suspect this is caused by CLONE_NEWUSER being used in clone too early (before mount). An option I see could be to unshare(CLONE_NEWUSER) after mounts are all done then signal parent process to setup required id_map or split lxc_setup in 2 distinct parts, the first running outside this namespace. What do think of this approach ?

For the time being, I use the workaround but I hope to have some time to hack something into LXC itself in the coming weeks.

@stgraber stgraber added the bug label May 20, 2014
@stgraber stgraber added this to the lxc-1.0.4 milestone May 20, 2014
@hallyn
Copy link
Member

hallyn commented May 20, 2014

The constraints are: (1) an unprivileged user must unshare userns before he can unshare mntns. (2) when the container goes away - even uncleanly - the mount must go away.

I think the cleanest way to do this in the code (which is still not as clean as I'd like) would be to handle this right after src/lxc/start.c's call to attach_block_device(). Check whether getuid() == 0 and lxc.id_map is not empty. If that is the case, then unshare a mnt_ns right there, mount the block device to $lxcpath/$lxc_name/rootfs, and update the lxc_conf->rootfs.path to be $lxcpath/$lxc_name/rootfs.

Based on your last paragraph, I gather you're interested in coding up a patch for this, so I'll wait for that. (If that's not the case please let me know)

@yadutaf
Copy link
Author

yadutaf commented May 20, 2014

I'd be tempted to keep the mounts where they are currently and, if id_map is not empty and getuid() == 0 then unshare at this point and signal the parent process to setup the id_map. This way, we keep the same workflow. Obviously, if the user is not root, we don't change anything. Would it be cleaner ?

I will submit a patch next week.

@hallyn
Copy link
Member

hallyn commented May 20, 2014

I don't believe it is possible to do this and have it work for
unprivileged users. However if you can make it work and the patch
is clean, all the better :)

@hallyn hallyn self-assigned this May 22, 2014
@hallyn
Copy link
Member

hallyn commented May 22, 2014

This is needed for another driver to create qcow2-based unprivileged containers, so I am going to post a patch tonight for this. If you come up with a cleaner patch later on we'll happily take a look.

stgraber pushed a commit that referenced this issue May 25, 2014
It is not possible to mount a block device from a non-init user namespace.
Therefore if root on the host is starting a container with a uid
mapping, and the rootfs is a block device, then mount the rootfs before
we spawn the container init task.

This addresses #221

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
@yadutaf
Copy link
Author

yadutaf commented May 27, 2014

I have a proof of concept on my side that works for both lxc.rootfs and lxc.mount.entry while preserving current behavior for non-root containers. The only problem is that it will most probably conflict with commit 35120d9. I'll keep you informed as soon as possible.

stgraber pushed a commit that referenced this issue Jun 4, 2014
It is not possible to mount a block device from a non-init user namespace.
Therefore if root on the host is starting a container with a uid
mapping, and the rootfs is a block device, then mount the rootfs before
we spawn the container init task.

This addresses #221

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
@stgraber stgraber modified the milestones: lxc-1.0.5, lxc-1.0.4 Jun 13, 2014
@hallyn hallyn closed this as completed Jun 24, 2014
z-image pushed a commit to z-image/lxc that referenced this issue Oct 16, 2016
It is not possible to mount a block device from a non-init user namespace.
Therefore if root on the host is starting a container with a uid
mapping, and the rootfs is a block device, then mount the rootfs before
we spawn the container init task.

This addresses lxc#221

Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Acked-by: Stéphane Graber <stgraber@ubuntu.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug
Development

No branches or pull requests

3 participants