Not able to run container with non root user #1513

erezo · 2017-07-10T08:12:28Z

Hi,

running with:

CentOS Linux release 7.2.1511 (Core)

runc version 1.0.0-rc3
commit: 5c73abb
spec: 1.0.0-rc5-dev

Docker version 17.06.0-ce, build 02c1d87

I followed the building section and installed runc in custom dir, then Created OCI Bundle and rootless spec and finally tried to run the rootless container but keep getting the following error:

../usr/local/sbin/runc --root /tmp/runc run mycontainerid

container_linux.go:265: starting container process caused "process_linux.go:250: running exec setns process for init caused "exit status 34""

I enabled user_namespace with grubby, rebooted and setting seem to be set:

BOOT_IMAGE=/vmlinuz-3.10.0-327.el7.x86_64 root=UUID=4cb23129-4832-4b25-abfb-be8d63160eac ro crashkernel=auto rd.lvm.lv=vg00/lswap nomodeset rhgb quiet LANG=en_US.UTF-8 user_namespace.enable=1

Still I can not run a container from a non root user. When i do it with root user it works but not with non root.

Any idea?

The text was updated successfully, but these errors were encountered:

tamagokun · 2017-07-21T16:56:52Z

Same here.

exit status 34 should be: https://github.com/opencontainers/runc/blob/master/libcontainer/nsenter/nsexec.c#L723 based on other issues where that exit status is discussed.

Using the latest master (1.0.0-rc6-dev)

ChethanSuresh · 2017-07-28T10:01:31Z

I received a similar error with exit status 4, first step that helped me was
$ strace -f runc --root /path/to/myroot run id
see where it fails.
In my case, it was permission denied to write to /proc/self/oom_score_adj file.

cyphar · 2017-08-05T03:05:04Z

In my case, it was permission denied to write to /proc/self/oom_score_adj file.

I fixed this a while ago in 6bd4bd9.

cyphar · 2017-08-05T03:07:04Z

@erezo I believe we @mrunalp may know more about this. In particular, I would not be surprised if it's caused by SELinux or some other interaction where the way we're creating the namespaces isn't the way SELinux wants it to be done.

I also believe he was complaining a few months ago that some of the nsexec changes broke on RHEL in certain cases, but I don't think we ever figured out how we can cleanly solve that issue.

rhatdan · 2017-08-05T10:34:11Z

On RHEL and CentOS platforms I believe, you have to set a sysctl to allow you to run non privilege user namespaces.

cyphar · 2017-08-05T11:51:13Z

@rhatdan Oh, is user_namespace.enable=1 on the kernel cmdline not enough anymore? I thought you only need the sysctl if you didn't set the cmdline.

TomSweeneyRedHat · 2017-08-05T17:23:34Z

'''
On RHEL 7.3 and later, starting in June 2017, it's now:
grubby --default-kernel

/boot/vmlinuz-3.10.0-514.16.1.el7.x86_64
** update namespace.unpriv_enable in kernel from above **
grubby --args=namespace.unpriv_enable=1 --update-kernel /boot/vmlinuz-3.10.0-682.el7.x86_64

The systctl command that's been added is:
sysctl user.max_user_namespaces=15076

I don't know when (if?) this has been implemented in CentOS.
'''
(Trying with auto formating off)

cyphar · 2017-08-05T18:25:32Z

Oh I just noticed that @erezo was trying to run inside a Docker container. Docker uses a seccomp policy to disable CLONE_NEWUSER. You'll want to write your own seccomp policy or run the container with --security-opt="seccomp=unconfined"

erezo · 2017-08-09T15:23:33Z

@cyphar I wasn't trying to run inside Docker. I just mentioned the Docker version also installed on that machine.

ChethanSuresh · 2017-08-10T08:24:48Z

I fixed this a while ago in 6bd4bd9.

Sorry for the interruption @cyphar,(off the issue topic)
but in android-3.18,

the proc_score_adj file seems to be created with only user read permissions(400)
even a simple echo 1 > /proc/self/oom_score_adj fails with permissions denied for non-root user.

Therefore, update_oom_score_adj failed.

I presumed that the change must be from kernel side, is my understanding right?

cyphar · 2017-08-10T08:56:50Z

@chethanmaurian Yes that's a ABI breakage from the Android kernel fork. I guess they think it makes it more secure for some reason? Here's the commit that made the change.

cyphar · 2017-08-10T10:15:59Z

@erezo Ah okay, I was a bit confused because the Docker version isn't really relevant unless you're running inside Docker (runc doesn't use Docker, it's the other way around). I'll boot up a CentOS 7.2 VM over the weekend to figure out why it's broken.

cyphar · 2017-08-10T16:28:50Z

This looks suspiciously like one of those really fun "only CentOS is broken" bugs. In particular, it looks like on CentOS 7 you can't do an unshare(CLONE_NEWUSER|CLONE_NEWNS):

% unshare -U true
% unshare -Uuinprf true
% unshare -Um true
unshare: unshare failed: Operation not permitted

@mrunalp @rhatdan Would this be an SELinux problem? I tried setenforce 0 but it looks like that doesn't change anything. I tried pulling the CentOS kernel sources, but soon realised that there's no patch information so searching would be a huge pain. I found the patch, see below.

cyphar · 2017-08-10T16:39:59Z

Nope, it's actually just a very simple patch to fs/namespace.c:

struct mnt_namespace *copy_mnt_ns(unsigned long flags, struct mnt_namespace *ns,
		struct user_namespace *user_ns, struct fs_struct *new_fs)
{
	struct mnt_namespace *new_ns;

	BUG_ON(!ns);
	get_mnt_ns(ns);

	if (!(flags & CLONE_NEWNS))
		return ns;

	/* Unprivileged creation currently disabled in RHEL7 */
	if (!capable(CAP_SYS_ADMIN)) {
		put_mnt_ns(ns);
		return ERR_PTR(-EPERM);
	}

	new_ns = dup_mnt_ns(ns, user_ns, new_fs);

	put_mnt_ns(ns);
	return new_ns;
}

Where (obviously) this is the patched part:

	/* Unprivileged creation currently disabled in RHEL7 */
	if (!capable(CAP_SYS_ADMIN)) {
		put_mnt_ns(ns);
		return ERR_PTR(-EPERM);
	}

On paper it should be possible to use runc without mount namespaces. But this is quite disappointing IMO.

rhatdan · 2017-08-14T14:30:22Z

If you setenforce 0 and something is still denied, 98.9% chance not SELinux.

User Namespace and Mounting to not match. Only think you are allowed to mount in a user namespace is tmpfs, and bind mounts, I believe. All mounting has to be completed by container runtime before the user namespace is created. I know that the user namespace guys are working hard to fix this, but it is a Very difficult problem.

cyphar · 2017-08-14T15:24:28Z

@rhatdan If you see this comment, I identified that it's a patch that RHEL applies to completely disable the ability to create a new mount namespace inside a user namespace. You're right it has nothing to do with SELinux, this is the patched part:

	/* Unprivileged creation currently disabled in RHEL7 */
	if (!capable(CAP_SYS_ADMIN)) {
		put_mnt_ns(ns);
		return ERR_PTR(-EPERM);
	}

I'm not exactly sure how this works with Docker, since I was under the impression that --userns-remap wouldn't work if that feature was disabled (maybe you know more about that than me). Note that in other kernels (openSUSE, Fedora, Ubuntu, Debian, etc) this works (the upstream kernel does not impose this restriction). Is there a reason for this decision? Is it possible to make this a sysctl or cmdline knob?

Omnifarious · 2018-02-27T22:33:41Z

@cyphar - Has RH given an explanation of why they apply this patch, or a way to disable it? I know some people who really want to ship a product that relies on creating a mount namespace inside a user namespace.

cyphar · 2018-02-28T03:08:58Z

I spoke to Eric Biederman about this patch when I last saw him. In short the reason why this patch is applied is because the RHEL kernel doesn't have a lot of the patches required to make mountns-inside-userns secure, so they just disable it. Newer versions of RHEL have this patch removed I believe -- though this won't help you if you're stuck on 7.4.

rhatdan · 2018-02-28T12:02:03Z

I believe this will be allowed in RHEL7.5 which should be coming in the next couple of months

Omnifarious · 2018-02-28T20:54:25Z

@rhatdan - The company I care about this on the behalf of is pretty large, and the feature that depends on this is a pretty major and important feature. We should probably establish contact with RedHat about this and make sure.

RIght now we're considered a lot of fairly hacky workarounds. And all of them involve elevated privilege levels for things we were hoping wouldn't need them. This company telling its customers that they have to upgrade for this feature will not be that unreasonable. They won't like it, but it won't be a disaster.

FelikZ · 2018-03-13T10:55:46Z

See moby/moby#35806

cyphar · 2018-03-13T11:25:44Z

Or more importantly, refer to https://discuss.linuxcontainers.org/t/centos-7-kernel-514-693-cannot-start-any-nodes-after-update/641/17. It looks like they added a boot parameter after I looked at the source above -- since the checks allowing weren't present in the sources I downloaded.

The TL;DR is that on newer RHEL 7.4 releases you need to use both user_namespace.enable=1 namespace.unpriv_enable=1.

MichaelOVertolli · 2018-07-25T18:37:27Z

I had a similar issue on CentOS 7.5 which I localized to sysctl failing to assign user.max_user_namespaces=15076 on boot. It turned out to be an SELinux issue. Calling ausearch -m avc showed a rather weird error. The key part is:

scontext=system_u:system_r:systemd_sysctl_t:s0
tcontext=system_u:system_r:systemd_sysctl_t:s0

Anyway, apparently this is a known bug in kernel 3.10.0. The easiest fix is probably to use audit2allow. My type enforcement file looked like this:

module systemd_sysctl_t 1.0:

require {
  type systemd_sysctl_t;
  class capability sys_resource;
}

allow systemd_sysctl_t self:capability sys_resource;

Few other details:
I modified /etc/docker/daemon.json rather than /etc/sysconfig/docker. The documentation recommends the latter, but I believe the sysconfig approach is deprecated. Consequently, systemctl status docker | grep userns didn't show anything (i.e., that command is not a valid check anymore).

I also used both user_namespace.enable=1 and namespace.unpriv_enable=1. I haven't checked if they are required, though.

cyphar · 2018-11-14T04:55:30Z

Given all of the above discussion, I'm pretty sure that this can be closed. It's possible that in future RHEL or CentOS releases there will be more out-of-tree knobs. Please open a new issue if that is the case.

erezo changed the title ~~Getting error when trying to run mycontainer example~~ Not able to run container with non root user Jul 10, 2017

cyphar mentioned this issue Mar 5, 2018

Got starting container process caused "process_linux.go:301: running exec setns process for init caused \"exit status 40\"": unknown. from time to time #1740

Closed

jamiejackson mentioned this issue Apr 5, 2018

Namespace problem in CentOS 7.4 haxorof/ansible-role-docker-ce#32

Closed

lukasheinrich mentioned this issue Apr 6, 2018

COPY fails on some base-images (cern/cc7-base) cyphar/orca-build#19

Open

ChethanSuresh mentioned this issue Jul 26, 2018

problem following README instruction to run rootless container #1850

Closed

cyphar closed this as completed Nov 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Not able to run container with non root user #1513

Not able to run container with non root user #1513

erezo commented Jul 10, 2017 •

edited

Loading

tamagokun commented Jul 21, 2017 •

edited

Loading

ChethanSuresh commented Jul 28, 2017 •

edited

Loading

cyphar commented Aug 5, 2017

cyphar commented Aug 5, 2017

rhatdan commented Aug 5, 2017

cyphar commented Aug 5, 2017

TomSweeneyRedHat commented Aug 5, 2017

cyphar commented Aug 5, 2017

erezo commented Aug 9, 2017

ChethanSuresh commented Aug 10, 2017

cyphar commented Aug 10, 2017 •

edited

Loading

cyphar commented Aug 10, 2017

cyphar commented Aug 10, 2017 •

edited

Loading

cyphar commented Aug 10, 2017 •

edited

Loading

rhatdan commented Aug 14, 2017

cyphar commented Aug 14, 2017 •

edited

Loading

Omnifarious commented Feb 27, 2018

cyphar commented Feb 28, 2018

rhatdan commented Feb 28, 2018

Omnifarious commented Feb 28, 2018 •

edited

Loading

FelikZ commented Mar 13, 2018 •

edited

Loading

cyphar commented Mar 13, 2018 •

edited

Loading

MichaelOVertolli commented Jul 25, 2018

cyphar commented Nov 14, 2018

Not able to run container with non root user #1513

Not able to run container with non root user #1513

Comments

erezo commented Jul 10, 2017 • edited Loading

tamagokun commented Jul 21, 2017 • edited Loading

ChethanSuresh commented Jul 28, 2017 • edited Loading

cyphar commented Aug 5, 2017

cyphar commented Aug 5, 2017

rhatdan commented Aug 5, 2017

cyphar commented Aug 5, 2017

TomSweeneyRedHat commented Aug 5, 2017

cyphar commented Aug 5, 2017

erezo commented Aug 9, 2017

ChethanSuresh commented Aug 10, 2017

cyphar commented Aug 10, 2017 • edited Loading

cyphar commented Aug 10, 2017

cyphar commented Aug 10, 2017 • edited Loading

cyphar commented Aug 10, 2017 • edited Loading

rhatdan commented Aug 14, 2017

cyphar commented Aug 14, 2017 • edited Loading

Omnifarious commented Feb 27, 2018

cyphar commented Feb 28, 2018

rhatdan commented Feb 28, 2018

Omnifarious commented Feb 28, 2018 • edited Loading

FelikZ commented Mar 13, 2018 • edited Loading

cyphar commented Mar 13, 2018 • edited Loading

MichaelOVertolli commented Jul 25, 2018

cyphar commented Nov 14, 2018

erezo commented Jul 10, 2017 •

edited

Loading

tamagokun commented Jul 21, 2017 •

edited

Loading

ChethanSuresh commented Jul 28, 2017 •

edited

Loading

cyphar commented Aug 10, 2017 •

edited

Loading

cyphar commented Aug 10, 2017 •

edited

Loading

cyphar commented Aug 10, 2017 •

edited

Loading

cyphar commented Aug 14, 2017 •

edited

Loading

Omnifarious commented Feb 28, 2018 •

edited

Loading

FelikZ commented Mar 13, 2018 •

edited

Loading

cyphar commented Mar 13, 2018 •

edited

Loading