Having non-root privileges on the host and root inside the container #2918

Closed
pwaller opened this Issue Nov 27, 2013 · 11 comments

Projects

None yet
@pwaller
Contributor
pwaller commented Nov 27, 2013

This issue is for discussion of one of the two halves of #1034 ("taking advantage of CONFIG_USER_NS").

With user namespaces, docker could have ordinary user privileges on the host but think that they have root privileges inside the container.

xref #503 (comment)

@fermayo
Member
fermayo commented Dec 12, 2013

+1

Anyone has an idea at least of the modifications required for docker to work with the lxc.id_map parameter?

@creack
Contributor
creack commented Jan 9, 2014

IIRC user namespace is not properly supported in "older" kernel (pre 3.10?). /cc @jpetazzo

@dineshs-altiscale
Contributor

Latest kernel from last weekend (3.14.1) is needed for this to work properly. I was hitting a kernel regression with mounting proc earlier, which is now fixed.

With that latest kernel, some manual configuration and quick/dirty changes (mostly) to container.start, I am able to run bash as root in the container but UID 100000 on my CentOS 6.5 host.

My usage model is as follows.

A set of UID ranges from the host are mapped into the container through a new option to docker run.

docker run -uidmap <container UID>:<host UID>:<range> -i -t ubuntu bash

The process will then run with a virtual UID as specified by -u (or default). The UID must be a part of the virtual UID space passed to docker run.

The key change is to translate UIDs and GIDs on all files in the container to their virtual values and adjust permissions of a few global directories so that the process won't stumble on missing root privileges on the host.

Traversing the entire container file system does take some time. As an optimization, its worth committing those changes back to the image and adding a flag so that UID translation doesn't have to be repeated the next time.

@dineshs-altiscale dineshs-altiscale added a commit to dineshs-altiscale/docker that referenced this issue Mar 11, 2014
@dineshs-altiscale dineshs-altiscale Support for user namespaces
This exposes UID namespace support.  A new command line option (--uidmap)
maps a set of virtual UIDs to which the application within the container
is confined.  The application could potentially be the root in the
container but unprivileged on the host.

Addresses issue #2918

Docker-DCO-1.1-Signed-off-by: Dinesh Subhraveti <dineshs@altiscale.com> (github: dineshs-altiscale)
b4a363b
@dineshs-altiscale dineshs-altiscale added a commit to dineshs-altiscale/docker that referenced this issue Mar 19, 2014
@dineshs-altiscale dineshs-altiscale Support for user namespaces
This exposes UID namespace support.  A new command line option (--uidmap)
maps a set of virtual UIDs to which the application within the container
is confined.  The application could potentially be the root in the
container but unprivileged on the host.

Addresses issue #2918

Docker-DCO-1.1-Signed-off-by: Dinesh Subhraveti <dineshs@altiscale.com> (github: dineshs-altiscale)
53c482b
@dineshs-altiscale dineshs-altiscale added a commit to dineshs-altiscale/docker that referenced this issue Mar 19, 2014
@dineshs-altiscale dineshs-altiscale Support for user namespaces
This exposes UID namespace support.  A new command line option (--uidmap)
maps a set of virtual UIDs to which the application within the container
is confined.  The application could potentially be the root in the
container but unprivileged on the host.

Addresses issue #2918

Docker-DCO-1.1-Signed-off-by: Dinesh Subhraveti <dineshs@altiscale.com> (github: dineshs-altiscale)
f14d3df
@dineshs-altiscale dineshs-altiscale added a commit to dineshs-altiscale/docker that referenced this issue Mar 22, 2014
@dineshs-altiscale dineshs-altiscale Support for user namespaces
This exposes UID namespace support.  A new command line option (--uidmap)
maps a set of virtual UIDs to which the application within the container
is confined.  The application could potentially be the root in the
container but unprivileged on the host.

Addresses issue #2918

Docker-DCO-1.1-Signed-off-by: Dinesh Subhraveti <dineshs@altiscale.com> (github: dineshs-altiscale)
0ca1124
@dineshs-altiscale dineshs-altiscale added a commit to dineshs-altiscale/docker that referenced this issue Mar 22, 2014
@dineshs-altiscale dineshs-altiscale Support for user namespaces
This exposes UID namespace support.  A new command line option (--uidmap)
maps a set of virtual UIDs to which the application within the container
is confined.  The application could potentially be the root in the
container but unprivileged on the host.

Addresses issue #2918

Docker-DCO-1.1-Signed-off-by: Dinesh Subhraveti <dineshs@altiscale.com> (github: dineshs-altiscale)
858f5d2
@dineshs-altiscale dineshs-altiscale added a commit to dineshs-altiscale/docker that referenced this issue Mar 23, 2014
@dineshs-altiscale dineshs-altiscale Support for user namespaces
This exposes UID namespace support.  A new command line option (--uidmap)
maps a set of virtual UIDs to which the application within the container
is confined.  The application could potentially be the root in the
container but unprivileged on the host.

Addresses issue #2918

Docker-DCO-1.1-Signed-off-by: Dinesh Subhraveti <dineshs@altiscale.com> (github: dineshs-altiscale)
104725d
@nearsemiring

Any update on this? A namespace change isn't going to help users stuck on legacy hosts. Speed isn't an issue. It could run 100x slower and still be useful.

@jessfraz jessfraz added the feature label Feb 25, 2015
@jessfraz
Contributor

duplicate of #7906

@jessfraz jessfraz closed this Feb 26, 2015
@tphyahoo

Could this have label project/security added?

@damienmg
damienmg commented Feb 2, 2016

I don't see how this is a duplicate of #7906 though there are some connection.

I try to do exactly that, run a docker daemon inside a namespace sandbox and it does not work. With docker 10rc2, I get the following log:

INFO[0000] Graph migration to content-addressability took 0.00 seconds 
WARN[0000] Running modprobe bridge br_netfilter failed with message: modprobe: ERROR: could not insert 'bridge': Operation not permitted
modprobe: WARNING: Module br_netfilter not found.
insmod /lib/modules/3.13.0-74-generic/kernel/net/llc/llc.ko 
, error: exit status 1 
WARN[0000] Running modprobe nf_nat failed with message: `modprobe: ERROR: could not insert 'nf_nat': Operation not permitted
insmod /lib/modules/3.13.0-74-generic/kernel/net/netfilter/nf_conntrack.ko`, error: exit status 1 
INFO[0000] Default bridge (docker0) is assigned with an IP address 172.17.0.0/16. Daemon option --bip can be used to set a preferred IP address 
WARN[0000] Your kernel does not support cgroup memory limit: mountpoint for memory not found 
WARN[0000] mountpoint for cpu not found                 
WARN[0000] mountpoint for blkio not found               
WARN[0000] mountpoint for cpuset not found              
FATA[0000] Error starting daemon: Devices cgroup isn't mounted 

You can obviously run a docker daemon inside a docker container in priviledged mode but that does not help in running the docker daemon without requiring root privilege.

If that bug is really resolved, is there any description somewhere on how to do it?

@PAStheLoD

@damienmg What happens if you mount a cgroups hierarchy before starting docker?

@damienmg
damienmg commented Feb 2, 2016

Same thing, wether I do the mount inside or outside the namespace sandbox

@bittner
bittner commented Jun 16, 2016

Sorry in advance for a maybe stupid question:

For what I understand reading this discussion and #15187 with the advent of user namespaces in Docker 1.9 the problem of having files owned by root on the host file system should be a matter of the past.

Files still owned by root on a mounted volume

Now, I run docker-engine 1.11.2 and docker-compose 1.7.1. And the files that the Docker container of my (Python) application generates in the mounted volume of the container (e.g. .pyc files) are still all owned by root on my host system. Which is a bit of a pain when I want to clean up the project and have the intermediate build files, etc. removed from the project. (I have to use sudo to first chown them or delete them using sudo straight away. Feels a bit wacky.)

Is this a different issue, or am I missing something?

@justincormack
Member

@bittner have you enabled user namespaces? They are not enabled by default. However, they do not yet fix all the issues with ownership, there is still work going on upstream in the kernel to make a better solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment