New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow running dockerd as a non-root user (Rootless mode) #38050

Open
wants to merge 1 commit into
base: master
from

Conversation

@AkihiroSuda
Copy link
Member

AkihiroSuda commented Oct 16, 2018

- What I did

Allow running dockerd in an unprivileged user namespace (rootless mode).
Close #37375

No SETUID/SETCAP binary is required, except newuidmap and newgidmap.

For Kubernetes integration, please refer to https://github.com/rootless-containers/usernetes .

This PR contains two commits, but the first one is same as #38038 (overlayfs in userns for Ubuntu).
I'll rebase this PR when #38038 gets merged.
(Updated: #38083 is merged now)

- How I did it

By using user_namespaces(7), mount_namespaces(7), network_namespaces(7), and slirp4netns.

Please refer to docs/rootless.md for the details.

- How to verify it

  • Make sure /etc/subuid and /etc/subgid contain the entry for you
$ id -u
1001
$ whoami
penguin
$ grep ^$(whoami): /etc/subuid
penguin:231072:65536
$ grep ^$(whoami): /etc/subgid
penguin:231072:65536
  • Start daemon: dockerd-rootless.sh --experimental
  • Start client: docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...

Remarks:

  • Some distros such as Debian (excluding Ubuntu) and Arch Linux require sudo sh -c "echo 1 > /proc/sys/kernel/unprivileged_userns_clone".
  • Some distros require sudo modprobe ip_tables iptable_mangle iptable_nat iptable_filter.

Restrictions:

  • Only vfs graphdriver is supported. However, on Ubuntu and a few distros, overlay2 and overlay are also supported. Starting with Linux 4.18, we will be also able to implement FUSE snapshotters.
  • Cgroups (including docker top) and AppArmor are disabled at the moment. In future, Cgroups will be optionally available when delegation permission is confi
    gured on the host.
  • Checkpoint is not supported at the moment.
  • Running rootless dockerd in rootless/rootful dockerd is also possible, but not fully tested.

- Description for the changelog

Allow running dockerd in an unprivileged user namespace (rootless mode)

- A picture of a cute animal (not mandatory but encouraged)

penguin

https://en.wikipedia.org/wiki/Little_penguin#/media/File:Eudyptula_minor_Bruny_1.jpg

Screenshot:

penguin0@suda-ws01:~$ id
uid=1002(penguin0) gid=1006(penguin0) groups=1006(penguin0)
penguin0@suda-ws01:~$ ps u
USER        PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
penguin0 122952  0.0  0.0  21484  5156 pts/3    Ss   16:58   0:00 /bin/bash -l
penguin0 123093  0.0  0.0  21484  5200 pts/4    Ss   16:58   0:00 /bin/bash -l
penguin0 123094  0.0  0.0 134792  2860 pts/4    S    16:58   0:00 (sd-pam)
penguin0 123252  0.0  0.0   4628   784 pts/4    S+   16:58   0:00 /bin/sh /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123253  0.0  0.0 105772  3696 pts/4    Sl+  16:58   0:00 rootlesskit --net=slirp4netns --mtu=65520 --copy-up=/etc --copy-up=/run /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123257  0.0  0.0 105516  4024 pts/4    Sl+  16:58   0:00 /proc/self/exe --net=slirp4netns --mtu=65520 --copy-up=/etc --copy-up=/run /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123265  0.0  0.0   2980  1072 pts/4    S+   16:58   0:00 slirp4netns --mtu 65520 123257 tap0
penguin0 123281  0.0  0.0   4628   828 pts/4    S+   16:58   0:00 /bin/sh /usr/local/bin/dockerd-rootless.sh --experimental
penguin0 123283  0.6  0.8 583536 65728 pts/4    Sl+  16:58   0:00 dockerd --experimental
penguin0 125126  0.0  0.0  38372  3688 pts/3    R+   17:00   0:00 ps u
penguin0@suda-ws01:~$ docker -H unix:///run/user/1002/docker.sock run --rm hello-world

Hello from Docker!
This message shows that your installation appears to be working correctly.

To generate this message, Docker took the following steps:
 1. The Docker client contacted the Docker daemon.
 2. The Docker daemon pulled the "hello-world" image from the Docker Hub.
    (amd64)
 3. The Docker daemon created a new container from that image which runs the
    executable that produces the output you are currently reading.
 4. The Docker daemon streamed that output to the Docker client, which sent it
    to your terminal.

To try something more ambitious, you can run an Ubuntu container with:
 $ docker run -it ubuntu bash

Share images, automate workflows, and more with a free Docker ID:
 https://hub.docker.com/

For more examples and ideas, visit:
 https://docs.docker.com/get-started/

@AkihiroSuda AkihiroSuda requested a review from tianon as a code owner Oct 16, 2018

@AkihiroSuda AkihiroSuda referenced this pull request Oct 16, 2018

Open

[UMBRELLA] patch submission status #42

3 of 6 tasks complete

@vdemeester vdemeester requested review from thaJeztah , cpuguy83 and dmcgowan Oct 16, 2018

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Oct 16, 2018

@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:rootless branch 2 times, most recently from 483ab2e to e183cfb Oct 16, 2018

@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:rootless branch from e183cfb to 21cc7a8 Oct 16, 2018

@AkihiroSuda AkihiroSuda changed the title Rootless mode Allow running dockerd as a non-root user (Rootless mode) Oct 16, 2018

@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:rootless branch 2 times, most recently from abb3322 to 79c8968 Oct 16, 2018

@codecov

This comment has been minimized.

Copy link

codecov bot commented Oct 16, 2018

Codecov Report

❗️ No coverage uploaded for pull request base (master@5ec3138). Click here to learn what that means.
The diff coverage is 19.35%.

@@            Coverage Diff            @@
##             master   #38050   +/-   ##
=========================================
  Coverage          ?   36.59%           
=========================================
  Files             ?      608           
  Lines             ?    45304           
  Branches          ?        0           
=========================================
  Hits              ?    16578           
  Misses            ?    26435           
  Partials          ?     2291
@sargun

This comment has been minimized.

Copy link
Contributor

sargun commented Oct 17, 2018

How can you delegate cgroups? A piece of work prior to this might be supporting cgroup namespace?

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Oct 17, 2018

How can you delegate cgroups? A piece of work prior to this might be supporting cgroup namespace?

Cgroups delegation is disabled on this PR and it is likely to be a separate PR in future.

Until we can get full cgroups v2 support in runc (blocked due to lack of freezer and device subsystems, see opencontainers/runc#654), we would need to use pam_cgfs, although it is unlikely to be available on Red Hat distros: containers/libpod#1429

@thaJeztah
Copy link
Member

thaJeztah left a comment

Not too familiar with all the requirements to make this work, but had a quick glance over, and left some comments/suggestions 🤗

Show resolved Hide resolved Dockerfile Outdated
Show resolved Hide resolved cmd/dockerd/config_common_unix.go Outdated
Show resolved Hide resolved cmd/dockerd/daemon_unix.go Outdated
Show resolved Hide resolved cmd/dockerd/daemon_unix.go Outdated
Show resolved Hide resolved contrib/dockerd-rootless.sh Outdated
Show resolved Hide resolved pkg/archive/archive_linux.go Outdated
Show resolved Hide resolved pkg/archive/archive_linux.go Outdated
Show resolved Hide resolved pkg/archive/archive_linux.go Outdated
Show resolved Hide resolved pkg/archive/archive_linux.go Outdated
Show resolved Hide resolved pkg/archive/archive_linux.go Outdated

@thaJeztah thaJeztah added this to backlog in maintainers-session Oct 17, 2018

@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:rootless branch from 79c8968 to 76a6d66 Oct 19, 2018

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Oct 19, 2018

addressed comments

@derek derek bot added the rebuild/z label Jan 10, 2019

@olljanat

This comment has been minimized.

Copy link
Contributor

olljanat commented Jan 10, 2019

I'm OK with this change as it passes all existing tests but it would be good idea to add also rootless mode CI or include that to part of current experimental build server.

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 11, 2019

@thaJeztah Can we add rootless to Jenkins?

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 11, 2019

Rootless mode could be tested with DOCKER_REMOTE_DAEMON=1 DOCKER_TEST_HOST=unix:///run/user/1001/docker.sock make test-integration. (currently seems broken? It seems just testing with local daemon...)

We would need to make sure cgroup tests are skipped on rootless mode in follow-up PRs.

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 14, 2019

Show resolved Hide resolved cmd/dockerd/daemon.go Outdated
@thaJeztah
Copy link
Member

thaJeztah left a comment

some concerns were raised about the slirp4netns binary being GPL licensed; not sure if that has consequences (so let's wait with merging until we've checked if that's a problem)

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jan 16, 2019

FWIW there's one file already in this repo that's using that license;

# Distributed under the terms of the GNU General Public License v2

@justincormack

This comment has been minimized.

Copy link
Contributor

justincormack commented Jan 16, 2019

I think slirp4netns should be packaged separately, mixing licensing in one package is not a good idea, and source must be available, so downloading from the network is not acceptable. It would be a good idea to get distros to package it if we are going to use it.

@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jan 16, 2019

From the readme (https://github.com/rootless-containers/slirp4netns) looks like it's already packages on some distros;

  • RHEL 8 & Fedora (28 or later)
  • Arch Linux:
  • openSUSE Tumbleweed
  • openSUSE Leap 15.0
  • SUSE Linux Enterprise 15
  • Debian GNU/Linux Sid
@thaJeztah

This comment has been minimized.

Copy link
Member

thaJeztah commented Jan 16, 2019

But we'd need a script to install it for CI (in this repo), a detection if it's installed (so that we give a proper error if the feature is not available) perhaps a convenience script for those that install from the static packages 🤔 (thinking out loud)

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 16, 2019

Is it better to use VPNKit instead by default?

Concerns:

@tonistiigi

This comment has been minimized.

Copy link
Member

tonistiigi commented Jan 16, 2019

mixing licensing in one package is not a good idea

There is no plan atm to start including slirp4netns in regular packages.

looks like it's already packages on some distros

With the current implementation that provides scripts that should be run as unprivileged user (instead of dropping to rootless from root) this doesn't solve the main use cases. The idea discussed offline was only to provide an extra tarball in https://download.docker.com/linux/static/nightly/x86_64/ (eg. docker-contrib-0.0.0-xxx-xxx.tgz ) with the extra binaries/gpl-stub that rootless depends on. This makes it possible to have an install script that can be run without ever needing sudo (in some systems at least). If you can install packages you can just install regular docker.

Is it better to use VPNKit instead by default?

If we can't figure this out then I think thats the best option. The launcher script can still have slirp4netns as a default if it can be found on the system. Or rootless install script can pull it from https://github.com/rootless-containers/slirp4netns/releases .

Low throughput:

When I tested it, the throughput was much lower than in your stats. I can test again with new MTU config. I think the numbers you provided are acceptable.

Compiling VPNkit takes a lot of time. Can we use prebuilt binary?

@djs55 Do you know where we could get that without slowing down the moby build.

@djs55

This comment has been minimized.

Copy link

djs55 commented Jan 16, 2019

I set up a hub autobuilder for moby/vpnkit as an experiment: https://cloud.docker.com/repository/docker/djs55/vpnkit . The image contains a statically linked binary:

$ docker run -it djs55/vpnkit /vpnkit --help
NAME
       vpnkit - proxy TCP/IP connections from an ethernet link via sockets

SYNOPSIS
       vpnkit [OPTION]...
...

We've not tagged a release of moby/vpnkit in a while -- perhaps we should do that soon.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jan 16, 2019

re: vpnkit performance, has this been built with go1.11? io.Copy(tcpConn, tcpConn) should be much faster in go1.11.

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 17, 2019

djs55/vpnkit seems amd64-only?

re: vpnkit performance, has this been built with go1.11? io.Copy(tcpConn, tcpConn) should be much faster in go1.11.

Thanks, will try to run benchmark with Go 1.11

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 17, 2019

Uh, there is no io.Copy but io.Reader.Read ; *vmnet.Vif.Write and *vmnet.Vif.Read; io.Writer.Write

https://github.com/rootless-containers/rootlesskit/blob/325f47b88ce7a68fbde9827d0f6205f2f9070e79/pkg/network/vpnkit/vpnkit.go#L186-L209

@cpuguy83

This comment has been minimized.

Copy link
Contributor

cpuguy83 commented Jan 17, 2019

Ah I was looking at libproxy in vpnkit and assumed that was what was being used :(

Well, IF these are tcp conns being proxied there, io.Copy will use splice(2) on Linux instead of a user space copy.

@AkihiroSuda

This comment has been minimized.

Copy link
Member

AkihiroSuda commented Jan 18, 2019

Updated to use prebuilt djs55/vpnkit binary.

Support for non-amd64 and slirp4netns can be discussed in follow-up PR series.

allow running `dockerd` in an unprivileged user namespace (rootless m…
…ode)

Please refer to `docs/rootless.md`.

TLDR:
 * Make sure `/etc/subuid` and `/etc/subgid` contain the entry for you
 * `dockerd-rootless.sh --experimental`
 * `docker -H unix://$XDG_RUNTIME_DIR/docker.sock run ...`

Signed-off-by: Akihiro Suda <suda.akihiro@lab.ntt.co.jp>

@AkihiroSuda AkihiroSuda force-pushed the AkihiroSuda:rootless branch from 4c22b85 to 6cac613 Jan 18, 2019

@@ -233,6 +238,10 @@ RUN cd /docker-py \
&& pip install paramiko==2.4.2 \
&& pip install yamllint==1.5.0 \
&& pip install -r test-requirements.txt
COPY --from=rootlesskit /build/ /usr/local/bin/
# VPNKit git b4c8b69e68f74c69a6e2fff696a3a196b061dde6 (1/5/2019)
# FIXME: currently, this always install amd64 binary

This comment has been minimized.

@thaJeztah

thaJeztah Jan 18, 2019

Member

Wonder if we can build from source; there's a Dockerfile in the repo to build vpnkit https://github.com/moby/vpnkit/blob/master/Dockerfile, but not sure we should copy that (perhaps the steps from the Dockerfile could be move into the Makefile? @djs55 - think that would work?

This comment has been minimized.

@AkihiroSuda

AkihiroSuda Jan 18, 2019

Member

It requires more than 10 minutes...

Any chance to get non-amd64 prebuilt binaries?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment