New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't bind to privileged ports as non-root #8460

Open
tianon opened this Issue Oct 7, 2014 · 44 comments

Comments

Projects
None yet
@tianon
Member

tianon commented Oct 7, 2014

The simplest way to reproduce this is:

$ docker run --rm -u 1000 php:apache
...
(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80
(13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80
...

So, that led me to try this, but with the same result:

$ docker run --rm -u 1000 --cap-add NET_BIND_SERVICE php:apache
...
(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80
(13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80
...

IMO, it seems reasonable to allow non-root to bind to privileged ports inside the container, especially since they have a private net namespace, so I was actually surprised this wasn't already taken care of. I'm also confused as to why the --cap-add didn't work, but maybe that's because it adds the cap to the whitelist of things to not remove, not necessarily adds it if it isn't there? I'm grasping at straws here.

@tianon

This comment has been minimized.

Member

tianon commented Oct 7, 2014

Maybe I've even just got the wrong CAP I'm trying to add.

@tianon

This comment has been minimized.

Member

tianon commented Oct 7, 2014

Essentially we'd love to drop the User/Group from the Apache config in that image, and if we could bind on port 80 as non-root, we could just use USER directly to do that, which would be magical and let users arbitrarily change the exact UID they run Apache as to make sure it can access any files they're sharing from another container, for example.

@jpetazzo

This comment has been minimized.

Contributor

jpetazzo commented Oct 7, 2014

+1 (you got the right cap, but the cap also has to be granted to the process I believe)

@crosbymichael

This comment has been minimized.

Member

crosbymichael commented Oct 7, 2014

Do you have anything else like apparmor or selinux?

@tianon

This comment has been minimized.

Member

tianon commented Oct 7, 2014

Definitely not here - those things make life hard. 😄

@jakajancar

This comment has been minimized.

jakajancar commented Oct 21, 2014

I don't mind if only root is allowed to bind to privileged ports (to mimic containerless environments), but our deployment approach uses authbind to give the permission to other users. Unfortunately, authbind doesn't seem to work in Docker (I always get Permission denied when attempting to bind regardless).

@allingeek

This comment has been minimized.

Contributor

allingeek commented Feb 1, 2015

An even more simple example:

$sudo docker run --rm -u nobody --cap-add net_bind_service ubuntu:latest nc -l 0.0.0.0 80
nc: Permission denied

Further capsh indicates that cap_net_bind_service is included:

$ docker run --rm -u nobody --cap-add net_bind_service ubuntu:latest capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap,37
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=65534(nobody)
gid=65534(nogroup)
groups=

Running on boot2docker:

$ uname -a
Linux boot2docker 3.16.7-tinycore64 #1 SMP Tue Dec 16 23:03:39 UTC 2014 x86_64 GNU/Linux

Versions:

$ docker version
Client version: 1.4.1
Client API version: 1.16
Go version (client): go1.3.3
Git commit (client): 5bc2ff8
OS/Arch (client): darwin/amd64
Server version: 1.4.1
Server API version: 1.16
Go version (server): go1.3.3
Git commit (server): 5bc2ff8

What is the guidance? Is this supposed to work?

@cpuguy83

This comment has been minimized.

Contributor

cpuguy83 commented Feb 2, 2015

This fails with --privileged as well, fyi.

@spf13 spf13 added kind/bug and removed bug labels Mar 21, 2015

@mrunalp

This comment has been minimized.

Contributor

mrunalp commented Mar 27, 2015

The issue here is that the process's effective capabilities don't have
cap_net_bind_service when the user is not root unless the file you are
trying to execute has those capabilities bits set.

Example to get it working for nc:

[root@localhost ~]# docker run -it --rm -u 1000 ubuntu /bin/nc.openbsd -l 0.0.0.0 80
nc.openbsd: Permission denied

# Run an image as root, setcap the necessary file and commit it.
[root@localhost ~]# docker run -it --rm ubuntu bash
root@caa42897b59c:/#
root@caa42897b59c:/# setcap 'cap_net_bind_service=+ep' /bin/nc.openbsd
root@caa42897b59c:/# getcap  /bin/nc.openbsd
/bin/nc.openbsd = cap_net_bind_service+ep
root@caa42897b59c:/# exit
exit

# Run the new image as non-root
[root@localhost ~]# docker run -it --rm -u 1000 ubuntu-user-nc /bin/nc.openbsd -l 0.0.0.0 80

works!
@tianon

This comment has been minimized.

Member

tianon commented Mar 31, 2015

@mrunalp don't we have Docker backends that don't properly support those cap bits though?

@mrunalp

This comment has been minimized.

Contributor

mrunalp commented Mar 31, 2015

@tianon There is really no other option here besides using user namespaces or setting caps on files :(

@eparis

This comment has been minimized.

Contributor

eparis commented Mar 31, 2015

so setting the executables as suid root (even worse)

@tianon

This comment has been minimized.

Member

tianon commented Apr 1, 2015

@mrunalp you've got me intrigued with mention of user namespaces -- do you just mean mapping the non-root user to be host root, or is there some other feature of user namespaces that makes privileged port mapping possible as non-root without setting caps on files? 😄

@mrunalp

This comment has been minimized.

Contributor

mrunalp commented Apr 1, 2015

@tianon In case of user namespaces, the container process uid is 0 inside (and non-zero outside). When a process with uid 0 execs a binary (irrespective of which user namespace it belongs to), the rules for capabilities for root apply and it gains all the capabilities in its permitted and effective sets except those masked by the bounding set. So, from a capabilities perspective it will behave just like regular root and since we have CAP_NET_BIND_SERVICE in the default template, it will work :)

@tianon

This comment has been minimized.

Member

tianon commented Apr 1, 2015

Ah, but we'd still have the issue where USER and -u make this not work; shucks.

Looks like it's time to document that setting capabilities on files is the solution going forward.

@mrunalp

This comment has been minimized.

Contributor

mrunalp commented Apr 2, 2015

@tianon Yep, this should be documented.

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Apr 4, 2015

@tianon @mrunalp apparently you both have a clear understanding what has to be documented. Would one of you be willing to create a PR for adding a description (and example) to the documentation?

Also, is documentation the only actionable item here? Or should this stay open after that (in hope of a better solution)?

@mrunalp

This comment has been minimized.

Contributor

mrunalp commented Apr 4, 2015

I can add the documentation changes. I don't think there is any fix short of a kernel change if it ever happens.

Sent from my iPhone

On Apr 4, 2015, at 1:21 PM, Sebastiaan van Stijn notifications@github.com wrote:

@tianon @mrunalp apparently you both have a clear understanding what has to be documented. Would one of you be willing to create a PR for adding a description (and example) to the documentation?

Also, is documentation the only actionable item here? Or should this stay open after that (in hope of a better solution)?


Reply to this email directly or view it on GitHub.

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Apr 4, 2015

@mrunalp That would be great! In that case, I think you should add a closes #8460 to that PR

@tianon can you agree with that? (as it's your issue)

justincormack added a commit to justincormack/docker that referenced this issue Jan 23, 2017

Add ambient capabilities gated by no new privileges
This is a rework of support for ambient capabilities, to avoid
the issues in the previous version, where there was a conflict
between two use cases, programs that want to use sudo and programs
that want to grant unprivileged users direct capabilities.

If you do not use the `--security-opt no-new-privileges` flag,
nothing changes with this patch. `sudo`, suid binaries and filesystem
capabilities elevate privileges, and non root users can only use
privileges via these mechanisms as on a normal Linux userspace.

With the `no-new-privileges` flag, the kernel does not allow caps
to be granted via suid binaries, so it is assumed that the user wants
to be granted capabilities directly, so ambient capabilities are
granted. For root this makes little difference, but for a normal
user this means that they can be granted capabilities directly, so
that privileged operations can be performed directly.

As previously no capabilities were granted to a non root user with
`no-new-privileges`, we take the opoprtunity to reduce the default
capability set in this case to only the three safest capabilities:
`CAP_KILL`, `CAP_AUDIT_WRITE` and `CAP_NET_SERVICE`. Other capabilities
must be granted with `--cap-add`.

`runc` commit is in opencontainers/runc#1286
Spec commit is in opencontainers/runtime-spec#668

fix moby#8460

Signed-off-by: Justin Cormack <justin.cormack@docker.com>

justincormack added a commit to justincormack/docker that referenced this issue Jan 24, 2017

Add ambient capabilities gated by no new privileges
This is a rework of support for ambient capabilities, to avoid
the issues in the previous version, where there was a conflict
between two use cases, programs that want to use sudo and programs
that want to grant unprivileged users direct capabilities.

If you do not use the `--security-opt no-new-privileges` flag,
nothing changes with this patch. `sudo`, suid binaries and filesystem
capabilities elevate privileges, and non root users can only use
privileges via these mechanisms as on a normal Linux userspace.

With the `no-new-privileges` flag, the kernel does not allow caps
to be granted via suid binaries, so it is assumed that the user wants
to be granted capabilities directly, so ambient capabilities are
granted. For root this makes little difference, but for a normal
user this means that they can be granted capabilities directly, so
that privileged operations can be performed directly.

As previously no capabilities were granted to a non root user with
`no-new-privileges`, we take the opoprtunity to reduce the default
capability set in this case to only the three safest capabilities:
`CAP_KILL`, `CAP_AUDIT_WRITE` and `CAP_NET_SERVICE`. Other capabilities
must be granted with `--cap-add`.

`runc` commit is in opencontainers/runc#1286
Spec commit is in opencontainers/runtime-spec#668

fix moby#8460

Signed-off-by: Justin Cormack <justin.cormack@docker.com>

justincormack added a commit to justincormack/docker that referenced this issue Jan 24, 2017

Add ambient capabilities gated by no new privileges
This is a rework of support for ambient capabilities, to avoid
the issues in the previous version, where there was a conflict
between two use cases, programs that want to use sudo and programs
that want to grant unprivileged users direct capabilities.

If you do not use the `--security-opt no-new-privileges` flag,
nothing changes with this patch. `sudo`, suid binaries and filesystem
capabilities elevate privileges, and non root users can only use
privileges via these mechanisms as on a normal Linux userspace.

With the `no-new-privileges` flag, the kernel does not allow caps
to be granted via suid binaries, so it is assumed that the user wants
to be granted capabilities directly, so ambient capabilities are
granted. For root this makes little difference, but for a normal
user this means that they can be granted capabilities directly, so
that privileged operations can be performed directly.

As previously no capabilities were granted to a non root user with
`no-new-privileges`, we take the opoprtunity to reduce the default
capability set in this case to only the three safest capabilities:
`CAP_KILL`, `CAP_AUDIT_WRITE` and `CAP_NET_SERVICE`. Other capabilities
must be granted with `--cap-add`.

`runc` commit is in opencontainers/runc#1286
Spec commit is in opencontainers/runtime-spec#668

fix moby#8460

Signed-off-by: Justin Cormack <justin.cormack@docker.com>

justincormack added a commit to justincormack/docker that referenced this issue Jan 24, 2017

Add ambient capabilities gated by no new privileges
This is a rework of support for ambient capabilities, to avoid
the issues in the previous version, where there was a conflict
between two use cases, programs that want to use sudo and programs
that want to grant unprivileged users direct capabilities.

If you do not use the `--security-opt no-new-privileges` flag,
nothing changes with this patch. `sudo`, suid binaries and filesystem
capabilities elevate privileges, and non root users can only use
privileges via these mechanisms as on a normal Linux userspace.

With the `no-new-privileges` flag, the kernel does not allow caps
to be granted via suid binaries, so it is assumed that the user wants
to be granted capabilities directly, so ambient capabilities are
granted. For root this makes little difference, but for a normal
user this means that they can be granted capabilities directly, so
that privileged operations can be performed directly.

As previously no capabilities were granted to a non root user with
`no-new-privileges`, we take the opoprtunity to reduce the default
capability set in this case to only the three safest capabilities:
`CAP_KILL`, `CAP_AUDIT_WRITE` and `CAP_NET_SERVICE`. Other capabilities
must be granted with `--cap-add`.

`runc` commit is in opencontainers/runc#1286
Spec commit is in opencontainers/runtime-spec#668

fix moby#8460

Signed-off-by: Justin Cormack <justin.cormack@docker.com>
@vishvananda

This comment has been minimized.

vishvananda commented Jul 1, 2017

Just an fyi, for people that don't want to deal with CAP_NET_BIND_SERVICE but would like to bind low ports; as of kernel 4.11, there is a sysctl that you can set in the network namesace will allow you to bind low ports[1]. If you are using a user namespace and you don't have privileges in the network namespace, you will have to set it on the network namespace before hand, but for everyone else you can actually set it as part of docker run:

docker run -it --sysctl net.ipv4.ip_unprivileged_port_start=0 hello

[1] http://elixir.free-electrons.com/linux/v4.11.8/source/Documentation/networking/ip-sysctl.txt#L832

@vdemeester

This comment has been minimized.

Member

vdemeester commented Feb 14, 2018

Seems like this can be closed 👼

@vdemeester vdemeester closed this Feb 14, 2018

@jcberthon

This comment has been minimized.

Contributor

jcberthon commented Feb 14, 2018

Hi @vdemeester

As far as I can see the runc spec now defines how to set ambient capabilities which could solve this problem (see opencontainers/runtime-spec#675). But as far as I can tell, this is not yet implemented in Docker (or Moby), at least on the command line or in compose files.

Therefore I would expect this issue to be kept opened as long as this is not implemented. But I could be misguided.

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Feb 15, 2018

Yes, I think @justincormack was looking at that a while back

@justincormack

This comment has been minimized.

Contributor

justincormack commented Jun 29, 2018

There is a Kubernetes proposal to support ambient capabilities in CRI kubernetes/community#2285 we will probably implement something in Docker to support that when it happens.

@irsl

This comment has been minimized.

irsl commented Jul 26, 2018

I couldn't find this documented anywhere, so adding here. This is how to switch user to a non-privileged one while keeping a certain capability (net_bind_service in this example). Busybox/alpine based images need bash to be installed first next to libcap.

capsh --caps="cap_net_bind_service+eip cap_setgid,cap_setuid+ep" --gid=1000 --uid=1000 --print -- -c "exec /my/favourity/webserver"

@mjameswh

This comment has been minimized.

mjameswh commented Nov 15, 2018

Is there any actual risk to set --sysctl net.ipv4.ip_unprivileged_port_start=0 on a container when that container is attached only to bridge networks (or actually, any network except host)? After all, even if a process opens an extra low port from inside the container, it wont be accessible unless the port is actually exposed and mapped to an external port, from the host. And the possibility that a port that is expected to be legitimately opened be hijacked by an illegitimate process, is not greater with this config (relative to what it is now), since the illegitimate process would have to start before the legitimate process opens its port, so very early in the container launch sequence, at which time, the process would be running as root anyway in the current approach...

What I mean by this is that setting net.ipv4.ip_unprivileged_port_start to 0 by default (that is, unless the container is attached directly to a network of type host) would appears to be a possible and reasonable solution, and most likely more secure than the status quo. Of course, this is arguably a patch, and yes this could be simply be highlighted in documentation. However, from the large number of outside tickets pointing to this one, it seems to be a very common issue, and it appears that many people, if not most, ends up lowering their security (by running their container as root) because they failed to find documentation on how to fix the issue by themselves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment