Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't bind to privileged ports as non-root #8460

Closed
tianon opened this issue Oct 7, 2014 · 44 comments · Fixed by #41030
Closed

Can't bind to privileged ports as non-root #8460

tianon opened this issue Oct 7, 2014 · 44 comments · Fixed by #41030

Comments

@tianon
Copy link
Member

tianon commented Oct 7, 2014

The simplest way to reproduce this is:

$ docker run --rm -u 1000 php:apache
...
(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80
(13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80
...

So, that led me to try this, but with the same result:

$ docker run --rm -u 1000 --cap-add NET_BIND_SERVICE php:apache
...
(13)Permission denied: AH00072: make_sock: could not bind to address [::]:80
(13)Permission denied: AH00072: make_sock: could not bind to address 0.0.0.0:80
...

IMO, it seems reasonable to allow non-root to bind to privileged ports inside the container, especially since they have a private net namespace, so I was actually surprised this wasn't already taken care of. I'm also confused as to why the --cap-add didn't work, but maybe that's because it adds the cap to the whitelist of things to not remove, not necessarily adds it if it isn't there? I'm grasping at straws here.

@tianon
Copy link
Member Author

tianon commented Oct 7, 2014

Maybe I've even just got the wrong CAP I'm trying to add.

@tianon
Copy link
Member Author

tianon commented Oct 7, 2014

Essentially we'd love to drop the User/Group from the Apache config in that image, and if we could bind on port 80 as non-root, we could just use USER directly to do that, which would be magical and let users arbitrarily change the exact UID they run Apache as to make sure it can access any files they're sharing from another container, for example.

@jpetazzo
Copy link
Contributor

jpetazzo commented Oct 7, 2014

+1 (you got the right cap, but the cap also has to be granted to the process I believe)

@crosbymichael
Copy link
Contributor

Do you have anything else like apparmor or selinux?

@tianon
Copy link
Member Author

tianon commented Oct 7, 2014

Definitely not here - those things make life hard. 😄

@jakajancar
Copy link

I don't mind if only root is allowed to bind to privileged ports (to mimic containerless environments), but our deployment approach uses authbind to give the permission to other users. Unfortunately, authbind doesn't seem to work in Docker (I always get Permission denied when attempting to bind regardless).

@allingeek
Copy link
Contributor

An even more simple example:

$sudo docker run --rm -u nobody --cap-add net_bind_service ubuntu:latest nc -l 0.0.0.0 80
nc: Permission denied

Further capsh indicates that cap_net_bind_service is included:

$ docker run --rm -u nobody --cap-add net_bind_service ubuntu:latest capsh --print
Current: = cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap+i
Bounding set =cap_chown,cap_dac_override,cap_fowner,cap_fsetid,cap_kill,cap_setgid,cap_setuid,cap_setpcap,cap_net_bind_service,cap_net_raw,cap_sys_chroot,cap_mknod,cap_audit_write,cap_setfcap,37
Securebits: 00/0x0/1'b0
 secure-noroot: no (unlocked)
 secure-no-suid-fixup: no (unlocked)
 secure-keep-caps: no (unlocked)
uid=65534(nobody)
gid=65534(nogroup)
groups=

Running on boot2docker:

$ uname -a
Linux boot2docker 3.16.7-tinycore64 #1 SMP Tue Dec 16 23:03:39 UTC 2014 x86_64 GNU/Linux

Versions:

$ docker version
Client version: 1.4.1
Client API version: 1.16
Go version (client): go1.3.3
Git commit (client): 5bc2ff8
OS/Arch (client): darwin/amd64
Server version: 1.4.1
Server API version: 1.16
Go version (server): go1.3.3
Git commit (server): 5bc2ff8

What is the guidance? Is this supposed to work?

@cpuguy83
Copy link
Member

cpuguy83 commented Feb 2, 2015

This fails with --privileged as well, fyi.

@spf13 spf13 added kind/bug Bugs are bugs. The cause may or may not be known at triage time so debugging may be needed. and removed bug labels Mar 21, 2015
@mrunalp
Copy link
Contributor

mrunalp commented Mar 27, 2015

The issue here is that the process's effective capabilities don't have
cap_net_bind_service when the user is not root unless the file you are
trying to execute has those capabilities bits set.

Example to get it working for nc:

[root@localhost ~]# docker run -it --rm -u 1000 ubuntu /bin/nc.openbsd -l 0.0.0.0 80
nc.openbsd: Permission denied

# Run an image as root, setcap the necessary file and commit it.
[root@localhost ~]# docker run -it --rm ubuntu bash
root@caa42897b59c:/#
root@caa42897b59c:/# setcap 'cap_net_bind_service=+ep' /bin/nc.openbsd
root@caa42897b59c:/# getcap  /bin/nc.openbsd
/bin/nc.openbsd = cap_net_bind_service+ep
root@caa42897b59c:/# exit
exit

# Run the new image as non-root
[root@localhost ~]# docker run -it --rm -u 1000 ubuntu-user-nc /bin/nc.openbsd -l 0.0.0.0 80

works!

@tianon
Copy link
Member Author

tianon commented Mar 31, 2015

@mrunalp don't we have Docker backends that don't properly support those cap bits though?

@mrunalp
Copy link
Contributor

mrunalp commented Mar 31, 2015

@tianon There is really no other option here besides using user namespaces or setting caps on files :(

@eparis
Copy link
Contributor

eparis commented Mar 31, 2015

so setting the executables as suid root (even worse)

@tianon
Copy link
Member Author

tianon commented Apr 1, 2015

@mrunalp you've got me intrigued with mention of user namespaces -- do you just mean mapping the non-root user to be host root, or is there some other feature of user namespaces that makes privileged port mapping possible as non-root without setting caps on files? 😄

@mrunalp
Copy link
Contributor

mrunalp commented Apr 1, 2015

@tianon In case of user namespaces, the container process uid is 0 inside (and non-zero outside). When a process with uid 0 execs a binary (irrespective of which user namespace it belongs to), the rules for capabilities for root apply and it gains all the capabilities in its permitted and effective sets except those masked by the bounding set. So, from a capabilities perspective it will behave just like regular root and since we have CAP_NET_BIND_SERVICE in the default template, it will work :)

@tianon
Copy link
Member Author

tianon commented Apr 1, 2015

Ah, but we'd still have the issue where USER and -u make this not work; shucks.

Looks like it's time to document that setting capabilities on files is the solution going forward.

@mrunalp
Copy link
Contributor

mrunalp commented Apr 2, 2015

@tianon Yep, this should be documented.

@thaJeztah
Copy link
Member

@tianon @mrunalp apparently you both have a clear understanding what has to be documented. Would one of you be willing to create a PR for adding a description (and example) to the documentation?

Also, is documentation the only actionable item here? Or should this stay open after that (in hope of a better solution)?

@mrunalp
Copy link
Contributor

mrunalp commented Apr 4, 2015

I can add the documentation changes. I don't think there is any fix short of a kernel change if it ever happens.

Sent from my iPhone

On Apr 4, 2015, at 1:21 PM, Sebastiaan van Stijn notifications@github.com wrote:

@tianon @mrunalp apparently you both have a clear understanding what has to be documented. Would one of you be willing to create a PR for adding a description (and example) to the documentation?

Also, is documentation the only actionable item here? Or should this stay open after that (in hope of a better solution)?


Reply to this email directly or view it on GitHub.

@thaJeztah
Copy link
Member

@mrunalp That would be great! In that case, I think you should add a closes #8460 to that PR

@tianon can you agree with that? (as it's your issue)

@mjameswh
Copy link
Contributor

mjameswh commented Nov 15, 2018

Is there any actual risk to set --sysctl net.ipv4.ip_unprivileged_port_start=0 on a container when that container is attached only to bridge networks (or actually, any network except host)? After all, even if a process opens an extra low port from inside the container, it wont be accessible unless the port is actually exposed and mapped to an external port, from the host. And the possibility that a port that is expected to be legitimately opened be hijacked by an illegitimate process, is not greater with this config (relative to what it is now), since the illegitimate process would have to start before the legitimate process opens its port, so very early in the container launch sequence, at which time, the process would be running as root anyway in the current approach...

What I mean by this is that setting net.ipv4.ip_unprivileged_port_start to 0 by default (that is, unless the container is attached directly to a network of type host) would appears to be a possible and reasonable solution, and most likely more secure than the status quo. Of course, this is arguably a patch, and yes this could be simply be highlighted in documentation. However, from the large number of outside tickets pointing to this one, it seems to be a very common issue, and it appears that many people, if not most, ends up lowering their security (by running their container as root) because they failed to find documentation on how to fix the issue by themselves.

mendhak added a commit to mendhak/docker-http-https-echo that referenced this issue Nov 25, 2020
I've chowned the /app directory in the container to node user.  This allows the process to access the private key.

However! I've not set the USER to node in the Dockerfile because then the process would be unable to bind to port 80 internally (or anything below 1024).
I didn't want to force it either and make it a breaking change, because many existing setups will be mapping 8080 externally to 80/443 internally, for example.

The workaround is therefore on the user (sorry), they will need to set the environment variable to a higher number, or use the --sysctl net.ipv4.ip_unprivileged_port_start=0 flag

Reference: moby/moby#8460 (comment)

Issue #14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment