Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLOG: How to run a more secure non root user container. #203

Closed
rhatdan opened this issue Dec 17, 2015 · 18 comments
Closed

BLOG: How to run a more secure non root user container. #203

rhatdan opened this issue Dec 17, 2015 · 18 comments

Comments

@rhatdan
Copy link
Member

rhatdan commented Dec 17, 2015

I was asked a question about running users inside of a docker container: could they still get privileges?

For more background on Linux capabilities, see: http://linux.die.net/man/7/capabilities

We'll start with a simple container where the primary process is running as root. One can look at the capabilities of the current process via grep Cap /proc/self/status. There is also a capsh utility.

# docker run --rm -ti fedora grep Cap /proc/self/status
CapInh: 00000000a80425fb
CapPrm: 00000000a80425fb
CapEff: 00000000a80425fb
CapBnd: 00000000a80425fb
CapAmb: 0000000000000000

Notice that the Effective Capabilities (CapEff) is a non-zero value, which means that the process has capabilities.

Using the pscap tool, I see that the process has these capabilities.

chown, dac_override, fowner, fsetid, kill, setgid, setuid, setpcap, net_bind_service, net_raw, sys_chroot, mknod, audit_write, setfcap

Now let's run a container as non root using the -u option.

docker run -u 3267 fedora grep Cap /proc/self/status
CapInh: 00000000a80425fb
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 00000000a80425fb
CapAmb: 0000000000000000

Notice that the Effective capabilities (CapEff) is all zero, but the bounding set of capabilities (CapBnd) is not. This means that if there is a setuid binary included in the image, it would be possible to gain these capabilities. Notice also, not surprisingly, this number matches the previous container.

So even though this process is running as non root inside the container, it could potentially run with the same capabilities as above if the image builder included a setuid binary.

chown, dac_override, fowner, fsetid, kill, setgid, setuid, setpcap, net_bind_service, net_raw, sys_chroot, mknod, audit_write, setfcap

Docker has a nice feature where you can drop all capabilities via --cap-drop=all. Now, if we execute the same container with a non privileged user and drop all capabilities:

# docker run --rm -ti --cap-drop=all -u 3267 fedora grep Cap /proc/self/status
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 0000000000000000
CapAmb: 0000000000000000

Now this user cannot gain any capabilities on the system. I would advise almost all users of Docker to that run their containers with non privileged users to use this feature. This adds a lot of security to the system.

@cgwalters
Copy link
Member

(edited for typos, grammar, and some extra markdown, used --rm too)

@rhatdan
Copy link
Member Author

rhatdan commented Dec 18, 2015

Thanks I think we want this blog to go out after new year. Too few readers now. Also want to do meeting on Monday, see if anything else comes up.

@jzb
Copy link
Contributor

jzb commented Dec 21, 2015

Yeah, I will plan on putting this up on 5 January unless I hear otherwise. Nobody's going to be paying much attention this week.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 4, 2016

I think this is all set to go tomorrow.

@cgwalters
Copy link
Member

I think I mentioned this elsewhere, but an issue with this is that a setuid binary still gives you uid 0, which even without capabilities can be a privilege escalation path. See https://bugs.freedesktop.org/show_bug.cgi?id=35623 for example.

Certainly outside of a filesystem namespace, it's enough to change /usr/bin/bash or /root/.bashrc so that the next admin that logs in executes your code, etc.

Inside of a container it is probably safer...as long as you don't have mounts visible to the host. And you are relying on things like /sys being mounted read-only.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 4, 2016

Well the sudo/su commands will fail when they attempt the setuid() system call. But yes if you are running as root inside of the container without capabilities you can still manipulate files owned by root or group root with rw privs.

@cgwalters
Copy link
Member

Note with PR_NO_NEW_PRIVS as https://git.gnome.org/browse/linux-user-chroot uses, the setuid bit (including transitioning to uid 0) is entirely skipped.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 4, 2016

I have been asking Steve for a situation where PR_SET_NO_NEW_PRIVS should be used.

I have started throwing together a pull request for this in runc/docker. Now sure if we want

docker run --nonewprivs

Or should this be default for docker run --cap-drop=all.

@cgwalters
Copy link
Member

Ok, looks like I was wrong, without CAP_SETUID the kernel will ignore the setuid bit (see security/commoncap.c in cap_bprm_set_creds() in b4ba1f0f6533e3a49976f5736b263478509099a0 ).

NNP (above missing CAP_SETUID) will at least still ensure bounded SELinux domain transitions, but that doesn't currently matter for Docker because we only use one domain inside a container.

That said, I think it would be compatible for --cap-drop=all to actually imply NNP too, and would add a small degree of extra safety.

The way I think of it - NNP (+ seccomp) is the future. Capabilities are mostly useless. We should be aiming for a setuid-less world, and NNP forces that.

@rhatdan
Copy link
Member Author

rhatdan commented Jan 4, 2016

I have @mrunalp working on a patch for runc/docker to add this NNP, I guess we already do some of this in the seccomp patch, but I would like to have something separate from that at the CLI level.

@jberkus
Copy link
Contributor

jberkus commented Feb 25, 2016

So ... I take it this blog post is still not ready?

@bproffitt
Copy link
Contributor

This ran on Jan 7:
http://www.projectatomic.io/blog/2016/01/how-to-run-a-more-secure-non-root-user-container/

BKP

On Wed, Feb 24, 2016 at 9:27 PM, Josh Berkus notifications@github.com
wrote:

So ... I take it this blog post is still not ready?


Reply to this email directly or view it on GitHub
#203 (comment)
.

Brian Proffitt
Principal Community Analyst
Open Source and Standards
@TheTechScribe
574.383.9BKP

@rhatdan rhatdan closed this as completed Feb 25, 2016
@michalrabinowitch
Copy link

@rhamilto how would apply those potential caps as non root user? CapBnd for example

@rhamilto
Copy link
Contributor

rhamilto commented Nov 5, 2018

^ @rhatdan

@rhatdan
Copy link
Member Author

rhatdan commented Nov 5, 2018

@michalrabinowitch setuid applications like sudo, su or random ones on the disk image.

@mydockergit
Copy link

@michalrabinowitch setuid applications like sudo, su or random ones on the disk image.

But if you are a non-root user inside the container you won't be able to run sudo or su because it will ask for a root password.

@rhatdan
Copy link
Member Author

rhatdan commented Dec 5, 2018

Random ones on disk might not ask for the password, and sudo can be setup to not require password

@rhatdan
Copy link
Member Author

rhatdan commented Dec 5, 2018

We are looking for defense in depth, relying on software to be written correctly is not enough protection, in my opinion.

spantaleev added a commit to spantaleev/matrix-docker-ansible-deploy that referenced this issue Jan 28, 2019
We run containers as a non-root user (no effective capabilities).

Still, if a setuid binary is available in a container image, it could
potentially be used to give the user the default capabilities that the
container was started with. For Docker, the default set currently is:
- "CAP_CHOWN"
- "CAP_DAC_OVERRIDE"
- "CAP_FSETID"
- "CAP_FOWNER"
- "CAP_MKNOD"
- "CAP_NET_RAW"
- "CAP_SETGID"
- "CAP_SETUID"
- "CAP_SETFCAP"
- "CAP_SETPCAP"
- "CAP_NET_BIND_SERVICE"
- "CAP_SYS_CHROOT"
- "CAP_KILL"
- "CAP_AUDIT_WRITE"

We'd rather prevent such a potential escalation by dropping ALL
capabilities.

The problem is nicely explained here: projectatomic/atomic-site#203
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants