Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow setns() in container, or add flag to allow it specifically #496

Open
mercmobily opened this issue Nov 22, 2018 · 18 comments

Comments

@mercmobily
Copy link

commented Nov 22, 2018

The docker update from 1.11.x to 1.12.x seems to have broken setns() calls inside container. setns() is used by Chrome for creating a namespaces. I figured this out after reading this SO post

The only solution right now is to run chrome with --no-sandbox but that's way way less than ideal.
Another "solution" is to run the container with --cap-add=SYS_ADMIN -- which is a rather broad thing to do.

  • This is a bug report
  • This is a feature request
  • I searched existing issues before opening this one

Expected behavior

I expect to EITHER have a flag to enable setns() in the container (so that Chrome can run securely), OR allow setns() in docker containers.

Actual behavior

Right now, the whole world is effectively using --no-sandbox to run Chrome in containers.
Seriously.

Steps to reproduce the behavior

  • Create a docker container with Chrome in it
  • Try to run Chrome
  • Try again with --no-sandbox

Output of docker version:

    Client:
     Version:      1.13.1
     API version:  1.26
     Go version:   go1.8.3
     Git commit:   092cba3
     Built:        Thu Oct 12 22:34:44 2017
     OS/Arch:      linux/amd64

    Server:
     Version:      1.13.1
     API version:  1.26 (minimum version 1.12)
     Go version:   go1.8.3
     Git commit:   092cba3
     Built:        Thu Oct 12 22:34:44 2017
     OS/Arch:      linux/amd64
     Experimental: false
 

Output of docker info:

 
Containers: 1
 Running: 1
 Paused: 0
 Stopped: 0
Images: 23
Server Version: 1.13.1
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 23
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: aa8187dbd3b7ad67d8e5e3a15115d3eef43a7ed1
runc version: 9df8b306d01f59d3a8029be411de015b7304dd8f
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 apparmor
 seccomp
  Profile: default
Kernel Version: 4.13.0-46-generic
Operating System: Ubuntu 17.10
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 7.68 GiB
Name: merc-B250M-D3H
ID: 5VQF:HZG3:ULIM:TQOZ:ITG2:SUGX:HFZ2:QBZH:HJR6:GABW:COXR:CY3E
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Registry: https://index.docker.io/v1/
WARNING: No swap limit support
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
@cpuguy83

This comment has been minimized.

Copy link
Collaborator

commented Nov 22, 2018

Most likely this is the seccomp policy blocking setns. You can supply a custom seccomp policy.

But also, you are already running chrome in a container, what is the need to do another "setns"?

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 22, 2018

I can see this page that explains how to set Seccomp profiles for dockers, and especially the option --security-opt seccomp=/path/to/seccomp/profile.json. What would the profile.json have to contain to just whitelist setns()?

As per your second question, imagine a complex application that needs Chrome in headless mode, to run some client-side testing or to generate PDFs -- or whatever. In this case, running Chrome without a sandbox would imply that a hacker could exploit some of Chrome's vulnerabilities to gain access to the instance.

If you let me know, and I see that it works, I will make sure I tell pretty much everybody online (that would include the Selenium people, but also countless people out there on SO and various forums) that there is a solution other than disabling the sandbox, which in many cases is a really bad idea.

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Nov 22, 2018

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 22, 2018

Ah, I even looked into those repos a lot in order to figure out what was going on... but never together, and without knowing about the seccomp flag.
So... The chrome json file seems to be listing a lot of calls to allow. I guess it's because it basically overrides the full default seccomp setting Is there no way to make this future-proof, and say "apply the default setting, with this difference" so to speak?

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 22, 2018

@jessfraz I saw a few tickets on your repos where people asked you about the Chrome issue; you referred them to your dotfiles. However, I believe a more detailed explanation, when this problem arises. Just my humble 2c -- thank you for everything!

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Nov 22, 2018

Is there no way to make this future-proof, and say "apply the default setting, with this difference" so to speak?

Yes, the seccomp profile is unfortunately quite verbose. This was by design, because the default profile is configurable on the daemon, so if a container would only specify the "diff" (assuming the daemon runs the default profile), the result would be unpredictable. So for that reason, the seccomp profile requires you to specify exactly what the profile should look like.

Perhaps it would be a fun "pet" project to create a seccomp-profile generator, i.e. something like;

seccomp-bake \
  --default-profile=profiles/seccomp/default.json \
  --whitelist-add=foo,bar,baz \
  -o ./my-profile.json

(although probably could be done with, e.g., jq)

Another improvement would be this proposal; moby/moby#32801 (adding "entitlements"), which would make setting security options more user-friendly

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 22, 2018

Hi,

alright... I will test this out on my own machine (mainly making sure that setns() is the only thing Chrome needs, and if it isn't, wrestling permissions till I get it right, possibly checking @jessfraz's settings in Chrome) and will then proceed to mass-answering people with the same issue in the gazillion places I've found (probably just pointing to this issue, which right now is pure gold to a lot of people out there)

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 23, 2018

Hi,

So, it's not just setns -- as I imagined. After cutting and sorting and diffing, here is the list of calls that are NOT whitelisted in the default config file but are listed in @jessfraz's Chrome config file.

    > arch_prctl
    > chroot
    > clone
    > fanotify_init
    > name_to_handle_at
    > open_by_handle_at
    > setdomainname
    > sethostname
    > syslog
    > unshare
    > vhangup
    > setns

I frankly don't know if all of them are needed. I assume @jessfraz would have straced chrome and checked which calls were called... maybe?

So, at this stage if somebody wants to run Chrome in a docker container, they can basically:

  • Get the default seccomp config file

  • Add the calls above in the whitelist at the top, the one that starts with:

    "syscalls": [
    {
    "names": [

  • Enjoy a safe Chrome.

I think the Selenium people are the first one that must be warned, since right now basically anybody running Travis/Selenium, is running an insecure sandbox-less Chrome. That's planet-wise.

Before I go out there and tell everybody, may I ask: I realise that the list above is the full list that will make it work with Chrome. But... can it be shortened? How was it worked out? Trial & error? Strace? Grepping Chrome's source?

I guess the best person to answer would be @jessfraz -- any hints?

@cpuguy83

This comment has been minimized.

Copy link
Collaborator

commented Nov 23, 2018

Almost certainly strace

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 23, 2018

@cpuguy83 If that is the case, there is no point in trying and shorten it.

Do you think it's worthwhile trying my luck, and see if the Docker people would accept a pull request adding CHROME the same way CAP_SYS_ADMIN is?

This wold be to help out all those people out there trying to get headless chrome to do software testing in a container...

@cpuguy83

This comment has been minimized.

Copy link
Collaborator

commented Nov 23, 2018

Sorry no.
Those capabilities are actual Linux capabilities

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Nov 23, 2018

I think the Selenium people are the first one that must be warned, since right now basically anybody running Travis/Selenium, is running an insecure sandbox-less Chrome. That's planet-wise.

Chrome will be sandboxed as a whole by the container; if those containers are minimal (only contain chrome, and the bare minimum required), and follow best practices, such as running as a non-privileged user, run with a read-only filesystem, have --security-opt=no-new-privileges set, as well as memory and CPU constraints), no damage could be done beyond what's inside the container (possibly, the profile could be tightened further, as the default profile is a "generic" profile for common use).

Note that @jessfraz's Dockerfile (and seccomp profile) is targeted at desktop / interactive use of the Chrome container, and therefore may be more permissive than required for your use case (running Selenium tests in headless mode).

Given that more syscalls are whitelisted in the Chrome seccomp profile, that actually means the profile is less restrictive than the default, thus introducing more risks if the container gets compromised.

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 23, 2018

@thaJeztah Yours is a compelling argument. However, if the container for example must be able to connect to a database server, for example, a non-sandbox chrome might become the gateway to gain read-access to the database and get credentials. If a shell is obtained, the intruder will be able to reach hosts that would normally be unreachable. So, while it's true that a malicious user exploiting a Chrome vulnerability would "only" be able to access the container, there are many cases where access to that container's data -- and even just having a shell in that container -- might be a problem bigger than expected. You can surely think of several dangerous scenarios if you have an application server that needs to run headless Chrome (for example to create PDFs, for example).

Your comment on the possibiity of headless Chrome not needing all of these:

> arch_prctl
> chroot
> clone
> fanotify_init
> name_to_handle_at
> open_by_handle_at
> setdomainname
> sethostname
> syslog
> unshare
> vhangup
> setns

Is interesting; by looking at them, I doubt headless Chrome would need much less. But, it would need investigation for sure.

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 26, 2018

@thaJeztah Any thoughts? I don't want to recommend anything to anybody unless it's sound advice, and your message cast some doubts on my reasoning. When you write _ if those containers are minimal (only contain chrome, and the bare minimum required),_, I think that those "minimum requirements" for server-side Chrome will inevitably have to report the results to another host, possibly have access to hosts otherwise protected, and have some privileges to do so. Om the other hand, do you think the syscalls above have security implications? (arch_prctl jumps to mind)

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Nov 26, 2018

I think that those "minimum requirements" for server-side Chrome will inevitably have to report the results to another host, possibly have access to hosts otherwise protected, and have some privileges to do so.

If this is in a CI environment, you should assume the content you're running is compromised, and configure what the container is allowed to access based on that assumption. In a Docker setup, that could also mean; connect the container to a network that only allows it to connect to those services/containers that you want it to be able to reach. (If this is about "results", and you don't want it to be able to "push" those changes, perhaps writing to a file, and collect those changes would be an option). That said; I don't have a lot of experience with setting up Selenium, so not sure I can give more advice on that part 😅

Om the other hand, do you think the syscalls above have security implications? (arch_prctl jumps to mind)

I'll defer that one to @justincormack and @jessfraz, who are probably better at answering that.

@mercmobily

This comment has been minimized.

Copy link
Author

commented Nov 26, 2018

@lucifer1004

This comment has been minimized.

Copy link

commented Jan 9, 2019

My use case is running Chrome (headful) with AWS Fargate, where neither --cap-add nor --security-opt can be used, does this mean I can only run Chrome with --no-sandbox?

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Jan 9, 2019

My use case is running Chrome (headful) with AWS Fargate, where neither --cap-add nor --security-opt can be used, does this mean I can only run Chrome with --no-sandbox?

If there's no option to customize those (or the daemon configuration), then probably: yes.

If a dedicated option was added for this, then you'd probably also not be able to configure that in that case, so it may be better to open a feature request with AWS

tkp1n added a commit to tkp1n/chromium-ci that referenced this issue Sep 29, 2019
bajtos added a commit to strongloop/loopback that referenced this issue Sep 30, 2019
See the discussion in
docker/for-linux#496

Signed-off-by: Miroslav Bajtoš <mbajtoss@gmail.com>
bajtos added a commit to strongloop/loopback that referenced this issue Sep 30, 2019
See the discussion in
docker/for-linux#496

Signed-off-by: Miroslav Bajtoš <mbajtoss@gmail.com>
bajtos added a commit to strongloop/loopback that referenced this issue Sep 30, 2019
See the discussion in
docker/for-linux#496

Signed-off-by: Miroslav Bajtoš <mbajtoss@gmail.com>
bajtos added a commit to strongloop/loopback that referenced this issue Oct 1, 2019
See the discussion in
docker/for-linux#496

Signed-off-by: Miroslav Bajtoš <mbajtoss@gmail.com>
bajtos added a commit to strongloop/loopback that referenced this issue Oct 1, 2019
See the discussion in
docker/for-linux#496

Signed-off-by: Miroslav Bajtoš <mbajtoss@gmail.com>
bajtos added a commit to strongloop/loopback that referenced this issue Oct 1, 2019
See the discussion in
docker/for-linux#496

Signed-off-by: Miroslav Bajtoš <mbajtoss@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.