Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dockerd: make -H tcp://... even harder without --tlsverify #37299

Closed
wants to merge 1 commit into from

Conversation

cyphar
Copy link
Contributor

@cyphar cyphar commented Jun 16, 2018

Historically we have been a little bit too forgiving to people who run
Docker with an unauthenticated API server, and recent events have
shown that a worrying amount of people are running Docker daemons with
the API accessible from the internet.

For a long time Docker has not used a TCP port by default, and using it
without --tlsverify has elicited an all-caps warning for a while as
well. But obviously this is not sufficient (maybe people aren't noticing
the warnings). We cannot just disallow this (because it is in theory
possible to set this up safely, and we use it ourselves in testing), so
instead add a new flag with a scary name to make sure that users are
aware of the significant risk of binding the API server to a TCP port
without any authentication:

% sudo dockerd -H tcp://0.0.0.0:1337 --give-the-internet-root-access

While it might sound like a bit of a joke, these sorts of "scary and
very obvious" flags exist in other tools. For instance, flashrom
requires you to use the flag "i_want_a_brick" if you are trying to flash
a laptop's ROM in a way it doesn't think is safe (and anyone reading
such an option should recognise why the flag has such a scary name).

Midge our cute new kitten cat by dougwoods

Midge our cute new kitten cat by dougwoods

Ref: docker/hub-feedback#1121
Signed-off-by: Aleksa Sarai asarai@suse.de

@cyphar cyphar force-pushed the make-binding-to-tcp-harder branch 2 times, most recently from bedc4d9 to 7962115 Compare June 16, 2018 15:14
@cyphar cyphar requested a review from vdemeester as a code owner June 16, 2018 15:14
@codecov
Copy link

codecov bot commented Jun 16, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@de0abf4). Click here to learn what that means.
The diff coverage is 42.85%.

@@            Coverage Diff            @@
##             master   #37299   +/-   ##
=========================================
  Coverage          ?   35.03%           
=========================================
  Files             ?      609           
  Lines             ?    45005           
  Branches          ?        0           
=========================================
  Hits              ?    15769           
  Misses            ?    27123           
  Partials          ?     2113

@cyphar
Copy link
Contributor Author

cyphar commented Jun 16, 2018

WoW is broken because it uses -H tcp://. I can't fix this because the test scripts are in Jenkins, so someone with access to Jenkins would need to fix it (just add the --give-the-internet-root-access option).

15:16:58 INFO: Args: -H tcp://0.0.0.0:2357 --graph d:\CI\CI-796211553\daemon --pidfile d:\CI\CI-796211553\docker.pid

Historically we have been a little bit too forgiving to people who run
Docker with an unauthenticated API server, and recent events[1] have
shown that a worrying amount of people are running Docker daemons with
the API accessible from the internet.

For a long time Docker has not used a TCP port by default, and using it
without --tlsverify has elicited an all-caps warning for a while as
well. But obviously this is not sufficient (maybe people aren't noticing
the warnings). We cannot just disallow this (because it is in theory
possible to set this up safely, and we use it ourselves in testing), so
instead add a new flag with a scary name to make sure that users are
aware of the significant risk of binding the API server to a TCP port
without any authentication:

  % sudo dockerd -H tcp://0.0.0.0:1337 --give-the-internet-root-access

While it might sound like a bit of a joke, these sorts of "scary and
very obvious" flags exist in other tools. For instance, flashrom
requires you to use the flag "i_want_a_brick" if you are trying to flash
a laptop's ROM in a way it doesn't think is safe (and anyone reading
such an option should recognise why the flag has such a scary name).

[1]: https://kromtech.com/blog/security-center/cryptojacking-invades-cloud-how-modern-containerization-trend-is-exploited-by-attackers

Signed-off-by: Aleksa Sarai <asarai@suse.de>
@cyphar cyphar force-pushed the make-binding-to-tcp-harder branch from 7962115 to 658cc5c Compare June 17, 2018 01:12
@AkihiroSuda
Copy link
Member

What do you recommend for dind usecase?

@@ -72,7 +73,8 @@ func (o *daemonOptions) InstallFlags(flags *pflag.FlagSet) {
flags.Var(opts.NewQuotedString(&tlsOptions.KeyFile), "tlskey", "Path to TLS key file")

hostOpt := opts.NewNamedListOptsRef("hosts", &o.Hosts, opts.ValidateHost)
flags.VarP(hostOpt, "host", "H", "Daemon socket(s) to connect to")
flags.VarP(hostOpt, "host", "H", "Daemon socket(s) to bind to")
flags.BoolVar(&o.InsecureHostBind, "give-the-internet-root-access", false, "Allow -H to bind to a TCP address without --tlsverify (this could allow anyone on your network to get root access to your *host*)")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think we will support rootless mode. Just --insecure-bind-tcp seems fine.

Copy link
Contributor Author

@cyphar cyphar Jun 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think --insecure is scary enough. Maybe --allow-internet-facing-remote-code-execution or something? 😉

if proto == "tcp" && (serverConfig.TLSConfig == nil || serverConfig.TLSConfig.ClientAuth != tls.RequireAndVerifyClientCert) {
logrus.Warn("[!] DON'T BIND ON ANY IP ADDRESS WITHOUT setting --tlsverify IF YOU DON'T KNOW WHAT YOU'RE DOING [!]")
if !cli.Config.InsecureHostBind {
return nil, fmt.Errorf("refusing to bind to an IP address without --tlsverify configured: use --give-the-internet-root-access to ignore this error")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break existing deployments.
Probably we should have some grace period? (cc @thaJeztah )

Or maybe we could inject time.Sleep(punishment)

Copy link
Contributor Author

@cyphar cyphar Jun 17, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break existing deployments.

Well, yes. But I'm not sure how to gracefully deny it -- adding a sleep would not be noticed (people would probably blame systemd) and nobody has noticed the warning message on startup that has been there for many years. We could make docker (the client) output a warning on each operation but that will also break deployments. (Personally I think it should've been done this way in the first place.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would definitely require a deprecation period; one thing to make this situation more visible would be to include a warning to docker info, if the CLI is using an insecure TCP connection.

Also; making it easier to do the right thing (which could be, e.g. automatically generating a certificate) would help moving things to a better place

@cyphar
Copy link
Contributor Author

cyphar commented Jun 17, 2018

What do you recommend for dind usecase?

Do you mean the "bind-mount-the-socket" usecase? I mean, I wouldn't recommend that either. If you want I could bind-mount /var/run/docker.sock over itself with MS_UNBINDABLE -- but that would break in much more subtle ways.

@AkihiroSuda
Copy link
Member

I meant a case where DinD client connects to a DinD daemon using a TCP socket using a Docker network

$ docker run --privileged --name dind -d docker
$ docker run --link dind:docker docker docker ...

@cyphar
Copy link
Contributor Author

cyphar commented Jun 18, 2018

The dind default CMD could be updated to use the --give-the-internet-root-access (or whatever we call the flag) -- assuming that the default ENTRYPOINT has ["docker"]. Running a --privileged container gives that container almost root access on your machine (in theory) so an RCE in a --privileged container through Docker is almost as bad as an RCE as root on your box.

The only other option I can think of is that we auto-detect that we're in a container but I really don't like that idea. Not to mention that people should still be careful even when running Docker inside a (privileged) container.

@Faheetah
Copy link

Is the flag really being called "--give-the-internet-root-access"? This is definitely not the case if you are running in a virtual machine or an otherwise segregated environment. The naming of the flag implies that it will not only open up the container to internet access (even if it's technically infeasible), but give it root access. If you want to name it something practical, going along with what other projects do, --ignore-insecure-tls-warnings would be more descriptive. --give-the-internet-root-access is not only confusing but comes off unprofessional.

This is aside from the fact that you can't prevent stupid. If people don't care to learn the implications of running without TLS on an externally bound port, I don't think a flag is going to change their mind. If anything a more effective approach would be to require TLS by default and require a flag to disable TLS.

@cyphar
Copy link
Contributor Author

cyphar commented Jun 19, 2018

Is the flag really being called "--give-the-internet-root-access"? This is definitely not the case if you are running in a virtual machine or an otherwise segregated environment.

Exposing the API to your network is giving the network root access to the machine that the Docker daemon is running on. If you're running the Docker daemon in a VM then I think it should be somewhat expected that --give-the-internet-root-access would give the network root access to the VM.

It is definitely possible to configure this to be done safely, but it is non-trivial and it is pretty obvious that users aren't doing it properly. I would prefer that there be a very scary-looking option required in order to undermine the security of a machine, so that users don't do this by accident (or intentionally) without knowing that they're doing something bad.

As for whether the option will be called --give-the-internet-root-access I imagine that it will be bikeshedded and eventually it'll be given a fairly non-scary name like --insecure-bind-tcp (which I think defeats the point of having a flag like this -- if the flag name doesn't look scary then people will just blindly use it and ignore the problem). Maybe --give-the-network-root-access would be more accurate for firewalled networks but I'm not sure it's clear. What about --allow-network-based-privileged-code-execution?

If anything a more effective approach would be to require TLS by default and require a flag to disable TLS.

It wouldn't be possible to do that automatically because we don't know what the CA root (for authorising clients) should be. We could just accept no connections by default (with effectively a nil CA root) but then that's a much more silent failure than actually not starting dockerd without requiring a scary flag to enable it.

@thaJeztah
Copy link
Member

ping @justincormack @cyli @n4ss for thoughts / input

@Faheetah
Copy link

Could just require either —tlsverify or whatever flag like —disable-tls-authentication then docker would not run insecure by default, and the flag would make it clear the user is not disabling encrypted transport but also cert based auth. That’s how SSH runs and this would be the container equivalent of requiring key based auth or explicitly allowing passwordless ssh.

@cyphar
Copy link
Contributor Author

cyphar commented Jun 20, 2018

@Faheetah That is essentially what this patch is doing (the flag just has a different name). All jokes aside, I don't really mind if we change the flag name -- at the end of that day that's mostly just bikeshedding. I just want people to stop exposing their machines to unauthenticated root access. 😸

@thaJeztah thaJeztah added this to backlog in maintainers-session Jun 20, 2018
@justincormack
Copy link
Contributor

@cyphar I opened #36357 a while back to just deprecate it completely. I think if we add a flag to make it harder it should be part of a long term deprecation roadmap, eg the flag should be --deprecated-root-my-box-please or whatever from the start.

@cyphar
Copy link
Contributor Author

cyphar commented Jun 22, 2018

@justincormack I can change the flag to --deprecated-root-my-box-please 😉. But the question is how you'd like the deprecation to work -- should we print a warning on docker info before we require the flag?

@thaJeztah I just looked up how docker info warnings are printed and I'm a little bit concerned -- as far as I can tell the errors aren't actually created by dockerd they are created on the client-side based on what information came from GET /info? I guess you'd want the error to be based on whether the server connection was over TLS (and not based on some entry in the response to GET /info)?

@cyli
Copy link
Contributor

cyli commented Jun 22, 2018

Wondering if we should return, as part of the system info, the trusted client CA root certificate?

@thaJeztah
Copy link
Member

I can tell the errors aren't actually created by dockerd they are created on the client-side based on what information came from GET /info? I guess you'd want the error to be based on whether the server connection was over TLS (and not based on some entry in the response to GET /info)?

Ah, yes; those warnings being generated client-side have been a thorn in my side for a long time (I think we should move them to the daemon, and return a list of warning that the client just prints).

For this use-case, I assume the client has all information that's needed to decide if this warning should be printed or not, so we would be able to print the warning based on that information, correct? (So not using /info output for that, but print it in the same location)

@cpuguy83
Copy link
Member

We can add a "Warnings" field to the Info struct, no?

@thaJeztah
Copy link
Member

We can add a "Warnings" field to the Info struct, no?

Yes that was my thinking; if present, the client would use that (backward compatibility with older API versions), and otherwise use the current behavior, and generate those warnings client-side.

@cyphar
Copy link
Contributor Author

cyphar commented Jun 25, 2018

We can add a "Warnings" field to the Info struct, no?

Yeah, that's what I would do (it's what I thought /info did until I tried to figure out where the errors are coming from). Though obviously we can't drop the warnings code from the client due to needing backwards compatibility.

@thaJeztah
Copy link
Member

Discussing this in the maintainers meeting and we definitely agree that we should improve this; things being discussed;

  • add an (incrementing) delay for people that use either an unprotected, or insecure configuration (to make them aware something's "wrong") (include message in HTTP header?)
  • we can add a flag / environment-variable to disable that behavior to given them an escape hatch
  • move the warnings from the client to the daemon (this can be done separate from this PR, because that's something we should do anyway 😄)
  • officially deprecate unprotected/insecure (and mark for removal in XX releases)
  • include information about the deprecation in the warning that's printed ("to be removed in version x.y and/or a date after which it will be removed")
  • add a metrics label about insecure/unprotected API to the prometheus metrics
  • for future; look into generating certificates to get people started (to be discussed; how do we get those from the daemon to the client in case of a remote daemon?)

@cyphar let us know if you want to help with those; perhaps we should create an epic for tracking those changes

@thaJeztah
Copy link
Member

@justincormack @cyli @n4ss ^^ wdyt?

@thaJeztah thaJeztah removed this from backlog in maintainers-session Jul 5, 2018
@cyli
Copy link
Contributor

cyli commented Jul 5, 2018

+1 warning on daemon side, prometheus metrics, and deprecation warning

Could you explain a bit about the delay? Is it per API request/per IP? Would the warning in the HTTP header be translated into a warning that the client would display per command? (This way, every command run against the daemon would display a warning, not just docker info.) Would it affect docker commands run against the docker socket instead of just the HTTP API as well, so admins would see it every time they did anything locally?

Also, this may not be desirable since mutual TLS works pretty well, but one possible usability improvement for security the daemon by default is generating server TLS certs and providing an auth token that can be used by clients instead. Jupyter notebook introduced this for example: https://jupyter-notebook.readthedocs.io/en/stable/security.html

It might be easier/friendlier for people to get set up with a remote daemon by copying an API token, as opposed to setting up a client CA cert and a key, and possibly signing leaf client certs with the CA.

@cyphar
Copy link
Contributor Author

cyphar commented Jul 5, 2018

@cyli

+1 warning on daemon side

There already has been a daemon warning for at least 4 years. I'm not sure people have seen it -- most people run inside systemd and given how chatty Docker (and containerd) are in logs, I'm not surprised that people miss a warning very early in the startup of dockerd.

@cyli
Copy link
Contributor

cyli commented Jul 6, 2018

@cyphar Sorry, I was referring to your and @thaJeztah's suggestion for generating warnings for docker info on daemon side.

@cyphar
Copy link
Contributor Author

cyphar commented Jul 6, 2018

Ah okay. Yeah, I agree.

@AkihiroSuda
Copy link
Member

What's current status?

Wondering if this is closable in favor of #37684

@cyphar
Copy link
Contributor Author

cyphar commented Sep 18, 2018

@AkihiroSuda No, #37684 was a first step towards deprecation -- the long-term goal should be that we require a flag to allow this (rather than just have another warning). I don't think closing this makes sense because otherwise we'll just forget -- instead I think this should be put on a future milestone with a deprecation notice added to the warning in #37684.

@derek derek bot added the rebuild/* label Dec 21, 2018
@derek derek bot added the status/failing-ci Indicates that the PR in its current state fails the test suite label Dec 22, 2018
@olljanat
Copy link
Contributor

@cyphar plz check those failing tests

@AkihiroSuda
Copy link
Member

I'm in favor of this PR but we should honor the future deprecation policy, although I feel waiting for 3 releases (19.09, 20.03, and 20.09) is too much in this case: Before an existing feature is removed it is labeled as “deprecated” within the documentation and remains in Docker for at least 3 stable releases unless specified explicitly otherwise. https://docs.docker.com/engine/#feature-deprecation-policy
cc @thaJeztah @andrewhsu

Also, I'm wondering we can let every invocation of docker command to print a warning when the daemon is listening on TCP without TLS.

@AkihiroSuda
Copy link
Member

btw I opened a dind doc PR docker-library/docs#1525 and a dind proposal docker-library/docker#164 toward deprecating dind w/o TLS

@cpuguy83
Copy link
Member

#41285 is merged now which:

  1. Adds a deprecation notice for "unauthenticated TCP by default"
  2. Sleeps for 15s when authentication is not explicitly disabled (--tls=false or --tlsverify=false)
  3. Sleeps for 1s for all cases of unauthenticated TCP to help draw attention the insecure configuration.
  4. Always enable --tlsverify if --tls=true unless --tlsverify is explicitly set to false.
  5. Give a free pass to localhost (even if it's not really secure either...)

The deprecation notice warns that in the next release we'll enable --tls=true by default (and by extension --tlsverify will be true by default.
It would be nice to follow up with an easy way to generate certs... especially for automated configuration.

Closing this one. Thanks for all the discussion. We finally landed on a solution that seems like it will take care most of the cases of unexpectedly exposing the docker API to the world.

People should also note, we do support tunneling API access over SSH, which is negotiated by the docker CLI... e.g. DOCKER_HOST=ssh://1.2.3.4 docker run ...

@cpuguy83 cpuguy83 closed this Sep 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security impact/changelog status/failing-ci Indicates that the PR in its current state fails the test suite status/1-design-review
Projects
Development

Successfully merging this pull request may close these issues.

None yet