daemon: archive: pause containers before doing filesystem operations #39252

cyphar · 2019-05-22T09:07:29Z

(Note: This PR was made public after discussions with the Docker security team, if you find a security vulnerability please report it directly to security@docker.com.)

There are certain classes of attacks (as evidenced in CVE-2018-15664)
which are caused by our allowing container processes to be executing
while we are doing filesystem operations on the container. In
particular, there are trivial TOCTOU races in symlink resolution and
scoping that can be exploited.

The most complete solution to this problem would be to modify
chrootarchive so that all of the archive operations occur with the root
as the container rootfs (and not the parent directory, which is what
causes the vulnerability since the parent is attacker-controlled).
Unfortunately, changes to this core piece of Docker are almost
impossible (the TarUntar interface has many copies and reimplementations
that would all need to be modified to be able to handle a new "root"
argument).

So, we instead settle for the next-best option which is to pause the
container during our usage of the filesystem. This is far from an ideal
solution (you can image some attack scenarios such as shared volume
mounts) where this is ineffectual but it does block the most basic
attack.

I am currently working on some new kernel functionality that would allow
for much safer resolution of paths inside untrusted roots, but as
above it would be difficult to modify Docker to use it. I am working on
adding support to filepath-securejoin though (however this will
require quite a few inteface changes).

Fixes: CVE-2018-15664
Signed-off-by: Aleksa Sarai asarai@suse.de

cyphar · 2019-05-22T09:08:08Z

This is still a WIP since I still need to test that my reproducer no longer triggers the issue.

daemon/archive.go

thaJeztah · 2019-05-22T12:37:09Z

/cc @justincormack @cpuguy83

codecov · 2019-05-23T13:17:59Z

Codecov Report

Merging #39252 into master will increase coverage by 0.02%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master   #39252      +/-   ##
==========================================
+ Coverage   37.05%   37.08%   +0.02%     
==========================================
  Files         612      612              
  Lines       45478    45584     +106     
==========================================
+ Hits        16853    16904      +51     
- Misses      26341    26392      +51     
- Partials     2284     2288       +4

thaJeztah · 2019-05-23T13:37:15Z

Wondering if we need a --pause option on docker cp, similar to what we have on docker commit / docker container commit;

 -p, --pause            Pause container during commit (default true)

(could be either --no-pause if the default should be true (which will be a breaking change likely), or --pause to make it optional)

cyphar · 2019-05-23T13:42:48Z

Since this is a security fix I'm not sure how good of an idea it is to allow disabling it -- the attack scripts I have allow attackers to read and write to anywhere on the host filesystem as root. Adding footguns for their own sake feels like a bad idea.

docker commit -p is more of a functional feature to (in theory) allow for atomic snapshots on storage drivers which don't have them (like overlayfs).

thaJeztah · 2019-05-23T13:51:06Z

Yes 😞 definitely understood; I can see this being a disruptive change though (as in; users that would currently would now seeing their container going offline). Surely, not really best practice if you need the actual container to get important data out 🤷‍♂, but still.

cyphar · 2019-05-23T13:57:19Z

I can do some more testing to see how bad the impact is, but in theory we're talking about hundreds of milliseconds of downtime -- docker pause doesn't kill the container process after all it freezes the container's cgroup.

EDIT: Actually no, it could be pretty bad if you're copying large files...

thaJeztah · 2019-05-23T14:01:05Z

I can do some more testing to see how bad the impact is, but in theory we're talking about hundreds of milliseconds of downtime -- docker pause doesn't kill the container process after all it freezes the container's cgroup.

Oh, I misunderstood, and thought the container would be paused during the whole copy operation 😊 😅. Should've taken the time to go through the whole flow. Sorry for that! Jumping between conversations 😊

cyphar · 2019-05-23T14:02:08Z

No sorry, you are right -- it does apply to the entire copy operation. I forgot people copy pretty large files into and out of containers. Hmm, maybe we do need an off switch for this...

AkihiroSuda · 2019-05-23T14:10:05Z

Pause doesnt work with rootless, any workaround?

cyphar · 2019-05-23T14:12:53Z

For rootless we could just not do the pausing. As I mention in the commit message, this is a best-effort protection because (after spending several weeks working on it) I concluded it's effectively impossible to implement the actual protection necessary to defend against the attack in Docker. You would need to rewrite too many interfaces and too much glue code that it's not really doable anymore.

olljanat · 2019-05-23T18:09:42Z

Derek add label: rebuild/janky

cpuguy83 · 2019-05-23T18:29:24Z

How come we aren't using chrootarchive here?

cyphar · 2019-05-23T22:14:16Z

@cpuguy83 At the very end of the chain, chrootarchive is being used. The problem is that chrootarchive is being used to chroot into the parent directory of the directory where the extraction is occurring (which is attacker-controlled since it's inside the container). So, if you get hit by the race you end up chrooting into the host filesystem. As I said above, this could be fixed by adding a root parameter to all of the TarUntar interfaces but this turns out to be quite hard purely due to the number of changes needed all over the place.

There are certain classes of attacks (as evidenced in CVE-2018-15664) which are caused by our allowing container processes to be executing while we are doing filesystem operations on the container. In particular, there are trivial TOCTOU races in symlink resolution and scoping that can be exploited[1]. The most complete solution to this problem would be to modify chrootarchive so that all of the archive operations occur with the root as the container rootfs (and not the parent directory, which is what causes the vulnerability since the parent is attacker-controlled). Unfortunately, changes to this core piece of Docker are almost impossible (the TarUntar interface has many copies and reimplementations that would all need to be modified to be able to handle a new "root" argument). So, we instead settle for the next-best option which is to pause the container during our usage of the filesystem. This is far from an ideal solution (you can image some attack scenarios such as shared volume mounts) where this is ineffectual but it does block the most basic attack. I am currently working on some new kernel functionality that would allow for much safer resolution of paths inside untrusted roots[2], but as above it would be difficult to modify Docker to use it. I am working on adding support to filepath-securejoin[3] though (however this will require quite a few inteface changes). [1]: https://bugzilla.suse.com/show_bug.cgi?id=1096726 [2]: https://lwn.net/Articles/767547/ [outdated API] [3]: https://github.com/cyphar/filepath-securejoin Fixes: CVE-2018-15664 Signed-off-by: Aleksa Sarai <asarai@suse.de>

cpuguy83 · 2019-05-28T17:12:22Z

Thinking we could add a flag to set the daemon level default (--default-pause-on-cp)

Or alternatively we could add a flag which requires the user to set pause on cp (--require-puase-on-cp) and error out if pause is not specified.

I will be happy to look at updating interfaces to make this better.
I was already considering building an external tool to deal with container filesystem stuff for cases like docker cp, because it's pretty horrific today with manipulating mount namespaces in the root namespace and other things... which to provide any real benefit requires updating these interfaces anyway.

cpuguy83 · 2019-05-28T19:11:15Z

Just a thought, we could take a temporary snapshot on the fs if the container is running (the if part is just for optimization).
I'm not sure if we would even need to modify the layer store/graphdriver api's for this.

/cc @dmcgowan

arno01 · 2019-05-30T07:17:02Z

Are there real use cases people really need docker cp and cannot live without it?

I am just saying if that is too complicated to fix and is too dangerous to leave, I would opt for having some flag like --enable-insecure-docker-cp for dockerd, but this is only because I am having in mind docker cp is a handy feature only for quick dev tests and that if someone needs to copy some data from the container, should rather use a shared docker volume or use other tools like nsenter.

justincormack · 2019-05-30T08:40:57Z

@arno01 it is hard to know how people are using docker cp; I hope people are not using it in production, just for dev. If this is the case, adding an automatic pause seems an ok mitigation without any additional changes for now.

arno01 · 2019-05-30T09:08:20Z

I am afraid that pausing (SIGSTOP) containers is also a breaking change as it will lead to the error prone situations such as when someone is docker cp'ing something big as usual (for him, in a prod, maybe even scripted or just one time ad-hoc copy) will unexpectedly start pausing the running process which in turn (if left paused for too long) may break all network connections (or socket's buffer can fill up) to the service running in that container.

WDYT?

cpuguy83 · 2019-05-30T22:08:18Z

I have implemented chrootarchive within a specified root (the container's root) here: #39292

The change itself is fairly isolated (2 new functions in chrootarchive).

cyphar · 2019-05-30T22:35:25Z

I'm closing this PR in favour of that one.

@cpuguy83 I should've been more specific -- the difficulty I had was in trying to fix and restructure all the users of FollowSymlinkInScope.

GordonTheTurtle added the status/0-triage label May 22, 2019

cyphar force-pushed the archive-pause-containers branch from 95531f7 to 101c2d6 Compare May 22, 2019 09:11

AkihiroSuda reviewed May 22, 2019

View reviewed changes

daemon/archive.go Outdated Show resolved Hide resolved

cyphar force-pushed the archive-pause-containers branch 2 times, most recently from 8fbdb8c to 88799f6 Compare May 22, 2019 12:35

cyphar force-pushed the archive-pause-containers branch 2 times, most recently from 78377ce to fede627 Compare May 23, 2019 13:17

cyphar force-pushed the archive-pause-containers branch from fede627 to e9cac66 Compare May 23, 2019 13:22

cyphar marked this pull request as ready for review May 23, 2019 13:43

derek bot added the rebuild/janky label May 23, 2019

GordonTheTurtle removed the rebuild/janky label May 23, 2019

cyphar force-pushed the archive-pause-containers branch from e9cac66 to 6227d8e Compare May 29, 2019 01:06

CodeJuan mentioned this pull request May 30, 2019

feature: add pouch cp cmd AliyunContainerService/pouch#2879

Merged

cyphar closed this May 30, 2019

cyphar deleted the archive-pause-containers branch May 30, 2019 22:35

ret2libc mentioned this pull request May 31, 2019

Resolve symlinks in cp containers/podman#3214

Merged

cpuguy83 mentioned this pull request Jun 3, 2019

Pass root to chroot to for chroot Tar/Untar (CVE-2018-15664) #39292

Merged

This was referenced Jun 3, 2019

[18.09 backport] Pass root to chroot to for chroot Tar/Untar (CVE-2018-15664) docker-archive/engine#253

Merged

[19.03 backport] Pass root to chroot to for chroot Tar/Untar (CVE-2018-15664) docker-archive/engine#254

Merged

andrewhsu mentioned this pull request Jun 12, 2019

[19.03] Revert Pass root to chroot to for chroot Tar/Untar (CVE-2018-15664) docker-archive/engine#275

Merged

MingcongBai mentioned this pull request Aug 21, 2019

docker: CVE-2018-15664 AOSC-Dev/aosc-os-abbs#1965

Closed

3 tasks

tatianab mentioned this pull request Nov 8, 2023

x/vulndb: potential Go vuln in github.com/moby/moby: CVE-2018-15664 golang/vulndb#2212

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

daemon: archive: pause containers before doing filesystem operations #39252

daemon: archive: pause containers before doing filesystem operations #39252

cyphar commented May 22, 2019 •

edited

Loading

cyphar commented May 22, 2019

thaJeztah commented May 22, 2019

codecov bot commented May 23, 2019 •

edited

Loading

thaJeztah commented May 23, 2019

cyphar commented May 23, 2019

thaJeztah commented May 23, 2019

cyphar commented May 23, 2019 •

edited

Loading

thaJeztah commented May 23, 2019

cyphar commented May 23, 2019

AkihiroSuda commented May 23, 2019

cyphar commented May 23, 2019

olljanat commented May 23, 2019

cpuguy83 commented May 23, 2019

cyphar commented May 23, 2019 •

edited

Loading

cpuguy83 commented May 28, 2019

cpuguy83 commented May 28, 2019

arno01 commented May 30, 2019 •

edited

Loading

justincormack commented May 30, 2019

arno01 commented May 30, 2019 •

edited

Loading

cpuguy83 commented May 30, 2019

cyphar commented May 30, 2019

daemon: archive: pause containers before doing filesystem operations #39252

daemon: archive: pause containers before doing filesystem operations #39252

Conversation

cyphar commented May 22, 2019 • edited Loading

cyphar commented May 22, 2019

thaJeztah commented May 22, 2019

codecov bot commented May 23, 2019 • edited Loading

Codecov Report

thaJeztah commented May 23, 2019

cyphar commented May 23, 2019

thaJeztah commented May 23, 2019

cyphar commented May 23, 2019 • edited Loading

thaJeztah commented May 23, 2019

cyphar commented May 23, 2019

AkihiroSuda commented May 23, 2019

cyphar commented May 23, 2019

olljanat commented May 23, 2019

cpuguy83 commented May 23, 2019

cyphar commented May 23, 2019 • edited Loading

cpuguy83 commented May 28, 2019

cpuguy83 commented May 28, 2019

arno01 commented May 30, 2019 • edited Loading

justincormack commented May 30, 2019

arno01 commented May 30, 2019 • edited Loading

cpuguy83 commented May 30, 2019

cyphar commented May 30, 2019

cyphar commented May 22, 2019 •

edited

Loading

codecov bot commented May 23, 2019 •

edited

Loading

cyphar commented May 23, 2019 •

edited

Loading

cyphar commented May 23, 2019 •

edited

Loading

arno01 commented May 30, 2019 •

edited

Loading

arno01 commented May 30, 2019 •

edited

Loading