Proposal: control cache sources during build with --cache-from #26065

stevvooe · 2016-08-26T23:17:19Z

This proposal was originally from #24711 (comment).

With several changes to the image format, the scope of build caching has been limited to local nodes. This is problematic for architectures which dispatch builds to arbitrary nodes, since pulling new images and data will not fulfill the build cache.

The main concern here is Cache poisoning. The worst part about is that it is not at all obvious that you are affected or protected. It can only be mitigated by limiting the horizon of data that one trusts.

Anything that circumvents that protection, even docker save/load, is going to open your infrastructure up to injection of malicious content. The proposal in #20316 and previous proposals have not addressed this problem. While we all want fast builds (and I really do), introducing cache poisoning to the build step of the infrastructure must be avoided. Could you imagine the impact if someone could just inject a malicious layer into library/ubuntu or library/alpine?

The other aspect to this is the misapplied assumption about the idempotence of shell commands. apt-get update run twice is never guaranteed to have the same result. Ever. That is just not how it works. If you have a build cache that is never purged, you will never update your upstream software. That may or may not be the intent. Even worse, if this build cache gets filled in with remote data, you probably have no visibility into when that command was run.

The underlying problem here is that with 1.10 changes in the image format, we no longer restore the parent image chain when pulling from a registry. As such, a proper solution to this problem involves something that can control the level of trust for content to a distributed build cache.

Let's look at how we build an image, with FROM alpine at the top:

docker pull mysuperapp:v0
docker build -t mysuperapp:v1 .

In this simple case, we cannot assume that a remote mysuperapp:v0 and the ongoing build are related, since that would possibly introduce the cache poisoning scenario that we need to avoid. However, one may have local registry infrastructure that they know they can trust. While we can infer parentage (despite other assertions, this is still possible), we may not be able trust that parentage from a build caching perspective from all registries. But, this build environment is special.

What better way than to tell the build process process that you can trust a related image?

docker build --cache-from mysuperapp:v0 -t mysuperapp:v1 .

The above would allow Dockerfile commands to be satisfied from the entries of mysuperapp:v0 in the build of mysuperapp:v1. Job done!

No! We still have a problem. Now, my build system has to know tag lineage (mysuperapp:v0<mysuperapp:v1`). Let's modify the meaning the tagless reference to mean something slightly different:

docker pull -a mysuperapp
docker build --cache-from mysuperapp -t mysuperapp:v1 .

In the above, we pull all the tags from mysuperapp, any layer of which can satisfy the build cache. In practice, this probably is a little wide for most scenarios, so we can allow multiple --cache-from directives on the command line:

docker build --cache-from mysuperapp:v0 docker build --cache-from mysuperapp:v1 -t mysuperapp:v2 .

There are many possibilities here to make this more flexible, such as running a registry service specially for the purpose of build caching mybuildcache.internal/mysuperapp. Did you know that you can just run a registry and rsync the filesystem around without locking? You can also rsync from multiple sources and merge the result safely (kind of). Such a registry can be purged periodically (or some one could submit a PR to purge old data).

We can take this even further, but I hope the point is brought home. This is probably less convenient that the original behavior, but it is a good trade off. It leverages the existing infrastructure and has the possibility of being extended as use cases change.

Closes #18924.

The text was updated successfully, but these errors were encountered:

tonistiigi · 2016-09-22T22:27:41Z

Some notes from #26839

Specifying multiple --cache-from images is bit problematic. If both images match there is no way(without doing multiple passes) to figure out what image to use. So we pick the first one(let user control the priority) but that may not be the longest chain we could have matched in the end. If we allow matching against one image for some commands and later switch to a different image that had a longer chain we risk in leaking some information between images as we only validate history and layers for cache. Currently I left it so that if we get a match we only use this target image for rest of the commands.
If the FROM image changes, no cache is reused. This is kind of useful and actually important because otherwise, the user can not detect that there are important security fixes in the base image because they always use the cache. But this may also come as a surprise to some users. I think we should not try to make some hack to work around it, rather if some users want different behavior they should just use immutable tags or digests as a FROM image.

stevvooe · 2016-09-27T20:35:36Z

@tonistiigi Thanks for implementing this feature!

jonasschneider · 2017-01-19T06:53:01Z

Edit: Turns out we were in fact using the cache wrong (cache is not touched when a RUN doesn't modify the FS. Sorry for the confusion :)

~~--cache-from seems to be broken for me in 1.13.0. Here is a Gist that reproduces the problem: https://gist.github.com/jonasschneider/9e0bf96429444c89da1749225d89750a~~

Setup: Two empty Docker-in-Docker instances are started. The first one builds an image and pushes it to a registry. The second instances pulls down the image and attempts to build the identical Dockerfile, referencing the repo name of the first image in --cache-from.
Expected: The second build uses the cache of the first build.
Actual: The second build starts from scratch.

~~Am I using it wrong or is this a bug?~~

thaJeztah · 2017-01-19T11:19:51Z

@jonasschneider could you open an issue with more details (output of docker version and docker info), so that it can be looked into / checked if it's a bug or not?

jonasschneider · 2017-01-19T11:22:02Z

@thaJeztah Thanks for taking a look. I think we've tracked down the issue to a pretty surprising combination of cache invalidation rules and non-FS-modifying RUN commands not being cached, so there might not be a bug. I'll think it through again and open issues accordingly.

thaJeztah · 2017-01-19T11:24:05Z

Thanks @jonasschneider 👍

stevvooe mentioned this issue Aug 26, 2016

Proposal: using registry images for build caching #24711

Closed

stevvooe mentioned this issue Sep 15, 2016

During docker push, calculate and compare sha256 digest even if it doesn't exist yet on the image #18924

Open

thaJeztah added area/distribution kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny labels Sep 15, 2016

lox mentioned this issue Sep 19, 2016

Implement shared docker layer cache buildkite/elastic-ci-stack-for-aws#145

Closed

tonistiigi mentioned this issue Sep 22, 2016

Implement build cache based on history array #26839

Merged

icecrime closed this as completed in #26839 Sep 26, 2016

tonistiigi mentioned this issue Mar 7, 2017

Using --cache-from doesn't skip base image pull #31613

Open

tonistiigi mentioned this issue May 3, 2017

--cache-from prevents sharing layers across containers #33002

Open

pimterry mentioned this issue Sep 15, 2017

Use all existing images on the device for caching during push builds balena-io-modules/balena-sync#140

Merged

dhinus mentioned this issue Oct 1, 2018

--cache-from and Multi Stage: Pre-Stages are not cached #34715

Closed

danking mentioned this issue Mar 19, 2019

[docker, scorecard] add docker base image, use in scorecard hail-is/hail#5623

Merged

davidslater mentioned this issue Jun 18, 2020

updated docker build twosixlabs/armory#645

Merged

stefanos-kalantzis mentioned this issue Apr 26, 2021

[1457] pull cacheFrom images during build fabric8io/docker-maven-plugin#1463

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: control cache sources during build with --cache-from #26065

Proposal: control cache sources during build with --cache-from #26065

stevvooe commented Aug 26, 2016 •

edited

Loading

tonistiigi commented Sep 22, 2016

stevvooe commented Sep 27, 2016

jonasschneider commented Jan 19, 2017 •

edited

Loading

thaJeztah commented Jan 19, 2017

jonasschneider commented Jan 19, 2017

thaJeztah commented Jan 19, 2017

Proposal: control cache sources during build with --cache-from #26065

Proposal: control cache sources during build with --cache-from #26065

Comments

stevvooe commented Aug 26, 2016 • edited Loading

tonistiigi commented Sep 22, 2016

stevvooe commented Sep 27, 2016

jonasschneider commented Jan 19, 2017 • edited Loading

thaJeztah commented Jan 19, 2017

jonasschneider commented Jan 19, 2017

thaJeztah commented Jan 19, 2017

stevvooe commented Aug 26, 2016 •

edited

Loading

jonasschneider commented Jan 19, 2017 •

edited

Loading