New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pulling build cache #20316

Closed
delfick opened this Issue Feb 14, 2016 · 85 comments

Comments

@delfick

delfick commented Feb 14, 2016

Hi,

I found out yesterday that in docker 1.10.1 the parent chain of an image isn't pulled anymore.

This means on my bamboo agent I can no longer pull down the build cache for the images and so all my jobs will rebuild everytime and I get no time saving (sbt really does take a long time to run, I'd rather have the build cache for when the deps haven't changed).

Is it possible to reintroduce the ability to pull not just the top layer but the the parent chain as well please?

Thanks

Stephen.

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Feb 15, 2016

With the new content-addressable storage, there is no "parent chain"; an image is a collection of layers, and those layers are directly linked to the image (i.e. no need to traverse the parent-images to collect the dependent layers).

afaik, The build cache of an image is now separate, i.e. you can only make use of the build cache on the machine that actually built the image; because the build cache depends on both the instructions in the Dockerfile, and the build-context (the files used during build).

Did this change for you? I.e. were you previously able to docker pull an image on an empty machine, and see docker build skipping lines with "using cache..."?

@delfick

This comment has been minimized.

delfick commented Feb 15, 2016

Yeah, our build agents depend on the "using cache" behaviour to be fast.

The build agents get destroyed at night and recreated in the morning, so
they need to recreate the cache by pulling down the latest version of the
images.

On Tue, Feb 16, 2016, 02:30 Sebastiaan van Stijn notifications@github.com
wrote:

With the new content-addressable storage, there is no "parent chain"; an
image is a collection of layers, and those layers are directly linked to
the image (i.e. no need to traverse the parent-images to collect the
dependent layers).

afaik, The build cache of an image has always been separate, i.e. you can
only make use of the build cache on the machine that actually built the
image; because the build cache depends on both the instructions in the
Dockerfile, and the build-context (the files used during build).

Did this change for you? I.e. were you previously able to docker pull an
image on an empty machine, and see docker build skipping lines with
"using cache..."?


Reply to this email directly or view it on GitHub
#20316 (comment).

@dustinlacewell

This comment has been minimized.

dustinlacewell commented Feb 16, 2016

Just chiming in here to say that we (Highland team) also depended on this functionality and are doing really ugly things to retain caching features in face of this change.

@fx

This comment has been minimized.

fx commented Feb 17, 2016

+1 this makes CI builds very, very painful.

@kimh

This comment has been minimized.

kimh commented Feb 20, 2016

@dustinlacewell I'm curious about your really ugly things. Is that something you can share? I opened #20380 but I suppose the root cause is the same and I'd like to know a way to use cache even if it's very ugly.

@bfosberry

This comment has been minimized.

bfosberry commented Mar 3, 2016

+1 this broke caching functionality for us, and for our customers, any workarounds and fix ETA would be appreciated

@kreisys

This comment has been minimized.

kreisys commented Mar 3, 2016

+1. I'm one of those customers.

@sdornan

This comment has been minimized.

sdornan commented Mar 4, 2016

+1 another customer

@bfosberry

This comment has been minimized.

bfosberry commented Mar 7, 2016

Any thoughts @thaJeztah? Is there a way we can masquerade the build context? Previously if the relevant files for an ADD/COPY were identical it used the cache, even on another machine. How does the new image cache prevent caches from other machines from being used?

@bfosberry

This comment has been minimized.

bfosberry commented Mar 7, 2016

This is starting to make sense, so is there a way we can specify for image build layers to be included in the pull?

@luben93

This comment has been minimized.

luben93 commented Mar 8, 2016

+1 another customer

@rheinwein

This comment has been minimized.

Member

rheinwein commented Mar 8, 2016

+1 as this does increase build times significantly, and also slows down day-to-day work for anyone who spins up lots of disposable VMs during development. @dustinlacewell I would also be very interested in hearing your very ugly things. We attempted to tag and push each individual layer during the build process and push/pull those as a way to recreate the cache, but to no avail.

@jpetazzo

This comment has been minimized.

Contributor

jpetazzo commented Mar 8, 2016

Following some internal convo here at Docker —

This is addressing a security issue; and the associated threat model is "as an attacker, I know that you are going to do FROM ubuntu and then RUN apt-get update in your build, so I'm going to trick you into pulling an image that ​_pretents_​ to be the result of ubuntu + apt-get update so that next time you build, you will end up using my fake image as a cache, instead of the legit one."

With that in mind, we can start thinking about an alternate solution that doesn't compromise security.

@bfosberry

This comment has been minimized.

bfosberry commented Mar 8, 2016

That makes sense. It seems like we should be able to come up with a sensible middleground that does not compromise security using notary, or at least in the meantime allow users to bypass the security protections in situations where they are confident of the source of the layers

@delfick

This comment has been minimized.

delfick commented Mar 8, 2016

Surely if an attacker has access to where you are building your images, you
have bigger problems?

Also, if they can fake an intermediate image, what stops them faking the
final image?

On Wed, Mar 9, 2016, 04:21 Brendan Fosberry notifications@github.com
wrote:

That makes sense. It seems like we should be able to come up with a
sensible middleground that does not compromise security using notary, or at
least in the meantime allow users to bypass the security protections in
situations where they are confident of the source of the layers


Reply to this email directly or view it on GitHub
#20316 (comment).

@delfick

This comment has been minimized.

delfick commented Mar 8, 2016

After a discussion with a friend at work, I can see it from a different viewpoint.

So let's say we pull down evil/foo which is FROM ubuntu followed by RUN apt-get update except with a small surprise included in the image.

Subsequent builds using those same commands will be compromised.

Now if we base a build on evil/foo we get the same problem regardless of the intermediate images, but in this case we are trusting evil/foo has a whole is not compromised and we shouldn't have to trust that downloading evil/foo will negatively affect subsequent builds of other images.

So, my proposal is can we put trust at a per registry level?

So I can say to docker, I trust that only I can put images into this specific registry and that you may download intermediate images from it as well, because only I have the ability to put them there in the first place. And for public registries I only download the final image.

@tonistiigi

This comment has been minimized.

Member

tonistiigi commented Mar 8, 2016

I propose adding support for loading parent chains in the load endpoint(this already works in legacy mode). I think docker load has a bit different security properties than pull. Then we can provide an external tool for loading/saving build cache metadata without restarting the daemon. So in CI, you could do docker pull and then try to apply the build cache data on it.

@mleventi

This comment has been minimized.

mleventi commented Mar 17, 2016

Are there any known workaround to this issue currently?

@kramarz

This comment has been minimized.

kramarz commented Mar 21, 2016

+1 We were using tar to be sure that our build context is always the same, so we could share cached layers. Now its useless and we have to build everything from scratch on every machine. We are using jenkins ec2 plugin so this means complete rebuild of all our images multiple times per day. We are using private registry with ssl so we are sure what are it the layers.

@amrali

This comment has been minimized.

amrali commented Mar 23, 2016

This broke caching functionality for us as well. Is the attack vector registry poisoning or a MITM on docker during a FROM? AFAIK docker securely pulls from the registry.

@mitchcapper

This comment has been minimized.

Contributor

mitchcapper commented Mar 25, 2016

The security concern is understandable, but an option to allow the trusting or pulling of cache would be a big win. It doesn't sound like people in this thread have an issue with not having the cache history for images pulled from the docker hub as much as between their own machines. It would seem if it was possible to pull the build cache from another docker host (or push and pull it from an internal registry for example ) would solve most peoples problems. It would also avoid the security issue (as you are simply pulling it from a trusted host that originally followed the security practices). I do not see any security decrease in this practice.

For more details:
This is a complicated issue but one that most likely needs some sort of shared caching solution otherwise it requires more aggressive tagging and complex client infrastructure.
Right now we build all the images and distribute them to various machines. Some clusters use additional images so those get built in that cluster. Before it would pull down everything in the hub, and go through and run build on all the dockerfiles for the images on that cluster. 99% of these images were build previously so it uses the cache and was done quickly. Then it pushes all images to the registry which again 99% were previously pushed so very few new bits are pushed.

With the new system obviously every build on any cluster that didnt do the originating build ends up having to rebuild every image. In addition it then ends up repushing every image to the registry (which is a lot of overhead).

This could possibly be changed by updating framework to rather than just try to build and push every image to only do so for ones that have changed. This is not easy however as it means additional tagging for every dockerfile revision or a client side database of the 'current' image id. Essentially for attempting to replicate the old behavior one needs to have a client side build client that hashes the dockerfile build context like the build daemon.

@deavid

This comment has been minimized.

deavid commented Apr 6, 2016

This issue is effectively stopping me from using docker. Please add to "docker pull" an option to also pull build cache/intermediate images (at our own risk).

I'm new to docker, probably i'm wrong, but i don't see any security issue there. If the image hash now is a secure hash instead of a UUID (for example SHA-2 512) the probability of collision almost zero.
If an attacker could trick your docker client to pull its image instead of yours is because it computes the same hash. Then, or both are the same byte by byte, or he spent years on brute-force to find a collision. If someone could do this, then he could trick also ssl certificates, for example. Or full images. I don't see why this would be an attack on intermediate images and is not a concern on final images.

Anyway, I'm all for an option that enables this kind of pull "at our own risk". Please.

@tonistiigi

This comment has been minimized.

Member

tonistiigi commented Apr 7, 2016

i don't see any security issue there. If the image hash now is a secure hash instead of a UUID

Build cache is not based on the image IDs but uses a different method that tries to map Dockerfile commands to configurations.

But yes, for some cases like CI that only download a single image from a trusted source there is no issue. We have merged #21385 that will ship in v1.11 that can be used by external tools to import chains of image configurations. I'll try to start working on one of such tools soon.

@bfosberry

This comment has been minimized.

bfosberry commented Apr 7, 2016

@tonistiigi this is an important for us, let us know is we can assist with the development of such tooling

@mitchcapper

This comment has been minimized.

Contributor

mitchcapper commented Apr 7, 2016

@tonistiigi would the behavior be to docker pull, docker save | docker load ? Or would we have to docker save from a hub and use docker load on the machine?

@simonvanderveldt

This comment has been minimized.

simonvanderveldt commented Apr 14, 2016

Build cache is not based on the image IDs but uses a different method that tries to map Dockerfile commands to configurations.

@tonistiigi Why doesn't the build cache use image IDs?

@thaJeztah thaJeztah added this to the 1.13.0 milestone Sep 26, 2016

@simonvanderveldt

This comment has been minimized.

simonvanderveldt commented Sep 26, 2016

@tonistiigi I might be misunderstanding things, but how would #26839 solve not having a build cache on a CI machine? All the build jobs would need to parse the Dockerfile to determine which image is in the FROM line and append it to the docker build command?

@tonistiigi

This comment has been minimized.

Member

tonistiigi commented Sep 26, 2016

@simonvanderveldt The image specified in --cache-from is the image from the previous CI build. It doesn't have anything to do with the FROM line in Dockerfile. CI can pull it with regular docker pull, like they could before v1.10.

@simonvanderveldt

This comment has been minimized.

simonvanderveldt commented Sep 26, 2016

@tonistiigi OK, thanks for the clarification. We'll give it a try, let's see how well it works :)

@graingert

This comment has been minimized.

Contributor

graingert commented Sep 26, 2016

I think it would be much better to whitelist registries eg

--history-whitelist 'mycustom.example.com/frontend' 'ubuntu' '
mycustom.example.com/backend'

On 26 Sep 2016 19:15, "Simon van der Veldt" notifications@github.com
wrote:

@tonistiigi https://github.com/tonistiigi I might be misunderstanding
things, but how would #26839 #26839
solve not having a build cache on a CI machine? All the build jobs would
need to parse the Dockerfile to determine which image is in the FROM line
and append it to the docker build command?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#20316 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAZQTFqMsq0xsAJkHY3CKXk73Sj5B3Baks5quAutgaJpZM4HZ8dC
.

@delfick

This comment has been minimized.

delfick commented Sep 27, 2016

That's a cool feature. But I don't understand how that doesn't conflict
with your earlier security concerns... (maybe I need more coffee?)

I agree with thomas above that this ticket would be better served by
whitelisting registries. (with a massive warning if you whitelist
dockerhub!)

And, as a bonus, in a way that is global to docker so I don't have to
change all projects.

On Tue, Sep 27, 2016, 05:04 Thomas Grainger notifications@github.com
wrote:

I think it would be much better to whitelist registries eg

--history-whitelist 'mycustom.example.com/frontend' 'ubuntu' '
mycustom.example.com/backend'

On 26 Sep 2016 19:15, "Simon van der Veldt" notifications@github.com
wrote:

@tonistiigi https://github.com/tonistiigi I might be misunderstanding
things, but how would #26839 #26839
solve not having a build cache on a CI machine? All the build jobs would
need to parse the Dockerfile to determine which image is in the FROM line
and append it to the docker build command?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#20316 (comment),
or mute
the thread
<
https://github.com/notifications/unsubscribe-auth/AAZQTFqMsq0xsAJkHY3CKXk73Sj5B3Baks5quAutgaJpZM4HZ8dC

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#20316 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAGq9YnCIXMv1XHw_LEwMH-tWFGGMivkks5quBdIgaJpZM4HZ8dC
.

@marius311 marius311 referenced this issue Oct 4, 2016

Open

Getting Error #1

@andrask

This comment has been minimized.

andrask commented Dec 7, 2016

It used to be great that I was able to select a layer from any image and use it as a starting point. Currently, I am given an image that has 4 layers to be stripped off to get to the original base image. The original image is not reconstructable in any other way.

I'll go back to Docker 1.9 and do the thing there by simply tagging the given layer.

I couldn't find a way to do this with this new system. Does anyone have some advice?

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Dec 7, 2016

@andrask you still can, if the image was built locally, the intermediate layers are still stored locally as images during docker build. Those images are not distributed though, when doing docker push

docker build -t foo .
Sending build context to Docker daemon 2.048 kB
Step 1/3 : FROM alpine
 ---> baa5d63471ea
Step 2/3 : RUN echo "step-two" > foobar
 ---> Running in dac42a660616
 ---> 8d8f7ba114a1
Removing intermediate container dac42a660616
Step 3/3 : RUN echo "step-three" > foobar
 ---> Running in fc1292ec6183
 ---> 401c84521cea
Removing intermediate container fc1292ec6183
Successfully built 401c84521cea
docker run --rm 8d8f7ba114a1 cat foobar
step-two

docker run --rm 401c84521cea cat foobar
step-three
@andrask

This comment has been minimized.

andrask commented Dec 7, 2016

@thaJeztah Unfortunately, this image was built months ago. No one has the build any more. We are left with a descendant image that has all the original content but on lower layers.

With docker 1.8.3 I have just downloaded the image from the registry and tagged the given layer.

Is anything like this possible with the new setup?

@thaJeztah

This comment has been minimized.

Member

thaJeztah commented Dec 7, 2016

@andrask the layers are all there, if you docker save -o image.tar <image> you'll get an archive containing all the image and layer data. Not sure how easy it is to reconstruct an image from previous layers, haven't tried

@andrask

This comment has been minimized.

andrask commented Dec 7, 2016

@thaJeztah Thanks for the info. Though, I (and probably many others) would highly appreciate if there was some guidance on how this can be done. Creating the image descriptor by hand, removing the unneeded layer info from the json configs would probably work, I guess.

@mitchcapper

This comment has been minimized.

Contributor

mitchcapper commented Dec 7, 2016

@andrask you are talking about a small corner case I believe. Using the tool above you can already save the metadata for an image after it is built with all the layers and transfer that to any system you need. You are talking about the case where you have lost the original build cache, and do not want to rebuild it locally on one machine to get the history to be able to distribute. Most of the pain areas described above do not match with that.

@andrask

This comment has been minimized.

andrask commented Dec 7, 2016

@mitchcapper I'm not sure if this is a corner case. I think it easily falls in line with one of the comments above:

Allowing parent layer metadata to be saved for a layer, regardless if the parent layer is in the save command, would be a huge win for those of us working on CI/remote systems.

Reusing parent layers used to be ridiculously easy. It would be good if we could get some comparably easy way to do it now.

PS: just to make my case clearer
It is impossible to rebuild the base from the Dockerfile as the 3rd party dependencies have changed significantly since 8 months ago when the base was last built. The tags for my base image have been overwritten and I can only restore them from a descendant image.
With Docker 1.8 I simply pulled the descendant image, tagged the base layer and I was done.
With Docker 1.10+ I'd need to save, then manually construct the base image descriptor and reload it. Doable but sad that it's far more complex.

@kbiernat

This comment has been minimized.

kbiernat commented Jan 19, 2017

I tried to use the mentioned method with --cache-from, however it doesn't work as expected.
It does indeed get some more steps from the cache, but not all of them.

The case for me is that there are a few people working on an image (weighting about 2g at the moment), and re-building most of the image at any change by any of those people, then re-pushing it is a big pain.

@andrask obviously it's too late, but it's a good practice to keep the 3rd party dependencies mirrored in your own infrastructure :) There is NO GUARANTEE that even a huge site (like launchpad for downloading DEBs) won't go down over a period of time. Plus it obviously saves a lot of time when doing wget -O - 500mb-sources-file.tar.gz | tar xzf - && configure && make/install && rm -rf sources in one step

@tonistiigi

This comment has been minimized.

Member

tonistiigi commented Jan 19, 2017

@kbiernat

I tried to use the mentioned method with --cache-from, however it doesn't work as expected.
It does indeed get some more steps from the cache, but not all of them.

Have you opened an issue for this?

@kbiernat

This comment has been minimized.

kbiernat commented Jan 19, 2017

@tonistiigi no :) I'll try to investigate the reasons and will open an issue with full description of what the switch changes for me, if needed,
thank you for your interest.

@kish3007

This comment has been minimized.

kish3007 commented Mar 1, 2017

I have been trying to make use of --cache-from command in docker 1.13, no luck so far, any help would be really appreciated. This is the step I have been following:
Dockerfile1:
FROM centos:7
RUN echo "Step1"
RUN echo "step2"

docker build -t kish0509/test1:latest .
docker push kish0509/test1:latest

Dockerfile2:
FROM centos:7
RUN echo "Step1"
RUN echo "step2"
RUN echo "step3"
RUN echo "step4"

In the second machine I have been trying to make use of cache:
docker pull kish0509/test1:latest
docker build --cache-from kish0509/test1:latest -t kish0509/test2:latest .

It doesn't make use of cache for the first 2 instructions rather it runs each instructions again. Am I missing anything?

@tonistiigi

This comment has been minimized.

Member

tonistiigi commented Mar 1, 2017

@kish3007 What is the output of
docker history --no-trunc registry/test1:latest docker history --no-trunc registry/test2:latest

@kish3007

This comment has been minimized.

kish3007 commented Mar 1, 2017

docker history --no-trunc kish0509/cachtest:v1-latest

IMAGE                                                                     CREATED             CREATED BY                                                                                          SIZE                COMMENT
sha256:3fa7c14f1eb1c5fdd9175de84dcbd730d809febc5b7ee400c211e291e16124e0   2 hours ago         /bin/sh -c echo "step 2"                                                                            0 B                 
<missing>                                                                 2 hours ago         /bin/sh -c echo "step1"                                                                             0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]                                                                0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop)  LABEL name=CentOS Base Image vendor=CentOS license=GPLv2 build-date=20161214     0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop) ADD file:940c77b6724c00d4208cc72169a63951eaa605672bcc5902ab2013cbae107434 in /    192 MB              
<missing>                                                                 6 months ago        /bin/sh -c #(nop)  MAINTAINER https://github.com/CentOS/sig-cloud-instance-images                   0 B           

docker history --no-trunc kish0509/cachtest:v2-latest

IMAGE                                                                     CREATED             CREATED BY                                                                                          SIZE                COMMENT
sha256:31919b97f2dcd7409d72aa18727eb72acd4a4565a639d016a3866c27fc8d8a07   30 seconds ago      /bin/sh -c echo "step 4"                                                                            0 B                 
sha256:21a9a56449d5ee4f5c6f927b9891fb400243609920d6388dd2a3a96625a21a83   32 seconds ago      /bin/sh -c echo "step 3"                                                                            0 B                 
sha256:14d89a0f37b04ce8dc20d2286d7698f7c8f45469a1a1279ead0960bd75df7fd7   34 seconds ago      /bin/sh -c echo "step 2"                                                                            0 B                 
sha256:08e6a91d1b4c163fc3c1241cf88903d811b830837026276c42300d019af10198   36 seconds ago      /bin/sh -c echo "step1"                                                                             0 B                 
sha256:67591570dd29de0e124ee89d50458b098dbd83b12d73e5fdaf8b4dcbd4ea50f8   2 months ago        /bin/sh -c #(nop)  CMD ["/bin/bash"]                                                                0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop)  LABEL name=CentOS Base Image vendor=CentOS license=GPLv2 build-date=20161214     0 B                 
<missing>                                                                 2 months ago        /bin/sh -c #(nop) ADD file:940c77b6724c00d4208cc72169a63951eaa605672bcc5902ab2013cbae107434 in /    192 MB              
<missing>                                                                 6 months ago        /bin/sh -c #(nop)  MAINTAINER https://github.com/CentOS/sig-cloud-instance-images                   0 B      
@tonistiigi

This comment has been minimized.

Member

tonistiigi commented Mar 2, 2017

@kish3007 This seems to be #31189 . It should be fixed when you update to v17.03.0-ce

@kish3007

This comment has been minimized.

kish3007 commented Mar 2, 2017

Thank you very much. This issue is fixed in v17.03.0-ce, however is there a way to pull the image on fly during docker build? Like this: docker build --cache-from kish0509/cachetest:1 --pull -t kish0509/cachetest:2 .

Right now we pull the version1 image separately and we build a new image.

@javipolo

This comment has been minimized.

javipolo commented Jan 17, 2018

In case someone is going nuts with reusing layers as I did, the "trick" is to pass to --cache-from the image you are rebuilding (and have it pulled already) and ALSO the image that it uses as base in the FROM.

Example:
Dockerfile for image custom-gource:0.1

FROM base_image:2.2.1
RUN apt-get update && apt-get install gource
COPY myscript.sh /myscript.sh

In order to rebuild in other host without doing the apt-get again, you'll need to:

docker pull custom-gource:0.1
docker build --cache-from=base_image:2.2.1,custom-gource:0.1 . -t custom-gource:0.2

It might seem too obvious but I've been struggling long time with this until I got that you need to include the base image too

jcfr added a commit to dockbuild/dockbuild that referenced this issue Feb 26, 2018

makefile: Update build rule to use "--cache-from" and reuse cached la…
…yers

To support docker client < 1.13, a fallback without the argument was added.

See moby/moby#20316 (comment)

scottx611x added a commit to parklab/SigMA that referenced this issue Dec 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment