New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Caching layers during build #143

Open
j00bar opened this Issue Aug 2, 2016 · 14 comments

Comments

Projects
None yet
8 participants
@j00bar
Collaborator

j00bar commented Aug 2, 2016

One of the niceties of the Dockerfile is that the Docker engine re-uses cached layers to accelerate rebuilds.

This is not nearly so straight-forward in Ansible Container.

This issue is to track the ongoing discussions among the core team as to how we might implement a similar facility into our builds.

@chouseknecht

This comment has been minimized.

Member

chouseknecht commented Aug 9, 2016

@j00bar any updated specs since our conversation yesterday?

@j00bar

This comment has been minimized.

Collaborator

j00bar commented Aug 10, 2016

Trying to document. #thestruggleisreal What I think we've arrived at, as an optional build approach:

We're going to implement an Ansible execution strategy. This build approach will not be orchestrating builds using Docker Compose. For each task, it will calculate a hash based on the previous task's hash, the task, the host, and an enumerator of operation order. It will lookup for an image with a label matching that hash. If found, it will consider that image to be a cache of the result of the task and move on. If not, it will stand up a container for that host using the parent's cached image, execute the one task, stop the container, and commit the container as a new image for the cache.

Thus, besides the builder container, at most one other container will be running at a time. We will still be able to copy/fetch files to/from the running container and the builder container. But the containers being built will not be able to talk directly to one another over the network.

At the end of the build, the resultant image will have one layer per task executed.

I'm also going to suggest that as part of this strategy, we allow playbook writers to define a special variable per task, say, _cacheable, which if False will automatically bust the cache.

Again, this will be an optional build method for speed - the existing build method that results in a single additional layer per build will remain.

@chouseknecht

This comment has been minimized.

Member

chouseknecht commented Aug 13, 2016

Added feature/build_cache branch for this. Added as an upstream branch to maybe make collaboration easier.

@jzaccone

This comment has been minimized.

jzaccone commented Sep 9, 2016

Has any thought gone into the benefits of using image layers outside of build speed? One of the great things about Docker is that it is fast and lightweight, not only in building but also in shipping and running on different environments. Image layers are a big part of that.

For example:

  1. Faster ship times between hosts
  2. Much smaller footprint on every host where the image lives.
  3. Ditto #2 because of the reuse of common layers between different containers

Packing everything into one layer is wasteful if the common use case is to change the last layer of the image 95% of the time (i.e. code).

And I guess my point here is that I vote for integrating support for image layers out of the box rather than an optional build parameter.

(thanks for the outsider comments. Will be happy to contribute after discussions)

@j00bar

This comment has been minimized.

Collaborator

j00bar commented Sep 9, 2016

@jzaccone Thanks for bringing up those points. Out of the box, Ansible Container adds one layer to the base image during the playbook run. So if you're using CentOS (4 layers in stock image), your built image has 5 layers total. You still get all of the benefits you enumerated.

@dustymabe

This comment has been minimized.

Contributor

dustymabe commented Sep 9, 2016

So if you're using CentOS (4 layers in stock image)

Well most of the layers are just metadata:

Layers                                                                                                                                                 
└─970633036444 docker.io/centos:7 /bin/sh -c #(nop) CMD ["/bin/bash"]                                                                                  
  └─<missing> /bin/sh -c #(nop) LABEL name=CentOS Base Image vendor=CentOS license=GPLv2 build-date=20160729                                           
    └─<missing> /bin/sh -c #(nop) ADD file:44ef4e10b27d8c464ad675a8a514a382c8748bb17d1bd707df084f6315076149 in /                                       
      └─<missing> /bin/sh -c #(nop) MAINTAINER https://github.com/CentOS/sig-cloud-instance-images
@jzaccone

This comment has been minimized.

jzaccone commented Sep 9, 2016

In a typical application, the real "meat" is going to come in that last layer created by the playbook run. There is much more to gain here by breaking that apart into multiple layers.

@j00bar

This comment has been minimized.

Collaborator

j00bar commented Sep 9, 2016

@jzaccone "gain" in terms of what?

@tomsun

This comment has been minimized.

tomsun commented Sep 10, 2016

Gain example: dependencies

One application I deploy ATM (to AWS ECR, from a Dockerfile) has:

  • a base image (~150 MB)
  • an application-specific apt-get layer (~300 MB), changes very infrequently, only when the apt-get oneliner in my Dockerfile is changed
  • an application-specific pip install layer (~50MB), changes quite infrequently, only when my application's requirements.txt file is changed
  • my application-specific code - changes on every deploy (~20 MB)

On a typical build only the last layer changes, ie I only have to upload ~20 MB to AWS most of the time.

If the recipe for the three custom layers was an ansible-container playbook instead of a Dockerfile, and if they were produced as a single layer, then I would upload ~370 MB data on every deploy.

I.e. a way, within the vocabulary of ansible-container, to somehow express that some task or role or play needs its own layer - and some reasonable cache invalidation criteria to go with that - would be very nice.

@jzaccone

This comment has been minimized.

jzaccone commented Sep 16, 2016

@tomsun @j00bar This is the use case that I am referring to. Thank you tomsun for proving the example.

To combine my list from above with Tomsun's example, the ~370MB would be duplicated in unique images where only a unique image layer that contains the ~20 MB code change is necessary. This increases the container footprint on all servers where the image is moved to.

@j00bar

This comment has been minimized.

Collaborator

j00bar commented Sep 17, 2016

@tomsun @jzaccone Thanks, y'all - that's tremendously helpful. Your input on my design/UX quandry about how to implement this would be welcome. See here: #217 (comment)

@gregdek gregdek modified the milestones: 0.3, 0.2 Nov 22, 2016

@gregdek gregdek added the in progress label Dec 8, 2016

@kavehv

This comment has been minimized.

kavehv commented Dec 12, 2016

@tomsun's point is the main item holding me back from using ansible-container at the moment (and I would love to do so). The potential proliferation of layers due to the current ineffective cache use is problematic for me as well. I'm curious when you think it might be ready. Dockerfiles just don't have the same functionality offered by ansible.

@shanemcd

This comment has been minimized.

Member

shanemcd commented Jan 4, 2017

Hey @j00bar, has there been any progress made on this?

We're using containers in our Jenkins infrastructure and without layer caching, building / pushing / pulling flat images would slow our builds down quite a bit.

@j00bar

This comment has been minimized.

Collaborator

j00bar commented Jan 12, 2017

@shanemcd There's a WIP branch that implements an execution strategy that identifies and fingerprints layers. I still need to write the code that will actually do the reuse and cleanup of stale layers.

@gregdek gregdek modified the milestone: 0.3 Feb 13, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment