[Feature request] Docker build should use cache unless asked not to #6439

MirzaSikander · 2018-02-15T00:07:59Z

Environment

Server - VSTS or TFS on-premises?
VSTS
Agent - Hosted or Private:
- If using Hosted agent, provide agent queue name:
  Linux Hosted Preview

Issue Description

Locally, if you manually build docker containers over and over again with small changes, the docker daemon uses cache to speed up docker build.

On VSTS, every time a build is started, the docker build goes through the same steps that it has gone over previously: the base images are fetched and expanded, apt-get installs all the dependencies, files are copied over, etc. This adds a lot of time for each build; In fact the majority of build time for our CI/CD is wasted in this.

Expectation:
If the host being used is the same for each build, it should make use of the cache from previous builds to
speed up this step, unless specifically not asked to.

polys · 2018-05-09T14:21:40Z

Could you at least cache base images from public repositories as a first step?

It's such a waste of time downloading images like microsoft/aspnetcore, microsoft/aspnetcore-build, node, python, nginx etc. again and again.

borkke · 2018-05-12T16:01:24Z

+1 for this one.

Does anyone know is it the same on TFS?

polys · 2018-05-12T19:37:15Z

On Hosted VS2017 agents, some container images are already cached apparently:
https://github.com/Microsoft/vsts-image-generation/blob/master/images/win/Vs2017-Server2016-Readme.md#docker-images

linuxchata · 2018-06-07T07:50:09Z

Is there information about container images cached on Hosted Linux Preview agents?

chrispat · 2018-06-08T12:45:13Z

The host is not the same for each build, it is a fresh host each time for security and consistency reasons. We do precache some of those images on the host machines. We can look at caching others but that cache will always drift from what is currently published simply due to how often the pool is updated. So it is possible that you build a docker container with a node base and you get a different sha256 base when you build in CI from what you get when you build locally.

bryanmacfarlane · 2018-06-08T12:57:03Z

note that we are going to release a new Ubuntu (released) hosted pool and we're investigating caching a series of more focused containers (node, dotnet core etc...). We may keep a couple tags of each. But as @chrisrpatterson noted, you're cache hits will be hit or miss for a number of factors. We're experimenting now.

This issue should move to vsts-image-generation repo since that's where the hosted VM gen and the new docker images work will happen.

andreacassioli · 2018-07-23T16:25:48Z

This is quite an issue, especially when using CI/CD intensively and with base images such as ASP.NET SDK or the like (unfortunately even dotnet core 2.1 seems to have a very large sdk image).

@chrisrpatterson : the cache will clearly drift, that's in it's own nature. The point is that as long as a new image is not published, the one I am using will be cached as the SHA will be the same (notice that people should pick specific version, not just latest!).

andreacassioli · 2018-07-26T18:31:14Z

I have noticed that the docker build process always delete intermediate containers. Why is that? It is not the default Docker behaviour. We have some use cases in which we would like to split a multi-stage build in multiple tasks, but right know there is no way to cache one stage to the second.

Notice that that does not happen running docker build fro ma bash task.

harshil93 · 2018-07-27T09:49:48Z

@andreacassioli Not an expert on docker but the default option in docker build is to remove intermediate containers. Do you want to keep the containers after docker build ?

https://docs.docker.com/engine/reference/commandline/build/#options

See the --rm

andreujuanc · 2018-08-26T18:21:53Z

I'd love this!
Even just for testing purposes, i'm trying different settings while configuring my pipeline and if every queue gonna use 5 minutes of my free build time, then i'll run out of minutes in less than a day :(

matt-mahdieh · 2018-08-27T04:18:58Z

Any progress on this issue? I have been having the same issue on VSTS. Every time trigger a new build it downloads the images (in my case .NET SDK) which is huge. Is there any way to cache same as what is available in Bitbucket cloud?

mattnedrich · 2018-10-18T17:21:05Z

Any update on this?

mattquinn · 2018-11-12T13:01:18Z

I have a customer interested in this feature.
They understand running your own private build agent can solve this, but the tradeoff of then having to manage that infrastructure isn't compelling.

The idea of being able to leverage the cache in a hosted build agent is compelling - quicker builds, without managing infrastructure - and could help get their builds down from tens of minutes to ~1-2 mins, which on their CI/CD "always deploying" plans would be a massive win.

AndreyBespamyatnov · 2018-11-28T20:02:52Z

Have the same issue with Downloading MSSQL image every time, takes ~20 min.
Thanks

p10tyr · 2018-12-10T16:25:47Z

And now with limited 1 Parallel Builds a simple app takes minutes to build in Docker CI because it has to download everything every time ... Pleeeeaaaseeeee! I don't want to wait hours for a build to complete and release... We already pay for extra parallels but this is a bit out of hand now.

nelisbijl · 2018-12-15T10:51:45Z

I have a customer interested in this feature.
They understand running your own private build agent can solve this, but the tradeoff of then having to manage that infrastructure isn't compelling.

The idea of being able to leverage the cache in a hosted build agent is compelling - quicker builds, without managing infrastructure - and could help get their builds down from tens of minutes to ~1-2 mins, which on their CI/CD "always deploying" plans would be a massive win.

I managed to create a self-hosted agent from https://github.com/Microsoft/azure-pipelines-image-generation using packer. Although the documentation fails to explain how to create a VM from the generated vhd and template file, I got one up and running, installed the agent software and configured it.

However, this agent still fails to cache the costly 'npm install' step in my Dockerfile.
If I run the same docker build command on that machine in the folder created by the pipeline agent, it does use the cached docker image!!
Something must be different between using the commandline and using the azure pipeline agent but I haven't found it yet

UPDATE
Found that it is caused by the Job's option 'Allow scripts to access the OAuth token'
It's value is passed to docker build using a --build-arg ACCESS_TOKEN=$(System.AccessToken)
After the Dockerfile statement: ARG ACCESS_TOKEN
Every RUN command is re-executed and no longer uses cache:

2018-12-15T12:35:24.1151714Z Step 2/6 : COPY .npmrc .npmrc
2018-12-15T12:35:24.1169984Z ---> Using cache
2018-12-15T12:35:24.1170265Z ---> 0d5e206a8336
2018-12-15T12:35:24.1170385Z Step 3/6 : COPY package*.json ./
2018-12-15T12:35:24.1180777Z ---> Using cache
2018-12-15T12:35:24.1181102Z ---> 21eb833bb3c8
2018-12-15T12:35:24.1181370Z Step 4/6 : RUN ls -altrR .
2018-12-15T12:35:24.1186686Z ---> Using cache
2018-12-15T12:35:24.1187008Z ---> 895655fcdf21
2018-12-15T12:35:24.1187084Z Step 5/6 : ARG ACCESS_TOKEN
2018-12-15T12:35:24.1193136Z ---> Using cache
2018-12-15T12:35:24.1193388Z ---> a3a4895ab808
2018-12-15T12:35:24.1193684Z Step 6/6 : RUN ls -altrR .
2018-12-15T12:35:24.3915668Z ---> Running in 52887c15fe39
2018-12-15T12:35:25.4198528Z .:
2018-12-15T12:35:25.4198815Z total 384
2018-12-15T12:35:25.4199606Z drwxr-xr-x 1 root root 4096 Nov 20 08:30 ..
2018-12-15T12:35:25.4200313Z -rw-r--r-- 1 root root 1401 Dec 15 12:24 package.json
2018-12-15T12:35:25.4200657Z -rw-r--r-- 1 root root 373122 Dec 15 12:24 package-lock.json
2018-12-15T12:35:25.4201062Z -rw-r--r-- 1 root root 309 Dec 15 12:24 .npmrc
2018-12-15T12:35:25.4201349Z drwxr-xr-x 1 root root 4096 Dec 15 12:25 .

The second ls statement is re-executed!
Note that the $ACCESS_TOKEN is not used anywhere.

FYI: I use the access token in my .npmrc file to access our private npm registry

Don't know whether System.AccessToken is different on every run. it echos as ***, or that it;s value can not be cached by Docker. Specifying any other value (e.g. --build-arg ACCESS_TOKEN=AT) makes the caching work as expected.

Anyone any idea?

~~UPDATE 2~~

~~Guess what? There is a difference between:~~

~~docker build . --build-arg ACCESS_TOKEN=$(System.AccessToken) ...~~

~~and~~

~~ACCESS_TOKEN=$(System.AccessToken) docker build . --build-arg ACCESS_TOKEN=$ACCESS_TOKEN ...~~

~~The second has no problem using docker's cache after the ARG ACCESS_TOKEN statement.~~
~~Great performance improvement!~~

UPDATE 3
After the holidays I came to the conclusion that what I stated in UPDATE 2 is not correct.

You can not specify an environment variable at the start of a commandline. You have to use the dedicated environment section for that (either in designer or yaml). Otherwise the environment variable will simply not be set (empty string value).
The reason it worked for me was that the ACCESS_TOKEN build arg wasn't actually used.

As $(System.AccessToken) seems to be different on every run, you should not use it in your Docker build if you want to enjoy the use of caching.

However, I somehow needed to authenticate my private npm registry.
You can use an NpmAuthenticate step for that. This updates the .npmrc files with an authentication token. You have to specify credentials. When left blank the OAuth authentication of the runner will be used which results in a different .npmrc file on every run. Instead you should use a npm service connection. The drawback is that it needs a PAT (private access token) that can be valid for no longer than 12 months. So you will have to remember to update that. The good news is that it results in a constant .npmrc, so docker can and will use it's cache

willemodendaal · 2019-01-31T12:23:56Z

Hello from 2019. I guess this is still an issue... pity. Would really like to use VSTS hosted agents, but not being able to cache container images makes the build super slow. 20+ minutes instead of 1 minute.

For reference, this is what my dockerfile looks like. Nothing fancy.

FROM microsoft/aspnet:4.7.2-windowsservercore-ltsc2016
ARG source
WORKDIR /inetpub/wwwroot
COPY . .

cpumanaz · 2019-03-01T18:00:56Z

I am experiencing the same. using the "FROM microsoft/aspnet:4.7.2-windowsservercore-ltsc2016" source image. It was mentioned not every hash is cached and there is drift. Is it documented which specific tag(s) are cached on the 'Hosted VS2017' agent so we can choose those source images?

This is advertised as a feature at the link below, but I do not see it working as advertised.
https://docs.microsoft.com/en-us/azure/devops/pipelines/languages/docker

cpumanaz · 2019-03-01T18:34:51Z

I'm reporting back. The reason that many might have issues is using :latest If you were taught using the latest tag was bad then you will not see the performance gains. By adding a docker image list task, I saw which images were pre-loaded.

==============================================================================
Task         : Docker
Description  : Build, tag, push, or run Docker images, or run a Docker command. Task can be used with Docker or Azure Container registry.
Version      : 1.1.27
Author       : Microsoft Corporation
Help         : [More Information](https://go.microsoft.com/fwlink/?linkid=848006)
==============================================================================
[command]"C:\Program Files\Docker\docker.exe" image list
REPOSITORY                    TAG                 IMAGE ID            CREATED             SIZE
microsoft/dotnet-framework    latest              ec599075a73c        7 weeks ago         13.2GB
microsoft/windowsservercore   latest              ea9f7aa13d03        7 weeks ago         11GB
microsoft/aspnet              latest              ddabd3e10c02        3 months ago        13.7GB
microsoft/nanoserver          latest              4c872414bf9d        4 months ago        1.17GB
microsoft/aspnetcore-build    1.0-2.0             5d8be0910d37        6 months ago        3.99GB

andreujuanc · 2019-03-04T12:10:43Z

That's interesting. At least that's a good start.
Problem is that many depend on specific versios, and that might be still an issue :(

sebnyberg · 2019-05-23T08:33:27Z

I keep coming back to this thread to check whether anything is happening on the topic.

As our CI and release pipelines have turned into a sequence of docker and helm commands, the only long running issue is this one. Spending (wasting) almost 90% of build time downloading images and packages due to lack of support for layer caching is disappointing to say the least.

Considering that we no longer have a need for the Azure integration features, we don't depend on using Azure DevOps anymore. Changing all our pipelines to a competitor however is also quite a task so I wish this would see higher priority.

damienwebdev · 2019-05-23T14:59:57Z

@bryanmacfarlane @chrispat would this be solvable by microsoft/azure-pipelines-yaml#113 ?

BenWalters · 2019-06-06T14:02:11Z

@chrispat @bryanmacfarlane, do you have any news on this? It's been being discussed for a long time now. Even a timeline would be nice.

jarednlivingston · 2019-06-11T16:17:46Z

@chrispat @bryanmacfarlane, do you have any news on this? It's been being discussed for a long time now. Even a timeline would be nice.

It looks like it might be coming in June 2019:
microsoft/azure-pipelines-yaml#113 (comment)

oak-tree · 2019-06-26T14:17:55Z

Hey!
Docker cache is a must have feature for devops. We are lossing so much time for rebuilding our image for every single relase.. :(
according to the last comment at microsoft/azure-pipelines-yaml#113 (comment), the microsoft/azure-pipelines-yaml#113 does not going to fix this issue. Any ideas when devops plans to release this feature?

jfranki · 2019-06-26T15:03:40Z

We have a Dockerfile with a base image of more than 2GB (MATLAB Runtime) stored in an Azure Container Registry. So yeah... that one has to be built locally on every update until this is sorted out.

BenWalters · 2019-07-15T13:34:44Z

This feature request has been open 18 months now with the last update over 1 year ago!

@bryanmacfarlane & @chrispat could we have an update please as I don't believe that microsoft/azure-pipelines-yaml#113 doesn't look like it'll solve it.

bryanmacfarlane · 2019-07-15T13:55:52Z

You are correct that the caching feature won't help you here. It's also not a task issue.

Hosted pools re-image and give you a fresh machine every time.
Private / custom agents on your own VM is the only way to get incremental build, images, etc.

There is a feature which is the best of both worlds.

https://github.com/microsoft/azure-pipelines-agent/blob/master/docs/design/byos.md
That would give you incremental cached containers with the convenience of hosted pools (spun up and down for you).

You should probably create a feature request here as that's a large feature / service addition (not an issue with a task). Get some votes.

https://developercommunity.visualstudio.com/spaces/21/visual-studio-team-services.html?type=idea

@thejoebourneidentity FYI as his team is driving that feature (they own hosted pools).

I'm inclined to close this issue here because it's not a task issue and it just won't get traction here.

BenWalters · 2019-07-15T14:09:37Z

Hi @bryanmacfarlane
Thanks for the update, although a little surprised that this answer hasn't been given sooner!
Guess we are back of the queue again with this request 👎

bryanmacfarlane · 2019-07-15T14:21:28Z

Time hasn't been wasted on the queue because of this issue. It has been actively proposed, already designed per the link above and the feature has been postponed by product.

Closing as this isn't a task issue.

I've forwarded this link to the team ( @thejoebourneidentity ) with the feature and also sent them a mail.

thejoebourneidentity · 2019-07-15T18:38:12Z

Thanks @bryanmacfarlane. Hey folks, Bryan is right. True E2E docker layer caching won't be achieved in our hosted pools since we refresh machines between jobs. The postponed 'BYOS' feature as it is called on the agent repo is absolutely on our radar and will give you maximum caching flexibility.

In the meantime, if there are base container images you would like to see us include in our hosted pool images, please feel free to request them over on our image gen repo: https://github.com/microsoft/azure-pipelines-image-generation

robertoandrade · 2019-09-22T21:13:38Z

I was able to get this working using the Cache task and some conditional manual docker save/load commands after it to dump the desired docker image (and its layers) into a tar ball that the Cache task would then upload and download on the next execution which the load command from docker would put back into the local registry. The time it takes the Cache task to perform its thing in combination with the docker save/load respective operations pretty much matched that of downloading the base image/layers from a public registry, so not seeing much benefit in doing it.

In an earlier attempt spent a significant amount of time trying to cache the contents of C:\ProgramData\Docker (where the deconstructed images are stored) and even change the docker daemon to point to another location, that was unsuccessful given the container set permissions on the files/folders there.

When running benchmark tests locally in a Windows VM on my Mac, it seems to download and extract files much faster than the Azure Pipeline agents, so thinking the key bottleneck is I/O and perhaps networking.

RobertoPrevato · 2020-02-07T12:33:11Z

On July 2019 I published files to build Docker images for private Azure DevOps agents. One of them has Docker, so it supports Docker workloads. If you can run a container on a machine, this is a possible solution to benefit of caching, and speed up builds.

https://github.com/RobertoPrevato/AzureDevOps-agents

See: https://github.com/RobertoPrevato/AzureDevOps-agents/tree/master/ubuntu18.04-docker

letmaik · 2020-04-04T10:03:47Z

@bryanmacfarlane @thejoebourneidentity I think the ask here is not to re-use VMs but still have a fresh VM each time (that's my use case anyway) with the option of defining that Docker layers should be cached across runs, similarly to how regular folders can be cached. For that, BYOS wouldn't bring any benefit. What's the technical limitation to get this going? Is it simply that Docker doesn't easily allow to restore a local cache? Has anyone deeply looked into this already?

letmaik · 2020-04-04T17:21:48Z

Follow-up: Even though not as simple as a "checkbox", I was successful in saving/restoring a Docker build cache by using Docker's --cache-from feature. Note that this requires pushing the built images (along with the cache) to a registry, which may not always be available. In another issue, a prototype was done on saving the cache to a folder using the Azure Pipelines cache functionality, however, that approach seems much more involved and also requires to install BuildKit separately.

adrian-skybaker · 2020-04-27T03:25:34Z

The postponed 'BYOS' feature as it is called on the agent repo is absolutely on our radar and will give you maximum caching flexibility.

Relying on this feature seems overkill for a simple, transparent docker cache, which is offered by other CI/CD platforms. The pipeline caching feature seems a better place for this. Caching source docker images is conceptually not very different to caching maven dependencies.

keesschollaart81 · 2020-07-23T19:51:35Z

Pulling the :latest and then using --cache-from point to it, like the example, below seems to work, or am I missing something? It's true that during build the agent has to download the image, the argument for doing this would be that the clients downloading this image later only have to download the updated layers, instead of every layer for every release, right?

- task: Docker@2
  inputs:
    containerRegistry: '$(ContainerRegistryName)'
    command: 'login'

- script: "docker pull $(ACR_ADDRESS)/$(REPOSITORY):latest"
  displayName: Pull latest for layer caching
  continueOnError: true # for first build, no cache

- task: Docker@2
  displayName: build
  inputs:
    containerRegistry: '$(ContainerRegistryName)'
    repository: '$(REPOSITORY)'
    command: 'build'
    Dockerfile: './dockerfile '
    buildContext: '$(BUILDCONTEXT)'
    arguments: '--cache-from=$(ACR_ADDRESS)/$(REPOSITORY):latest' 
    tags: |
      $(Build.BuildNumber)
      latest

- task: Docker@2
  displayName: "push"
  inputs:
    command: push
    containerRegistry: "$(ContainerRegistryName)"
    repository: $(REPOSITORY) 
    tags: |
      $(Build.BuildNumber)
      latest

kmkumaran assigned bansalaseem and jikuma Feb 15, 2018

kmkumaran added the Area: Release label Feb 15, 2018

jikuma assigned azooinmyluggage Feb 15, 2018

bansalaseem added the enhancement label Mar 19, 2018

jikuma removed their assignment Jul 23, 2018

harshil93 closed this as completed Jul 27, 2018

harshil93 reopened this Jul 27, 2018

bansalaseem assigned chrispat and bryanmacfarlane and unassigned azooinmyluggage and bansalaseem Aug 28, 2018

bansalaseem added Area: Build enhancement and removed Area: Release enhancement labels Aug 28, 2018

AndreMiras mentioned this issue Oct 11, 2018

Investigate Azure Pipelines kivy/python-for-android#1400

Closed

Shaked mentioned this issue Jun 25, 2019

Proposals/pipeline caching microsoft/azure-pipelines-yaml#113

Merged

bryanmacfarlane closed this as completed Jul 15, 2019

This comment has been minimized.

Sign in to view

[Feature request] Docker build should use cache unless asked not to #6439

[Feature request] Docker build should use cache unless asked not to #6439

Comments

MirzaSikander commented Feb 15, 2018

Environment

Issue Description

polys commented May 9, 2018

borkke commented May 12, 2018

polys commented May 12, 2018

linuxchata commented Jun 7, 2018

chrispat commented Jun 8, 2018

bryanmacfarlane commented Jun 8, 2018

andreacassioli commented Jul 23, 2018

andreacassioli commented Jul 26, 2018

harshil93 commented Jul 27, 2018

andreujuanc commented Aug 26, 2018

matt-mahdieh commented Aug 27, 2018

mattnedrich commented Oct 18, 2018

mattquinn commented Nov 12, 2018

AndreyBespamyatnov commented Nov 28, 2018

p10tyr commented Dec 10, 2018

nelisbijl commented Dec 15, 2018 • edited Loading

willemodendaal commented Jan 31, 2019 • edited Loading

cpumanaz commented Mar 1, 2019 • edited Loading

cpumanaz commented Mar 1, 2019 • edited Loading

andreujuanc commented Mar 4, 2019

sebnyberg commented May 23, 2019

damienwebdev commented May 23, 2019 • edited Loading

BenWalters commented Jun 6, 2019

jarednlivingston commented Jun 11, 2019

oak-tree commented Jun 26, 2019

jfranki commented Jun 26, 2019

BenWalters commented Jul 15, 2019

bryanmacfarlane commented Jul 15, 2019

BenWalters commented Jul 15, 2019

bryanmacfarlane commented Jul 15, 2019

This comment has been minimized.

thejoebourneidentity commented Jul 15, 2019

robertoandrade commented Sep 22, 2019

RobertoPrevato commented Feb 7, 2020 • edited Loading

letmaik commented Apr 4, 2020

letmaik commented Apr 4, 2020

adrian-skybaker commented Apr 27, 2020

keesschollaart81 commented Jul 23, 2020 • edited Loading

nelisbijl commented Dec 15, 2018 •

edited

Loading

willemodendaal commented Jan 31, 2019 •

edited

Loading

cpumanaz commented Mar 1, 2019 •

edited

Loading

cpumanaz commented Mar 1, 2019 •

edited

Loading

damienwebdev commented May 23, 2019 •

edited

Loading

RobertoPrevato commented Feb 7, 2020 •

edited

Loading

keesschollaart81 commented Jul 23, 2020 •

edited

Loading