-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] Docker build should use cache unless asked not to #6439
Comments
Could you at least cache base images from public repositories as a first step? It's such a waste of time downloading images like |
+1 for this one. Does anyone know is it the same on TFS? |
On Hosted VS2017 agents, some container images are already cached apparently: |
Is there information about container images cached on Hosted Linux Preview agents? |
The host is not the same for each build, it is a fresh host each time for security and consistency reasons. We do precache some of those images on the host machines. We can look at caching others but that cache will always drift from what is currently published simply due to how often the pool is updated. So it is possible that you build a docker container with a node base and you get a different sha256 base when you build in CI from what you get when you build locally. |
note that we are going to release a new Ubuntu (released) hosted pool and we're investigating caching a series of more focused containers (node, dotnet core etc...). We may keep a couple tags of each. But as @chrisrpatterson noted, you're cache hits will be hit or miss for a number of factors. We're experimenting now. This issue should move to vsts-image-generation repo since that's where the hosted VM gen and the new docker images work will happen. |
This is quite an issue, especially when using CI/CD intensively and with base images such as ASP.NET SDK or the like (unfortunately even dotnet core 2.1 seems to have a very large sdk image). @chrisrpatterson : the cache will clearly drift, that's in it's own nature. The point is that as long as a new image is not published, the one I am using will be cached as the SHA will be the same (notice that people should pick specific version, not just latest!). |
I have noticed that the docker build process always delete intermediate containers. Why is that? It is not the default Docker behaviour. We have some use cases in which we would like to split a multi-stage build in multiple tasks, but right know there is no way to cache one stage to the second. Notice that that does not happen running docker build fro ma bash task. |
@andreacassioli Not an expert on docker but the default option in docker build is to remove intermediate containers. Do you want to keep the containers after docker build ? https://docs.docker.com/engine/reference/commandline/build/#options See the --rm |
I'd love this! |
Any progress on this issue? I have been having the same issue on VSTS. Every time trigger a new build it downloads the images (in my case .NET SDK) which is huge. Is there any way to cache same as what is available in Bitbucket cloud? |
Any update on this? |
I have a customer interested in this feature. The idea of being able to leverage the cache in a hosted build agent is compelling - quicker builds, without managing infrastructure - and could help get their builds down from tens of minutes to ~1-2 mins, which on their CI/CD "always deploying" plans would be a massive win. |
Have the same issue with Downloading MSSQL image every time, takes ~20 min. |
And now with limited 1 Parallel Builds a simple app takes minutes to build in Docker CI because it has to download everything every time ... Pleeeeaaaseeeee! I don't want to wait hours for a build to complete and release... We already pay for extra parallels but this is a bit out of hand now. |
I managed to create a self-hosted agent from https://github.com/Microsoft/azure-pipelines-image-generation using packer. Although the documentation fails to explain how to create a VM from the generated vhd and template file, I got one up and running, installed the agent software and configured it. However, this agent still fails to cache the costly 'npm install' step in my Dockerfile. UPDATE 2018-12-15T12:35:24.1151714Z Step 2/6 : COPY .npmrc .npmrc The second ls statement is re-executed! FYI: I use the access token in my .npmrc file to access our private npm registry Don't know whether System.AccessToken is different on every run. it echos as ***, or that it;s value can not be cached by Docker. Specifying any other value (e.g. --build-arg ACCESS_TOKEN=AT) makes the caching work as expected. Anyone any idea?
UPDATE 3 You can not specify an environment variable at the start of a commandline. You have to use the dedicated environment section for that (either in designer or yaml). Otherwise the environment variable will simply not be set (empty string value). As $(System.AccessToken) seems to be different on every run, you should not use it in your Docker build if you want to enjoy the use of caching. However, I somehow needed to authenticate my private npm registry. |
Hello from 2019. I guess this is still an issue... pity. Would really like to use VSTS hosted agents, but not being able to cache container images makes the build super slow. 20+ minutes instead of 1 minute. For reference, this is what my dockerfile looks like. Nothing fancy.
|
I am experiencing the same. using the "FROM microsoft/aspnet:4.7.2-windowsservercore-ltsc2016" source image. It was mentioned not every hash is cached and there is drift. Is it documented which specific tag(s) are cached on the 'Hosted VS2017' agent so we can choose those source images? This is advertised as a feature at the link below, but I do not see it working as advertised. |
I'm reporting back. The reason that many might have issues is using
|
That's interesting. At least that's a good start. |
I keep coming back to this thread to check whether anything is happening on the topic. As our CI and release pipelines have turned into a sequence of docker and helm commands, the only long running issue is this one. Spending (wasting) almost 90% of build time downloading images and packages due to lack of support for layer caching is disappointing to say the least. Considering that we no longer have a need for the Azure integration features, we don't depend on using Azure DevOps anymore. Changing all our pipelines to a competitor however is also quite a task so I wish this would see higher priority. |
@bryanmacfarlane @chrispat would this be solvable by microsoft/azure-pipelines-yaml#113 ? |
@chrispat @bryanmacfarlane, do you have any news on this? It's been being discussed for a long time now. Even a timeline would be nice. |
It looks like it might be coming in June 2019: |
Hey! |
We have a Dockerfile with a base image of more than 2GB (MATLAB Runtime) stored in an Azure Container Registry. So yeah... that one has to be built locally on every update until this is sorted out. |
This feature request has been open 18 months now with the last update over 1 year ago! @bryanmacfarlane & @chrispat could we have an update please as I don't believe that microsoft/azure-pipelines-yaml#113 doesn't look like it'll solve it. |
You are correct that the caching feature won't help you here. It's also not a task issue. Hosted pools re-image and give you a fresh machine every time. There is a feature which is the best of both worlds. https://github.com/microsoft/azure-pipelines-agent/blob/master/docs/design/byos.md You should probably create a feature request here as that's a large feature / service addition (not an issue with a task). Get some votes. https://developercommunity.visualstudio.com/spaces/21/visual-studio-team-services.html?type=idea @thejoebourneidentity FYI as his team is driving that feature (they own hosted pools). I'm inclined to close this issue here because it's not a task issue and it just won't get traction here. |
Hi @bryanmacfarlane |
Time hasn't been wasted on the queue because of this issue. It has been actively proposed, already designed per the link above and the feature has been postponed by product. Closing as this isn't a task issue. I've forwarded this link to the team ( @thejoebourneidentity ) with the feature and also sent them a mail. |
This comment has been minimized.
This comment has been minimized.
Thanks @bryanmacfarlane. Hey folks, Bryan is right. True E2E docker layer caching won't be achieved in our hosted pools since we refresh machines between jobs. The postponed 'BYOS' feature as it is called on the agent repo is absolutely on our radar and will give you maximum caching flexibility. In the meantime, if there are base container images you would like to see us include in our hosted pool images, please feel free to request them over on our image gen repo: https://github.com/microsoft/azure-pipelines-image-generation |
I was able to get this working using the Cache task and some conditional manual In an earlier attempt spent a significant amount of time trying to cache the contents of C:\ProgramData\Docker (where the deconstructed images are stored) and even change the docker daemon to point to another location, that was unsuccessful given the container set permissions on the files/folders there. When running benchmark tests locally in a Windows VM on my Mac, it seems to download and extract files much faster than the Azure Pipeline agents, so thinking the key bottleneck is I/O and perhaps networking. |
On July 2019 I published files to build Docker images for private Azure DevOps agents. One of them has Docker, so it supports Docker workloads. If you can run a container on a machine, this is a possible solution to benefit of caching, and speed up builds. https://github.com/RobertoPrevato/AzureDevOps-agents See: https://github.com/RobertoPrevato/AzureDevOps-agents/tree/master/ubuntu18.04-docker |
@bryanmacfarlane @thejoebourneidentity I think the ask here is not to re-use VMs but still have a fresh VM each time (that's my use case anyway) with the option of defining that Docker layers should be cached across runs, similarly to how regular folders can be cached. For that, BYOS wouldn't bring any benefit. What's the technical limitation to get this going? Is it simply that Docker doesn't easily allow to restore a local cache? Has anyone deeply looked into this already? |
Follow-up: Even though not as simple as a "checkbox", I was successful in saving/restoring a Docker build cache by using Docker's |
Relying on this feature seems overkill for a simple, transparent docker cache, which is offered by other CI/CD platforms. The pipeline caching feature seems a better place for this. Caching source docker images is conceptually not very different to caching maven dependencies. |
Pulling the :latest and then using - task: Docker@2
inputs:
containerRegistry: '$(ContainerRegistryName)'
command: 'login'
- script: "docker pull $(ACR_ADDRESS)/$(REPOSITORY):latest"
displayName: Pull latest for layer caching
continueOnError: true # for first build, no cache
- task: Docker@2
displayName: build
inputs:
containerRegistry: '$(ContainerRegistryName)'
repository: '$(REPOSITORY)'
command: 'build'
Dockerfile: './dockerfile '
buildContext: '$(BUILDCONTEXT)'
arguments: '--cache-from=$(ACR_ADDRESS)/$(REPOSITORY):latest'
tags: |
$(Build.BuildNumber)
latest
- task: Docker@2
displayName: "push"
inputs:
command: push
containerRegistry: "$(ContainerRegistryName)"
repository: $(REPOSITORY)
tags: |
$(Build.BuildNumber)
latest |
Environment
Server - VSTS or TFS on-premises?
VSTS
Agent - Hosted or Private:
Linux Hosted Preview
Issue Description
Locally, if you manually build docker containers over and over again with small changes, the docker daemon uses cache to speed up docker build.
On VSTS, every time a build is started, the docker build goes through the same steps that it has gone over previously: the base images are fetched and expanded, apt-get installs all the dependencies, files are copied over, etc. This adds a lot of time for each build; In fact the majority of build time for our CI/CD is wasted in this.
Expectation:
If the host being used is the same for each build, it should make use of the cache from previous builds to
speed up this step, unless specifically not asked to.
The text was updated successfully, but these errors were encountered: