-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: CACHE OFF support, take II #42799
Comments
I don't think caching can be skipped if the files are added to an image layer, because of the content-addressable store used for image layers.
In order for the next
Perhaps you're able to provide a minimal example to showcase this scenario (what does such a Dockerfile look like?) Perhaps a combination of |
Here is an example: # First Stage
FROM node:14-alpine3.10 as node-build
WORKDIR /usr/app
COPY package*.json ./
RUN npm install
# Every layer above this line should be cached
# Ideally this COPY will never cache nor spend any time generating and storing hashes to cache
# This is where a NOCACHE option could help
COPY . ./
RUN npm run build
# Second Stage
# If NOCACHE was issued in the above section, it should not effect this one
# Ideally this next round of 'npm install' should create a new cache layer
FROM node:14-alpine3.10
WORKDIR /usr/app
COPY --from=node-build /usr/app/package*.json ./
RUN npm install --only=production
# The following COPY should ideally never be cached either, as in the previous section.
COPY --from=node-build /usr/app/dist/index.js ./
CMD npm start I am simply hoping to provide Docker with enough information so that it can intelligently cache layers that need to be cached, and avoid wasted caching overhead when it will never be leveraged in a real world scenario. The only time it would be, as things sit today, would be if someone runs two builds in a row without making any change at all... which I would not optimize around that scenario. This also assumes Docker is already setup to use caching layers again when it encounters a |
@brandonmpetty, a feature like this would only be telling the builder not to reuse the previously-built layer on subsequent builds. The layer still needs to be built, stored, and checksummed, because it still needs to be used as part of the resulting image. That said, I still think this would be a useful feature. Here's the justification: As I understand it, Docker decides whether it can cache a given step based on whether any of the following has changed:
A common use case for Docker is to encapsulate CI builds. The Docker build performs a git clone and executes the build. However, the To address that, people use The addition of a simple directive, e.g. If there is a reason not to do this, I haven't heard it yet, despite people asking for it for years. I'll reassert Brandon's original request, emphasis mine:
|
@jakerobb Here's what i am using to rebuild in CI if changes in git happened:
And then in the
This will only rebuild the following layers if the git repo actually changed. |
Not having base features like this really makes me question humanity sometimes. As if designing features that makes sense actually causes pain to some people. It's unfeasible my dockerfile does not execute its RUN statements, even though I changed the list of dependencies to be added, because its using the cache in production (NO I CANT BUST CACHE ON THIS SPECIFIC PRODUCTION SERVER). wtf is this |
@MauriceArikoglu It would help if you posted what problem you are having.
What does this mean? Inputs may include the order of execution, the build context, build arguments, or the base image of the build stage. Do you always want it to make sure the base image is up to date? Make sure to use Beyond that, I think something like |
Production is important but what about development - when you have to build the same image Dockerfile a dozen times per day? How often do you forget to add newline for cache skipping and have to rebuild after this? For me, personally, this feature reduces a lot of hours for development. So why not implement it? |
Unless you have to update your system libraries a dozen times a day, there's no reason to rebuild your dev image that much. I bet you're doing that every time you change some of your project dependencies or even something in your code base. You should probably use a bind mount for that. Then, you can restart your container (if you don't have hot reload) instead of rebuilding the image. If you're not sure how to optimize your developer experience when Docker is in the loop, you should probably seek for help on our forum or on our Community Slack. |
Background
Adding the ability to tell Docker to avoid caching in the Dockerfile: #1996
I am interested in bringing up the "NOCACHE" issue... once again.
Why?, many think it would be a great feature as I will detail my use case below...
and because the repo moderators have failed to lock these issues, and giving the last word as to why they were closed in the first place.
Feel free to close this out of merit, but please lock it or #1996 and state a clear reason as to why.
Why I want the feature
I am interested from a performance perspective.
I am not sure how
COPY
is actually implemented. The docs only give a hint:By adding a
CACHE OFF
option, not only would the checksum analysis not have to be performed, but if Docker is pre-calculating and storing the checksums for the layer, it could also avoid that entirely.Example
If we know that the COPY layer is likely to always contain different files we could completely avoid caching altogether after that point. This could be a huge savings if there are a lot of files being hashed. Also, since this would be part of a build pipeline I would assume that my next FROM statement would automatically setup caching again. A typical Typescript pattern would be to do a build in the first phase, requiring dev dependencies to perform the actual build, and then in the second phase npm install only the production dependencies and then copy over the built content from the earlier phase. The goal, being able to cache BOTH npm install calls to avoid them in the future since those layers almost never change given a package-lock.json, while completely avoiding needless checksum calculations and storage after those points.
The text was updated successfully, but these errors were encountered: