Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.dockerignore not respected when copying from builder in multi stage builds #33923

Open
dovreshef opened this issue Jul 3, 2017 · 17 comments
Open

Comments

@dovreshef
Copy link

I use multi stage builds as a way to manage image size & data that I do not wish to make its way into the final image.
The first image downloads some private git repositories, using injected ssh key, and the second copies the repos into the final image without the .git data.
Unfortunately, it does not look like .dockerignore is respected in COPY --from=builder commands, and as result the final image still has the .git folder with all the data inside.

@tonistiigi
Copy link
Member

Yes, .dockerignore filters the local files before they are sent to the daemon. It does not apply to individual COPY commands between stages. You can either remove the unnecessary files in the source stage or copy individual paths(multiple paths and wildcards are allowed) instead of the root directory.

@dovreshef
Copy link
Author

@tonistiigi Thanks for the answer.

Since multi stage builds are used often for this purpose, of cloning some private git repository, and copying it after some modification to the next stage, shouldn't COPY support .dockerignore between stages as well?

I understand it's a matter of opinion, but it seems cleaner somehow to me that COPY command between stages should support the same option.

@tonistiigi
Copy link
Member

I agree that this is a common use-case but don't think .dockerignore is the solution. It is coincidental that in your case the filter you want to apply between stages is the same as the one applying your local source files. We should figure out how to apply these filters in Dockerfile itself in a cleaner way than the workaround I posted.

@huarmenta
Copy link

Is there any solution to this already?
I found that running ls -la on Builder stage shows files as it should be, i mean without the files listed inside .dockerignore, said that, i did the same after COPY --from=Builder --chown=app:app /app /app command, and shows the same result, but looking inside the docker-compose run web sh session, all files are there. How is this possible?

@thaJeztah
Copy link
Member

Possibly #32507 (RUN --mount) could help partially. In this case though, assuming you're not interested in the .git directory after the "builder" stage completed, something like this would work;

FROM foo AS builder
RUN <your git clone etc.> /source
RUN rm -r /source/.git

FROM bar AS finalstage
COPY --from=builder /source /somewhere

@thaJeztah
Copy link
Member

Related: #37333

@asbjornu
Copy link

It is coincidental that in your case the filter you want to apply between stages is the same as the one applying your local source files.

I don't believe that's coinicidental at all, @tonistiigi. I was very surprised to discover that COPY did not respect the rules defined in .dockerignore.

Especially given the botched up syntax and semantics of the COPY statement (detailed in #15771), not being able to easily exclude a boatload of folders and files makes this a rather massive headache to work around.

The current "solution" either involves hundreds of COPY statements (if you want to copy hundreds of sub-directories into an exact replica directory hierarchy inside the container) or several RUN rm -rf ... statements, both of which feels like hacks around missing core functionality in Docker.

We should figure out how to apply these filters in Dockerfile itself in a cleaner way than the workaround I posted.

I would be fine with something like this:

COPY --use-dockerignore <source> <target>

Or this:

RM --use-dockerignore ./

Having to repeat every statement inside the .dockerfile as you would have to for alternatives such as an --exclude argument for COPY just adds to the headache and is no real solution to me.

I understand how using the .dockerignore by default is out of the question for backwards compatibility reasons, but not making it possible to use at all just seems unintuitive and weird.

@hetii
Copy link

hetii commented Mar 4, 2021

Hello guys, it's 2021 and this issue is still not resolved :/

Below my workaround that honor multiple .dockerignore in any path location in project between stages.

This need to be added at the end of each stage, where you have unneeded artifacts:
RUN bash -O nullglob -O dotglob -O globstar -O extglob -c 'for path in $(cat **/.dockerignore);rm -rf ${path}; done'

Please not here that it can delete host files if you mount project files by using volume mechanism!

Also the syntax used by docker ignore and the syntax used by extended globs are not compatible in some aspects. eg:
in .dockerignore: !foo with extglob: !(foo)

Also you need to have a bash in your image, at least in version 4.

@thaJeztah
Copy link
Member

Please not here that it can delete host files if you mount project files by using volume mechanism!

I guess you mean when using buildah or the Red Hat fork of Docker? Docker itself doesn't support volumes/bind-mounts for that reason (and the RUN --mount ... feature when using buildkit defaults to being read-only (and in r/w mode, changes are discarded after the RUN completes), so won't be able to remove files from your host)

@jacklrs
Copy link

jacklrs commented Jul 2, 2021

@hetii thx for this RUN command to cleanup before next stage. I got a syntax error because of a missing do before rm -rf. This did work:
RUN bash -O nullglob -O dotglob -O globstar -O extglob -c 'for path in $(cat **/.dockerignore); do rm -rf ${path}; done'

@NicholasYamamoto
Copy link

Hate to be that guy but do you happen to know of the equivalent command in sh @hetii? So far this is the only solution I've found for this issue that still remains open (seriously?), and it would be awesome to be able to use this in an Alpine-based image without having to install bash!

@notusertelken
Copy link

Is running a script to delete files according to the dockerignore still the only way to do this?

@HolyNoodle
Copy link

HolyNoodle commented May 1, 2023

Adding to the case.

Same as the others. I would want COPY to support the docker ignore between multi stage in order to be able to optimize the image size.

I have multi stage build on a mono repo with X applications and libraries in TS. So, to build a production image I require my dev dependencies to be installer in order to transpile.

Once it's done, I want to leave behind the dependencies. And reinstall only the production ones.

Without the requested feature, we need to add X rm commands to clean the src folder. Which could be a tedious one time task but is not. Everytime you'll add or remove a new app/lib, you'll have to figure out the new rm commands.

A support for docker ignore is more flexible than any other solution imho.

A side note: NPM can prune dependencies for prod. In my case I use pnpm that doesn't work in monorepos for pruning...

@emilyjerger
Copy link

Still a problem for me too!

@make-github-pseudonymous-again

@JulienBacquart
Copy link

JulienBacquart commented Apr 19, 2024

If you have the possibility to download the repository as an .tar.gz archive, a possible solution is that the tar command accept a --exclude-from= flag that can take a .dockerignore file.

As an example:

FROM debian:bookworm-slim as builder

WORKDIR /tmp

# Download release from Github
ADD https://github.com/user/repository/archive/refs/tags/myrelease.tar.gz .

# Extract from tar archive excluding files according to the .dockerignore file 
RUN --mount=type=bind,source=.dockerignore,target=.dockerignore \ 
    tar -xf myrelease.tar.gz --exclude-from=.dockerignore

FROM python:${PYTHON_VERSION}-slim as base

WORKDIR /app

# Copy the source code into the container.
COPY --from=builder /tmp/ .

Make sure that your .dockerignore file doesn't include itself, otherwise you will not be able to mount it.

...
**/.classpath
# We need the dockerignore file when we extract from the tar.gz archive 
# **/.dockerignore
**/.env
**/.git
**/.gitignore
...

@thaJeztah
Copy link
Member

For those looking for exclusion patterns when copying between stages, a pull-request was merged in the Dockerfile syntax that introduces a --exclude option on COPY and ADD;

documentation:

It's currently in the "labs" variant of the Dockerfile syntax, and requires a # syntax directive to use (but will eventually make it's way into the stable syntax) more info can be found in this blog-post;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests