Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile COPY with file globs will copy files from subdirectories to the destination directory #15858

Closed
jfchevrette opened this issue Aug 26, 2015 · 41 comments
Labels

Comments

@jfchevrette
Copy link

@jfchevrette jfchevrette commented Aug 26, 2015

Description of problem:
When using COPY in a Dockerfile and using globs to copy files & folders, docker will (sometimes?) also copy files from subfolders to the destination folder.

$ docker version
Client:
 Version:      1.8.1
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   d12ea79
 Built:        Thu Aug 13 19:47:52 UTC 2015
 OS/Arch:      darwin/amd64

Server:
 Version:      1.8.0
 API version:  1.20
 Go version:   go1.4.2
 Git commit:   0d03096
 Built:        Tue Aug 11 17:17:40 UTC 2015
 OS/Arch:      linux/amd64

$ docker info
Containers: 26
Images: 152
Storage Driver: aufs
 Root Dir: /mnt/sda1/var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 204
 Dirperm1 Supported: true
Execution Driver: native-0.2
Logging Driver: json-file
Kernel Version: 4.0.9-boot2docker
Operating System: Boot2Docker 1.8.0 (TCL 6.3); master : 7f12e95 - Tue Aug 11 17:55:16 UTC 2015
CPUs: 4
Total Memory: 3.858 GiB
Name: dev
ID: 7EON:IEHP:Z5QW:KG4Z:PG5J:DV4W:77S4:MJPX:2C5P:Z5UY:O22A:SYNK
Debug mode (server): true
File Descriptors: 42
Goroutines: 95
System Time: 2015-08-26T17:17:34.772268259Z
EventsListeners: 1
Init SHA1:
Init Path: /usr/local/bin/docker
Docker Root Dir: /mnt/sda1/var/lib/docker
Username: jfchevrette
Registry: https://index.docker.io/v1/
Labels:
 provider=vmwarefusion

$ uname -a
Darwin cerberus.local 14.5.0 Darwin Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64

Environment details:
Local setup on OSX /w boot2docker built with docker-machine

How to Reproduce:

Context

$ tree
.
├── Dockerfile
└── files
    ├── dir
    │   ├── dirfile1
    │   ├── dirfile2
    │   └── dirfile3
    ├── file1
    ├── file2
    └── file3

Dockerfile

FROM busybox

RUN mkdir /test
COPY files/* /test/

Actual Results

$ docker run -it copybug ls -1 /test/
dirfile1
dirfile2
dirfile3
file1
file2
file3

Expected Results
The resulting image should have the same directory structure from the context

@jfchevrette

This comment has been minimized.

Copy link
Author

@jfchevrette jfchevrette commented Aug 26, 2015

Updated original message with output from docker info and uname -a and reformatted it to be according to the issue reporting template.

@jrabbit

This comment has been minimized.

Copy link
Contributor

@jrabbit jrabbit commented Sep 1, 2015

I've had this on 1.6.2 and 1.8
https://gist.github.com/jrabbit/e4f864ca1664ec0dd288 second level directories are treated as first level ones should be for some reason?

for those googling: if you're having issues with COPY * /src try COPY / /src

@duglin

This comment has been minimized.

Copy link
Contributor

@duglin duglin commented Sep 1, 2015

@jfchevrette I think I know why this is happening.
You have COPY files/* /test/ which expands to COPY files/dir files/file1 files/file2 files/file /test/. If you split this up into individual COPY commands (e.g. COPY files/dir /test/) you'll see that (for better or worse) COPY will copy the contents of each arg dir into the destination dir. Not the arg dir itself, but the contents. If you added a 3rd level of dirs I bet those will stick around.

I'm not thrilled with that fact that COPY doesn't preserve the top-level dir but its been that way for a while now.

You can try to make this less painful by copying one level higher in the src tree, if possible.

@jfchevrette

This comment has been minimized.

Copy link
Author

@jfchevrette jfchevrette commented Sep 1, 2015

I'm pretty confident that @duglin in right and it could be very risky to change that behavior. many dockerfiles may break or simply copy inuntended stuff.

However I'd argue that for the long run it would be better if COPY was following the way tools such as cp or rsync handle globs & trailing slashes on folders. It's definitely not expected for COPY to copy files from a subfolder matching dir/* into the destination IMO

@duglin

This comment has been minimized.

Copy link
Contributor

@duglin duglin commented Sep 1, 2015

@jfchevrette yep - first chance we get we should "fix" this.
Closing it for now...

@tugberkugurlu

This comment has been minimized.

Copy link

@tugberkugurlu tugberkugurlu commented Feb 27, 2016

@duglin so, closing means it will not get fixed?

@duglin

This comment has been minimized.

Copy link
Contributor

@duglin duglin commented Feb 27, 2016

@tugberkugurlu yup, at least for now. There's work underway to redo the entire build infrastructure and when we do that is when we can make COPY (or its new equivalent) act the way it should.

@tugberkugurlu

This comment has been minimized.

Copy link

@tugberkugurlu tugberkugurlu commented Feb 27, 2016

@duglin thanks. Is it possible to keep this issue open and update the status here? Or is there any other issue for this that I can subscribe to?

@duglin

This comment has been minimized.

Copy link
Contributor

@duglin duglin commented Feb 27, 2016

@tugberkugurlu I thought we had an issue for "client-side builder support" but I can't seem to find it. So all we may have is what the ROADMAP ( https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax ) says.

As for keeping the issue open, I don't think we can do that. The general rule that Docker has been following is to close any issue that isn't actionable right away. Issues for future work are typically closed and then reopened once the state of things change such that some action (PR) can be taken for the issue.

@deric

This comment has been minimized.

Copy link

@deric deric commented Nov 5, 2016

@duglin This is very serious issue, you shouldn't just close it because the problem was introduced in 0.1 release. It would be more appropriate to target this for 2.0 release (milestones are on github too).

I guess most people use:

COPY . /app

and blacklist all other folders in .gitignore or have single level directory structure and use COPY which actually has mv semantics:

COPY src /myapp

It's quite hard for me to imagine that someone would actually use COPY for flattening directory structure. The other workaround for this is using tar -cf .. & ADD tarfile.tar.gz. Changing at least this would be really helpful. The other thing is respecting slashes in directory names COPY src /src vs COPY src/ /src (which are currently completely ignored).

@tjwebb

This comment has been minimized.

Copy link

@tjwebb tjwebb commented Jan 14, 2017

duglin closed this on Sep 1, 2015

@duglin This is a ridiculous and infuriating issue and should not be closed. The COPY command behaves specifically in disagreement with the documented usage and examples.

@thaJeztah

This comment has been minimized.

Copy link
Member

@thaJeztah thaJeztah commented Jan 14, 2017

@tjwebb there's still an open issue #29211. This can only be looked into if there's a way to fix this that's fully backward compatible. We're open to suggestions if you have a proposal how this could be implemented (if you do, feel free to write this up, and open a proposal, linking to this issue). Note that there's already a difference between (for example), OS X, and Linux in the way cp is handled;

mkdir -p repro-15858 \
  && cd repro-15858 \
  && mkdir -p source/dir1 source/dir2 \
  && touch source/file1 source/dir1/dir1-file1 \
  && mkdir -p target1 target2 target3 target4 target5 target6

cp -r source target1 \
&& cp -r source/ target2 \
&& cp -r source/ target3/ \
&& cp -r source/* target4/ \
&& cp -r source/dir* target5/ \
&& cp -r source/dir*/ target6/ \
&& tree

OS X:

.
├── source
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target1
│   └── source
│       ├── dir1
│       │   └── dir1-file1
│       ├── dir2
│       └── file1
├── target2
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target3
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target4
│   ├── dir1
│   │   └── dir1-file1
│   ├── dir2
│   └── file1
├── target5
│   ├── dir1
│   │   └── dir1-file1
│   └── dir2
└── target6
    └── dir1-file1

20 directories, 12 files

On Ubuntu (/bin/sh)

.
|-- source
|   |-- dir1
|   |   `-- dir1-file1
|   |-- dir2
|   `-- file1
|-- target1
|   `-- source
|       |-- dir1
|       |   `-- dir1-file1
|       |-- dir2
|       `-- file1
|-- target2
|   `-- source
|       |-- dir1
|       |   `-- dir1-file1
|       |-- dir2
|       `-- file1
|-- target3
|   `-- source
|       |-- dir1
|       |   `-- dir1-file1
|       |-- dir2
|       `-- file1
|-- target4
|   |-- dir1
|   |   `-- dir1-file1
|   |-- dir2
|   `-- file1
|-- target5
|   |-- dir1
|   |   `-- dir1-file1
|   `-- dir2
`-- target6
    |-- dir1
    |   `-- dir1-file1
    `-- dir2

24 directories, 12 files
diff --git a/macos.txt b/ubuntu.txt
index 188d2c3..d776f19 100644
--- a/macos.txt
+++ b/ubuntu.txt
@@ -11,15 +11,17 @@
 │       ├── dir2
 │       └── file1
 ├── target2
-│   ├── dir1
-│   │   └── dir1-file1
-│   ├── dir2
-│   └── file1
+│   └── source
+│       ├── dir1
+│       │   └── dir1-file1
+│       ├── dir2
+│       └── file1
 ├── target3
-│   ├── dir1
-│   │   └── dir1-file1
-│   ├── dir2
-│   └── file1
+│   └── source
+│       ├── dir1
+│       │   └── dir1-file1
+│       ├── dir2
+│       └── file1
 ├── target4
 │   ├── dir1
 │   │   └── dir1-file1
@@ -30,6 +32,8 @@
 │   │   └── dir1-file1
 │   └── dir2
 └── target6
-    └── dir1-file1
+    ├── dir1
+    │   └── dir1-file1
+    └── dir2
 
-20 directories, 12 files
+24 directories, 12 files
@AshleyAitken

This comment has been minimized.

Copy link

@AshleyAitken AshleyAitken commented Apr 2, 2017

Make a new command CP and get it right this time please.

@nickjbyrne

This comment has been minimized.

Copy link

@nickjbyrne nickjbyrne commented May 11, 2017

I would echo the above, this must have wasted countless development hours, its extremely un-intuitive.

@anorth2

This comment has been minimized.

Copy link

@anorth2 anorth2 commented Jun 8, 2017

+1 from me. This is really stupid behavior and could easily be remedied by just adding a CP command that performs how COPY should have.

"Backwards compatibility" is a cop out

@snobu

This comment has been minimized.

Copy link

@snobu snobu commented Sep 19, 2017

The TL;DR version:

Don't use COPY * /app, it doesn't do what you'd expect it to do.
Use COPY . /app instead to preserve the directory tree.

@adsl99801

This comment has been minimized.

Copy link

@adsl99801 adsl99801 commented Feb 26, 2018

COPY only able copy it's subfolder .

@divmgl

This comment has been minimized.

Copy link

@divmgl divmgl commented Mar 3, 2018

Just spent countless hours on this... Why does this even work this way?

@gaui

This comment has been minimized.

Copy link

@gaui gaui commented Mar 25, 2018

I'm using Paket and want to copy the following in the right structure:

.
├── .paket/
│   ├── paket.exe
│   ├── paket.bootstrapper.exe
├── paket.dependencies
├── paket.lock
├── projectN/

And by doing COPY *paket* ./ it results in this inside the container:

.
├── paket.dependencies
├── paket.lock

How about adding a --glob or --recursive flag for COPY and ADD ?

@laurencefass

This comment has been minimized.

Copy link

@laurencefass laurencefass commented Apr 13, 2018

COPY . /destination preserves sub-folders.

@fnagel

This comment has been minimized.

Copy link

@fnagel fnagel commented Apr 13, 2018

Three years and this is still an issue :-/

@BaibhavVishal123

This comment has been minimized.

Copy link

@BaibhavVishal123 BaibhavVishal123 commented Apr 18, 2018

Can we get an ETA, when this will be fixed

@laurencefass

This comment has been minimized.

Copy link

@laurencefass laurencefass commented Apr 18, 2018

not an issue...
from above...
COPY . /destination preserves sub-folders.

@snobu

This comment has been minimized.

Copy link

@snobu snobu commented Apr 18, 2018

True, no longer an issue after you fume for half a day and end up here. Sure :)
Let's be constructive,

image

We really need a new CP command or a --recursive flag to COPY so backwards compatibility is preserved.

Top points if we also show a warning on image build, like:
Directory structure not preserved with COPY *, use CP or COPY . More here <link>. if we detect possible misuse.

@intellix

This comment has been minimized.

Copy link

@intellix intellix commented Apr 23, 2018

I'm looking for this for copying across nested lerna package.json files in subdirectories to better utilise npm install cache to only trigger when dependencies change. Currently all files changed cause dependencies to install again.

Something like this would be great:

COPY ["package.json", "packages/*/package.json", "/app/"]
@zentby

This comment has been minimized.

Copy link

@zentby zentby commented May 3, 2018

Go check #29211 guys. This one has been closed and no one cares.

@andradei

This comment has been minimized.

Copy link

@andradei andradei commented May 3, 2018

@zentby Conversation is here, issue is tracked there (since this one is closed)... It's confusing.

@instabledesign

This comment has been minimized.

Copy link

@instabledesign instabledesign commented May 12, 2018

a workaround is to COPY files and RUN cp -R command

COPY files /tmp/
RUN cp -R /tmp/etc/* /etc/ && rm -rf /tmp/etc
@intellix

This comment has been minimized.

Copy link

@intellix intellix commented May 12, 2018

That won't work @instabledesign as the COPY command destroys cache when a file is different that shouldn't invalidate cache (for instance I only want to copy files relating to npm dependency installation as that doesn't often change)

trajano added a commit to trajano/trajano-portfolio that referenced this issue Jun 13, 2018
Fixes the missing assets due to moby/moby#15858
@kayjtea

This comment has been minimized.

Copy link

@kayjtea kayjtea commented Jun 27, 2018

I also needed to copy just a set of files (in my case, *.sln and *.csproj files for dotnet core) to perverse cache. One work around is to create a tar ball of just files you want and then ADD the tarball in the Docker file. Yeah, now you have to have a shell script in addition to the Docker file...

build.sh

#!/bin/bash

# unfortunately there's no easy way to copy just the *.sln and *.csproj (see https://github.com/moby/moby/issues/15858)
# so we generate a tar file containing the required files for the layer

find .. -name '*.csproj' -o -name 'Finomial.InternalServicesCore.sln' -o -name 'nuget.config' | sort | tar cf dotnet-restore.tar -T - 2> /dev/null

docker build -t finomial/iscore-build -f Dockerfile ..

Docker file

FROM microsoft/aspnetcore-build:2.0
WORKDIR /src

# set up a layer for dotnet restore 

ADD docker/dotnet-restore.tar ./

RUN dotnet restore

# now copy all the source and do the dotnet buld
COPY . ./

RUN dotnet publish --no-restore -c Release -o bin Finomial.InternalServicesCore.sln
@sgsunder

This comment has been minimized.

Copy link

@sgsunder sgsunder commented Jul 19, 2018

You can use multiple COPY commands to do this, but that has the disadvantage of creating multiple image layers and bloating your final image size.

As kayjtea mentioned above, you can also wrap the docker build command in a helper build script to create tarballs that preserve directory structure, and ADD them in, but that adds complexity and breaks things like docker-compose build and Docker Hub automated builds.

Really, COPY should function just like a POSIX compliant /bin/cp -r command, but it seems like that won't happen for 'backwards compatibility,' even though the current behavior is completely unintuitive for anyone with experience in *nix systems.


The best compromise I have found is to use a multi-stage build as a hack:

FROM scratch as project_root
# Use COPY to move individual directories
# and WORKDIR to change directory
WORKDIR /
COPY ./file1 .
COPY ./dir1/ ./dir1/
COPY ./dir2/ .
WORKDIR /newDir
COPY ./file2 .

# The actual final build you end up using/pushing
# Node.js app as example
FROM node
WORKDIR /opt/app

COPY package.json .
RUN npm install

COPY --from=project_root / .
CMD ["npm", "start"]

This is self contained within one Dockerfile, and only creates one layer in the final image, just like how an ADD project.tar would work.

sjkaris added a commit to Omnition/synthetic-load-generator that referenced this issue Feb 7, 2019
Docker has some very weird behavior on copying files with the COPY
command in which COPY dir/* /tmp/ will not copy the directory structure
but instead a flat list of all the files under that dir...

for a good time see moby/moby#15858

Testing Done: locally
sjkaris added a commit to Omnition/synthetic-load-generator that referenced this issue Feb 7, 2019
Docker has some very weird behavior on copying files with the COPY
command in which COPY dir/* /tmp/ will not copy the directory structure
but instead a flat list of all the files under that dir...

for a good time see moby/moby#15858

Testing Done: locally
@ruffsl

This comment has been minimized.

Copy link

@ruffsl ruffsl commented Feb 9, 2019

Having a complete COPY command would really help when attempting to preserve the docker build cache. The ROS community develops using nested workspace of packages, with each one declaring dependencies in its own package.xml file. These files are used by a dependency manager to install any upstream libraries. These package.xml file change relatively infrequently wrt to code in packages themselves once the groundwork is set. If the directory tree structure was preserved during a copy, we could simply copy our workspace during the docker build in two stages to maximise caching e.g.:

# copy project dependency metadata
COPY ./**/package.xml /opt/ws/

# install step that fetches unsatisfied dependency
RUN dependency_manager install --workspace /opt/ws/

# copy the rest of the project's code
COPY ./ /opt/ws/

# compile code with cached dependencies
RUN build_tool build --workspace /opt/ws/

Thus the cache for the dependency install layer above would only bust if the developer happened to change a declared dependency, while a change in the package's code would only bust the compilation layer.

Currently, all the matched package.xml files are being copied on top of each other to the root of the destination directory, with the last globed file being the only package.xml that persisted in the image. Which is really quite un-intuitive for users! Why are copied files being overwritten on top of eachother, plus the undefined behavior of which eventually persists in the image.

@benmccallum

This comment has been minimized.

Copy link

@benmccallum benmccallum commented Jun 14, 2019

This is such a pain in basically every stack that has package management, so it affects so many of us. Can it be fixed? Sheesh. It's been an issue since 2015! The suggestion to use a new command of CP is a good one.

@zenozen

This comment has been minimized.

Copy link

@zenozen zenozen commented Jun 21, 2019

Can we reopen this? It's very tedious behavior that COPY command uses a golang internal function for path matching, rather than a real wide-adopted standard, like glob

@ruffsl

This comment has been minimized.

Copy link

@ruffsl ruffsl commented Sep 11, 2019

For those who'd like to copy via globing using a workaround with experimental buildkit syntax, even if caching isn't as precise or robust can take a look at the comments here: #39530 (comment)

I'd still like to see this issue re-opened so we can cache on selective glob style copies.

@ruffsl

This comment has been minimized.

Copy link

@ruffsl ruffsl commented Sep 17, 2019

I realized a relatively simple workaround for my example in #15858 (comment) via multi-stage builds, and thought many of you here with similar needs may appreciate arbitrary caching on copied artifacts from the build context. Using multi-stage builds, it's possible to filter/preprocess the directory to cache:

# Add prior stage to cache/copy from
FROM ubuntu AS package_cache

# Copy from build context
WORKDIR /tmp
COPY ./ ./src

# Filter or glob files to cache upon
RUN mkdir ./cache && cd ./src && \
    find ./ -name "package.xml" | \
      xargs cp --parents -t ../cache

# Continue with primary stage
FROM ubuntu

# copy project dependency metadata
COPY --from=package_cache /tmp/cache /opt/ws/

# install step that fetches unsatisfied dependency
RUN dependency_manager install --workspace /opt/ws/

# copy the rest of the project's code
COPY ./ /opt/ws/

# compile code with cached dependencies
RUN build_tool build --workspace /opt/ws/

For real world working example, you could also take a look here: ros-planning/navigation2#1122

@hannadrehman

This comment has been minimized.

Copy link

@hannadrehman hannadrehman commented Oct 10, 2019

I'm looking for this for copying across nested lerna package.json files in subdirectories to better utilise npm install cache to only trigger when dependencies change. Currently all files changed cause dependencies to install again.

Something like this would be great:

COPY ["package.json", "packages/*/package.json", "/app/"]

i m having the exact same use case.

@kirill-konshin

This comment has been minimized.

Copy link

@kirill-konshin kirill-konshin commented Oct 18, 2019

I'm looking for this for copying across nested lerna package.json files in subdirectories to better utilise npm install cache to only trigger when dependencies change. Currently all files changed cause dependencies to install again.

Something like this would be great:

COPY ["package.json", "packages/*/package.json", "/app/"]

This case but for Yarn workspaces.

@anilanar

This comment has been minimized.

Copy link

@anilanar anilanar commented Oct 21, 2019

It's 2020 and this is still not fixed.

@benmccallum

This comment has been minimized.

Copy link

@benmccallum benmccallum commented Nov 6, 2019

If anyone is struggling with this in a dotnet setting, I've solved it for us by writing a dotnet core global tool that restores the directory structure for the *.csproj files, allowing a restore to follow. See documentation on how to do it here.

@benmccallum

This comment has been minimized.

Copy link

@benmccallum benmccallum commented Nov 6, 2019

FYI, theoretically a similar approach could be used in other settings, but essentially the tool is reverse-engineering the folder structure, so I'm not sure how easy or even possible that would be for say a lerna or yarn workspaces setup. Happy to investigate it if there's interest. Could even be possible in the same tool if folks were happy to install the dotnet core runtime for it to work, else the same approach I've done would need to be built in a language that doesn't require a new dependency, like node I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.