dockerfile: Split the ADD command #3050

amluto · 2013-12-04T21:29:38Z

Currently the ADD command is IMO far too magical. It can add local and remote files. It will sometimes untar a file and it will sometimes not untar a file. If a file is a tarball that you want to copy, you accidentally untar it. If the file is a tarball in some unrecognized compressed format that you want to untar, you accidentally copy it.

There's also essentially no security for downloads.

Can we have unambiguous options, please. For example:

COPY-IN <local file or directory> <target in image>

UNTAR <local tarball> <target directory in image>

DOWNLOAD NOVERIFY <URL> <target>

DOWNLOAD SHA256 <sha256 hash> <URL> <target>

etc.

The text was updated successfully, but these errors were encountered:

vieux · 2013-12-04T21:38:49Z

👍 👍 👍 👍 👍

I agree with you Currently the ADD command is IMO far too magical

Here are my comments:

We should keep options in one keyword
Dockerfile syntax should be easy to learn, no need to have so many commands
Right now with your example there is no way to untar a remote tarball, I would suggest something like:
- COPY/INJECT -> copy local/remote files/directory/archive in the image
- UNTAR -> copy local/remote archive in the image and untar them

tianon · 2013-12-04T22:03:09Z

Massive +1s, and I love @vieux's suggestion of just COPY and UNTAR commands to remove some of the magic here. This is a major roadblock for a couple of my projects right now.

As a side note, the hash verification stuff was already covered in #2579, and I think that'd be the appropriate place to discuss that further, especially since splitting ADD alone is a big enough topic for one issue. :)

I think the "remote download" items could also do with Last-Modified and/or ETag HEAD lookups for when we eventually get "ADD caching", but that's not really the discussion here either. 👍

amluto · 2013-12-04T22:29:59Z

I can work with COPY and UNTAR.

scalp42 · 2013-12-08T04:23:30Z

explicit > implicit +1

SvenDowideit · 2013-12-11T07:05:09Z

why UNTAR rather than RUN tar zxvf?

ie, INSERT a_tarfile then (do i need to set a WORKDIR?) RUN tar zxvf a_tarfile

first up, INSERT is thus the same as docker insert and when i need to run something...

shykes · 2013-12-11T07:22:12Z

One difference is that "run tar" depends on the version of tar available in the image. It's better to guarantee the same version everywhere (ideally docker would embed its own implementation) , and insome scenarios the images will not have tar at all.

On Tue, Dec 10, 2013 at 11:05 PM, Sven Dowideit notifications@github.com
wrote:

why UNTAR rather than RUN tar zxvf?
ie, INSERT a_tarfile then (do i need to set a WORKDIR?) RUN tar zxvf a_tarfile

first up, INSERT is thus the same as docker insert and when i need to run something...

Reply to this email directly or view it on GitHub:
#3050 (comment)

SvenDowideit · 2013-12-11T09:54:44Z

understood

tianon · 2013-12-11T14:44:59Z

Also because that would double the size of our base images, not to mention where does the original tar come from when we start with nothing?

I think it's very important that we keep the ability to create a layer entirely from a tar (for our base images at least), I would just be extremely grateful for a way to do so with a remote tar too, and if we could disentangle ADD while we're at it, I see that as a win, personally.

tonyarkles · 2014-03-06T22:27:18Z

Another vote for disentangling it. I've just run into a problem, and even my hacky workaround wasn't enough to get around the ADD magic.

I've got a tarball in my context (local) that needs to get copied into the image compressed (it gets used later on directly, not decompressed). I started out trying to ADD it, but discovered that it got untarred. As a workaround, I renamed it to .foo instead of .tar.gz and it still got decompressed!

Not sure what my next attempt is going to be, but it'd be awesome if there were a straightforward way to just say "copy this file as-is".

Edit: A final workaround that actually worked:

Before hand:
openssl enc -aes-256-cbc -k password -in myfilename.tar.gz -out myfilename.encrypted

In Dockerfile:
ADD ./myfilename.encrypted /root/
RUN openssl enc -d -aes-256-cbc -k password -in /root/myfilename.encrypted -out /root/myfilename.tar.gz

pda · 2014-03-17T17:52:41Z

Dockerfile / docker build should implement minimal functionality, and instead delegate to commands which are available on the CLI as docker <cmd> ….

I think ADD is the biggest offender on this front. Some symptoms:

ADD in a Dockerfile is confusing and doesn't have its own documentation page.
docker add doesn't exist; no CLI mechanism to recursively add a local directory to an image.

I imagine ADD is tricky because it depends on the build context (tarred $PWD sent to the docker daemon). But I'm sure we can work around that. If running with a build context, use that; if not, tar and send the specified directory to be added to the image.

Maybe this warrants an issue, not a comment, but it seems very relevant here.

pnasrat · 2014-04-19T15:17:46Z

We have a bunch of bugs on ADD at the moment. What's the best way to pitch a strawman implementation?

thedeeno · 2014-04-29T20:55:04Z

+1 this magical ADD also has surprising results when you WANT an archive in the image. For instance, when adding a locally built gem I need to tar it first so the expansion doesn't break my RUN gem install xxx command.

snarlysodboxer · 2014-04-30T22:51:56Z

+1 for no magical untar. - this is not intuitional IMO.

This has caused hours of headache and troubleshooting.
Example:

ADD ./database_to_import.tar /tmp/database_to_import.tar
RUN pg_restore --clean --format tar -d database_name /tmp/database_to_import.tar

fails with confusing errors that are hard to troubleshoot because /tmp/database_to_import.tar becomes a directory inside the container without warning (except for one sentence deep in the ADD documentation.)

For now the only way I can find to get a tar file into an image without it auto-untarring is to make an extra directory and put the tar file in there like this:

ADD ./database_dir/ /tmp/
RUN pg_restore --clean --format tar -d database_name /tmp/database_to_import.tar

But that quickly becomes cumbersome.

unclejack · 2014-07-08T15:35:07Z

There's a new COPY instruction which can be used to copy local files to the image. This COPY instruction doesn't extract, it only copies.

Splitting up the ADD instruction still requires some discussion. Decompressing and extracting a tarball should still be possible after splitting up ADD.

EdBoraas · 2014-08-07T14:53:52Z

So, now that we have COPY, is there any hope at all of having ADD unpack remote archives as it does for local archvies? The consistency would be a huge plus, and I have some base image cases that would benefit from this. (Alternatively, if ADD's behavior should just be left alone, how about an UNPACK command that always unpacks; i.e. the flip-side of the COPY case?)

tiborvass · 2014-08-07T16:12:26Z

@EdBoraas We thought about that, and although we definitely would like to have UNPACK or EXTRACT commands, we're also thinking we probably should not have such commands automatically download remote files. Which brings to DOWNLOAD and the problem on how to combine those now 3 commands: optional commands (DOWNLOAD vs UNPACK DOWNLOAD, and COPY vs UNPACK COPY), or using pipes, etc. The goal is to deprecate ADD without changing its current quirky behavior that many people rely on.

Bilge · 2014-10-08T19:00:18Z

👍 I require the ability to use an archive as the rootfs for my image stored remotely. Currently there is no way to do this with a Dockerfile. ADD will only extract local images which limits my options.

Whether ADD should support extracting remote archives (which introduces a BC break) or a new command is added to extract remote archives (which preserves BC and could be named more intuitively) is of little concern to me since at this moment in time the feature to extract remote archives into the file system is completely missing but I would make good use of it if it were available.

However, I'd like to cast a vote in favour of UNPACK or EXTRACT.

dreamcat4 · 2015-03-01T10:08:06Z

Need the following:

To have ADD or UNPACK or EXTRACT work for at least a minimum of compressed .tar.gz and tgz compressed format. But also as many others as is possible.
Have ADD or UNPACK or EXTRACT be able to fetch and download and unpack in 1 step. From remote http:// curl/wget URL - whether it is ssl verified or not.

duglin · 2015-03-01T13:20:40Z

Once we have #10775 merged we can look to add options to ADD to allow for finer control over what happens. I would prefer to do that than to add a ton of new top level Dockerfile commands that almost do the same thing as each other. We can even then look to merge COPY and ADD, even having just those two causes confusion.

thaJeztah · 2015-03-01T14:58:41Z

Once we have #10775 merged we can look to add options to ADD to allow for finer control over what happens

👍

dreamcat4 · 2015-03-04T11:04:23Z

We also need a reliable caching of ADD'ed tarballs. Docker build is supposed to set mtime from last-modified HTTP header. But isn't working for us, neither from github releases URLS (redirected to s3 with a session key). Nor from local Apache webserver.

Where #7748 is as another possible solution to our caching problem. It downloads them every time, which sucks.

jessfraz · 2015-07-10T18:56:05Z

Hello!
We are no longer accepting patches to the Dockerfile syntax as you can read about here: https://github.com/docker/docker/blob/master/ROADMAP.md#22-dockerfile-syntax

Mainly:

Allowing the Builder to be implemented as a separate utility consuming the Engine's API will open the door for many possibilities, such as offering alternate syntaxes or DSL for existing languages without cluttering the Engine's codebase

Then from there, patches/features like this can be re-thought. Hope you can understand.

I spent a good 4 hours trying to understand why the images would build correctly locally but when built in jenkins/kaniko they would miss the `dist` folders. Turns out ADD is too magical, and in some occasions it would not untar. See: moby/moby#3050

This was referenced Mar 17, 2014

Decompression behavor of ADD should have an opt-out #4693

Closed

docker expands a .tar.gz archive when it isn't expected to do so DUMPING files into root directory / #4391

Closed

creack added the /project/helpwanted label Mar 17, 2014

tianon mentioned this issue Apr 16, 2014

automatically expanding files (like gems) that are not tar archives #5241

Closed

proppy mentioned this issue Apr 19, 2014

add -add flag to docker run #3578

Closed

unclejack added the /project/point-o label May 12, 2014

unclejack self-assigned this May 12, 2014

vieux removed the /project/point-o label May 16, 2014

tiborvass mentioned this issue May 27, 2014

"ADD <tarball-url> /dir" copies the file instead of untarring it #2474

Closed

unclejack mentioned this issue May 28, 2014

add support for COPY to docker build #6089

Merged

shin- added the Distribution label Jul 1, 2014

bsdlp mentioned this issue Jul 8, 2014

ADD file.tgz no longer decompresses #6881

Closed

unclejack mentioned this issue Jul 9, 2014

Replace ADD with COPY in the docs (where suitable) #6912

Closed

amluto unassigned unclejack Jul 24, 2014

dmp42 removed the Distribution label Aug 27, 2014

jessfraz added kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. white-belt labels Feb 26, 2015

dreamcat4 mentioned this issue Mar 4, 2015

Docker does not cache: ADD https://github.com/.../s6-overlay-0.1.0-linux-amd64.tar.gz just-containers/s6-overlay#4

Closed

spf13 added exp/beginner and removed white-belt labels Mar 21, 2015

jessfraz closed this as completed Jul 10, 2015

aslicerh mentioned this issue Aug 10, 2017

Refactor tests and add ASP.NET Core test redhat-developer/s2i-dotnetcore#101

Merged

This was referenced Mar 24, 2021

[DOCS/CLI] Removes ADD in favor of COPY backstage/backstage#5087

Closed

[DOCS/CLI, DOCKER] Fixes untar not happening under certain conditions backstage/backstage#5088

Merged

Onyad mentioned this issue May 4, 2023

Khoruzhii docker mipt-backend-academy-winter-2023/team-5#4

Closed

thaJeztah added the area/builder label Oct 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dockerfile: Split the ADD command #3050

dockerfile: Split the ADD command #3050

amluto commented Dec 4, 2013

vieux commented Dec 4, 2013

tianon commented Dec 4, 2013

amluto commented Dec 4, 2013

scalp42 commented Dec 8, 2013

SvenDowideit commented Dec 11, 2013

shykes commented Dec 11, 2013

first up, `INSERT` is thus the same as `docker insert` and when i need to run something...

SvenDowideit commented Dec 11, 2013

tianon commented Dec 11, 2013

tonyarkles commented Mar 6, 2014

pda commented Mar 17, 2014

pnasrat commented Apr 19, 2014

thedeeno commented Apr 29, 2014

snarlysodboxer commented Apr 30, 2014

unclejack commented Jul 8, 2014

EdBoraas commented Aug 7, 2014

tiborvass commented Aug 7, 2014

Bilge commented Oct 8, 2014

dreamcat4 commented Mar 1, 2015

duglin commented Mar 1, 2015

thaJeztah commented Mar 1, 2015

dreamcat4 commented Mar 4, 2015

jessfraz commented Jul 10, 2015

dockerfile: Split the ADD command #3050

dockerfile: Split the ADD command #3050

Comments

amluto commented Dec 4, 2013

vieux commented Dec 4, 2013

tianon commented Dec 4, 2013

amluto commented Dec 4, 2013

scalp42 commented Dec 8, 2013

SvenDowideit commented Dec 11, 2013

shykes commented Dec 11, 2013

first up, INSERT is thus the same as docker insert and when i need to run something...

SvenDowideit commented Dec 11, 2013

tianon commented Dec 11, 2013

tonyarkles commented Mar 6, 2014

pda commented Mar 17, 2014

pnasrat commented Apr 19, 2014

thedeeno commented Apr 29, 2014

snarlysodboxer commented Apr 30, 2014

unclejack commented Jul 8, 2014

EdBoraas commented Aug 7, 2014

tiborvass commented Aug 7, 2014

Bilge commented Oct 8, 2014

dreamcat4 commented Mar 1, 2015

duglin commented Mar 1, 2015

thaJeztah commented Mar 1, 2015

dreamcat4 commented Mar 4, 2015

jessfraz commented Jul 10, 2015

first up, `INSERT` is thus the same as `docker insert` and when i need to run something...