Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile ADD relative symlinks support #18789

Open
nazar-pc opened this issue Dec 19, 2015 · 51 comments
Open

Dockerfile ADD relative symlinks support #18789

nazar-pc opened this issue Dec 19, 2015 · 51 comments
Labels
area/builder kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.

Comments

@nazar-pc
Copy link

I want to raise question from #1676 once again.

I have set of images that are intended to work together.
One of images is Consul, it is used for DNS in other containers for service discovery.
Script that add Consul to /etc/resolv.conf is the same in multiple images.

What I'd like to have is relative symlink support so that I can create single copy of mentioned script and point there symlinks from other images.
Since all Dockerfiles are within the same repository and symlinks are relative I do not see any issues at all why this should be forbidden.

I'm going to add automatic Ceph client support to my images and supporting 2 scripts that are copied into 5+ places become really difficult and awkward (more chances to forget to update something, unnecessary bigger diff on each commit, etc).
Most of images are based on official images like mariadb, nginx, haproxy, etc, so I don't have common base to put files there.

TL;DR edition:

# Dockerfile
...
ADD consul-dns.sh /
...
$ LANG=C stat consul-dns.sh 
  File: 'consul-dns.sh' -> '../common/consul-dns.sh'
  Size: 23          Blocks: 8          IO Block: 4096   symbolic link
...

Symlink is relative within the same repository, no problems.

@GordonTheTurtle
Copy link

Hi!

Please read this important information about creating issues.

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

This is an automated, informational response.

Thank you.

For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues


BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:
uname -a:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

@nazar-pc nazar-pc changed the title Dockerfile ADD symlinks support Dockerfile ADD relative symlinks support Dec 19, 2015
@phemmer
Copy link
Contributor

phemmer commented Dec 19, 2015

See #6094 (comment) which has a simple (though not perfect) workaround:

tar -czh . | docker build -

The "not perfect" bit being that all symlinks are dereferenced, not just those pointing to files outside the archive.

Also, note, on some systems this may not work well due to #15785

@nazar-pc
Copy link
Author

It will not work for automated builds on Docker Hub.

@stmuraka
Copy link

+1
I have a similar use case and documented here 19246.
Why can't the file be dereferenced (on the client side) prior to build on the server? I don't see how this changes the behavior of the build since all files will be copied and replace the symlinked file.

@jdmarshall
Copy link

There's a gap in the feature set between Docker and Docker-compose. What I need to be able to do is ask docker-compose to build me an image that contains files from Directory A and files from Directory B. Currently there is no way to do that because both projects disavow responsibility.

If my DockerFile is in Directory A and by build artifacts in Directory B, I can use volumes to mount them as long as DOCKER_HOST is localhost. Once it slips to another machine (testing, complex dev environment, or eventually production), there is no equivalent way to deploy because Docker refuses to allow/follow symlinks.

All we want is to symlink from our 'docker' directory to the build artifacts (eg, '../dist'). This is not an unreasonable request.

Lots of us are using opinionated frameworks. The build artifacts are organized one way and one way only. We can't just rearrange our project output to make Docker happy. We can't position our Dockerfile so that it only sees those files and itself. So we either choose to leak our entire project into the Docker image (potentially containing files full of proprietary information), or we have to completely rewrite our toolchains' build scripts to generate a different directory structure, one that can have a Dockerfile at the top level.

What's more likely to happen is our teams will vote to use something other than Docker. Or we'll do something brittle and blame Docker for bad outcomes.

@nazar-pc
Copy link
Author

For Docker Compose in some cases you can try to use mount --bind even though it is quite ugly.

@thaJeztah
Copy link
Member

I do not see any issues at all why this should be forbidden.

@nazar-pc @jdmarshall have a look at this repo; https://github.com/thaJeztah/keyjacker
and consider what would happen if people are tricked into building that (it doesn't do a thing, but with a bit of imagination ...)

I wonder, would the -f option be a solution for your use-case? https://docs.docker.com/engine/reference/commandline/build/#specify-dockerfile-f

@nazar-pc
Copy link
Author

I think it can be fixed by controlling which user is used to access build files. User might be restricted to only access files necessary for build process (root of cloned repository) (like chroot or something).

However, -f option seems to be a good solution at least for my use case, but I do not see this separation in build settings on Docker Hub.

@thaJeztah
Copy link
Member

I think it can be fixed by controlling which user is used to access build files.

Currently, that'd be the user that's running the build, and you probably have access to your keys

@nazar-pc
Copy link
Author

I've tried to make build locally with -f. It makes sense, but Dockerfile needs to be modified as well, since context is changed, which is odd, it would be more natural to have everything inside Dockerfile relative to Dockerfile location.

Here is what seems logical to me, but doesn't work:

nazar-pc@nazar-pc /w/g/docker-webserver> ls
common/  mariadb/
nazar-pc@nazar-pc /w/g/docker-webserver> ls common/
consul-dns.sh*
nazar-pc@nazar-pc /w/g/docker-webserver> ls mariadb/
ceph-mount.sh*  Dockerfile  galera.cnf  webserver-entrypoint.sh*
nazar-pc@nazar-pc /w/g/docker-webserver> cd mariadb/
nazar-pc@nazar-pc /w/g/d/mariadb> cat Dockerfile 

...

COPY ceph-mount.sh /
COPY ../common/consul-dns.sh /
COPY galera.cnf /etc/mysql_dist/galera.cfg
COPY webserver-entrypoint.sh /

...

nazar-pc@nazar-pc /w/g/d/mariadb> docker build -f Dockerfile -t nazarpc/webserver:mariadb ..
Sending build context to Docker daemon 1.097 MB
Step 1 : FROM mariadb:10.1
 ---> aa45ab08ad04
Step 2 : MAINTAINER Nazar Mokrynskyi <nazar@mokrynskyi.com>
 ---> Using cache
 ---> 7411fff4bc0f
Step 3 : ...
 ---> Using cache
 ---> f7856f83b9ca
Step 4 : COPY ceph-mount.sh /
lstat ceph-mount.sh: no such file or directory

I know this is likely to cause BC break, but current situation is not as good as it should be.
Also Docker Hub doesn't support changing context. There is clear problem here and no generally working solution yet.

@nazar-pc
Copy link
Author

The idea is that everything is relative to Dockerfile (simple & reliable) and context just provides limitation how far Dockerfile can reference higher in directories tree (namely ../../some-file).

@thaJeztah
Copy link
Member

The idea of -f is; "here's a context, and use this Dockerfile to build it", I agree that it's a bit confusing if the Dockerfile is in a different location, although you could put all Dockerfiles at the root of the repository, e.g. Dockerfile.sevice1, Dockerfile.service2

@nazar-pc
Copy link
Author

I think this is the only really working way at the moment to share files for Docker Hub images, but that would be a lot of files in the root directory:(

@jdmarshall
Copy link

You guys really need to get your story straight about security and repeatability concerns. People are closing issues about security concerns in the name of 'repeatability' and you guys are downplaying repeatability issues in the name of security. And meanwhile you're still training developers to curl-execute bash files off the internet. I can't begin to explain how frustrating this all is.

I'll experiment with -f, but it doesn't solve all my use cases (example: an Ember or Angular app served by nginx, so the nginx configuration files are stored with the Dockerfile, and the contents of the root are in ./dist or ./build). At any rate I don't see a way to use -f from docker-compose, so the gap is still there.

@thaJeztah
Copy link
Member

You guys really need to get your story straight about security and repeatability concerns.

Please watch your tone, no need to get offensive here; I'm trying to explain why the current design is as it stands, and looking for a solution that may help. Also realize we cannot satisfy everybody's needs and use-cases.

People are closing issues about security concerns in the name of 'repeatability'

Can you provide an example?

At any rate I don't see a way to use -f from docker-compose, so the gap is still there.

Try the dockerfile instruction in docker-compose.yaml, which is the equivalent in docker-compose

@stmuraka
Copy link

@thaJeztah

@nazar-pc @jdmarshall have a look at this repo; https://github.com/thaJeztah/keyjacker
and consider what would happen if people are tricked into building that (it doesn't do a thing, but with a > bit of imagination ...)

Users can always shoot themselves in the foot.

Consider the some random website with a script
http://some.random.website/runme.sh:

rm -rf /

Dockerfile

FROM someimage
ADD http://some.random.website/runme.sh
CMD ./runme.sh

Run:

$ docker build -t myimage .
$ docker run -d -v /:/ myimage

It's about enabling capabilities. User permissions/management (i.e. who can build, etc.) is business as usual. Copying a (sym)linked file still requires the user to have permissions on that file. You can't just link any old file and have exclusive rights to it. In the case a user doesn't have rights to the file, the build should fail. But if a user does have rights to the file, I don't see what the concern is.

Keeping track and updating the same config file(s) used across many images tend to lead to a lot misconfigured images. As noted by @jdmarshall, changing one's build process (e.g. writing wrapper scripts) to work around Docker build deficiencies is not acceptable IMO. Some of the goals of Docker are to enable the standardization and agility of application/service deployment.

If you're concerned that users could pull in files "by accident", you could throw some kind of warning at the end of the build saying - Docker build copied files from outside its build path - use at your own risk.

@jdmarshall
Copy link

| Please watch your tone, no need to get offensive here

I'm being sincere, tone or otherwise. There are many tickets being filed in these two arenas, with a lot of traffic to them, and it's easy to see that those commenting and those grooming the backlog aren't in accord, amongst or within each other. It would be in everybody's best interests if you guys had a come-to-jesus meeting to figure out how to proceed. Is one important? Are both important? When it's a tossup, who wins? (I think most of us would vote "security")

Regarding your call for issue numbers, unfortunately there's no "show my all the issues I've commented on" in github yet. I'll keep looking, but #9176 and a bunch of related issues that were closed in favor of it (and then it too was closed). In those cases is was trying to get credentials into a docker image without capturing them.

There's a ticket (#13490) that was opened in May of last year to describe sane ways to get secrets into Docker without leaving them in the final image. No visible progress has been made.

And #332, one read of which is around security (confidential data), which still draws new comments long after being closed.

@vdemeester
Copy link
Member

Note that at some point, the builder should be in a separate project (see #14298) and thus, at some point, it's going to be easier to specify a different builder than the default one. One that would support the feature you want. Some work is done on that area ; and that is the reason why Dockerfile changes are freezed for now.

In my opinion, having a separate way to build an image will solve most of those question. It's about separation of concern and giving more choice.

PS: it's already possible to create a custom builder even (it's not optimal and it's not as pluggable as it might become, but it's possible).

@duglin
Copy link
Contributor

duglin commented Jan 20, 2016

unfortunately there's no "show my all the issues I've commented on" in github yet

try: is:issue is:open commenter:jdmarshall

@jdmarshall
Copy link

@vdemeester Cool! I'll keep an eye on that one. Having it separate should make it easier for people to try out different solutions to all these issues and we can argue about working solutions instead of hypothetical ones :) I wonder what people will come up with.

@jdmarshall
Copy link

I ended up working around this issue by rearranging my project structure, creating some fairly complex .dockerignore files and putting the Dockerfile in the 'root' of each module. Sooner or later I'm going to make a mistake and include a file on a docker image that I'll regret. I'm certain that the odds of this are quite a bit higher than the odds that the symlink will be used for nefarious purposes.

But it works. For now. Sort of.

@noreabu
Copy link

noreabu commented Mar 16, 2016

If you have root access and are utilizing directories this might be a less annoying workaround:

sudo mount --bind project-sources /home/foo/bar/project-sources-branches/branchname

@vdemeester
Copy link
Member

@arianitu the behavior is like that (not beeing able to add some stuff from outside the context) because of the architecture of docker (client/server). When doing a docker build, the daemon might not have access to these folder (if its remote).

You have alternative to sending 500Mb of folder though.

  • Using .dockerignore to ignore the files and folder you don't want to send in the context (and thus not sending 500M to the daemon).
  • Using a alternative client-side builder (there is not too much around though 😓).

@arianitu
Copy link

@vdemeester my apologies, I moved my comment to the other issue because it seemed more appropriate over there.

@jaloren
Copy link

jaloren commented Jun 29, 2016

@vdemeester I do not find this argument persuasive. An extremely common use case is to git clone the source from which the docker image will be built, where the git repo will contain all the dependencies locally (by design). In that case, I will have all the files local (because I set it up that way) and all I want to do is include some specific directories or files through symbolic links.

What I do not want to do is: 1) write a shell script to manually copy the files into the build directory and then manually remove them and/or 2) completely rework the directory structure in a way just to please docker build and not because it makes sense for my project.

I don't understand why that's not permitted and considering that this is one of many many tickets opened on this issue, I think this is a common sentiment. And always the response is "But The Architecture". And my response is that the architecture is flawed if it doesn't permit such a common build use case for docker. Why can't there be an option on the docker build that says "allow the use of relative links during build" or "follow symbolic links". The default behavior wouldn't change but for those that want this functionality, they could turn it on.

@hholst80
Copy link

hholst80 commented Oct 9, 2016

How would docker build know which symlinks to resolve and which to keep? What do we mean by ADD some-symlink .?

@jeremymills
Copy link

Any updates on this?

@thaJeztah thaJeztah added area/builder kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. labels Jan 11, 2017
@lunemec
Copy link

lunemec commented Jan 18, 2017

+1 on this functionality. There should be at least function to allow for symlink following inside the context directory.

Or a counter-proposal, instead of .dockerignore have the exact opposite.
Suppose this workflow:

  1. programming something awesome
  2. program outputs tons of data, say logs into dev directory (the same as docker context)
  3. finish task, run tests, build binary
  4. try to build it, test if everything is ok
  5. cancel build because you were sending 10GB of data into the docker engine because you forgot it sends everything inside context to engine

Maybe it would be better to have user specify exactly what to use as context (or multiple contexts)? Because the way it is now - send everything and block later if we remember is a "security" flaw anyway :)

I believe I'm not the only one who did exactly the same thing like described in the workflow :)

And yes, I know this would be very incompatible change in the way docker build behaves ...

@hholst80
Copy link

tl,dr; my $0.02 feel free to ignore.

@jdmarshall docker already does that. it includes all symlinks as-is. The "bug" above is for the resolve of "some" files. I think that is a broken change.

If this feature should be included into docker it would have to be with a new keyword and not break the existing functionality of ADD and COPY. It would probably be a bit messy to implement because you would have to pre-scan the Dockerfile and include those files in the build context. I am sure it can be done but I am guessing it would be easier to just hack something together for your specific use case than to try and build a generic solution that fits everyone using docker.

@lunemec
Copy link

lunemec commented Jan 27, 2017

@hholst80 I agree that for compatibility reasons is would be easier to just hack something to make it work, but considering the number of people who would want something like this in this thread alone...(not to mention the questions on stackoverflow)...

@vitage
Copy link

vitage commented Feb 7, 2017

+1

@aensidhe
Copy link

We have several services which we planned to move into docker. We have several common files that shared between those services. We would like to have this folder structure:

  • services
    • common/various-common-files-here
    • service
      • softlink to common
      • Dockerfile

We do not want to copy-paste common files, changes between them should be syncronized.
We do not want to use root folder as a context for build because it is slowing builds down.

@treefitty
Copy link

Ok, this "need to include shared files" issue has stopped me from using docker-compose for a while, so today I set aside some time to come up a solution; logic being that someone out there must've come across this issue and found a solution.... I appreciate the moby security concerns, yet there must be a solution.

And there is a solution
And it works really nicely (I've confirmed this and everything is working nicely)

Full credit goes to Matt for his post on StackOverflow

He outlines a number of solutions, but I thought the first one would be the cleanest and easiest to implement & maintain... no dramas so far.

So to anyone else facing this who ends up here, try his advice in creating a common base image, it works a treat!

@nazar-pc
Copy link
Author

The whole point was to have different base images. In my case multiple images are based on docker official images like mariadb or nginx. Thus I can't use the same base image, yet underlying OS is the same and commands work fine.

I've end up moving all Dockerfiles to the root of the project with custom names. It is bulky and harder to maintain, but at least it works.

@aensidhe
Copy link

You can use multistage build.

  1. Build image with shared files, tag it somehow: shared_files
  2. Copy file in your image, derived from whatever you want:
FROM shared_files as file_provider
....

FROM nginx
COPY --from=file_provider .... ....
....

@nazar-pc
Copy link
Author

I think multistage build is an overkill when I simply need to copy a few files. I don't need to compile or download something, just copy files.

@rocketraman
Copy link

Another use case: a monorepo with two Node.JS projects:

monorepo/project-a
monorepo/project-b

project-a has a dependency on project-b -- in order to make the Node.JS build always use a predictable (dare I say, repeatable?) version of project-b, project-a declares its dependency on project-b via

"project-b": "file:../project-b"

Now, I can't build the Dockerfile for project-a, which contains an npm ci --production, because project-b isn't part of it's context. I can't include the entire monorepo in the context for project-b because there is a ton of unrelated stuff in the monorepo and sending gigabytes of context to Docker for project-a just to get project-b is dumb and ridiculously slow.

I can't include node_modules from outside Docker in the context, and just copy it into the Dockerfile rather than doing npm ci --production, because node_modules contains symlinks also, and this sucks anyway.

I don't want to make the dependency in project-a a library dependency because then my node build is not repeatable for a given branch of the code, which may require the specific version of that library that is checked out in the monorepo.

The tar trick works but a) the command line needs to exclude the stuff in .dockerignore manually and b) this makes the whole thing non-cross-platform and complicates docker builds in CI systems.

The build is already non-repeatable because I run npm install inside the Dockerfile and who the hell knows what stuff is going to get pulled in when that happens (and I am aware of that non-repeatability and accept it, and work around it by tagging docker builds appropriately so I can go back to a particular build without having to rebuild the Dockerfile). Therefore, Docker's misguided attempt to save me from myself is just annoying.

@thaJeztah
Copy link
Member

@rocketraman you may be interested in #37129

@rocketraman
Copy link

@thaJeztah Yes, I think that would solve my problem.

@jdmarshall
Copy link

@rocketraman That's exactly the situation that drew me into this issue. You want build artifacts included into two separate docker images, you have to do some copying of files to make that happen (which means they can get out of sync).

@aensidhe
Copy link

aensidhe commented Nov 7, 2018

@jdmarshall you can tianon/true image as shared base for that as described in my comment above.

@thaJeztah
Copy link
Member

you can tianon/true image as shared base for that as described in my comment above.

FROM scratch should work as well if you just want to build an image / stage with some files in it

@jdmarshall
Copy link

@aensidhe I think you're grossly misunderstanding what I mean by 'build artifacts'.

Output from a build process, when shared between multiple projects will likely not desire the same file structure inside the docker image. Having a base image works for binaries. For config files and most especially for the module system of many programming languages you would want to containerize, it accomplishes nothing.

@jdmarshall
Copy link

@theJeztah

I think with the addition of

RUN --mount=type=ssh git clone git@github.com:myorg/myproject.git myproject

basically kills off the security argument against symlinks. If someone can borrow my credentials and talk to authenticated services, there are a lot worse things they can do to me than stealing some files and putting them into a docker image. It's literally a different embodiment of the same attack you used in your example.

I still think the Docker teams lack consensus and it really needs to be sorted out, for everybody's sake.

Better advice is: Don't run a program you got off the internet. Don't run a shell script you got off the internet. Don't run a Dockerfile you got off the internet. Don't run a docker-compose.yml you got off the internet.

@thaJeztah
Copy link
Member

basically kills off the security argument against symlinks. If someone can borrow my credentials and talk to authenticated services, there are a lot worse things they can do to me than stealing some files and putting them into a docker image. It's literally a different embodiment of the same attack you used in your example.

The --mount=type=ssh option requires you to set the --ssh flag when running the build; the Dockerfile by itself cannot obtain access to your credentials, unless you explicitly allow it too (see docker/cli#1419)

@haizaar
Copy link

haizaar commented Dec 7, 2018

One caveat with tar trick is that .dockerignore has no effect anymore. Likely, tar supports --exclude-vcs-ignores, so you can convert your .dockerignore to a relevant VCS-ignore file (probably a distinct one from the VCS you actually use).

@ghost
Copy link

ghost commented Feb 13, 2019

+1 this feature (or even just allowing ADD to get files above the Dockerfile directory)

I'm not sure why Docker thinks it is their job to protect users from potential wrongdoers by limiting functionality in the name of security. I can take care of myself, and much prefer a full feature set to security. I would be really surprised if that wasn't the majority opinion of Docker users. If Docker really wants to help provide secure images, then certify safe images on the Docker Hub, and/or allow flagging of unsafe images.

@kkm000
Copy link

kkm000 commented Dec 9, 2019

@thaJeztah wrote:

I do not see any issues at all why this should be forbidden.

@nazar-pc @jdmarshall have a look at this repo; https://github.com/thaJeztah/keyjacker
and consider what would happen if people are tricked into building that (it doesn't do a thing, but with a bit of imagination ...)

I respectfully disagree with your example. The gimmithekey symlinks in your example are part of the build context. My understanding of the issue is not that people want to wrap symlinks into the build context and send them to the server as symlinks. What everyone is asking for please resolve the symlinks before adding the file to the build context. Exactly the thing that the tar tzh . would do, and the part of the common workaround that just everyone with a repository of more than a couple of dockerfiles has to implement.

So in the example, the gimmethekey link would be resolved on the client, and the client keys would be packaged into the context, if the invoking user has access to these files. Well, if that's what they want (I rather would not, but...), let them. This can be stupidity, but is not a security risk. Besides, if the linked file is inaccessible to the user that is invoking the build, the trick will fail because packaging the build context is performed on the client under the user's own real uid (Docker does snadbox the packaging, doesn't it?). So the example that you provided will just fail with an access permission violation.

I understand that you want to make docker build as foolproof as possible, but I promise that we'll always come up with a better fool! IMO, changing the design goal to be as foolprof as feasible would make Docker much better. Pretty please!

"If people are tricked into running the file" is not a good argument either. The internets are full of sites tricking people to download "runme.exe" files (to the various degree of success, but often successfully). We can patch security holes in software, but we cannot patch security holes in user's brains. Slashing features to achieve that entirely unreachable goal is fruitless, in my view.

@wshayes
Copy link

wshayes commented Apr 12, 2020

I've put together some truly awful, brittle hacks to get around this limitation. I've got several microservices that all depend on some common code.

  1. I tried to use my Dockerfiles in the directory above the individual microservices - but then the .dockerignore file doesn't work.

  2. I currently run a watchexec process to sync my common code into each microservice so that code/config is available to each microservice - really ugly solution, brittle as expected and a LOT of effort for someone new to the project to get setup and comfortable with.

Allowing symlinks for docker builds would make this really easy to manage and only things that were explicitly included in the docker build root dir by symlink or hardlink would be included in the docker build process.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder kind/enhancement Enhancements are not bugs or new features but can improve usability or performance.
Projects
None yet
Development

No branches or pull requests