Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker build should support privileged operations #1916

Closed
darklajid opened this issue Sep 18, 2013 · 288 comments
Closed

docker build should support privileged operations #1916

darklajid opened this issue Sep 18, 2013 · 288 comments
Labels
area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Comments

@darklajid
Copy link

Currently there seems to be no way to run privileged operations outside of docker run -privileged.

That means that I cannot do the same things in a Dockerfile. My recent issue: I'd like to run fuse (for encfs) inside of a container. Installing fuse is already a mess with hacks and ugly workarounds (see [1] and [2]), because mknod fails/isn't supported without a privileged build step.

The only workaround right now is to do the installation manually, using run -privileged, and creating a new 'fuse base image'. Which means that I cannot describe the whole container, from an official base image to finish, in a single Dockerfile.

I'd therefor suggest adding either

  • a docker build -privileged
    this should do the same thing as run -privileged, i.e. removing all caps limitations

or

  • a RUNP command in the Dockerfile
    this should .. well .. RUN, but with _P_rivileges

I tried looking at the source, but I'm useless with go and couldn't find a decent entrypoint to attach a proof of concept, unfortunately. :(

1: https://github.com/rogaha/docker-desktop/blob/master/Dockerfile#L40
2: #514 (comment)

@vieux
Copy link
Contributor

vieux commented Sep 18, 2013

If we go for this, I'm more in favor of the RUNP option, instead of having
all container running in -privileged mode.

On Wed, Sep 18, 2013 at 1:07 PM, Benjamin Podszun
notifications@github.comwrote:

Currently there seems to be no way to run privileged operations outside of
docker run -privileged.

That means that I cannot do the same things in a Dockerfile. My recent
issue: I'd like to run fuse (for encfs) inside of a container. Installing
fuse is already a mess with hacks and ugly workarounds (see [1] and [2]),
because mknod fails/isn't supported without a privileged build step.

The only workaround right now is to do the installation manually, using
run -privileged, and creating a new 'fuse base image'. Which means that I
cannot describe the whole container, from an official base image to finish,
in a single Dockerfile.

I'd therefor suggest adding either

  • a docker build -privileged
    this should do the same thing as run -privileged, i.e. removing all
    caps limitations

or

  • a RUNP command in the Dockerfile
    this should .. well .. RUN, but with _P_rivileges

I tried looking at the source, but I'm useless with go and couldn't find a
decent entrypoint to attach a proof of concept, unfortunately. :(

1: https://github.com/rogaha/docker-desktop/blob/master/Dockerfile#L40
2: #514 (comment)#514 (comment)


Reply to this email directly or view it on GitHubhttps://github.com//issues/1916
.

Victor VIEUX
http://vvieux.com

@jpetazzo
Copy link
Contributor

Actually, we might have to do both — i.e., RUNP + require a "-privileged"
flag.

If we rely only on RUNP (without requiring "-privileged"), then we would
have to wonder when we do a build "is this build safe?".
If we rely only on "-privileged", we miss the information (in the
Dockerfile) that "this action requires extended privileges".

I think a combination of both is the safest way.

On Wed, Sep 18, 2013 at 4:07 AM, Benjamin Podszun
notifications@github.comwrote:

Currently there seems to be no way to run privileged operations outside of
docker run -privileged.

That means that I cannot do the same things in a Dockerfile. My recent
issue: I'd like to run fuse (for encfs) inside of a container. Installing
fuse is already a mess with hacks and ugly workarounds (see [1] and [2]),
because mknod fails/isn't supported without a privileged build step.

The only workaround right now is to do the installation manually, using
run -privileged, and creating a new 'fuse base image'. Which means that I
cannot describe the whole container, from an official base image to finish,
in a single Dockerfile.

I'd therefor suggest adding either

  • a docker build -privileged
    this should do the same thing as run -privileged, i.e. removing all
    caps limitations

or

  • a RUNP command in the Dockerfile
    this should .. well .. RUN, but with _P_rivileges

I tried looking at the source, but I'm useless with go and couldn't find a
decent entrypoint to attach a proof of concept, unfortunately. :(

1: https://github.com/rogaha/docker-desktop/blob/master/Dockerfile#L40
2: #514 (comment)#514 (comment)


Reply to this email directly or view it on GitHubhttps://github.com//issues/1916
.

@jpetazzo https://twitter.com/jpetazzo
Latest blog post: http://blog.docker.io/2013/09/docker-joyent-openvpn-bliss/

@darklajid
Copy link
Author

Sounds reasonable. For me this feature (being able to create device nodes) makes or breaks the ability to create the deployment in Docker. If I can help (testing mostly, I tried looking at the source but failed so far. It seems the available commands in a buildfile are found via reflection, I added a runp command that set the config.privileged to true, but so far I'm unable to build and test -> stuck) I'd gladly invest some time.

@jpetazzo
Copy link
Contributor

I'd suggest RUNP + build -privileged.

lights up some smoke signals to catch attention of @shykes, @crosbymichael

... And then we'll have to find someone to implement it, of course ☺
Would that be something you'd want to try (with appropriate guidance and feedback from the core team, of course) ?

@darklajid
Copy link
Author

If the last part was targeted at me: Sure, why not. I'm already messing with the go code (not a language I'm familiar with, but see above: I'm trying to figure out what's going on anyway).

With a couple of pointers / someone to ping for some questions I'd certainly give it a try.

@shykes
Copy link
Contributor

shykes commented Sep 20, 2013

I'm not sold on RUNP or build -privileged.

​Generally I don't like anything that introduces different possible builds of the same input. That's why you can't pass arguments or env variables to a build.

Specifically I don't like introducing dependencies on "privileged" all over the place, because it designates a set of capabilities that is a) very large and b) not clearly spec-ed or defined. That's ok as a coarse mechanism for sysadmins to bypass security in an all-or-nothing way - an "escape hatch" when the standard docker execution environment is not enough. It's similar in that way to bind-mounts and custom lxc-conf.


@solomonstre
@getdocker

On Fri, Sep 20, 2013 at 3:18 PM, Benjamin Podszun
notifications@github.com wrote:

If the last part was targeted at me: Sure, why not. I'm already messing with the go code (not a language I'm familiar with, but see above: I'm trying to figure out what's going on anyway).

With a couple of pointers / someone to ping for some questions I'd certainly give it a try.

Reply to this email directly or view it on GitHub:
#1916 (comment)

@darklajid
Copy link
Author

Well, do you agree that it should be possible to build a docker image that - for example - runs fuse?
For that we'd need to mknod.

The way I see it, there's no way these builds could be different depending on parameters: The build will work (caps are not / less restricted than now) or fail (status quo). There's little to no risk of different 'versions' of the same build file, right?

@lukewpatterson
Copy link

I'm running into this issue now. To build the image I need, I have to perform a series of run -privileged steps + a commit step, rather than building a Dockerfile. Ideally, it would be nice to express the image build steps in a Dockerfile.

@jpetazzo
Copy link
Contributor

Is it also related to mknod operations?
If you could describe exactly the actions that require privileged mode in
your case, it would be very helpful!
Thanks,

@lukewpatterson
Copy link

Hey @jpetazzo, from the mailing list, here is the issue I'm facing: https://groups.google.com/forum/#!topic/docker-user/1pFhqlfbqQI

I'm trying to mount a fs I created (created to work around aufs and something about journaling) inside the container. The specific command I'm running is mount -o loop=/dev/loop0 /db/disk-image /home/db2inst1, where /db/disk-image was created with dd if=/dev/zero of=disk-image count=409600 and home/db2inst1 is where I'm trying to start db2 from.

@jpetazzo
Copy link
Contributor

If I understand correctly, during the installation process, you need a non-AUFS directory — or rather, something that supports O_DIRECT. If that's the case, Docker 0.7 should solve the problem, since it will use ext4 (and block-level snapshots) instead of AUFS.

@orikremer
Copy link

+1 for this as well.

Installing packages that require change to memory settings and kernel configuration (e.g. Vertica DB, WebSphere MQ) can only be done in privileged mode.

@unclejack
Copy link
Contributor

Let's try to separate concerns when it comes to running / building with "privileged": it can be required just during the build, just during execution via docker run or both.

It should be possible to allow a build to do something requiring a bit more permissions for a step (or more) if that's necessary. I needed this for a project and had to convert half of a Dockerfile to a shell script which invoked the build and continued to build things in privileged mode, so having a "privileged" build would be useful.

However, we shouldn't go all the way down to privileged mode by default just so we can use sysctl to change some settings. This should be done via image configuration or via command line args to be passed to docker run.

@jpetazzo
Copy link
Contributor

jpetazzo commented Oct 3, 2013

Right. @orikremer, do you have details on the parameters that Vertica DB and WebSphere MQ were trying to change?

If it's stuff in /sys or /proc, the best solution might be to put some mock up there instead, rather than switching to privileged mode (since the changes won't be persisted anyway).

In the long run, a mock filesystem might capture the change and convert them to Dockerfile directives, instructing the runtime that "hey, this container needs such or such tweak".

@orikremer
Copy link

@jpetazzo It's been a couple of days since I created the image. AFAIR Vertica was complaining that it doesn't have enough memory and both were trying to change max open files.
I will try to recreate the image using a Dockerfile and report back.

@ewindisch
Copy link
Contributor

Noting issue #2080 as it is related.

@orikremer
Copy link

@jpetazzo started recreating the image without -privileged. Two issues so far:

  • nice in limits.conf: Vertica adds "dbadmin - nice 0" to /etc/security/limits.conf. When trying to switch to that user when running in a non-privileged container I get a "could not open session" error. In a privileged container switch user works with no errors.
  • max open files: since the max needed in the container was higher than the one set in host I had to change /etc/init/docker.conf on the host and set "limit nofile" and then ulimit -n in the container. Is that the correct approach ?

@jpetazzo
Copy link
Contributor

jpetazzo commented Oct 6, 2013

When trying to switch to that user,

How does the switch happen? I don't understand how -privileged would help with user-switching; I'm probably missing something here :-)

max open files

If I understand correctly, the Vertical installer tries to set the max number of open files to a very high number, and that only works if Docker was started with such a high number or with the -privileged flag; right?

@orikremer
Copy link

switching user - su dbadmin fails with that error.
I was able to reproduce by:

  • pull a new image (centos-6.4-x86_64) and run non privileged
  • useradd testuser
  • edit /etc/security/limits.conf, add "testuser - nice 0"
  • try su testuser --> should fail with "could not open session"
    In a -privileged container su testuser works fine.

max open files - correct. the installer tries to set to a number higher than the host has. Only by increasing the hosts setting or starting -privileged does this work.

@jpetazzo
Copy link
Contributor

jpetazzo commented Oct 7, 2013

I just tried with the following Dockerfile:

FROM ubuntu
RUN useradd testuser
RUN echo testuser - nice 0 > /etc/security/limits.conf
CMD su testuser

And it works fine. What's the exact name of the image you're using?
(I tried centos-6.4-x86_64 but looks like I can't pull it!)

@mikewaters
Copy link

@lukewpatterson Can you share how you got the loop filesystem working inside your container?

@orikremer
Copy link

@jpetazzo Running this docker file

FROM backjlack/centos-6.4-x86_64
RUN useradd testuser
RUN echo 'testuser - nice 0' >> /etc/security/limits.conf
RUN su testuser
RUN echo 'test' > ~/test.txt

failed with:

ori@ubuntu:~/su_test$ sudo docker build .
Uploading context 10240 bytes
Step 1 : FROM backjlack/centos-6.4-x86_64
 ---> b1343935b9e5
Step 2 : RUN useradd testuser
 ---> Running in b41d9aa2be1b
 ---> 2ff05b54e806
Step 3 : RUN echo 'testuser - nice 0' >> /etc/security/limits.conf
 ---> Running in e83291fafc66
 ---> 03b85baf140a
Step 4 : RUN su testuser
 ---> Running in c289f6e5f3f4
could not open session
Error build: The command [/bin/sh -c su testuser] returned a non-zero code: 1
The command [/bin/sh -c su testuser] returned a non-zero code: 1
ori@ubuntu:~/su_test$

@jpetazzo
Copy link
Contributor

jpetazzo commented Oct 7, 2013

I turned on debugging for the PAM module (by adding debug to the pam_limits.so line in /etc/pam.d/system-auth), installed syslog, tried to su again, and here's what I found in /var/log/secure:

Oct 7 14:12:23 8be1e7bc5590 su: pam_limits(su:session): reading settings from '/etc/security/limits.conf'
Oct 7 14:12:23 8be1e7bc5590 su: pam_limits(su:session): process_limit: processing - nice 0 for USER
Oct 7 14:12:23 8be1e7bc5590 su: pam_limits(su:session): reading settings from '/etc/security/limits.d/90-nproc.conf'
Oct 7 14:12:23 8be1e7bc5590 su: pam_limits(su:session): process_limit: processing soft nproc 1024 for DEFAULT
Oct 7 14:12:23 8be1e7bc5590 su: pam_limits(su:session): Could not set limit for 'nice': Operation not permitted

Then I straced the su process, and found out that the following system call was failing:

setrlimit(RLIMIT_NICE, {rlim_cur=20, rlim_max=20}) = -1 EPERM (Operation not permitted)

This, in turn, causes the pam_limits module to report a failure; and this prevents su from continuing.
Interestingly, on Ubuntu, pam_limits is not enabled for su by default; and even if you enable it, the setrlimit call fails, but su continues and works anyway.
It might be related to the audit code, I'm not sure.

Now, why is setrlimit failing? Because the container is missing the sys_resource capability, which is required to raise any kind of limit.

I think I would just comment out that limits.conf directive.
(By the way, it's bad practice to add stuff directly to limits.conf; it should go to a separate file in limits.d, I think.)

Note: since you already increased the limit for number of open files for Docker, you could also raise the limit for the max priority; that should work as well!

I hope this helps.

@tianon
Copy link
Member

tianon commented Oct 7, 2013

In that Dockerfile, you've got the following line by itself:

RUN su testuser

There's no command to go with this (and it won't apply any resulting shell to subsequent RUN commands), so I wouldn't be surprised if it's really failing at trying to open a shell and not being somewhere interactive that doing so makes sense (since docker build is not an interactive process). I don't have time right now to confirm, but it's probably worth a try with an actual command being passed to su.

@orikremer
Copy link

@jpetazzo Thanks for the detailed description. I will try raising the max priority for Docker and see if that helps.

(By the way, it's bad practice to add stuff directly to limits.conf; it should go to a separate file in limits.d, I think.)

Agreed, but as this is the Vertica installer code that's putting it there I am trying work around that.

@tianon The same happens if I run this in an interactive shell (/bin/bash).

@tianon
Copy link
Member

tianon commented Oct 7, 2013

My apologies, I think it was still worth a try.

The point about that line not making much sense in the Dockerfile still does apply though. You probably wanted something more like this (after you figure out the limits issues):

RUN su testuser -c 'echo test > ~/test.txt'

@orikremer
Copy link

@tianon you're right, it doesn't make much sense. That was merely to demonstrate that the su itself fails.

@jpetazzo
Copy link
Contributor

To get back to the original discussion: I believe it is fine from a security standpoint (and useful!) to allow setfcap and mknod capabilities in the build process (and probably in regular container execution as well). Does anyone see any problem that could stem from that?

@zigguratt
Copy link

@jpetazzo Quite the opposite! It would solve many problems I'm encountering. I think this is necessary for people who want to run Docker containers that act/look more like a real machine.

@barelnir
Copy link

Hi

Randomly the docker build choose an hostname starting zero '0' which breaks our application, I tried to run "hostname" in such case inside my DockerFile but faced the same issue.

I also would like to have an option to run the docker build with RUNP or get an option to choose the hostname during build.

@nelsonjchen
Copy link
Contributor

nelsonjchen commented Sep 16, 2018

Has anybody tried building these kinds of images with Kaniko? I just did it with @maneamarius 's Dockerfile on Docker for Mac and it seems to build successfully once you call Kaniko's docker run "build" command with --cap-add=SYS_PTRACE. Though, I'm having a bit of trouble loading the resulting tarball locally, the RAM usage is a bit high since it can't use overlayfs, and layer caching is still WIP. Things might Just Work if I push to a registry but I haven't tried that yet.

docker run --cap-add=SYS_PTRACE --rm -v $(pwd):/workspace gcr.io/kaniko-project/executor:latest --dockerfile=Dockerfile --context=/workspace --tarPath=/workspace/test.tar --destination=test  --single-snapshot

@avidspartan1
Copy link

Having this feature would greatly help efforts to build Docker images via Puppet on Redhat/CentOS base images.

@nelsonjchen
Copy link
Contributor

nelsonjchen commented Jan 10, 2019

Since I last posted, I've followed back up with the changes in Kaniko. They are no longer tarballing in memory and are tarballing onto the disk which means support for Dockerfiles describing big images. Layer caching is still a WIP but they have an option for caching the base images for the moment (That means currently no fast RUN iteration save and run kind of work but we can cache alpine, ubuntu, and whatever popular bases are out there).

It's at a state where I've been successful in building @maneamarius's Dockerfile that emerges Golang in a Gentoo image in this project/demo without modifying @maneamarius 's Dockerfile or chopping it up in any way (EDIT: I've since had to modify the Dockerfile to pin the gentoo base image to the version that was latest at the time of this post. Otherwise, it's still unmodified.) :

https://github.com/nelsonjchen/kaniko-privileged-maneamarius-moby-1916

I've also configured Azure Pipelines to build the Dockerfile into an image with Kaniko with --cap-add=SYS_PTRACE, load Kaniko's output tarball, and run go version in the generated image. I figured some interactive "proof of life" would be interesting. Some of the earlier comments in here also were concerned about CI systems so I figured I'll configure a public CI system to work as well. BTW, Travis CI was considered but the build output was too long and it got terminated and Azure is perfectly happy with 166k lines of output. If the Dockerfile built with about 70k less lines of output, it probably would have succeeded on Travis CI. A link to the Azure Pipeline build outputs is at the top of the README.

@alexey-vostrikov
Copy link

Use buildah Luke

@AkihiroSuda
Copy link
Member

I'm closing this issue, because the feature is now available as docker buildx build --allow security.insecure

https://github.com/docker/buildx/blob/master/README.md#--allowentitlement
https://github.com/moby/buildkit/blob/master/frontend/dockerfile/docs/experimental.md#run---securityinsecuresandbox

@bharathappali
Copy link

bharathappali commented Sep 27, 2019

@AkihiroSuda I have updated my docker to version 19.03 to try buildx. When i was trying the command you mentioned, its giving me an error

$ docker buildx build --allow security.insecure -t sample-petclinic -f Dockerfile .
[+] Building 0.0s (0/0)                                                                                                                                                         
failed to solve: rpc error: code = Unknown desc = entitlement security.insecure is not allowed

Docker version:

Client: Docker Engine - Enterprise
 Version:           19.03.2
 API version:       1.40
 Go version:        go1.12.8
 Git commit:        c92ab06
 Built:             Tue Sep  3 15:57:09 2019
 OS/Arch:           linux/amd64
 Experimental:      true

Server: Docker Engine - Enterprise
 Engine:
  Version:          19.03.2
  API version:      1.40 (minimum version 1.12)
  Go version:       go1.12.8
  Git commit:       c92ab06
  Built:            Tue Sep  3 15:55:37 2019
  OS/Arch:          linux/amd64
  Experimental:     true
 containerd:
  Version:          1.2.6
  GitCommit:        894b81a4b802e4eb2a91d1ce216b8817763c29fb
 runc:
  Version:          1.0.0-rc8
  GitCommit:        425e105d5a03fabd737a126ad93d62a9eeede87f
 docker-init:
  Version:          0.18.0
  GitCommit:        fec3683

@AkihiroSuda
Copy link
Member

buildx docs: For entitlements to be enabled, the buildkitd daemon also needs to allow them with --allow-insecure-entitlement

@bharathappali
Copy link

Thanks @AkihiroSuda . It worked now.

@uvwild
Copy link

uvwild commented Feb 25, 2020

just to add another use case.
I am trying to fix a dockerfile build of an ibmdb2 container with a test database
IBM removed the v10 image from the hub. But the v11 DB image only starts with --privileged.
So all the code setting up the database in the Dockerfile is unfunctional now because the db2 doesn 't start without privileged. :(
There seem to be a complicated workaround with using docker run and docker commit.
In a productive build pipeline this creates a lot of extra complexity.

I have to ask like https://github.com/maneamarius in #1916 (comment)

Why is it such a big deal to support this? The build does execute a run under the hood.

In this specific use case a privileged build option would support a kind of "backward compatibility" and I know I am not the only one who had this issue after my web research.

@bharathappali
Copy link

bharathappali commented Feb 26, 2020

@uvwild I'm not sure if this helps your use case but you can give a try to build with kaniko Your image will be built without a docker deamon and you can extract the image once its done and using kaniko is jus like running a container you can use --privileged or --cap-add <capability which is needed> which might solve your problem.

I accept its not a complete solution you were expecting but an easier workaround which may fit in your build pipeline.

EDIT: As @alexey-vostrikov said buildah could be a more feasible solution for use cases which need --privileged to build image

mensinda added a commit to mensinda/meson that referenced this issue May 27, 2020
In their infinite wisdom, the docker devs decided in
moby/moby#1916 that adding support
for --cap-add=SYS_PTRACE for `docker build` is not a priority
or that they just don't want to do that.

So we have to work around this limitation by creating a
temporary base image, then run the install script with
`docker run`. Next, the resulting container is converted to
an image with `docker commit`. Finally we clean up the mess
we have made.
mensinda added a commit to mensinda/meson that referenced this issue May 27, 2020
In their infinite wisdom, the docker devs decided in
moby/moby#1916 that adding support
for --cap-add=SYS_PTRACE for `docker build` is not a priority
or that they just don't want to do that.

So we have to work around this limitation by creating a
temporary base image, then run the install script with
`docker run`. Next, the resulting container is converted to
an image with `docker commit`. Finally we clean up the mess
we have made.
cpuguy83 pushed a commit to cpuguy83/docker that referenced this issue May 25, 2021
Docker deamon doesn't work with --fixed-cidr on windows
@jerblack
Copy link

buildx docs: For entitlements to be enabled, the buildkitd daemon also needs to allow them with --allow-insecure-entitlement

What does this mean? Where is this done? I can't find documentation on this anywhere.

@dlants
Copy link

dlants commented Mar 6, 2023

I was able to get past the fuse: device not found, try 'modprobe fuse' first error (trying to run tup https://gittup.org/tup/ ) by doing the following:

Dockerfile:

# syntax=docker/dockerfile:1-labs
...
RUN --security=insecure <command requiring fuse>

see: https://docs.docker.com/engine/reference/builder/#run---securityinsecure

commands

docker buildx create --buildkitd-flags '--allow-insecure-entitlement security.insecure' --name insecure-builder
docker buildx use insecure-builder
docker buildx build --allow security.insecure .

see: https://docs.docker.com/engine/reference/commandline/buildx_build/#allow
see: https://docs.docker.com/engine/reference/commandline/buildx_create/#buildkitd-flags

I don't totally understand all of these settings or if they're wise to use, so beware! Definitely wish there was an easier way to do this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/builder kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny
Projects
None yet
Development

Successfully merging a pull request may close this issue.