-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement --cap and --namespace flags for better control #6687
Comments
ping @philips @unclejack @rhatdan |
@crosbymichael +1 to adding this feature. I can help with changes to libcontainer to make this happen. |
+1 On Wed, Jun 25, 2014 at 5:12 PM, Mrunal Patel notifications@github.com
|
Does this mean we are going away from the --opts patch altogether, and I should work on the --selinuxopts? |
I actually like the feature but the CLI is a little ugly. --namespace -PID PID looks like an Option but it is really not, and does this eliminate the use of -P as an option. systemd-nspawn does this using --capabiltiy and --drop-capabilty man systemd-nspawn
|
I agree with @rhatdan, but more concisely just --drop-cap, --add-cap Also, for namespaces it would be really nice to have the same functionality that we currently have with net for all namespaces... so host/new/container:<id/name> |
Any thoughts about adding this support to Dockerfiles, such that these things can be baked into an image? Dockerfile could also allow/deny override of the baked-in config via docker run ... |
I like --add-cap, other options are --keep-cap or --allow-cap. namespace is sort of overloaded, what about something like --pid-namespace=true|false? I think as separate issues it'd be great to list capabilities via docker inspect, or possibly a new API method. |
@rhatdan yes, I would just work on a |
@jeremyeder No, we are not ready to add anything to the image until we have some type of control that says do not run any untrusted image with elevated permissions on my docker host. For now we will keep these as runtime only options |
FWIW, I like the alternative ideas over the +/- syntax. |
I agree with @rhatdan I like to see the flags grouped together. What about |
I see a dichotomy: docker default - you can only add capabilities, not drop**. docker run --add-cap .... ** is there a case for taking away from the default whitelist capabilities of docker? It seems that would only break things? |
@crosbymichael Regarding Dockerfile support. As mentioned in #6616, often applications will not run correctly if not granted the capabilities they require. I recommend that Dockerfile capability specification should not actually provide elevated permissions, but should indicate required permissions, so a clear error can be issued in the case that they are not granted, higher-level systems can interrogate the image to determine required capabilities, etc. |
@michaelneale The problem with this is --privileged means more then just --drop-cap. It also turns off MAC support (selinux or AppArmor) and maybe a couple of other things. Again I look for systemd-nspawn which has the all flag --drop-cap=all --add-cap=SYS_ADMIN --drop-namespace=all --add-namespace=MNT We probably should separate out the discussion of DockerFiles from the discussion of CLI for dropping/adding namespaces and capabilities. |
@bgrant0607 I do agree. One of the difficult things for users of "Contained" Docker Containers will be figuring out why a container is getting "permission denied". There are several potential reasons for this. Read Only File Systems And the kernel does not help you out. We did have an effort to try to help, but Linus did not like it. https://fedoraproject.org/wiki/Features/FriendlyEPERM Currently the only thing a user can do is turn everything (all security) off (--privilege) and if it works he is happy but insecure. |
I really like the all flag @rhatdan is suggesting. It makes things very explicit, which is extremely important imo, and you don't have to wonder which caps Docker is adding or not adding by default or if the defaults ever change, etc. @michaelneale I'm actually not seeing a case for Docker to have a default whitelist at all. That it does isn't very well documented, and I think it leads to a false sense of security as well. I get the impression the Docker whitelist was created when Docker was just a project to be used by dotcloud, and best fit their use cases. |
Regarding namespaces, instead of just on/off, can we have the ability to specify another container id/name, similar to the -net flag. So --namespace pid=container:otherguy. Just as its nice to be able to have namespaces turned off so that you can augment the host system with things like monitoring tools, it is also just as nice to be able to augment other containers. |
@danbeaulieu Docker's whitelist has been created by docker and the community ( anyone remember the great MKNOD battle? ). It is not created to best fit the dotcloud usecase and the list evolves as we learn more about how people are using docker. Yes, i think we should have an |
Sadly I missed the MKNOD battle, I would have been on the side of don't grant it. And get rid of cgroup device node stuff, which I still have no clue why this is not a namespace. Having a keyword like "all" and "none" would make building up or taking away capabilities a lot easier. I think having a group of predefined CAPABILITIES is good so people will run in a "semi"-secure environment. I just wish it was easy to see what the default list is. Perhaps we should list it in the docker run --help |
If add/remove capabilities/namespaces goes in, can I do share @rhatdan concern that if something is getting denied, people might just run privileged and therefore disable MAC entirely. |
@rhatdan I started a basic implementation for caps here: https://github.com/vieux/docker/compare/dotcloud:master...vieux:cap_add_drop?expand=1 it supports Any idea ? /cc @crosbymichael |
all is fine with me |
@rhatdan nevermind I edited my comment, |
See #6968 |
One thing we discussed part of #2452 was support for FreeBSD and how this feature looks when we consider supporting other operating systems. Do we simply assume that all arguments to --cap are OS-dependent, or do we attempt a mapping to our own verbs to the OS-specific ones? I'm inclined to say it should be OS-dependent. |
It should be OS Specific. We should not be inventing a new language. How close is the --namespace patch? |
Has there been any progress on this? I am highly interested in the ability to drop specific namespaces. |
--cap-add and --cap-drop was included with Docker 1.2. No --namespace flag has been added. We should probably close this issue, possibly creating one for more granular namespace configuration. |
Perhaps my use case would help here. What I'm trying to do is mount something within a volume in one container, and then share the volume (containing the mount) with another container. Because of the mount namespace, when the volume is shared with another container, it appears empty. If I could disable the mount namespace, I could work around this. |
Currently you can disable network,UTC and IPC namespace. --net=host, --ipc=host I have submitted a patch to disable the PID namespace --pid=host Usernamespace is not enabled yet. That leaves mount, but kind of the definition of docker is using the MNT Namespace for handing of the image content. Not sure how you disable that and still have docker. libcontainer allows you to disable/enable all combinations. |
Well I imagine images would break left and right if the root ( |
imo it would be cool to have it tho - perhaps its a way for a container to union on top of another host dir / container, and in the process, we make even lighter weight containers. :) |
Even if you shared / or /usr you would probably still want to have applications write to locations that are different per container /var and /tmp for example. But if you want just to have namespaces, why not just use the unshare command. |
This is complete with |
We discussed this and think it would be best to not use something like
--opts
to be able to add and remove individual capabilities and namespaces for containers. We would rather use a dedicated flag and make sure that these features work across libcontainer and lxc. So our proposal is to add these two flags and provide a warning when elevating privileges for a container so that a user is aware of that they are doing.Caps
http://linux.die.net/man/7/capabilities
NET_ADMIN
cap to a container.Any additional caps that are retained for a container should produce an error on stderr when running the container to let the user know they have elevated the privileges for a container.
Namespaces
There are a few reasons why you want to modify the namespaces applied to a container. If you are running a monitoring application in a container you may want to not drop into your own PID namespace so that you can see all your parents PID.
Changing the default namespace profile should result in a warning to the user.
Privileged
Using the privileged flag should result in a warning on stderr just like the two new flags above.
Development
We can discuss some of the details in this issue but if you would like to work on any of the changes please let me know.
I think all three of these changes, caps, namespaces, and privileged warning can be done separately. Ideally the privileged warning code can be merged in first so we have the warning pipeline to work with on the other prs. Caps should be easy to implement. Namespaces will be a little more challenging because it may consist of some runtime/libcontainer changes to support the different configuration.
Suggestions?
The text was updated successfully, but these errors were encountered: