New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-root support for volumes #12944
Non-root support for volumes #12944
Conversation
GCE e2e build/test failed for commit 7ffc2e6. |
return nil | ||
} | ||
|
||
// Do images need to be pulled here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the record, I think this will need to be a function of the container runtime, because the image pull happens behind a Runtime
interface method call.
7ffc2e6
to
eceffd5
Compare
GCE e2e build/test failed for commit eceffd5a364f09b3c8e73aa2ce6182fe461a66e2. |
@pmorie Re SELinux in this PR and #9844 what if we do the following:
This would give good defaults and the options to change them for more strict security. WDYT ? |
There's a discussion going on to do a pod security context - that has On Tue, Aug 25, 2015 at 4:11 PM, Sami Wagiaalla notifications@github.com
Clayton Coleman | Lead Engineer, OpenShift |
|
eceffd5
to
d63eabd
Compare
@smarterclayton @swagiaal I don't want to use |
GCE e2e build/test failed for commit d63eabd4b3668241f80057b3251c4e7b935ce97e. |
3. The builders for distributed file systems should return the correct values based on the `Manage` | ||
field of the volume source and the SELinux support of that volume type. | ||
|
||
TODO: persistent volumes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markturansky, does this concern fall under the 'dynamic provisioning' work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will read this entire PR carefully to understand how it will work with PV and then provide feedback.
d63eabd
to
7450cbc
Compare
@thockin @vishh @smarterclayton @swagiaal PTAL, made some major revisions and additions to this tonight. |
GCE e2e build/test failed for commit 7450cbc. |
@childsb @wattsteve PTAL |
i'd like to see 2nd order multi-tenancy addressed. Or the use case of existing storage/LDAP system with UID/GIDs that may overlap with UID/GIDs on the host or other containers. If the decision is to specifically not allow it should be stated (i think it should be allowed) |
@childsb I'm not exactly clear on what '2nd order' multitenancy is -- what would be first vs. second order? |
@childsb I am planning to add a section on cluster-wide UID/GID provisioning, just want to make sure I know what you're looking for. |
@pmorie i would define 1st/2nd order tenancy like: |
@childsb Is this the same topic we've previously discussed? You brought up examples of hadoop, tomcat, etc. |
@pmorie yes same issue |
@childsb Okay, I will address that in the section I add about that. TL;DR: You should make sure your containers run as a single UID, handling multiple UIDs within containers is not going to be supported. |
If you do not use :z you must generate pod level MCS labels and override the container ones |
|
||
The Kubelet must analyze the pod spec to determine which UIDs need to use which volumes. If a | ||
container's security context's `RunAsUser` field is not set, the Kubelet must inspect the image via | ||
the container runtime to determine which UID the image will run as. Once the list of UIDs that need |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the kubelet should avoid inspecting container images. If RunAsUser
is not specified then rely on the GID to grant volume permissions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, and will change it. I don't think there should be any ownership determination. We already inspect the image though for the RunAsNonRoot
feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this before and it is part of the policy because it matches
end user expectation. Images are late bound to nodes, so it has to be
enforced at the kubelet.
You can represent the policy you describe already.
On Aug 27, 2015, at 5:29 PM, Sami Wagiaalla notifications@github.com
wrote:
In docs/proposals/volumes.md
#12944 (comment):
+1. When does Kubernetes need to manage the ownership of a volume?
+2. How does Kubernetes determine the ownership of a volume?
+
+#### When to determine ownership
+
+Whether Kubernetes should manage the ownership of a volume for a distributed filesystem depends upon
+both the file system type and the cluster operator's policy. For example:
+
+1. Some organizations will manage ownership of volumes externally to the cluster
+2. It is not possible to securelychown
orchmod
paths within some distributed filesystems
+
+#### How to determine ownership
+
+The Kubelet must analyze the pod spec to determine which UIDs need to use which volumes. If a
+container's security context'sRunAsUser
field is not set, the Kubelet must inspect the image via
+the container runtime to determine which UID the image will run as. Once the list of UIDs that need
I think the kubelet should avoid inspecting container images. If RunAsUser
is not specified then rely on the GID to grant volume permissions
—
Reply to this email directly or view it on GitHub
https://github.com/kubernetes/kubernetes/pull/12944/files#r38142257.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant this in the context of RunAsNonRoot - late binding the image requires late binding the policy
880b0d6
to
d096d83
Compare
@thockin @smarterclayton PTAL |
Unit, integration and GCE e2e build/test failed for commit d096d83b7eee0b20b06a09fa7f10b4bb5d82c399. |
d096d83
to
26d1081
Compare
@thockin I updated the doc based on our discussion today. Since the SELinux one is going to have similar copy, would you care to review this one first? If we're over the hump on this one, I will update the SELinux proposal as well. |
Unit, integration and GCE e2e test build/test passed for commit 26d10817f78323d7d8a5c1a03bc33d1d8ea80542. |
Labelling this PR as size/XL |
|
||
If the list of UIDs that need to use a volume includes both root and non-root users, supplemental | ||
groups can be applied to enable sharing volumes between containers. The ownership and permissions | ||
`root:<supplemental group> 0770` will make a volume usable from both containers running as root and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need the mode to be 02770 on the dir - if the group sticky bit is not set users can write to the volume as themselves but the process's FSGID is what is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to say we want the setgid bit set for now. I'm still thinking through all the implications of sticky bit. I think we want it, trying to determine if we should set it on all volumes.
just a couple nits, then LGTM. I don't want to re-review, so feel free to self-lgtm this :) |
26d1081
to
32b6646
Compare
By the power vested in me by #12944 (comment), I declare this PR LGTM. |
Thanks for the review @thockin |
Unit, integration and GCE e2e build/test failed for commit 32b6646. |
@k8s-bot test this |
Unit, integration and GCE e2e build/test failed for commit 32b6646. |
Automatic merge from submit-queue |
Auto commit by PR queue bot
Auto commit by PR queue bot
This is a WIP proposal for handing concerns around volume ownership and pods running containers as non-root UIDs.
I'm still hacking around to see what kind of changes I think will be necessary but I think the use-case and analysis are probably ready to begin discussion around.
@thockin @vishh @smarterclayton @ncdc