Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supplemental groups on PVs mutate pod spec in kubelet #27197

Closed
pmorie opened this issue Jun 10, 2016 · 17 comments · Fixed by #27639
Closed

Supplemental groups on PVs mutate pod spec in kubelet #27197

pmorie opened this issue Jun 10, 2016 · 17 comments · Fixed by #27639
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/storage Categorizes an issue or PR as relevant to SIG Storage.

Comments

@pmorie
Copy link
Member

pmorie commented Jun 10, 2016

#20490 added a beta feature which allows PVs to be annotated with a GID that owns them. This is to facilitate volumes like NFS that cannot be safely chowned/chmoded from the client and thus do not work with FSGroup. There are a couple issues with this PR that need to be sorted out:

  1. The pod spec should not be mutated. We should formalize in the contract for volume plugins that the api.Pod they are passed is a read-only view and should not be mutated. We should probably do a copy of the pod when creating Mounters.
  2. Currently there is no security policy check to see whether the Pod can use the supplemental group the volume is annotated with. We must be able to validate this.

@kubernetes/sig-storage

@pmorie pmorie added kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/storage Categorizes an issue or PR as relevant to SIG Storage. labels Jun 10, 2016
@pmorie pmorie added this to the v1.3 milestone Jun 10, 2016
@derekwaynecarr
Copy link
Member

@vishh - this is similar to some of the discussions we had where we should not mutate internal caches in kubelet.

@pmorie
Copy link
Member Author

pmorie commented Jun 10, 2016

@pweil- touch point with security policy here.

@vishh
Copy link
Contributor

vishh commented Jun 10, 2016

@derekwaynecarr My main concern with the copy approach is that the kubelet code might become too complicated to understand/reason. That concern can be solved through code cleanup and clear abstractions I hope.

@pmorie
Copy link
Member Author

pmorie commented Jun 10, 2016

Discussed a fix for the pod mutation with @saad-ali here: https://github.com/kubernetes/kubernetes/pull/26801/files#r66683068

@yujuhong
Copy link
Contributor

@derekwaynecarr My main concern with the copy approach is that the kubelet code might become too complicated to understand/reason. That concern can be solved through code cleanup and clear abstractions I hope.

This could also cause performance issues if objects such as pods are copied on every call, even though most components in kubelet don't need to mutate the objects.

@goltermann
Copy link
Contributor

This is a P0 with no assignee. Anyone know who it should go to?

@pmorie
Copy link
Member Author

pmorie commented Jun 13, 2016

@goltermann I'm trying to figure that out now. Might be @saad-ali -- we have a 1:1 today and we'll know more after that.

@pmorie pmorie added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jun 13, 2016
@pmorie
Copy link
Member Author

pmorie commented Jun 13, 2016

downgrading to p1 from p0 after talking to @eparis and @saad-ali

@matchstick
Copy link
Contributor

@pmorie Making this a P1 means that we expect it to get done in 1.3 time frame. That means that someone will resolve it in the next two weeks. Do we have folks who can do that? I am not sure that @saad-ali can get to it in that time frame.

@pmorie
Copy link
Member Author

pmorie commented Jun 13, 2016

@matchstick at this point I was lowering it because I don't think it should block the beta.

@pmorie pmorie added priority/backlog Higher priority than priority/awaiting-more-evidence. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Jun 13, 2016
@goltermann
Copy link
Contributor

To be super clear, @saad-ali and @pmorie - P2 means we don't think it needs to be fixed at all for v1.3 (but we'd fix it in 1.4). Is that accurate?

@eparis
Copy link
Contributor

eparis commented Jun 13, 2016

I'd like to know what @dchen1107 thinks of what they are doing in the kubelet to a (supposedly) ro object. If she doesn't scream and rip our her hair, I think we could wait until 1.4 with little/reasonable negative affect for end users.

@eparis
Copy link
Contributor

eparis commented Jun 13, 2016

although @liggitt this is going to bite us for openshift.

@dchen1107
Copy link
Member

Here are the reasons we decided to push the priority to p2 after talking to related parties:

  1. In 1.2 release, a beta(?) feature called fsGroup (FSGroup implementation #15352, ...) that lets you specify a special supplemental group that will own a pod's volumes.
  2. However, for NFS, you can't safely do a chown from the client for a number of reasons. Also, a lot of NFS volumes are provisioned to be owned by some GID that you have to have in order to use the volume. As a result, an administrator has to coordinate w/ the consumer of such a volume to say 'use group X when you use NFS volume Y'.
  3. To resolve above the issue, Automatically Add Supplemental Groups from Volumes to Pods #20490 was introduced to annotate the PV with the GID and having it used automatically.
  4. But unfortunately, the pr (Automatically Add Supplemental Groups from Volumes to Pods #20490) itself introduces a bunch of issues listed by @pmorie in issue description. Given current release status, there is no easy way today to fix this particular issue. I suggested to de-prioritize this to p2 to unblock 1.3 release.
  5. Meanwhile to mitigate the potential issues, especially lacking security policy validation, I suggested we revert Automatically Add Supplemental Groups from Volumes to Pods #20490. Then we can plan to resolve the issue in 1.4 release.

@pmorie please correct me if anything is missing?

cc/ @matchstick @saad-ali

@smarterclayton
Copy link
Contributor

@mfojtik / @pweil- if this is reverted here we'll need a carry and the security policy aspects closed downstream.

@pweil-
Copy link
Contributor

pweil- commented Jun 15, 2016

the security policy aspects closed downstream.

@mfojtik let's create a bug/issue/card to make sure this isn't missed as a post-rebase item. At a minimum we'll need to do a policy check to ensure the groups are allowed (build already does something similar for UID access).

Edit: created openshift/origin#9361

@matchstick matchstick added the priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. label Jun 16, 2016
@matchstick matchstick removed the priority/backlog Higher priority than priority/awaiting-more-evidence. label Jun 16, 2016
@matchstick matchstick modified the milestones: next-candidate, v1.3 Jun 16, 2016
@matchstick
Copy link
Contributor

So we are removing most of the P2s from the v1.3 milestone as they can be pushed to v1.4. Due to this special nature I think that a) everyone in the @kubernetes/sig-storage will be keeping it in our minds and try our best to not let it fall through the cracks (I know saad and I feel that way). After we release v1.3 we should revisit this issue.

I am going to mark it with milestone "next-candidate" and raising to P0 which is where we are bucketing many issues we wish we could have gotten into the latest release. There are a few other of these.

Hopefully this makes sense.

k8s-github-robot pushed a commit that referenced this issue Jun 21, 2016
Automatic merge from submit-queue

Remove pod mutation for volumes annotated with supplemental groups

Removes the pod mutation added in #20490 -- partially resolves #27197 from the standpoint of making the feature inactive in 1.3.  Our plan is to make this work correctly in 1.4.

@kubernetes/sig-storage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/storage Categorizes an issue or PR as relevant to SIG Storage.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants