Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm: prevent PSP blocking of upgrade image prepull #77792

Merged
merged 1 commit into from Jul 2, 2019

Conversation

@neolit123
Copy link
Member

commented May 12, 2019

What this PR does / why we need it:
kubeadm: prevent PSP blocking of upgrade image prepull

If the cluster has a PSP that blocks Pods from running as root
the DS that handles upgrade prepull will fail to create its Pods.

Workaround that by adding a PodSecurityContext with RunAsUser=999.

see the discussions here:
kubernetes/kubeadm#1208
#77787

Which issue(s) this PR fixes:

Fixes kubernetes/kubeadm#1208

Special notes for your reviewer:
example ClusterRole + PSP are here:
https://gist.github.com/praseodym/88ea84fef721f0da5ecf6b1f42ba77d7#file-psp-clusterroles-yaml

Does this PR introduce a user-facing change?:

kubeadm: prevent PSP blocking of upgrade image prepull by using a non-root user

/priority important-longterm
/kind bug
@kubernetes/sig-cluster-lifecycle-pr-reviews
cc @praseodym

holding on review from:
cc @mauilion @joshrosso
/hold

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented May 12, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: neolit123

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@@ -183,6 +184,12 @@ func buildPrePullDaemonSet(component, image string) *apps.DaemonSet {
},
Tolerations: []v1.Toleration{constants.ControlPlaneToleration},
TerminationGracePeriodSeconds: &gracePeriodSecs,
// Explicitly add a PodSecurityContext to allow these Pods to run as root.
// This prevents restrictive PSPs from blocking the Pod creation, by automatically

This comment has been minimized.

Copy link
@liggitt

liggitt May 13, 2019

Member

If a permissive PSP is present that allows the pod to be created as-is, it will be created as is (with no podSecurityContext and no explicit mustRunAsNonRoot setting)

The only time the pod would be forced to mustRunAsNonRoot: true by PSP is if that was the only psp available to the pod. In that case, setting this to false here would cause the pod to be completely rejected by PSP, and the pod could not be created at all. That might be what you want, since it indicates a permission problem earlier, but setting this to false will not prevent a restrictive PSP from blocking the pod creation.

This comment has been minimized.

Copy link
@praseodym

praseodym May 13, 2019

Contributor

Unfortunately that is not what happens. PSPs get evaluated in order, so it makes sense to have a restricted PSP with a MustRunAsNonRoot rule as the first in order. In that case, with no runAsNonRoot field set in the SecurityContext, the pod will be forced to run as non root by the PSP. This can be seen in the following code, which specifically triggers on sc.RunAsNonRoot() == nil and sets its value to true:

// if we're using the non-root strategy set the marker that this container should not be
// run as root which will signal to the kubelet to do a final check either on the runAsUser
// or, if runAsUser is not set, the image UID will be checked.
if sc.RunAsNonRoot() == nil && sc.RunAsUser() == nil && s.psp.Spec.RunAsUser.Rule == policy.RunAsUserStrategyMustRunAsNonRoot {
nonRoot := true
sc.SetRunAsNonRoot(&nonRoot)
}

Kubelet evaluates the actual runtime user (i.e. the value specified by the USER command in the Docker image) and sees that the container will be run as root, causing pod creation to fail.

This happens even when a less restrictive PSP (i.e. one without MustRunAsNonRoot rule) is available: it never gets tried by the PSP controller, because the pod does not specify it will run as root.

This comment has been minimized.

Copy link
@liggitt

liggitt May 13, 2019

Member

PSPs get evaluated in order, so it makes sense to have a restricted PSP with a MustRunAsNonRoot rule as the first in order.

Non-mutating PSPs are preferred over mutating ones

This comment has been minimized.

Copy link
@praseodym

praseodym May 13, 2019

Contributor

Then it seems like that preference isn't working out like it should. When I create this pod:

apiVersion: v1
kind: Pod
metadata:
  name: test
spec:
  containers:
  - name: test
    image: debian

on a cluster with these PSPs configured, the pod gets admitted to the a-restricted PSP with a MustRunAsNonRoot rule:

I0513 19:34:54.344576       1 admission.go:135] pod test (generate: ) in namespace kube-system validated against provider a-restricted

That log line is logged by a part the PSP admission plugin where mutation is allowed:

// compute the context. Mutation is allowed. ValidatedPSPAnnotation is not taken into account.
allowedPod, pspName, validationErrs, err := c.computeSecurityContext(a, pod, true, "")
if err != nil {
return admission.NewForbidden(a, err)
}
if allowedPod != nil {
*pod = *allowedPod
// annotate and accept the pod
klog.V(4).Infof("pod %s (generate: %s) in namespace %s validated against provider %s", pod.Name, pod.GenerateName, a.GetNamespace(), pspName)

And indeed, the pod never starts because was mutated to have a runAsNonRoot: true SC:

  Warning  Failed     60s (x8 over 2m22s)  kubelet, k8s-06    Error: container has runAsNonRoot and image will run as root

The same pod with a runAsNonRoot: false SC manually set gets created successfully with the c-moderate PSP applied.

This comment has been minimized.

Copy link
@liggitt

liggitt May 13, 2019

Member

Then it seems like that preference isn't working out like it should. When I create this pod:
...
on a cluster with these PSPs configured, the pod gets admitted to the a-restricted PSP with a MustRunAsNonRoot rule:

I see. In your case, you have multiple policies which require mutating the pod, so no policy allows the pod as is. In your cluster, you've configured the restrictive policy to take precedence over the moderate policy.

The full logic is:

  • If any authorized policies allow the pod as-is, admit the pod with no changes
  • If any authorized policies allow the pod with mutations, admit the pod under the first policy alphabetically

In your case, both the a-restricted and c-moderate policies would allow your example pod if it had some changes applied (dropped capabilities, etc). Since a-restricted comes first alphabetically, the restricted policy was chosen.

By explicitly setting securityContext: {runAsNonRoot: false}, your pod is no longer permitted at all by the a-restricted policy, so the c-moderate policy is the only one that would allow it.

If you want the moderate policy to be preferred for pods that it would allow, it would be simpler to name it so that it comes before your restrictive policy.

If the image prepull pod requires root access (does it?), then setting a runAsUser: 0 in the pod spec is fine. Like I said, it will not allow the pod to run in a scenario where the only PSPs available required running as non root.

This comment has been minimized.

Copy link
@mauilion

mauilion May 16, 2019

Contributor

TIL! Thanks Liggitt.

@praseodym

This comment has been minimized.

Copy link
Contributor

commented May 20, 2019

Considering that the prepull pod only runs /bin/sleep. we could also specify runAsUser: 999 (where 999 is some uid >0) instead?

@liggitt

This comment has been minimized.

Copy link
Member

commented May 20, 2019

Considering that the prepull pod only runs /bin/sleep. we could also specify runAsUser: 999 (where 999 is some uid >0) instead?

yes, either specifying a non-zero uid, or updating the image to not run as root would work as well

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Jun 10, 2019

@praseodym
just to double check. instead of the change this PR is doing, we could do this?

apiVersion: v1
kind: Pod
spec:
  securityContext:
    runAsUser: X

the pod itself might not require root, but the command (kubeadm upgrade) that creates it does require root. so X=0 seems fine in that regard?

@praseodym

This comment has been minimized.

Copy link
Contributor

commented Jun 11, 2019

the pod itself might not require root, but the command (kubeadm upgrade) that creates it does require root. so X=0 seems fine in that regard?

It would be, although that would be the same as setting runAsNonRoot: false. Because the pod doesn't need to be run as root I'd set it to something like runAsUser: 999 instead.

@neolit123 neolit123 force-pushed the neolit123:kubeadm-psp-upgrade-fix branch from f50d9a2 to ef84382 Jun 19, 2019
@neolit123

This comment has been minimized.

Copy link
Member Author

commented Jun 19, 2019

updated to use runAsUser=999
also tested locally and the upgrade pods work fine with this,
/hold cancel

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Jun 19, 2019

/retest

Copy link
Member

left a comment

Thanks @neolit123 !
@praseodym can you verify that this fix actually fixes your issue?
/lgtm
/hold

cmd/kubeadm/app/phases/upgrade/prepull.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the lgtm label Jul 1, 2019
@neolit123 neolit123 force-pushed the neolit123:kubeadm-psp-upgrade-fix branch from ef84382 to 857ccab Jul 1, 2019
@k8s-ci-robot k8s-ci-robot removed the lgtm label Jul 1, 2019
@neolit123

This comment has been minimized.

Copy link
Member Author

commented Jul 1, 2019

updated to use Int64Ptr

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Jul 1, 2019

@rosti

@praseodym can you verify that this fix actually fixes your issue?
/hold

also this is something we want in general as we don't need to run the upgrade DS Pods as root.

If the cluster has a PSP that blocks Pods from running as root
the DS that handles upgrade prepull will fail to create its Pods.

Workaround that by adding a PodSecurityContext with RunAsUser=999.
@neolit123 neolit123 force-pushed the neolit123:kubeadm-psp-upgrade-fix branch from 857ccab to 668d697 Jul 1, 2019
Copy link
Member

left a comment

Thanks @neolit123 !
Ok, let's merge this as is. If the original issue is not entirely fixed that way, we can always send new PRs.
/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm and removed do-not-merge/hold labels Jul 1, 2019
@fejta-bot

This comment has been minimized.

Copy link

commented Jul 1, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@neolit123

This comment has been minimized.

Copy link
Member Author

commented Jul 1, 2019

/retest

@fejta-bot

This comment has been minimized.

Copy link

commented Jul 2, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot

This comment has been minimized.

Copy link

commented Jul 2, 2019

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 003c4e5 into kubernetes:master Jul 2, 2019
23 checks passed
23 checks passed
cla/linuxfoundation neolit123 authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-dependencies Job succeeded.
Details
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-csi-serial Skipped.
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gce-iscsi Skipped.
pull-kubernetes-e2e-gce-iscsi-serial Skipped.
pull-kubernetes-e2e-gce-storage-slow Skipped.
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-node-e2e-containerd Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
@liggitt liggitt added this to the v1.16 milestone Aug 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.