PSP Replacement KEP #2582

tallclair · 2021-03-22T23:24:35Z

This KEP proposes a new policy mechanism to replace the use cases covered by PodSecurityPolicy.

This is an adaptation of the initial proposal that's been under discussion by members of sig-auth and sig-security, here: https://docs.google.com/document/d/1dpfDF3Dk4HhbQe74AyCpzUYMjp4ZhiEgGXSMpVWLlqQ/edit?usp=sharing

Most of the content is copied over from that doc, with the following additions:

motivation & goals sections (copied from the PSP replacement goals doc)
admission configuration section - This is mostly just spec'ing out what we were already discussing
risks & mitigations - highlighting some concerns that have been raised about this approach.
updates - discuss how pod updates will be handled in more detail, this section needs review
test plan
monitoring - needs more input

@liggitt recorded a demo of this proposal, which you can find here: https://youtu.be/SRg_apFQaHE

Enhancement issue: #2579

Outstanding unresolved sections:

Alpha blockers:

The name of this feature
The mode labels (currently ~~allow~~ enforce, warning, audit)
SELinux policy (close to resolution)
Capabilities policy
Monitoring (needs input from @kubernetes/sig-instrumentation-approvers )
CSI inline volume restrictions

implementation-time decisions:

Timeout & limit on namespace update warnings
Metric version label cardinality

Beta blockers:

Deprecation / removal policy for old profile versions
Ephemeral containers handling
Windows pod handling
PSP migration workflow & support

Required approvals:

keps/sig-auth/2579-pod-isolation-policy/README.md

deads2k · 2021-03-24T19:49:57Z

keps/sig-auth/2579-pod-isolation-policy/README.md

+version._
+
+Note that policies are not guaranteed to be backwards compatible, and a newer restricted policy
+could require setting a field that doesn't exist in the current API version.


This seems like an option for a perma-break. However, I don't see a practical way of knowing the version of a cluster. I don't believe we force behavior for /version for conformance purposes.

discussed this in the 3/24 breakout meeting without arriving at a specific conclusion. thought about various mechanisms for communicating the version of the cluster to the webhook:

check /version (unclear that is guaranteed to return the kubernetes version)

configure it in the webhook manifest (only works if you remember to keep the webhook in sync, makes skew during upgrade hard, only works if a single API server is talking to the webhook)

configure it in the webhook invocation (e.g. webhook path)

add server Kubernetes version into admission review

I thought that we did resolve this? The conclusion I drew from our discussion was that under the webhook implementation, policy versions would be tied to the webhook version, not the cluster version.

That's certainly the easiest, implementation-wise, but makes for tricky ordering on upgrade... if you upgrade the webhook first, then its latest restricted policy can start requiring fields to be set that the calling server might not be capable of setting yet.

I think the webhook implementation is likely secondary, so I'm ok saying that the webhook library version determines the meaning of latest, but we need to clearly document the expected order on upgrades between API server and webhook.

keps/sig-auth/2579-pod-isolation-policy/README.md

enj

Minor comments.

/lgtm

enj · 2021-05-11T15:46:35Z

keps/sig-auth/2579-psp-replacement/README.md

+
+The following audit annotations will be added:
+
+1. `pod-security.kubernetes.io/enforce-policy = <policy_level>:<resolved_version>` Record which policy was evaluated


Curious why level and version a munged into one key instead of two.

It cuts down on the size & verbosity of the logs. It's useful to keep them separate on labels because of the limitations of label selectors, but most log processors that I've seen can handle separating these values if need be. Is there a reason you'd want to see them separated?

This just seemed inconsistent with the labels used on namespaces.

enj · 2021-05-11T15:56:11Z

keps/sig-auth/2579-psp-replacement/README.md

+
+_Blocking for Beta._
+
+How long will old profiles be kept for? What is the removal policy?


+1 on keeping forever.

enj · 2021-05-11T16:02:38Z

keps/sig-auth/2579-psp-replacement/README.md

+- Using labels enables various workflows around policy management through kubectl, for example
+  issuing queries like `kubectl get namespaces -l
+  pod-security.kubernetes.io/enforce-version!=v1.22` to find namespaces where the enforcing
+  policy isn't pinned to the most recent version.
+- Keeping the options on namespaces allows atomic create-and-set-policy, as opposed to creating a
+  namespace and then creating a second object inside the namespace.


If we have a separate cluster scoped object called PodSecurity where the name of object matches the name of the namespace, one interesting property is that authorization to write said PodSecurity is distinct from write on namespace.

That being said, I like the flexibility and UX of the label based approach.

So should we define extra authorization (SAR checks with virtual verbs similar to PSP's use) checks needed to set these labels and enforce them in admission?

So should we define extra authorization (SAR checks with virtual verbs similar to PSP's use) checks needed to set these labels and enforce them in admission?

I think there is a use case for generic label policy, and I'd be interested in a proposal for it, but IMO we shouldn't implement something special for pod security.

If we have a separate cluster scoped object called PodSecurity where the name of object matches the name of the namespace, one interesting property is that authorization to write said PodSecurity is distinct from write on namespace.

I feel like we discussed this before, but I can't remember if there were any concerns aside from losing out on the label-selector UX. If we went this route, we'd probably want to add a custom kubectl command for it, but it would need to be fairly complicated to cover all the use cases we'd get for free with the label selector.

If we have a separate cluster scoped object called PodSecurity where the name of object matches the name of the namespace, one interesting property is that authorization to write said PodSecurity is distinct from write on namespace.

I feel like we discussed this before, but I can't remember if there were any concerns aside from losing out on the label-selector UX. If we went this route, we'd probably want to add a custom kubectl command for it, but it would need to be fairly complicated to cover all the use cases we'd get for free with the label selector.

Note that this mostly a thought exercise of what we lose by using label selectors instead of a distinct object.

So should we define extra authorization (SAR checks with virtual verbs similar to PSP's use) checks needed to set these labels and enforce them in admission?

I think there is a use case for generic label policy, and I'd be interested in a proposal for it, but IMO we shouldn't implement something special for pod security.

I do not know if I buy the "let us wait for the generic label policy" approach, especially since we want the pod security stuff to mostly be static. Having a one-off approach for pod security that encodes some well known permissions for the pod security labels could provide significant value and safety to this feature, especially in environments where users are allowed to provision namespaces.

enj · 2021-05-11T19:10:04Z

keps/sig-auth/2579-psp-replacement/README.md

+We are targeting GA in v1.24 to allow for migration off PodSecurityPolicy before it is removed in
+v1.25.


While I understand the rationale, this seems aggressive for exactly the wrong reason. 😞

Yeah. We shouldn't rush it just for the sake of rushing it. If there are red flags, we'll hold it back. PSP is only beta, so I don't think it would be terrible if this was still beta in v1.24 (although I'd really like to get it past alpha).

keps/sig-auth/2579-psp-replacement/README.md

tabbysable · 2021-05-12T05:12:18Z

/lgtm

IanColdwater · 2021-05-12T13:43:00Z

/approve 🎉

k8s-ci-robot · 2021-05-12T13:43:21Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: BenTheElder, deads2k, enj, IanColdwater, tallclair

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/prod-readiness/OWNERS~~ [deads2k]
~~keps/sig-auth/OWNERS~~ [deads2k,enj,tallclair]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sftim · 2021-05-12T13:56:09Z

Once this is merged, do we want to go back and tweak the blog article about it?

jsturtevant · 2021-05-12T15:57:03Z

/lgtm

tallclair · 2021-05-12T16:21:07Z

That's a wrap!

/remove-hold

This change adds the runc wrapper which has a purpose of applying policies set through Kubernetes namespaces. For now, it uses the namespace labels proposed in the PSP Replacement KEP[0]. The runc wrapper checks whether the container which is about to be started was scheduled by kubelet (by checking runc annotations set by kubelet / runtime servers). If some non-default policy is set, that policy is written for the particilar container to the BPF map, so then BPF programs can be aware of that policy. For now, there is no way of proving that those annotations really come from Kubernetes. Coming up with some sane way of securely proving that will be something to implement in a follow up work. [0] kubernetes/enhancements#2582 Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

This change adds the runc wrapper which has a purpose of applying policies set through Kubernetes namespaces. For now, it uses the namespace labels proposed in the PSP Replacement KEP[0]. The runc wrapper checks whether the container which is about to be started was scheduled by kubelet (by checking runc annotations set by kubelet / runtime servers). If some non-default policy is set, that policy is written for the particilar container to the BPF map, so then BPF programs can be aware of that policy. For now, there is no way of proving that those annotations really come from Kubernetes. Coming up with some sane way of securely proving that will be something to implement in a follow up work. [0] kubernetes/enhancements#2582 Fixes: #3 Signed-off-by: Michal Rostecki <mrostecki@opensuse.org>

tallclair added sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/security Categorizes an issue or PR as relevant to SIG Security. labels Mar 22, 2021

tallclair assigned liggitt, enj, deads2k, IanColdwater and tabbysable Mar 22, 2021

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 22, 2021

k8s-ci-robot requested review from deads2k and liggitt March 22, 2021 23:24

k8s-ci-robot added kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Mar 22, 2021

tallclair mentioned this pull request Mar 22, 2021

PodSecurity admission (PodSecurityPolicy replacement) #2579

Closed

12 tasks

Pod Isolation Policy KEP

c28827a

tallclair force-pushed the psp-replacement branch from 1234867 to c28827a Compare March 23, 2021 00:32

Add notes on conformance

62f143d

smarterclayton reviewed Mar 23, 2021

View reviewed changes