-
Notifications
You must be signed in to change notification settings - Fork 29
Add PodSecurityPolicy Support #428
Conversation
@@ -1,4 +1,4 @@ | |||
{% if salt.caasp_pillar.get('addons:dns', False) %} | |||
{% if salt.caasp_pillar.get('addons:dns', True) %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was defaulting in the static pillar to True - sync'd it up while I noticed it.
4225b2c
to
d84b4fa
Compare
I think this should reference this bugzilla issue |
allowPrivilegeEscalation: true | ||
defaultAllowPrivilegeEscalation: true | ||
# Capabilities | ||
allowedCapabilities: [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a completely open PSP, this setting could be '*'
in case a container is not privileged
but still needs certain capabilities?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
929d811
to
ac6b3fb
Compare
Some background information for capabilities in Kubernetes. There are three fields in the config: DefaultAddCapabilities"The capabilities which are added to containers by default, in addition to the runtime defaults." This means that any capability we specify here will be added on top of what our runtime of choice is adding by default. AllowedCapabilitiesThis is a list of capabilities that will be granted to a container if it requests it. None of them will be added by default but on demand. RequiredDropCapabilitiesAll specified capabilities must be dropped from the container. It is important that no capability in this list is also referenced in the two above ones. Notice: if our runtime of choice adds In the context of capabilities, having (only) three profiles for different security levels is really tough. That's just my 2 cents, but I would prefer to provide profiles tailored for specific use cases and workloads (e.g., web server, database). |
ef71be3
to
36b5e56
Compare
# control anyway as a belt+braces protection. /dev/null may be a better | ||
# option, but the implications of pointing this towards a device are | ||
# unclear. | ||
- pathPrefix: /opt/kubernetes-hostpath-volumes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have to point this at a path, I would point it at one that we know is going to never be used (and is obvious should never be used). Something like /tmp/caasp-kubernetes-locked-down-hostpath
or something? Though I appreciate that's somewhat uglier. But I believe it shouldn't be the same as the one we set for the unprivileged
PSP.
As for pointing it at a device, there theoretically shouldn't be an issue if pathPrefix
handles path targets properly. Having access to /dev/null
on the host is the same as /dev/null
in the container. Though it should be noted that in runc
we re-open /dev/null
once inside a container (though again this then resulted in CVE-2015-3627), making me suspicious about allowing access to the host /dev/null
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have to point this at a path, I would point it at one that we know is going to never be used (and is obvious should never be used). Something like /tmp/caasp-kubernetes-locked-down-hostpath
Yea - I like the notion of making it clear this is not intended to be useable. I'll make it something like /invalid-kubernetes-hostpaths/
- given the RO root FS, and that name, it will be unusable and obviously unusable - even if the hostPath volume type restriction is somehow bypassed.
As for pointing it at a device, there theoretically shouldn't be an issue if pathPrefix handles path targets properly.
Yea, the special-caseness of host devices made me rethink my initial setting of /dev/null
- I think you're agreeing with me here, that we should avoid referencing the device as it's unclear what that would end up allowing should something, somewhere, have a bug...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(As an aside, I just took a very quick look how pathPrefix
is implemented because the documentation had some very odd wordings and I'm a little concerned. I am going to take a closer look tomorrow, as I must be missing something...)
supplementalGroups: | ||
rule: RunAsAny | ||
fsGroup: | ||
rule: RunAsAny |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would want these to be RunAsNonRoot
-- but this might break users who (IMO incorrectly) are running containers as root. We don't use user namespaces at the moment, which is why this is a concern. But then we would have to disable allowPrivilegeEscalation
and defaultAllowPrivilegeEscalation
if we wanted the protection to be properly done (which then would break ping
as well).
But as I mention below, I'm not entirely sure who the target is for the unprivileged
PSP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally we would want these to be RunAsNonRoot -- but this might break users who (IMO incorrectly) are running containers as root.
For better or worse (okay - just worse), 99% of containers run as root. We want to provide a reasonable default for "unprivileged containers" where operators won't immediately jump to using the privileged
policy as nothing works.
But as I mention below, I'm not entirely sure who the target is for the unprivileged PSP.
I'll update the PR commit messages to be more clear on this. In short - unprivileged is meant to be a reasonable set of default privileges for the real-world of workloads our users have. For example, 99% of workloads have no need to drop the network/IPC/PID namespacing, so we don't allow that. However, 99% of real world workloads expect to start up as root
, so we allow it (e.g. go look at the Helm charts repo - maybe 1% allow specifying the necessary non-root account details)
allowedCapabilities: [] | ||
defaultAddCapabilities: [] | ||
requiredDropCapabilities: | ||
- ALL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little bit extreme if you also want to allow users to run as root. Most of the properties of "running as root" are actually capabilities that need to be allowed (such as being able to change the owner of a file).
However there also appears to be a fairly large shortcoming in the capabilities support in PSPs. You cannot whitelist the absolute set of capapbilities, you can only:
- Whitelist a set of capabilities plus the runtime defaults (with
allowedCapabilities
); or - Blacklist some subset of the runtime defaults (with
requiredDropCapabilities
) and whitelist other capabilities (withallowedCapabilities
); or - Add capabilities to the default set (with
defaultAddCapabilities
), combined with any of the above.
There's no way to say "only allow these capabilities I define, drop everything else even if it's a runtime default". This is a shortcoming of Docker's --cap-drop
and --cap-add
interface, but I don't see why this design also applies to CNI...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little bit extreme if you also want to allow users to run as root. Most of the properties of "running as root" are actually capabilities that need to be allowed (such as being able to change the owner of a file).
This is exactly the sort of info I need :) What are a reasonable set of allowances here?
However there also appears to be a fairly large shortcoming in the capabilities support in PSPs. You cannot whitelist the absolute set of capapbilities, you can only:
Yea, that annoyed me too. How do other runtimes handle this? And - What are the default set docker keeps no matter what?
hostIPC: false | ||
hostNetwork: false | ||
hostPorts: | ||
- min: 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is only for unprivileged users, we might want this to be [1024,65536)
. But maybe I'm misunderstanding what the target audience is for the unprivileged
PSP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wanted to do that :( - but for example, the pretty common apache/nginx containers from dockerhub go straight to port 80. There is no need for them to do this, especially in the context of Kubernetes, but they do. If we make this only allow unprivileged IP ports - then, most workloads will need to be ran under the "privileged" policy.
Maybe this points to a flaw in my policy naming.
kind: PodSecurityPolicy | ||
metadata: | ||
name: privileged | ||
annotations: {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you need to explicitly specify seccomp.security.alpha.kubernetes.io/allowedProfileNames: '*'
. According to the docs, not including this annotation means the default cannot be changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to be explicit about this.
Agree - however, we need to provide some generic defaults that suit "most" workloads. Operators are free to add extra policies with more specific restrictions or allowances. I see these policies as existing to provide some defaults which can either be used directly, or better yet - can be used as a template for more specific requirements. |
I've added my comments on the |
1b3eb40
to
d2b90b4
Compare
@@ -67,6 +67,7 @@ kube_log_level: '2' | |||
|
|||
# install the addons (ie, DNS) | |||
addons: | |||
psp: 'true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the PodSecurityPolicy really a addon? Is it optional? Are we going to expose this flag to the user?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like it to be optional yes, although I don't think we want to expose that in Velum - I just want is as a "advanced" option for anyone who happens to need it disabled for whatever reason.
services-setup: | ||
salt.state: | ||
- tgt: {{ super_master }} | ||
- sls: | ||
- addons | ||
- addons.psp | ||
- cni |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should create a /services
folder and move all these things there...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, I agree - we should rename the addons -> services folder. I won't do that in this PR, but I will create a followup once this merges to do the refactor.
Add support for PodSecurityPolicy's, allowing us to disable use of the hostPath volume type. This change adds 2 PSP's: * unprivileged (Default assigned to all users) The unprivileged PodSecurityPolicy is intended to be a reasonable compromise between the reality of Kubernetes workloads, and suse:caasp:psp:privileged. By default, we'll grant this PSP to all users and service accounts. * privileged The privileged PodSecurityPolicy is intended to be given only to trusted workloads. It provides for as few restrictions as possible and should only be assigned to highly trusted users. Fixes bsc#1047535
Add support for PodSecurityPolicy's, allowing us to disable use of the
hostPath volume type.
This change adds 2 PSP's:
The unprivileged PodSecurityPolicy is intended to be a
reasonable compromise between the reality of Kubernetes workloads, and
suse:caasp:psp:privileged. By default, we'll grant this PSP to all
users and service accounts.
The privileged PodSecurityPolicy is intended to be given
only to trusted workloads. It provides for as few restrictions as possible
and should only be assigned to highly trusted users.
Fixes bsc#1047535