Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.
Sign upKEP for promoting seccomp to GA #1148
Conversation
This comment has been minimized.
This comment has been minimized.
|
/lgtm but leaving it to @liggitt and @derekwaynecarr for the final approval. nit: Per our offline discussion, can you make downgrade workflow more explicit here? |
This comment has been minimized.
This comment has been minimized.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dchen1107, tallclair The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
|
||
| - Declare seccomp GA | ||
| - Fully document and formally spec the feature support | ||
| - Migrate the annotations to standard API fields |
This comment has been minimized.
This comment has been minimized.
liggitt
Jul 27, 2019
•
Member
suggest using language like "add equivalent API fields" rather than "migrate" (which sounds like we will move data, which we know doesn't work well)
edit: if the migration only occurs in pod objects where the fields are immutable after creation, this can work... the description below should be explicit that no data movement is done in pod template objects embedded in other types
edit 2: you were already explicit about pod templates... nevermind
This comment has been minimized.
This comment has been minimized.
tallclair
Jul 30, 2019
Author
Member
If the pod fields are immutable, is it possible to still set them on update as part of the storage flow? The criteria would be:
- The seccomp annotations are set, and do not change as part of the update.
- The seccomp fields are not set.
I guess the first criteria is problematic, since the seccomp fields could be changed in the update, and then a subsequent update could leave them be, so maybe that doesn't matter.
| type SeccompOptions struct { | ||
| // The seccomp profile to run with. | ||
| SeccompProfile |
This comment has been minimized.
This comment has been minimized.
liggitt
Jul 27, 2019
Member
was this intended to be embedded rather than something like Profile SeccompProfile? what other options do you anticipate?
This comment has been minimized.
This comment has been minimized.
tallclair
Jul 30, 2019
Author
Member
Yes, I modeled it off the Volume type.
The only option that comes to mind is the fallback strategy. Currently, if the seccomp options can't be enforced the node silently ignores them. In contrast, when AppArmor can't be enforced the pod is held in a Blocked state.
From a security perspective, silently failing could leave a potential security hole. From a portability perspective, it's nice to be able to choose the best practices options with a best-effort approach. Hence it might make sense to give the user the choice.
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
what happens for nodes/runtimes that don't enable seccomp is worth documenting, at least
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
maybe just securityContext.seccompProfile instead of embedding/nesting (can prefix other seccomp fields with seccomp if they are needed in the future
| } | ||
| // Only one profile source may be set. | ||
| // +union |
This comment has been minimized.
This comment has been minimized.
liggitt
Jul 27, 2019
Member
cc @apelisse ... don't we want to add a discriminator here? what's the current approach to doing that?
This comment has been minimized.
This comment has been minimized.
tallclair
Jul 30, 2019
Author
Member
Are discriminators required if the structure is immutable? I guess we'd still need it for pod templates...
This comment has been minimized.
This comment has been minimized.
tallclair
Jul 30, 2019
•
Author
Member
Does the discriminator field need to be a 1:1 mapping with union fields? If not, the API could look like this:
type SeccompProfile struct {
// +unionDiscriminator
// +optional
Type *SeccompProfileType
// +optional
LocalhostProfile *string
}
type SeccompProfileType string
const (
SeccompProfileUnconfined SeccompProfileType = "Unconfined"
SeccompProfileRuntimeDefault SeccompProfileType = "RuntimeDefault"
SeccompProfileLocalhost SeccompProfileType = "Localhost"
)
In other words, it seems redundant to have a the type indicate a boolean value that must be set to true. In those cases, the discriminator should be sufficient to indicate the option.
This comment has been minimized.
This comment has been minimized.
liggitt
Jul 30, 2019
Member
Does the discriminator field need to be a 1:1 mapping with union fields?
No, a union type without an associated member field is permitted
| type SeccompProfileSet struct { | ||
| // Whether the unconfined profile is included in this set. | ||
| // +optional | ||
| Unconfined *bool |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tallclair
Jul 30, 2019
Author
Member
Probably not. Maybe this should just be required? Is it permissible to have an optional non-pointer bool if the false & nil are equivalent? I suppose a reason to have nil is to allow for defaulting in mutating admission
| // Load a profile defined in static file on the node. | ||
| // The profile must be preconfigured on the node to work. | ||
| // +optional | ||
| LocalhostProfile *string |
This comment has been minimized.
This comment has been minimized.
liggitt
Jul 30, 2019
Member
do the current annotations contain sufficient detail to know whether they should convert to a runtime or localhost profile?
This comment has been minimized.
This comment has been minimized.
tallclair
Jul 30, 2019
•
Author
Member
Yes, currently the legal values for the annotations are:
unconfined # the real default
docker/default # equivalent to runtime/default
runtime/default
localhost/<path> # where path is relative to the node's configured seccomp root
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
spec out how annotation -> field conversion would work in the docker/default case (and how we'd validate annotation and field matched in that case... probably a special case)
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
would we use API defaulting to set RuntimeProfile to "default" if type is set to "Runtime"? if so, would we clear RuntimeProfile on update if Type was changed to something other than "Runtime"?
This comment has been minimized.
This comment has been minimized.
|
Note to reviewers: this PR is for 1.17 - please prioritize reviewing 1.16 PRs over this. |
| // The allowed localhostProfiles. Values may end in '*' to include all | ||
| // localhostProfiles with a prefix. | ||
| // +optional | ||
| LocalhostProfiles []string |
This comment has been minimized.
This comment has been minimized.
AkihiroSuda
Jul 30, 2019
Can we use Config object for storing profiles across the nodes? Can be another KEP though.
This comment has been minimized.
This comment has been minimized.
tallclair
Aug 5, 2019
Author
Member
I definitely agree this would be useful, but implementing it is a big project that I don't want to block GA on. See non-goals:
- Providing mechanisms for loading profiles from outside the static seccomp node directory
| All API skew is resolved in the API server. New Kubelets will only use the seccomp values specified | ||
| in the fields, and ignore the annotations. | ||
|
|
||
| #### Pod Creation |
This comment has been minimized.
This comment has been minimized.
AkihiroSuda
Jul 30, 2019
What is the expected behavior if the CRI runtime doesn't support seccomp? The pod should be created without seccomp (as in the current alpha annotation) or should result in an error?
This comment has been minimized.
This comment has been minimized.
tallclair
Aug 5, 2019
Author
Member
I think we need to maintain the backwards compatible behavior of ignoring it (I'll add a line to be explicit about this). Going forward, I left space in the API for a FailureStrategy which could dictate how to handle that situation (e.g. some applications may depend on seccomp application to run securely, while others might accept a best-effort best-practices approach).
This comment has been minimized.
This comment has been minimized.
justincormack
Oct 30, 2019
No application should depend on a policy; a policy can be applied as a non privileged operation by any code, so if it requires it then it can just apply it.
| harder to make some of the enhancements listed under [Non-Goals](#non-goals). Since the current | ||
| behavior is unguarded, I think we already need to treat the behavior as GA (which is why it's been | ||
| so hard to change the default profile), so I do not think these changes will actually increase the | ||
| friction. |
This comment has been minimized.
This comment has been minimized.
mayakacz
Aug 1, 2019
This is my biggest concern, but I think the arguments you've laid out are correct.
This is a necessary step to get to the good stuff :)
Thanks so much Tim!
This comment has been minimized.
This comment has been minimized.
mrunalp
commented
Aug 6, 2019
|
This looks fine for promoting what we have right now. However, I think that we would want to follow up with owning/defining seccomp policies by k8s better for workload portability as well as handling of failure scenarios. |
This comment has been minimized.
This comment has been minimized.
|
Yes, sorry, this is targeting v1.17 |
This comment has been minimized.
This comment has been minimized.
|
Updated the API with the union descriminator ( PTAL (still hoping to land this in v1.17) |
This comment has been minimized.
This comment has been minimized.
|
Looks like this will probably slip to v1.18, unless an API reviewer has time to approve it. (But prioritize #1243 over this). |
This comment has been minimized.
This comment has been minimized.
|
We discussed this at SIG node meeting today and decided to punt this to v1.18. Thanks! |
This comment has been minimized.
This comment has been minimized.
|
I'd like to pick this back up in v1.18. Who is willing to do the API review? |
This comment has been minimized.
This comment has been minimized.
I can take it, though I'll need to page context back in. Can you set up an hour to go over it? |
| fields are enforced even if the node version trails the API version. Since [we | ||
| support](https://kubernetes.io/docs/setup/release/version-skew-policy/) up to 2 minor releases of | ||
| version skew between the master and node, this behavior will be removed in version N+2 (where N is | ||
| the release with the upgraded seccomp support). |
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
update this to say "when the oldest supported kubelet is 1.18", since we're thinking through extending support to 4 releases
This comment has been minimized.
This comment has been minimized.
tallclair
Jan 17, 2020
Author
Member
Are we also considering extending version-skew support to 4 releases? Or just that we would carry patches for 4 releases?
This comment has been minimized.
This comment has been minimized.
liggitt
Jan 17, 2020
Member
we're considering the latter. the current kubelet/apiserver skew was set so the oldest kubelet would work with the newest apiserver, to avoid needing to do multiple nodepool recreations in a single cluster upgrade cycle
This comment has been minimized.
This comment has been minimized.
tallclair
Jan 17, 2020
Author
Member
Actually, I think we should just keep this for at least 4 versions. We don't need to rush to remove the reconciliation.
|
|
||
| #### Pod Update | ||
|
|
||
| The seccomp fields on a pod are immutable. |
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
This is not specific to this field, but we need to consider the case of an update from an old client that drops the new spec field. I think for this (and for other immutable fields), we should probably preserve the existing field value if an update attempts to clear it (since that could be coming from a client unaware it needs to round-trip this new field)
This comment has been minimized.
This comment has been minimized.
tallclair
Jan 17, 2020
Author
Member
I filed kubernetes/kubernetes#87301 to track this. It's not clear to me what the right way of handling this is. One argument for special-casing it for seccomp is that the field may be set by something the client does know about, the annotation. So there could be a situation where:
- Client creates pod with seccomp annotation
- Seccomp attempts to update the pod
- Update is rejected for dropping the seccomp field
| that case, the PSP enforcement should check both values as the effective value depends on the node | ||
| version running the pod. | ||
|
|
||
| #### PodTemplates |
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
consider how cluster operators will know they have things that need updating. a metric that fires on any podtemplate object with seccomp annotations, and (once we drop pod field -> annotation defaulting) any pod with a seccomp annotation might be helpful for tracking this
This comment has been minimized.
This comment has been minimized.
tallclair
Jan 17, 2020
Author
Member
Good idea. It could be any combination of:
- Metric counter
- Log line
- Event (on the controller)
- Warning annotation (on the controller or created pod)
A metric seems too hidden, if you don't know to look for it. I like the event, except it will disappear if you don't notice it right away. I'm leaning towards an annotation on both the controller object and the pod:
warning.kubernetes.io/seccomp: "Seccomp set through annotations. Support for setting seccomp through annotations will be dropped in v1.X"
We may also want to consider forbidding setting the seccomp annotation after we drop support, so that pods aren't being run unconfined unexpectedly. That would be more likely to break users, but follows the secure failure principle.
|
|
||
| If no seccomp annotations or fields are specified, no action is necessary. | ||
|
|
||
| If _only_ seccomp fields are specified, add the corresponding annotations. This ensures that the |
This comment has been minimized.
This comment has been minimized.
liggitt
Dec 9, 2019
Member
the proposed field is more expressive than the annotation (allows runtime profiles other than "default")... need to resolve if/how those would be converted to annotations that would not cause 1.17-level API servers to choke in validation
| // Use a predefined profile defined by the runtime. | ||
| // Most runtimes only support "default" | ||
| // +optional | ||
| RuntimeProfile *string |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
tallclair
Jan 17, 2020
Author
Member
As I go through this, I'm second guessing this field. AFAIK there aren't any runtimes that support multiple built in profiles, and this feature has never been requested.
If we dropped this field for now (just assume it's runtime/default), how bad would it be to add it back at some point in the future if it was needed? As long as we defaulted the new field to "default" and it's immutable, it doesn't seem like it would be that problematic? Or am I forgetting something?
| // Load a profile defined in static file on the node. | ||
| // The profile must be preconfigured on the node to work. | ||
| // +optional | ||
| LocalhostProfile *string |
This comment has been minimized.
This comment has been minimized.
tallclair
Jan 17, 2020
Author
Member
In the design review, we discussed not wanting to add the LocalhostProfile field if the intention is to deprecate and eventually remove that feature. Here's an alternative proposal for how to handle localhost:
Drop the LocalhostProfile *string field. Keep the SeccompProfileLocalhost SeccompProfileType, but optionally change its value to LocalhostDeprecated.
When creating a pod, the profile type can only be set to "Localhost" if the annotation is also set to localhost. When the kubelet goes to enforce the localhost profile, it fetches the path from the annotation.
The one gotcha is how to handle annotation update in this case. I'm tempted to say allow the update, and if the kubelet goes to enforce it and the annotation isn't set to localhost anymore, just treat it as an invalid localhost path (fail the pod).
This comment has been minimized.
This comment has been minimized.
|
@tallclair: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
This comment has been minimized.
This comment has been minimized.
|
Sorry it took so long to iterate on this. It is not currently a top priority for me, and may not make the v1.18 cutoff if it needs much more work. There are 3 open issues:
|
tallclair commentedJul 17, 2019
This is a proposal to upgrade the seccomp annotation on pods & pod security policies to a field, and
mark the feature as GA. This proposal aims to do the bare minimum to clean up the feature, without
blocking future enhancements.
/sig-node
/sig-auth
/priority important-longterm
/assign @liggitt @dchen1107 @derekwaynecarr
/cc @jessfraz