New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restricted Pod E2E tests #109946
Restricted Pod E2E tests #109946
Conversation
Sorry, looks like I have a little more work to do on this. |
2b0b97f
to
02c1c0d
Compare
@@ -235,12 +235,12 @@ var _ = SIGDescribe("Pods", func() { | |||
Spec: v1.PodSpec{ | |||
Containers: []v1.Container{ | |||
{ | |||
Name: "nginx", | |||
Image: imageutils.GetE2EImage(imageutils.Nginx), | |||
Name: "pause", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does changing from nginx to pause change the lifecycle characteristics of the pod (w.r.t. handling shutdown signals)? same question applies in multiple places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question. Both the pause container and nginx handle SIGINT and SIGTERM as a fast shutdown, which I think is what containerd uses. nginx additionally handles SIGQUIT, but I don't see any references to SIGQUIT in k/k, containerd, or cri-o.
Maybe @SergeyKanzhelev or @endocrimes have additional context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't cover any corner cases of shutdown so this seems ok to me 👍
test/e2e/framework/pod/utils.go
Outdated
// GetRestrictedPodSecurityContext returns a minimal restricted pod security context. | ||
func GetRestrictedPodSecurityContext() *v1.PodSecurityContext { | ||
return &v1.PodSecurityContext{ | ||
RunAsNonRoot: pointer.BoolPtr(true), | ||
RunAsUser: pointer.Int64(nonRootUser), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still unsure about this... this isn't really a minimal restricted pod security context, right? shouldn't runAsNonRoot: true
be sufficient?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Part of my goal with these methods was that most e2e test authors can just apply MixinRestrictedPodSecurity
and not think about the restricted profile at all. If we don't set RunAsUser
here (and in the mixin) then for all cases other than the pause pod the test author will need to manually set a RunAsUser. They'll probably pick something random that gets cargo-culted around.
What's your argument against this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't set RunAsUser here (and in the mixin) then for all cases other than the pause pod the test author will need to manually set a RunAsUser
Would they have to set it, or do you mean because the container image would have to be built to run with a non-root uid, which is uncommon?
What's your argument against this?
It would break images that expect to run with a particular non-zero uid at runtime, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I searched through our test image manifests, and the only ones that set a user are:
- pause: user doesn't matter
- nonroot: all the tests using this override the user anyway
- windows containers: don't support Restricted for now (added TODO)
I'd prefer that the tests which require a specific user are explicit about it, rather than having all the tests that don't have an opinion set something arbitrary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than having all the tests that don't have an opinion set something arbitrary
but… isn't this setting something arbitrary (1000) on all tests that don't have an opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is runAsUser generally needed as boilerplate, though? I thought runAsNonRoot was supposed to be sufficient.
If an e2e wanted to exercise runAsNonRoot
functionality with an image that did not run as root by default and didn't want to set a runAsUser, that would mean they couldn't use GetRestrictedPodSecurityContext or MustMixinRestrictedPodSecurity to do so, because those wouldn't actually set the minimal securityContext that is compatible with the restricted PSS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- IMO, if a test is explicitly testing the runAsNonRoot functionality (with or without a runAsUser set), then the test should be explicitly writing & managing the SecurityContext, and not deferring that to a helper.
- There should probably be 1-2 E2E tests explicitly checking runAsNonRoot behavior. I'd prefer to optimize these convenience methods for the common path, and force those couple of tests to be write the pod manually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, fair enough. Make sure the docs don't claim to produce a minimal valid restricted securityContext and doc the uid chosen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reasoning makes sense to me 👍 - A comment summarizing this would go a long way for future readers of the framework code here tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated comments.
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
looks good from a node pov, just needs the comment/doc updates
Added comments about the default non-root user. Also squashed commits & rebased. |
/hold cancel |
/test pull-kubernetes-e2e-capz-conformance |
func GetRestrictedPodSecurityContext() *v1.PodSecurityContext { | ||
return &v1.PodSecurityContext{ | ||
RunAsNonRoot: pointer.BoolPtr(true), | ||
RunAsUser: pointer.Int64(DefaultNonRootUser), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do windows nodes do if runAsUser is non-nil? do they ignore it or fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is ignored by kubelet:
kubernetes/pkg/kubelet/kuberuntime/security_context_windows.go
Lines 46 to 49 in 68fc207
if effectiveSc.RunAsUser != nil { | |
klog.InfoS("Windows container does not support SecurityContext.RunAsUser, please use SecurityContext.WindowsOptions", | |
"pod", klog.KObj(pod), "containerName", container.Name) | |
} |
With PodOS feature and OS field is set (in beta in 1.24) the adminsion controller would reject it:
kubernetes/staging/src/k8s.io/api/core/v1/types.go
Lines 3286 to 3307 in 976a940
// If the OS field is set to windows, following fields must be unset: | |
// - spec.hostPID | |
// - spec.hostIPC | |
// - spec.securityContext.seLinuxOptions | |
// - spec.securityContext.seccompProfile | |
// - spec.securityContext.fsGroup | |
// - spec.securityContext.fsGroupChangePolicy | |
// - spec.securityContext.sysctls | |
// - spec.shareProcessNamespace | |
// - spec.securityContext.runAsUser | |
// - spec.securityContext.runAsGroup | |
// - spec.securityContext.supplementalGroups | |
// - spec.containers[*].securityContext.seLinuxOptions | |
// - spec.containers[*].securityContext.seccompProfile | |
// - spec.containers[*].securityContext.capabilities | |
// - spec.containers[*].securityContext.readOnlyRootFilesystem | |
// - spec.containers[*].securityContext.privileged | |
// - spec.containers[*].securityContext.allowPrivilegeEscalation | |
// - spec.containers[*].securityContext.procMount | |
// - spec.containers[*].securityContext.runAsUser | |
// - spec.containers[*].securityContext.runAsGroup | |
// +optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, there's a TODO to avoid setting RunAsUser in MustMixinRestrictedPodSecurity for explicitly windows pods as part of the PodOS work, I just wanted to make sure this wasn't going to fail in a windows kubelet for pods that didn't explicitly indicate their OS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this did break the tests because of #102849
May 26 10:47:49.396: INFO: At 2022-05-26 10:44:57 +0000 UTC - event for pod-update-d046cc17-42c7-493c-8c0c-fea0e9ac3164: {kubelet capz-conf-qwm24} FailedMount: (combined from similar events): MountVolume.SetUp failed for volume "kube-api-access-9l7hz" : chown c:\var\lib\kubelet\pods\0a6b5b55-fc8f-42b4-b285-19aedc311882\volumes\kubernetes.io~projected\kube-api-access-9l7hz\..2022_05_26_10_46_59.624505313\token: not supported by windows
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't setting RunAsUserName, but does the windows kubelet have similar problems with RunAsUser?
if so, that seems like a good reason to drop RunAsUser here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the title is slightly misleading but it states in the issue: In addition if a Windows Pod is created with a Projected Volume and RunAsUser set, the Pod will be stuck at ContainerCreating:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok. @tallclair, that's a good reason not to set RunAsUser here for pods that could run on Windows. @jsturtevant, do you mind opening a PR to drop the RunAsUser field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
linking here for anyone following the thread #110235
/triage accepted |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: endocrimes, liggitt, tallclair The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
The Kubernetes project has merge-blocking tests that are currently too flaky to consistently pass. This bot retests PRs for certain kubernetes repos according to the following rules:
You can:
/retest |
What type of PR is this?
/kind cleanup
What this PR does / why we need it:
MixinRestrictedPodSecurity
and correspondingMustMixinRestrictedPodSecurity
utilities which set the required fields to pass the restricted pod security checks, and verify that the given pod does in fact meet the requirements. These are complementary to the existingGetRestricted{Pod,Container}SecurityContext
functions, but in most cases require fewer changes.test/e2e/common/node/pods.go
tests to set the restricted namespace label and use the new utilities.Does this PR introduce a user-facing change?
/area test
/assign @liggitt @s-urbaniak