[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #103213

tallclair · 2021-06-25T18:47:11Z

Aggregate identical warnings for multiple pods in a namespace
Test coverage of aggregation

njuptlzf · 2021-07-08T02:01:46Z

First test the current situation before making changes.
/assign

njuptlzf · 2021-07-08T03:51:01Z

If a pod has multiple causes of errors,

[noruntimeclasspod: message, message2 runtimeclass1pod: message, message2 runtimeclass2pod: message, message2 runtimeclass3pod: message1, message2 runtimeclass4pod: message1, message2, runtimeclass5pod: message1, runtimeclass6pod: message2]

I have two thoughts

Consider multiple causes of errors as a whole,

[(message, message2): [noruntimeclasspod, runtimeclass1pod, runtimeclass2pod],
(message1, message2): [runtimeclass3pod, runtimeclass4pod],
(message1): [runtimeclass5pod],
(message2): [runtimeclass6pod]]

Each error cause is used as a key to associate related pods

[(message): [noruntimeclasspod, runtimeclass1pod, runtimeclass2pod]
(message1): [runtimeclass3pod, runtimeclass4pod, runtimeclass5pod],
(message2): [noruntimeclasspod, runtimeclass1pod, runtimeclass2pod, runtimeclass3pod, runtimeclass4pod, runtimeclass6pod]]

The warning of thoughts1 will be repeated a lot; the podName of thoughts2 will be repeated a lot.

According to requirements, I might choose thought2.

tallclair · 2021-07-26T22:07:22Z

@njuptlzf I prefer the second approach you propose as well.

The only caveat: for a namespace with a small number of pods (1 in the extreme case), this approach would actually end up being a lot more verbose than the un-aggregated case. One option is we could set some threshold for number of pods with errors and only aggregate if the number exceeds that threshold. However, having 2 different potential output formats could be annoying if you're trying to do any automation on the response. @liggitt WDYT?

liggitt · 2021-07-27T15:07:42Z

Thinking through the workflow of someone dealing with the warnings across multiple pods... to fix warnings, they still have to visit the individual pod or workload definitions and make appropriate changes. To make that easy, I still think the pod should be the primary unit of organization, not the individual messages.

The main reason I think we should aggregate pods with identical warnings is that it is common to have effectively identical pods created from the same workload controller. If I have a replicaset with 100 pods, I'd rather see myrs1-pod-hg2f32hg (and 99 other pods): <... all warnings about myrs1-pod-hg2f32hg ...>

Taking the example above:

[
noruntimeclasspod: message, message2
runtimeclass1pod: message, message2
runtimeclass2pod: message, message2
runtimeclass3pod: message1, message2
runtimeclass4pod: message1, message2
runtimeclass5pod: message1
runtimeclass6pod: message2
]

it's easier to figure out everything I need to do to fix specific pods/workloads if we aggregate to this format:

noruntimeclasspod (and 2 other pods): message, message2
runtimeclass3pod (and 1 other pod): message1, message2
runtimeclass5pod: message1
runtimeclass6pod: message2

I can then fix those four pods/workloads completely, then rerun the dry-run to check my work. If I modified root workload definitions that affected other elided pods, great! If some of the elided pods just happened to have identical warnings but were not controlled by the same root workload definitions, then they'll be surfaced by name when I recheck.

njuptlzf · 2021-07-28T01:24:44Z

Okay, let me modify the logic and ut again.

tallclair · 2021-08-19T22:37:52Z

@liggitt that makes sense, but we might want to key off the controller in addition to the error messages. E.g.

mydeploy-pod-hg2f32hg (and 99 others from apps/v1 ReplicaSet mydeploy-f2g3gc): message, message1
mydeploy2-pod-3f42hu (and 30 others from apps/v1 ReplicaSet mydeploy2-f2234u): message, message1

or is that getting too verbose?

liggitt · 2021-09-07T14:13:58Z

hmm... I don't feel super-strongly either way, but that wouldn't necessarily point you at the object you'd actually need to edit to fix the issue... in the case of a deployment, you'd want to edit the deployment, but the pod ownerRefs would be pointing at the intermediate replicaset.

tallclair mentioned this issue Jun 25, 2021

PodSecurity Umbrella Issue #103192

Closed

25 tasks

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 25, 2021

tallclair added the sig/auth Categorizes an issue or PR as relevant to SIG Auth. label Jun 25, 2021

k8s-ci-robot removed the needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. label Jun 25, 2021

liggitt added kind/feature Categorizes issue or PR as related to a new feature. triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Jun 27, 2021

k8s-ci-robot removed the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Jun 27, 2021

liggitt added this to Beta in SIG-Auth: PodSecurity Jun 29, 2021

tallclair mentioned this issue Jun 29, 2021

[PodSecurity] Add test coverage for namespace admission #103211

Closed

3 tasks

enj added this to Needs Triage in SIG Auth Old Jul 6, 2021

tallclair mentioned this issue Jul 7, 2021

[PodSecurity] Improve error UX #103561

Closed

5 tasks

k8s-ci-robot assigned njuptlzf Jul 8, 2021

njuptlzf mentioned this issue Jul 8, 2021

[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #103585

Closed

liggitt moved this from Beta to In Progress in SIG-Auth: PodSecurity Jul 27, 2021

enj moved this from Needs Triage to In Progress in SIG Auth Old Sep 27, 2021

liggitt mentioned this issue Oct 25, 2021

[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #105889

Merged

liggitt moved this from In Progress to Done (1.23, Beta) in SIG-Auth: PodSecurity Oct 26, 2021

k8s-ci-robot closed this as completed in #105889 Oct 26, 2021

SIG Auth Old automation moved this from In Progress to Closed / Done Oct 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #103213

[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #103213

tallclair commented Jun 25, 2021 •

edited

njuptlzf commented Jul 8, 2021

njuptlzf commented Jul 8, 2021 •

edited

tallclair commented Jul 26, 2021

liggitt commented Jul 27, 2021 •

edited

njuptlzf commented Jul 28, 2021 •

edited

tallclair commented Aug 19, 2021

liggitt commented Sep 7, 2021

[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #103213

[PodSecurity] Aggregate identical warnings for multiple pods in a namespace #103213

Comments

tallclair commented Jun 25, 2021 • edited

njuptlzf commented Jul 8, 2021

njuptlzf commented Jul 8, 2021 • edited

tallclair commented Jul 26, 2021

liggitt commented Jul 27, 2021 • edited

njuptlzf commented Jul 28, 2021 • edited

tallclair commented Aug 19, 2021

liggitt commented Sep 7, 2021

tallclair commented Jun 25, 2021 •

edited

njuptlzf commented Jul 8, 2021 •

edited

liggitt commented Jul 27, 2021 •

edited

njuptlzf commented Jul 28, 2021 •

edited