Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OADP-290 Bug: The velero-privileged SCC is causing the CIS benchmark to fail #576

Closed
1 task done
alexisph opened this issue Feb 23, 2022 · 14 comments · Fixed by #860
Closed
1 task done

OADP-290 Bug: The velero-privileged SCC is causing the CIS benchmark to fail #576

alexisph opened this issue Feb 23, 2022 · 14 comments · Fixed by #860
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Milestone

Comments

@alexisph
Copy link

alexisph commented Feb 23, 2022

Contact Details

No response

Describe bug

As per the title, the velero-privileged SCC is causing the CIS benchmark to fail.

The Red Hat OADP operator is installed in our disconnected v4.8 cluster. The Compliance operator is checking compliance to the CIS Red Hat OpenShift Container Platform 4 Benchmark V1.1 and this rule is failing:

Limit Container Capabilities
Rule ID: xccdf_org.ssgproject.content_rule_scc_limit_container_allowed_capabilities
Result: fail
Severity: medium
Description: Containers should not enable more capabilites than needed as this opens the door for malicious use. To enable only the required capabilities, the appropriate Security Context Constraints (SCCs) should set capabilities as a list in allowedCapabilities.
Rationale: By default, containers run with a default set of capabilities as assigned by the Container Runtime which can include dangerous or highly privileged capabilities. Capabilities should be dropped unless absolutely critical for the container to run software as added capabilities that are not required allow for malicious containers or attackers.

Verification:

oc get scc -o json | jq '[.items[] | select(.metadata.name != "privileged") | .metadata.name, .allowedCapabilities]'

[
  "anyuid",
  null,
  "hostaccess",
  null,
  "hostmount-anyuid",
  null,
  "hostnetwork",
  null,
  "log-collector-scc",
  null,
  "machine-api-termination-handler",
  null,
  "node-exporter",
  null,
  "nonroot",
  null,
  "restricted",
  null,
  "velero-privileged",
  [
    "*"
  ]
]

The policy rule is checking the output of this command:

oc get scc -o json | jq '[.items[] | select(.metadata.name != "privileged")] | map(.allowedCapabilities == null)'

[
  true,
  true,
  true,
  true,
  true,
  true,
  true,
  true,
  true,
  false
]

The velero-privileged SCC was created by the OADP operator. It sets a value in allowedCapabilities (currently "*") and thus the policy rule is failing. Please consider using the privileged SCC instead.

Thanks in advance.

What happened?

The CIS benchmark failed because the created SCC sets a value in allowedCapabilities.
Expected that the OADP operator would not create a SCC which causes the benchmark to fail.

OADP Version

v1.0.0 (Stable Red Hat operator)

OpenShift Version

4.8

Velero pod logs

No response

Restic pod logs

No response

Operator pod logs

2022-02-23T10:33:02.565Z        DEBUG   events  Normal  {"object": {"kind":"SecurityContextConstraints","name":"velero-privileged","uid":"e8b64dd3-1870-449f-b395-df69d4b4bac7","apiVersion":"security.openshift.io/v1","resourceVersion":"290547648"}, "reason": "VeleroSecurityContextConstraintsReconciled", "message": "performed created on velero scc velero-privileged"}
E0223 10:33:02.567037       1 event.go:264] Server rejected event '&v1.Event{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ObjectMeta:v1.ObjectMeta{Name:"velero-privileged.16d663be3cbc23de", GenerateName:"", Namespace:"default", SelfLink:"", UID:"", ResourceVersion:"", Generation:0, CreationTimestamp:v1.Time{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, DeletionTimestamp:(*v1.Time)(nil), DeletionGracePeriodSeconds:(*int64)(nil), Labels:map[string]string(nil), Annotations:map[string]string(nil), OwnerReferences:[]v1.OwnerReference(nil), Finalizers:[]string(nil), ClusterName:"", ManagedFields:[]v1.ManagedFieldsEntry(nil)}, InvolvedObject:v1.ObjectReference{Kind:"SecurityContextConstraints", Namespace:"", Name:"velero-privileged", UID:"e8b64dd3-1870-449f-b395-df69d4b4bac7", APIVersion:"security.openshift.io/v1", ResourceVersion:"290547648", FieldPath:""}, Reason:"VeleroSecurityContextConstraintsReconciled", Message:"performed created on velero scc velero-privileged", Source:v1.EventSource{Component:"DPA-controller", Host:""}, FirstTimestamp:v1.Time{Time:time.Time{wall:0xc07da137a1a3f7de, ext:20843219340, loc:(*time.Location)(0x2c0c520)}}, LastTimestamp:v1.Time{Time:time.Time{wall:0xc07da137a1a3f7de, ext:20843219340, loc:(*time.Location)(0x2c0c520)}}, Count:1, Type:"Normal", EventTime:v1.MicroTime{Time:time.Time{wall:0x0, ext:0, loc:(*time.Location)(nil)}}, Series:(*v1.EventSeries)(nil), Action:"", Related:(*v1.ObjectReference)(nil), ReportingController:"", ReportingInstance:""}': 'events is forbidden: User "system:serviceaccount:openshift-adp:openshift-adp-controller-manager" cannot create resource "events" in API group "" in the namespace "default"' (will not retry!)

New issue

  • This issue is new
@alexisph alexisph added the kind/bug Categorizes issue or PR as related to a bug. label Feb 23, 2022
@kaovilai
Copy link
Member

kaovilai commented Feb 24, 2022

Resolution: the appropriate Security Context Constraints (SCCs) should set capabilities as a list in allowedCapabilities.
Velero still need a lot of capabilities but defining a list instead of wildcard may resolve the warning.

@kaovilai kaovilai self-assigned this Feb 24, 2022
@alexisph
Copy link
Author

alexisph commented Feb 24, 2022

@kaovilai, the policy rule is checking this command: oc get scc -o json | jq '[.items[] | select(.metadata.name != "privileged")] | map(.allowedCapabilities == null)'
So it actually returns "false" for the velero-privileged SCC because the SCC is setting a value in allowedCapabilities (currently "*"), so it seems that it will fail even if the capabilities list is limited. I've edited my original bug report.

Not sure what the resolution will be here, e.g. a) ignoring the velero-privileged SCC in the security policy as already done for the privileged SCC or, b) making OADP use the privileged SCC instead of creating a new velero-privileged SCC

@sfowl
Copy link

sfowl commented Mar 15, 2022

Why does OADP use velero-privileged instead of the default privileged SCC? Is it a less permissive option?

@kaovilai
Copy link
Member

From internal discussion, the fear we had was privileged could have been modified and cause breakage to our app. So velero-privileged is created to guarantee the required SCC so to speak.

@kaovilai
Copy link
Member

kaovilai commented Mar 16, 2022

It should be possible to use privileged SCC but we may need to do some tests. I think what we have is (edit: not) less permissive.

@kaovilai
Copy link
Member

I think it should be possible to query privileged SCC to check for expected values.. if it’s what we expect we use it, CIS benchmark passes. If its not expected, we create velero-privileged. If someone complains about CIS, tell them to fix privileged SCC

@jmontleon
Copy link
Contributor

jmontleon commented Mar 16, 2022

I believe the velero-privileged scc is a hangover from MTC for AOS 3.x.

For OCP 4.x the way to use the privileged SCC is the use verb like so:
https://github.com/konveyor/mig-operator/blob/release-1.6.3/deploy/olm-catalog/bundle/manifests/crane-operator.v1.6.3.clusterserviceversion.yaml#L771-L778

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 14, 2022
@alexisph
Copy link
Author

/remove-lifecycle stale

@openshift-ci openshift-ci bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 21, 2022
@kaovilai kaovilai assigned hhpatel14 and unassigned kaovilai Sep 19, 2022
@kaovilai
Copy link
Member

/lifecycle frozen

@openshift-ci openshift-ci bot added the lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. label Sep 19, 2022
@kaovilai
Copy link
Member

Also tracked on: https://issues.redhat.com/browse/OADP-290

@kaovilai kaovilai changed the title Bug: The velero-privileged SCC is causing the CIS benchmark to fail OADP-290 Bug: The velero-privileged SCC is causing the CIS benchmark to fail Sep 19, 2022
@msfrucht
Copy link

@kaovilai

@jmontleon Changing over to using the USE verb for SCCs is something I highly recommend. The product I worked on found out that applying the SCC directly to pods can have a race condition between editing the SCC and how long it takes for the SCC to be applied when doing anything in between affected by the SCC.

@kaovilai
Copy link
Member

kaovilai commented Nov 4, 2022

Thanks @msfrucht. Fix in #860 should land in 1.2.0 and resolve this issue.

@kaovilai
Copy link
Member

This will also land in 1.1.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants