Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet volume relabel fails for Dynatrace OneAgent CSI volumes #2556

Closed
baichinger opened this issue Nov 7, 2022 · 11 comments
Closed

Kubelet volume relabel fails for Dynatrace OneAgent CSI volumes #2556

baichinger opened this issue Nov 7, 2022 · 11 comments
Assignees
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW status/needs-info Further information is requested status/needs-triage Pending triage or re-evaluation

Comments

@baichinger
Copy link

Image I'm using:
Bottlerocket OS 1.10.1 (aws-k8s-1.23)

What I expected to happen:
Dynatrace OneAgent CSI volumes work on Bottlerocket.

What actually happened:

  1. Dynatrace OneAgent CSI driver successfully mounts the volume to the pod:
    {"level":"info","ts":"2022-11-07T08:39:07.861Z","logger":"csi-driver","msg":"GRPC call","method":"NodePublishVolume","volume-id":"csi-1f0a5b5faf0aa280913d7a16ab0023393ec52aa563886dc7eebfdef711c13ca4"}
    {"level":"info","ts":"2022-11-07T08:39:07.861Z","logger":"csi-driver","msg":"publishing volume","csiMode":"app","target":"/var/lib/kubelet/pods/174ca84d-4ed5-4c9b-8f09-ce397f5c07a3/volumes/kubernetes.io~csi/oneagent-bin/mount","fstype":"","readonly":false,"volumeID":"csi-1f0a5b5faf0aa280913d7a16ab0023393ec52aa563886dc7eebfdef711c13ca4","attributes":{"csi.storage.k8s.io/ephemeral":"true","csi.storage.k8s.io/pod.name":"pause","csi.storage.k8s.io/pod.namespace":"default","csi.storage.k8s.io/pod.uid":"174ca84d-4ed5-4c9b-8f09-ce397f5c07a3","csi.storage.k8s.io/serviceAccount.name":"default","dynakube":"pup86408-application-monitoring","mode":"app"},"mountflags":[]}
    
  2. Kubelet tries to relabel the files and fails:
    Error: failed to generate container "d3408d8e5fbc55df1769265072ade6c67c6a7176bafdae267f4fff904fdb7844" spec: failed to generate spec: relabel "/var/lib/kubelet/pods/174ca84d-4ed5-4c9b-8f09-ce397f5c07a3/volumes/kubernetes.io~csi/oneagent-bin/mount" with "system_u:object_r:data_t:s0:c65,c249" failed: lsetxattr /var/lib/kubelet/pods/174ca84d-4ed5-4c9b-8f09-ce397f5c07a3/volumes/kubernetes.io~csi/oneagent-bin/mount/agent/bin: permission denied
    
  3. Pod is blocked from starting up with status Init:CreateContainerError.

Additional nodes:

How to reproduce the problem:

  1. Create a Dynatrace free trial account: https://www.dynatrace.com/trial/
  2. Deploy Dynatrace Operator: https://www.dynatrace.com/support/help/setup-and-configuration/setup-on-container-platforms/kubernetes/get-started-with-kubernetes-monitoring/set-up-k8s-monitoring-helm
  3. Configure applicationMonitoring:
    apiVersion: dynatrace.com/v1beta1
    kind: DynaKube
    metadata:
      name: application-monitoring
    spec:
      apiUrl: https://ENVIRONMENTID.live.dynatrace.com/api
      oneAgent:
        applicationMonitoring:
          useCSIDriver: true
    
  4. Deploy a K8s pod to a Bottlerocket instance:
    kind: Pod
    apiVersion: v1
    metadata:
      name: pause
    spec:
      containers:
        - name: pause
          image: k8s.gcr.io/pause:3.1
    
  5. Observe the pod failing by inspecting its events:
    ❯ k get pods pause
    NAME    READY   STATUS                      RESTARTS   AGE
    pause   0/1     Init:CreateContainerError   0          98s
    
    ❯ k describe pod pause
    Events:
      Type     Reason     Age                  From               Message
      ----     ------     ----                 ----               -------
      Normal   Scheduled  2m28s                default-scheduler  Successfully assigned default/pause to ip-172-31-34-110.ec2.internal
      Warning  Failed     2m28s                kubelet            Error: failed to generate container "d5473948bd9728c3174cce919fa9f4ce811a75c5a9619e162a2b4946cec65968" spec: failed to generate spec: relabel "/var/lib/kubelet/pods/450a7bbe-87f7-4055-9a65-f6e205826644/volumes/kubernetes.io~csi/oneagent-bin/mount" with "system_u:object_r:data_t:s0:c434,c580" failed: lsetxattr /var/lib/kubelet/pods/450a7bbe-87f7-4055-9a65-f6e205826644/volumes/kubernetes.io~csi/oneagent-bin/mount/agent: permission denied
    
@etungsten etungsten added the area/kubernetes K8s including EKS, EKS-A, and including VMW label Nov 7, 2022
@etungsten
Copy link
Contributor

etungsten commented Nov 7, 2022

Thanks for the issue. We'll take a closer look. In the meantime, if you are able to reach the admin container on the host, can you try retrieving system logs and checking what AVC denial messages show up? Here is one of the ways to gather logs on Bottlerocket: https://github.com/bottlerocket-os/bottlerocket#logs.

@baichinger
Copy link
Author

These are the messages logged repeatedly:

# journalctl -f
...
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal containerd[1543]: time="2022-11-08T07:23:35.602782153Z" level=info msg="CreateContainer within sandbox \"537245d90ee69ef1d84844cd83cee92ba9c6c2786933ae285a76768db4dcdeeb\" for container &ContainerMetadata{Name:install-oneagent,Attempt:0,}"
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="agent" dev="nvme1n1p1" ino=135060 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="bin" dev="nvme1n1p1" ino=135061 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="1.255.30.20221105-054308" dev="nvme1n1p1" ino=135062 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="any" dev="nvme1n1p1" ino=135063 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="dotnet" dev="nvme1n1p1" ino=135065 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135066" dev="nvme1n1p1" ino=135066 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135067" dev="nvme1n1p1" ino=135067 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135068" dev="nvme1n1p1" ino=135068 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="java" dev="nvme1n1p1" ino=135069 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal audit[1543]: AVC avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135071" dev="nvme1n1p1" ino=135071 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal containerd[1543]: time="2022-11-08T07:23:35.604082663Z" level=error msg="CreateContainer within sandbox \"537245d90ee69ef1d84844cd83cee92ba9c6c2786933ae285a76768db4dcdeeb\" for &ContainerMetadata{Name:install-oneagent,Attempt:0,} failed" error="failed to generate container \"df7a7aa48e23bd2a6798302a9cf935cff5f859985b5768daab9d78f2b1944476\" spec: failed to generate spec: relabel \"/var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount\" with \"system_u:object_r:data_t:s0:c196,c830\" failed: lsetxattr /var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount/agent/bin: permission denied"
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kubelet[1585]: E1108 07:23:35.604271    1585 remote_runtime.go:416] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to generate container \"df7a7aa48e23bd2a6798302a9cf935cff5f859985b5768daab9d78f2b1944476\" spec: failed to generate spec: relabel \"/var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount\" with \"system_u:object_r:data_t:s0:c196,c830\" failed: lsetxattr /var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount/agent/bin: permission denied" podSandboxID="537245d90ee69ef1d84844cd83cee92ba9c6c2786933ae285a76768db4dcdeeb"
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kubelet[1585]: E1108 07:23:35.604412    1585 kuberuntime_manager.go:919] init container &Container{Name:install-oneagent,Image:docker.io/dynatrace/dynatrace-operator:v0.9.1,Command:[],Args:[init],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{EnvVar{Name:CONTAINERS_COUNT,Value:1,ValueFrom:nil,},EnvVar{Name:FAILURE_POLICY,Value:silent,ValueFrom:nil,},EnvVar{Name:K8S_PODNAME,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.name,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:K8S_PODUID,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.uid,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:K8S_BASEPODNAME,Value:pause,ValueFrom:nil,},EnvVar{Name:K8S_CLUSTER_ID,Value:336bde54-fbc5-4135-802c-813c6a8c1350,ValueFrom:nil,},EnvVar{Name:K8S_NAMESPACE,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.namespace,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:K8S_NODE_NAME,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:spec.nodeName,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:FLAVOR,Value:,ValueFrom:nil,},EnvVar{Name:TECHNOLOGIES,Value:all,ValueFrom:nil,},EnvVar{Name:INSTALLPATH,Value:/opt/dynatrace/oneagent-paas,ValueFrom:nil,},EnvVar{Name:INSTALLER_URL,Value:,ValueFrom:nil,},EnvVar{Name:VERSION,Value:,ValueFrom:nil,},EnvVar{Name:MODE,Value:provisioned,ValueFrom:nil,},EnvVar{Name:ONEAGENT_INJECTED,Value:true,ValueFrom:nil,},EnvVar{Name:CONTAINER_1_NAME,Value:pause,ValueFrom:nil,},EnvVar{Name:CONTAINER_1_IMAGE,Value:k8s.gcr.io/pause:3.1,ValueFrom:nil,},EnvVar{Name:DT_WORKLOAD_KIND,Value:DaemonSet,ValueFrom:nil,},EnvVar{Name:DT_WORKLOAD_NAME,Value:pause,ValueFrom:nil,},EnvVar{Name:DATA_INGEST_INJECTED,Value:true,ValueFrom:nil,},},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:oneagent-bin,ReadOnly:false,MountPath:/mnt/bin,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:oneagent-share,ReadOnly:false,MountPath:/mnt/share,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:injection-config,ReadOnly:false,MountPath:/mnt/config,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:data-ingest-enrichment,ReadOnly:false,MountPath:/var/lib/dynatrace/enrichment,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:kube-api-access-mm55x,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod pause-bkkjs_default(094a40c0-753b-4542-9980-6a71a268523c): CreateContainerError: failed to generate container "df7a7aa48e23bd2a6798302a9cf935cff5f859985b5768daab9d78f2b1944476" spec: failed to generate spec: relabel "/var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount" with "system_u:object_r:data_t:s0:c196,c830" failed: lsetxattr /var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount/agent/bin: permission denied
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kubelet[1585]: E1108 07:23:35.604466    1585 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"install-oneagent\" with CreateContainerError: \"failed to generate container \\\"df7a7aa48e23bd2a6798302a9cf935cff5f859985b5768daab9d78f2b1944476\\\" spec: failed to generate spec: relabel \\\"/var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount\\\" with \\\"system_u:object_r:data_t:s0:c196,c830\\\" failed: lsetxattr /var/lib/kubelet/pods/094a40c0-753b-4542-9980-6a71a268523c/volumes/kubernetes.io~csi/oneagent-bin/mount/agent/bin: permission denied\"" pod="default/pause-bkkjs" podUID=094a40c0-753b-4542-9980-6a71a268523c
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201704): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="agent" dev="nvme1n1p1" ino=135060 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201705): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="bin" dev="nvme1n1p1" ino=135061 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201706): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="1.255.30.20221105-054308" dev="nvme1n1p1" ino=135062 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201707): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="any" dev="nvme1n1p1" ino=135063 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201708): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="dotnet" dev="nvme1n1p1" ino=135065 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201709): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135066" dev="nvme1n1p1" ino=135066 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201710): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135067" dev="nvme1n1p1" ino=135067 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201711): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135068" dev="nvme1n1p1" ino=135068 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201712): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="java" dev="nvme1n1p1" ino=135069 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=dir permissive=0
Nov 08 07:23:35 ip-172-31-34-110.ec2.internal kernel: audit: type=1400 audit(1667892215.599:201713): avc:  denied  { relabelfrom } for  pid=1543 comm="containerd" name="#135071" dev="nvme1n1p1" ino=135071 scontext=system_u:system_r:control_t:s0-s0:c0.c1023 tcontext=system_u:object_r:local_t:s0 tclass=file permissive=0
Nov 08 07:23:37 ip-172-31-34-110.ec2.internal systemd[1]: run-containerd-runc-k8s.io-edb5da62c927d4e3124354ca6020eeb9a90502cf3edbd23c75df7d58d74f9f19-runc.KK81qm.mount: Deactivated successfully.
...

Do you need the archive from logdog as well?

@bcressey
Copy link
Contributor

bcressey commented Nov 8, 2022

What is the backing storage for OneAgent CSI volumes like this?

/var/lib/kubelet/pods/174ca84d-4ed5-4c9b-8f09-ce397f5c07a3/volumes/kubernetes.io~csi/oneagent-bin/mount

Based on the logs, I would guess that it might be backed directly by a host directory bind-mounted into the CSI driver, which is then bind-mounted onto the pod's mount point. kubelet is (correctly) working out that the underlying filesystem supports SELinux, and marks the volume for relabeling, and then containerd can't actually relabel it since a direct relabel from local_t to data_t is not allowed.

Ideally the pod's CSI volume mount will need to start with a data_t label of some kind. The first way to do that which comes to mind is to mount an emptyDir into the CSI driver from the host, which should receive a system_u:object_r:data_t:s0 label because of the seLinuxOptions. Then for each pod's CSI volume mount, copy the agent files into that emptyDir, then use the emptyDir's subdirectory as the source of the bind-mount for the pod volume mount point.

@baichinger
Copy link
Author

Your analysis is correct. The details to this are:

  • OneAgent CSI stores agent files in /var/lib/kubelet/plugins/csi.oneagent.dynatrace.com/data on the host.
  • Volume consumers (monitored application pods) get an overlayfs compound of
    • agent files (binaries) - ro
    • per pod configuration - rw

This saves (a) network bandwidth as agents get downloaded only once per node, and (b) disk storage (~800MB) per pod.

Switching from hostPath to emptyDir for the data directory gets denied as well, no AVC entries though:

# journalctl -f
...
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal containerd[1543]: time="2022-11-08T18:26:46.608780927Z" level=info msg="CreateContainer within sandbox \"8a7ed05bb4fa58c64c0cc46b9a3f0471e14d291c7fe319009668d3f1653aee59\" for container &ContainerMetadata{Name:install-oneagent,Attempt:0,}"
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228118): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228119): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228120): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228121): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228122): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228123): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228124): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228125): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228126): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kernel: audit: type=1401 audit(1667932006.605:228127): op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=dir
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal audit: SELINUX_ERR op=security_validate_transition seresult=denied oldcontext=system_u:object_r:data_t:s0 newcontext=system_u:object_r:data_t:s0:c63,c833 taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 tclass=file
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal containerd[1543]: time="2022-11-08T18:26:46.610407664Z" level=error msg="CreateContainer within sandbox \"8a7ed05bb4fa58c64c0cc46b9a3f0471e14d291c7fe319009668d3f1653aee59\" for &ContainerMetadata{Name:install-oneagent,Attempt:0,} failed" error="failed to generate container \"bfaefb49dc050c0955570a2d923ad5d7b0c54922d07ea36b549aeb2b522a91be\" spec: failed to generate spec: relabel \"/var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount\" with \"system_u:object_r:data_t:s0:c63,c833\" failed: lsetxattr /var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount/agent: operation not permitted"
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kubelet[1585]: E1108 18:26:46.610731    1585 remote_runtime.go:416] "CreateContainer in sandbox from runtime service failed" err="rpc error: code = Unknown desc = failed to generate container \"bfaefb49dc050c0955570a2d923ad5d7b0c54922d07ea36b549aeb2b522a91be\" spec: failed to generate spec: relabel \"/var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount\" with \"system_u:object_r:data_t:s0:c63,c833\" failed: lsetxattr /var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount/agent: operation not permitted" podSandboxID="8a7ed05bb4fa58c64c0cc46b9a3f0471e14d291c7fe319009668d3f1653aee59"
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kubelet[1585]: E1108 18:26:46.610887    1585 kuberuntime_manager.go:919] init container &Container{Name:install-oneagent,Image:docker.io/dynatrace/dynatrace-operator:v0.9.1,Command:[],Args:[init],WorkingDir:,Ports:[]ContainerPort{},Env:[]EnvVar{EnvVar{Name:CONTAINERS_COUNT,Value:1,ValueFrom:nil,},EnvVar{Name:FAILURE_POLICY,Value:silent,ValueFrom:nil,},EnvVar{Name:K8S_PODNAME,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.name,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:K8S_PODUID,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.uid,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:K8S_BASEPODNAME,Value:pause,ValueFrom:nil,},EnvVar{Name:K8S_CLUSTER_ID,Value:336bde54-fbc5-4135-802c-813c6a8c1350,ValueFrom:nil,},EnvVar{Name:K8S_NAMESPACE,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:metadata.namespace,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:K8S_NODE_NAME,Value:,ValueFrom:&EnvVarSource{FieldRef:&ObjectFieldSelector{APIVersion:v1,FieldPath:spec.nodeName,},ResourceFieldRef:nil,ConfigMapKeyRef:nil,SecretKeyRef:nil,},},EnvVar{Name:FLAVOR,Value:,ValueFrom:nil,},EnvVar{Name:TECHNOLOGIES,Value:all,ValueFrom:nil,},EnvVar{Name:INSTALLPATH,Value:/opt/dynatrace/oneagent-paas,ValueFrom:nil,},EnvVar{Name:INSTALLER_URL,Value:,ValueFrom:nil,},EnvVar{Name:VERSION,Value:,ValueFrom:nil,},EnvVar{Name:MODE,Value:provisioned,ValueFrom:nil,},EnvVar{Name:ONEAGENT_INJECTED,Value:true,ValueFrom:nil,},EnvVar{Name:CONTAINER_1_NAME,Value:pause,ValueFrom:nil,},EnvVar{Name:CONTAINER_1_IMAGE,Value:k8s.gcr.io/pause:3.1,ValueFrom:nil,},EnvVar{Name:DT_WORKLOAD_KIND,Value:DaemonSet,ValueFrom:nil,},EnvVar{Name:DT_WORKLOAD_NAME,Value:pause,ValueFrom:nil,},EnvVar{Name:DATA_INGEST_INJECTED,Value:true,ValueFrom:nil,},},Resources:ResourceRequirements{Limits:ResourceList{},Requests:ResourceList{},},VolumeMounts:[]VolumeMount{VolumeMount{Name:oneagent-bin,ReadOnly:false,MountPath:/mnt/bin,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:oneagent-share,ReadOnly:false,MountPath:/mnt/share,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:injection-config,ReadOnly:false,MountPath:/mnt/config,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:data-ingest-enrichment,ReadOnly:false,MountPath:/var/lib/dynatrace/enrichment,SubPath:,MountPropagation:nil,SubPathExpr:,},VolumeMount{Name:kube-api-access-7zkmw,ReadOnly:true,MountPath:/var/run/secrets/kubernetes.io/serviceaccount,SubPath:,MountPropagation:nil,SubPathExpr:,},},LivenessProbe:nil,ReadinessProbe:nil,Lifecycle:nil,TerminationMessagePath:/dev/termination-log,ImagePullPolicy:IfNotPresent,SecurityContext:nil,Stdin:false,StdinOnce:false,TTY:false,EnvFrom:[]EnvFromSource{},TerminationMessagePolicy:File,VolumeDevices:[]VolumeDevice{},StartupProbe:nil,} start failed in pod pause-dmjln_default(a904a098-e9d9-4bd2-a5ce-31b6d481e898): CreateContainerError: failed to generate container "bfaefb49dc050c0955570a2d923ad5d7b0c54922d07ea36b549aeb2b522a91be" spec: failed to generate spec: relabel "/var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount" with "system_u:object_r:data_t:s0:c63,c833" failed: lsetxattr /var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount/agent: operation not permitted
Nov 08 18:26:46 ip-172-31-34-110.ec2.internal kubelet[1585]: E1108 18:26:46.610950    1585 pod_workers.go:951] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"install-oneagent\" with CreateContainerError: \"failed to generate container \\\"bfaefb49dc050c0955570a2d923ad5d7b0c54922d07ea36b549aeb2b522a91be\\\" spec: failed to generate spec: relabel \\\"/var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount\\\" with \\\"system_u:object_r:data_t:s0:c63,c833\\\" failed: lsetxattr /var/lib/kubelet/pods/a904a098-e9d9-4bd2-a5ce-31b6d481e898/volumes/kubernetes.io~csi/oneagent-bin/mount/agent: operation not permitted\"" pod="default/pause-dmjln" podUID=a904a098-e9d9-4bd2-a5ce-31b6d481e898
...

Is this configuration supposed to work? It does on AL2.

@bcressey
Copy link
Contributor

bcressey commented Nov 8, 2022

Is this configuration supposed to work? It does on AL2.

Ideally the configuration would work on Bottlerocket. AL2 doesn't enable or enforce SELinux by default, and might not have a policy that covers this specific interaction, so it's not a 1:1 comparison.

oldcontext=system_u:object_r:data_t:s0
newcontext=system_u:object_r:data_t:s0:c63,c833
taskcontext=system_u:system_r:control_t:s0-s0:c0.c1023 

That seems like progress of sorts, since that relabel transition is something that it might make sense for the SELinux policy to permit. A privileged subject (control_t:s0-s0:c0.c1023 ) is attempting to add a MCS category pair (c63,c833) to an object label that lacks them (data_t:s0). That should only narrow the scope of access rather than changing it to something else entirely.

Or in other words, allowing this type of relabel would make it so only containers within the pod could access the volume, or privileged containers, rather than any unprivileged pod on the system.

Is this where I would change data-dir to an emptyDir? I can try making the SELinux policy change and then test out the repro case you've provided.

Also, is the emptyDir change one you'd consider making upstream or does it have unwanted side effects?

@baichinger
Copy link
Author

Yes, you identified the right spot.

The switch to emptyDir has indeed effects which we need to assess. As soon as the CSI pod gets deleted (i.e. restarted), the backing store (data-dir) gets removed and therefore overlayfs loses its lowerdir. File access is possible to some extent if file names are known, walking the directory tree is not. To be honest, this seems not very promising to me.

Alternatives to emptyDir data-dir volume coming to my mind:

  • A place on the existing host, labeled data_t.
  • A second disk attached to the node, labeled data_t.
  • Allow relabeling from local_t to data_t.

@bcressey
Copy link
Contributor

Ah, I had missed that overlayfs was in use. That explains one of the oddities - why even though containerd (which runs as runtime_t) is doing the relabel, the errors are for control_t. With overlayfs, a copy of the mounter's credentials is saved and also used for access checks. So we end up checking the containerd creds (which have permission to relabel) and the CSI plugin's (which don't).

However, the use of overlayfs makes me doubt that enabling the relabel operation to succeed is actually desirable here. Bottlerocket's kernel doesn't enable the "metadata only copy up" feature by default (CONFIG_OVERLAY_FS_METACOPY), so I would expect that modifications to xattrs would trigger copy up, which would defeat the desired caching behavior.

The better path might be to convince kubelet that the SELinux relabel isn't necessary. Can the mount be made read-only?

@baichinger
Copy link
Author

Making the mount read-only is be possible indeed. Would that be enough to stop kubelet from relabeling? What is necessary to stop kubelet from relabeling?

@bcressey
Copy link
Contributor

From a quick look at the code, it isn't clear to me that kubelet would skip marking the volume for relabeling if it was read-only, but it seems like it'd be a bug in the CRI implementation in containerd if it actually tries to relabel a read-only volume.

@stmcginnis stmcginnis added status/needs-triage Pending triage or re-evaluation status/needs-info Further information is requested and removed status/needinfo labels Dec 1, 2022
@bcressey
Copy link
Contributor

bcressey commented Dec 1, 2022

I've verified that kubelet skips relabeling for read-only volumes.

It looks like this might only be true when the volume is marked as readOnly: true in the CRD, though. For SELinux support, there's a check on whether the mount includes the "seclabel" option, but I don't see a similar check for the "ro" option. That makes sense since some read-only mounts, like tmpfs, can still be relabeled.

@baichinger
Copy link
Author

Can confirm that a read-only volume is not subject of relabeling. The proof of concept was successful.

Thanks for your support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubernetes K8s including EKS, EKS-A, and including VMW status/needs-info Further information is requested status/needs-triage Pending triage or re-evaluation
Projects
None yet
Development

No branches or pull requests

4 participants