Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

failed: lsetxattr read-only file system on pod start #145

Open
woehrl01 opened this issue Feb 14, 2024 · 14 comments
Open

failed: lsetxattr read-only file system on pod start #145

woehrl01 opened this issue Feb 14, 2024 · 14 comments
Labels
help-wanted Extra attention is needed

Comments

@woehrl01
Copy link

woehrl01 commented Feb 14, 2024

Hi,

Would like to test out this really promissing csi driver, but I receive the following error:

spec: failed to apply OCI options: relabel "/var/lib/kubelet/pods/ee74a41d-f2b5-4ac3-9722-80454387d5c9/volume-subpaths/source/nginx/18" with "system_u:object_r:data_t:s0:c246,c908" failed: lsetxattr /var/lib/kubelet/pods/ee74a41d-f2b5-4ac3-9722-80454387d5c9/volume-subpaths/source/nginx/18/p: read-only file system

I already changed the mount and the pod to be readable, but I still have that error.
I'm using EKS 1.28 with bottlerocket nodes.

Any ideas what I could try?

Edit: I got it working by setting readOnly: true on the volume directly. Any idea how I can troubleshoot why a writable volume does not work?

Thanks!

@mbtamuli
Copy link
Contributor

Hey @woehrl01 could you share a minimal manifest that could help reproduce the issue?

@woehrl01
Copy link
Author

woehrl01 commented Feb 15, 2024

Thanks you @mbtamuli

Absolutely!

Actually I'm using all defaults. The driver is installed with default helm values (1.1.0 and containerd) and I'm using the following configuration:

https://github.com/warm-metal/container-image-csi-driver/blob/v1.1.0/sample/ephemeral-volume.yaml

It works for me as soon as I add the readOnly flag to the volume definition.

@mugdha-adhav
Copy link
Collaborator

@woehrl01 could you please share the logs from node-plugin daemonset pod which is running on the same node where the workload was deployed?

@woehrl01
Copy link
Author

woehrl01 commented Feb 15, 2024

Feb 14, 2024 18:16:56.928
config.go:144] looking for config.json at /config.json
 
Feb 14, 2024 18:16:56.928
config.go:144] looking for config.json at /config.json
 
Feb 14, 2024 18:16:56.928
config.go:144] looking for config.json at /root/.docker/config.json
 
Feb 14, 2024 18:16:56.929
config.go:144] looking for config.json at /.docker/config.json
 
Feb 14, 2024 18:16:56.929
config.go:110] looking for .dockercfg at /.dockercfg
 
Feb 14, 2024 18:16:56.929
config.go:110] looking for .dockercfg at /.dockercfg
 
Feb 14, 2024 18:16:56.929
config.go:110] looking for .dockercfg at /root/.dockercfg
 
Feb 14, 2024 18:16:56.929
config.go:110] looking for .dockercfg at /.dockercfg
 
Feb 14, 2024 18:16:56.929
provider.go:82] Docker config file not found: couldn't find valid .dockercfg after checking in [ /root /]
 
Feb 14, 2024 18:16:56.932
aws_credentials.go:180] unable to get ECR credentials from cache, checking ECR API
 
Feb 14, 2024 18:16:56.944
aws_credentials.go:295] AWS request: ecr:GetAuthorizationToken in eu-central-1
 
Feb 14, 2024 18:16:57.022
aws_credentials.go:187] Got ECR credentials from ECR API for ecr_url
 
Feb 14, 2024 18:17:20.514
pullexecutor.go:92] "Finished pulling image" pod-name="" namespace="" uid="" request-id="10b0b1a4-0b65-41f0-a51c-418f38296289" image="someimage:versiontag" pull-duration="23.583414224s" image-size="161.81 MiB"
 
Feb 14, 2024 18:17:20.514
mountexecutor.go:63] "Mounting image" pod-name="" namespace="" uid="" request-id="10b0b1a4-0b65-41f0-a51c-418f38296289" image="someimage"
 
Feb 14, 2024 18:17:20.564
containerd.go:82] image "someimage:versiontag" unpacked
 
Feb 14, 2024 18:17:20.566
mounter.go:193] create read-write snapshot of image "someimage:versiontag" with key "csi-image.warm-metal.tech-csi-38e100b5eeb529e82e9cfb68091a734c716db143c2be52cf78b720df72f331cd"
 
Feb 14, 2024 18:17:20.568
containerd.go:120] create rw snapshot "csi-image.warm-metal.tech-csi-38e100b5eeb529e82e9cfb68091a734c716db143c2be52cf78b720df72f331cd" for image "sha256:f15ada610a14d12f77b5180515b5df4a7dc81bf5cff0dcf35e929f9f6968cb87" with metadata map[string]string{"containerd.io/gc.root":"2024-02-14T17:17:20Z"}
 
Feb 14, 2024 18:17:20.581
mountexecutor.go:87] "Finished mounting" pod-name="" namespace="" uid="" request-id="10b0b1a4-0b65-41f0-a51c-418f38296289" image="someimage" mount-duration="66.150696ms"
 
Feb 14, 2024 18:17:30.040
utils.go:97] GRPC call: /csi.v1.Identity/Probe
 
Feb 14, 2024 18:17:47.720
utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
 
Feb 14, 2024 18:18:29.938
utils.go:97] GRPC call: /csi.v1.Identity/Probe
 
Feb 14, 2024 18:19:07.579
utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
 
Feb 14, 2024 18:19:29.938
utils.go:97] GRPC call: /csi.v1.Identity/Probe

 
Feb 14, 2024 18:25:15.557
utils.go:97] GRPC call: /csi.v1.Node/NodePublishVolume
 
Feb 14, 2024 18:25:15.563
node_server.go:64] "Incoming NodePublishVolume request" pod-name="" namespace="" uid="" request-id="c74dea77-bfbb-4c26-a7e4-d33917a022c6" request string="volume_id:\"csi-ab8d5c61d4c47fe3a6f19f5e916c4e4ff1f812848b01adbf694a27d4ca2bc3ae\" target_path:\"/var/lib/kubelet/pods/b961b108-161d-45c5-9430-81eaa8d2bd29/volumes/kubernetes.io~csi/source/mount\" volume_capability:<mount:<> access_mode:<mode:SINGLE_NODE_WRITER > > volume_context:<key:\"csi.storage.k8s.io/ephemeral\" value:\"true\" > volume_context:<key:\"csi.storage.k8s.io/pod.name\" value:\"combined-web-5ff68fbfdb-7tjn8\" > volume_context:<key:\"csi.storage.k8s.io/pod.namespace\" value:\"p67747\" > volume_context:<key:\"csi.storage.k8s.io/pod.uid\" value:\"b961b108-161d-45c5-9430-81eaa8d2bd29\" > volume_context:<key:\"csi.storage.k8s.io/serviceAccount.name\" value:\"default\" > volume_context:<key:\"image\" value:\"someimage:versiontag\" > "
 
Feb 14, 2024 18:25:15.579
mount_linux.go:286] 'umount /tmp/kubelet-detect-safe-umount2272271457' failed with: exit status 1, output: umount: can't unmount /tmp/kubelet-detect-safe-umount2272271457: Invalid argument
 
Feb 14, 2024 18:25:15.579
mount_linux.go:288] Detected umount with unsafe 'not mounted' behavior
 
Feb 14, 2024 18:25:15.582
plugins.go:73] Registering credential provider: .dockercfg
 
Feb 14, 2024 18:25:15.582
plugins.go:73] Registering credential provider: amazon-ecr
 
Feb 14, 2024 18:25:15.589
mountexecutor.go:63] "Mounting image" pod-name="" namespace="" uid="" request-id="c74dea77-bfbb-4c26-a7e4-d33917a022c6" image="someimage"
 
Feb 14, 2024 18:25:15.605
containerd.go:82] image "someimage:versiontag" unpacked
 
Feb 14, 2024 18:25:15.606
mounter.go:193] create read-write snapshot of image "someimage:versiontag" with key "csi-image.warm-metal.tech-csi-ab8d5c61d4c47fe3a6f19f5e916c4e4ff1f812848b01adbf694a27d4ca2bc3ae"
 
Feb 14, 2024 18:25:15.607
containerd.go:120] create rw snapshot "csi-image.warm-metal.tech-csi-ab8d5c61d4c47fe3a6f19f5e916c4e4ff1f812848b01adbf694a27d4ca2bc3ae" for image "sha256:f15ada610a14d12f77b5180515b5df4a7dc81bf5cff0dcf35e929f9f6968cb87" with metadata map[string]string{"containerd.io/gc.root":"2024-02-14T17:25:15Z"}
 
Feb 14, 2024 18:25:15.619
mountexecutor.go:87] "Finished mounting" pod-name="" namespace="" uid="" request-id="c74dea77-bfbb-4c26-a7e4-d33917a022c6" image="someimage" mount-duration="29.932342ms"
 
Feb 14, 2024 18:25:29.938
utils.go:97] GRPC call: /csi.v1.Identity/Probe
 
Feb 14, 2024 18:25:47.739
utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities
 
Feb 14, 2024 18:26:29.938
utils.go:97] GRPC call: /csi.v1.Identity/Probe
 
Feb 14, 2024 18:27:43.251
utils.go:97] GRPC call: /csi.v1.Node/NodeUnpublishVolume
 
Feb 14, 2024 18:27:43.252
node_server.go:180] unmount request: volume_id:"csi-38e100b5eeb529e82e9cfb68091a734c716db143c2be52cf78b720df72f331cd" target_path:"/var/lib/kubelet/pods/fb74c7d2-2668-4b5b-8849-0d18cb98ee30/volumes/kubernetes.io~csi/source/mount"
 
Feb 14, 2024 18:27:43.282
mount_linux.go:286] 'umount /tmp/kubelet-detect-safe-umount566345078' failed with: exit status 1, output: umount: can't unmount /tmp/kubelet-detect-safe-umount566345078: Invalid argument
 
Feb 14, 2024 18:27:43.282
mount_linux.go:288] Detected umount with unsafe 'not mounted' behavior
 
Feb 14, 2024 18:27:43.293
mounter.go:211] unmount volume "csi-38e100b5eeb529e82e9cfb68091a734c716db143c2be52cf78b720df72f331cd" at "/var/lib/kubelet/pods/fb74c7d2-2668-4b5b-8849-0d18cb98ee30/volumes/kubernetes.io~csi/source/mount"
 
Feb 14, 2024 18:27:43.297
mounter.go:216] try to unref read-only snapshot
 
Feb 14, 2024 18:27:43.297
mounter.go:135] target "/var/lib/kubelet/pods/fb74c7d2-2668-4b5b-8849-0d18cb98ee30/volumes/kubernetes.io~csi/source/mount" is not read-only
 
Feb 14, 2024 18:27:43.297
mounter.go:222] delete the read-write snapshot
 
Feb 14, 2024 18:27:43.300
containerd.go:189] remove snapshot "csi-image.warm-metal.tech-csi-38e100b5eeb529e82e9cfb68091a734c716db143c2be52cf78b720df72f331cd"
 
Feb 14, 2024 18:28:29.938
utils.go:97] GRPC call: /csi.v1.Identity/Probe
 
Feb 14, 2024 18:29:13.668
utils.go:97] GRPC call: /csi.v1.Node/NodeGetCapabilities

@mugdha-adhav
Copy link
Collaborator

Currently in our automated builds we are only testing the driver against k8s version v1.25. Here's the compatibility matrix info.

I tested the driver on a kind cluster with k8s version v1.28.3 using containerd, I was able to run the sample ephemeral workload as expected.

$ kubectl logs ephemeral-volume-thb2m
+ '[' /target '!='  ]
+ '[' -f /target/csi-file1 ]
+ '[' -f /target/csi-file2 ]
+ '[' -d /target/csi-folder1 ]
+ '[' -f /target/csi-folder1/file ]
+ exit 0

@mugdha-adhav
Copy link
Collaborator

@woehrl01 I don't see any relevant errors in the logs your shared here.

Also the error you shared in the issue description, where is it coming from?

@woehrl01
Copy link
Author

@mugdha-adhav yes, your right the logs don't shoe any problems, that's why I asked if you have additional ideas to debug this.

The error I recieve is a kubernetes event on the pod and it's created by kubelet:

 {  "event.firstTimestamp": "2024-02-14T19:03:22Z", "event.involvedObject.apiVersion": "v1", "event.involvedObject.fieldPath": "spec.containers{ng}", "event.involvedObject.kind": "Pod", "event.involvedObject.name": "h-8648c99fbf-wkgrg", "event.involvedObject.namespace": "j", "event.involvedObject.resourceVersion": "1600765086", "event.involvedObject.uid": "7b39946a-5be40ee06e49b7d", "event.lastTimestamp": "2024-02-14T19:03:34Z", "event.message": "(combined from similar events): Error: failed to generate container \"218b131a8748748b7ba121c4a2fd5a6b182659fcecdff0357bd106aa1b1fcfb4\" spec: failed to apply OCI options: relabel \"/var/lib/kubelet/pods/7b39946a-5be4-49c4-8f52-20ee06e49b7d/volumes/kubernetes.io~csi/source/mount\" with \"system_u:object_r:data_t:s0:c211,c621\" failed: lsetxattr /var/lib/kubelet/pods/7b39946a-5be4-49c4-8f52-20ee06e49b7d/volumes/kubernetes.io~csi/source/mount/var: read-only file system", "event.metadata.creationTimestamp": "2024-02-14T19:03:22Z", "event.metadata.managedFields[0].apiVersion": "v1", "event.metadata.managedFields[0].fieldsType": "FieldsV1", "event.metadata.managedFields[0].manager": "kubelet", "event.metadata.managedFields[0].operation": "Update", "event.metadata.managedFields[0].time": "2024-02-14T19:03:34Z", "event.metadata.name": "h8648c99fbf-wkgrg.17b3d004cd0490fd", "event.metadata.namespace": "p67747", "event.metadata.resourceVersion": "1600786275", "event.metadata.uid": "ac9784a8-f562-48d4-b9be-fe926e9a3c13", "event.reason": "Failed", "event.source.component": "kubelet", "event.source.host": "ip--.f.compute.internal", "event.type": "Warning", "integrationName": "kube_events", "integrationVersion": "2.8.1", "old_event.count": 1, "old_event.firstTimestamp": "2024-02-14T19:03:22Z", "old_event.involvedObject.apiVersion": "v1", "old_event.involvedObject.fieldPath": "spec.containers{ng}", "old_event.involvedObject.kind": "Pod", "old_event.involvedObject.name": "h-8648c99fbf-wkgrg", "old_event.involvedObject.namespace": "h", "old_event.involvedObject.resourceVersion": "1600765086", "old_event.involvedObject.uid": "7b39946a-5beee06e49b7d", "old_event.lastTimestamp": "2024-02-14T19:03:22Z", "old_event.message": "(combined from similar events): Error: failed to generate container \"be6c7b7e8c3d4a3386a047312d173f6a94490d461e3073d6205cc1cf888f8f24\" spec: failed to apply OCI options: relabel \"/var/lib/kubelet/pods/7b39946a-5be4-49c4-8f52-20ee06e49b7d/volumes/kubernetes.io~csi/source/mount\" with \"system_u:object_r:data_t:s0:c211,c621\" failed: lsetxattr /var/lib/kubelet/pods/7b39946a-5be4-49c4-8f52-20ee06e49b7d/volumes/kubernetes.io~csi/source/mount/var: read-only file system", "old_event.metadata.creationTimestamp": "2024-02-14T19:03:22Z", "old_event.metadata.managedFields[0].apiVersion": "v1", "old_event.metadata.managedFields[0].fieldsType": "FieldsV1", "old_event.metadata.managedFields[0].manager": "kubelet", "old_event.metadata.managedFields[0].operation": "Update", "old_event.metadata.managedFields[0].time": "2024-02-14T19:03:22Z", "old_event.metadata.name": "h-8648c99fbf-wkgrg.17b3d004cd0490fd", "old_event.metadata.namespace": "h", "old_event.metadata.resourceVersion": "1600782603", "old_event.metadata.uid": "ac9784a8-f562-48d4-b9b", "old_event.reason": "Failed", "old_event.source.component": "kubelet", "old_event.source.host": ".compute.internal", "old_event.type": "Warning", "summary": "(combined from similar events): Error: failed to generate container \"218b131a8748748b7ba121c4a2fd5a6b182659fcecdff0357bd106aa1b1fcfb4\" spec: failed to apply OCI options: relabel \"/var/lib/kubelet/pods/7b39946a-5be4-49c4-8f52-20ee06e49b7d/volumes/kubernetes.io~csi/source/mount\" with \"system_u:object_r:data_t:s0:c211,c621\" failed: lsetxattr /var/lib/kubelet/pods/7b39946a-5be4-49c4-8f52-20ee06e49b7d/volumes/kubernetes.io~csi/source/mount/var: read-only file system", "timestamp": 1707937414000, "verb": "UPDATE"

@woehrl01
Copy link
Author

woehrl01 commented Feb 16, 2024

I just found the following related issue of a different csi driver with bottlerocket, looks like the issue is selinux + hostpath mount related:

bottlerocket-os/bottlerocket#2556

A fix is described here by passing different mount labels:
bottlerocket-os/bottlerocket#2656 (comment)

@mugdha-adhav
Copy link
Collaborator

Interesting, it seems the issue is platform specific. We could add values-bottlerocket-selinux.yaml file in our charts and add support for passing volumes and volumeMounts parameters.

@woehrl01
Copy link
Author

woehrl01 commented Feb 16, 2024

@mugdha-adhav I think it makes sense to add those different mount options to the helm chart. I also think that the context needs to be passed in the sourcecode. If I see it correct then the additonal mount options need to be set here, right before the mount.All call:

func (s snapshotMounter) Mount(ctx context.Context, key backend.SnapshotKey, target backend.MountTarget, ro bool) error {
mounts, err := s.snapshotter.Mounts(ctx, string(key))
if err != nil {
klog.Errorf("unable to retrieve mounts of snapshot %q: %s", key, err)
return err
}
err = mount.All(mounts, string(target))

@mugdha-adhav
Copy link
Collaborator

@woehrl01 would appreciate your help with sending out a PR for this. Please let me know if you want me to assign this issue to you.

@haydn-j-evans
Copy link

haydn-j-evans commented May 10, 2024

Hi,

We had the same issue with trying to mount a Loki PVC on bottlerocket 1.29

Loki sets its data chunks to read/only after it has finished writing to them, this is what I assume is actually causing the failure (having files in the r/w mount being read only, so SELinux cannot relabel them).

image

Copy link

github-actions bot commented Aug 9, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs.

@github-actions github-actions bot added the Stale label Aug 9, 2024
@mugdha-adhav
Copy link
Collaborator

mugdha-adhav commented Aug 15, 2024

@woehrl01 (or anyone else) would you be interested in sending a fix for this? I haven't worked with bottlerocket yet, so it might take me a bit longer to get the fix out.

@mugdha-adhav mugdha-adhav added help-wanted Extra attention is needed and removed Stale labels Aug 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help-wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

4 participants