[FEATURE] add volume attribute for filesystem owner #1165

dmayle · 2020-04-08T14:04:47Z

Is your feature request related to a problem? Please describe.
When provisioning volumes (dynamically or otherwise), they are unusable in non-root containers because the default ownership is root:root 755. This means that volumes used this way require additional configuration steps in order to be usable by non-root containers

Describe the solution you'd like
I'd like to see a CSI volume attribute added that can be used to set the initial ownership and permissions, something like 'initialOwner' and 'initialPermissions'.

Describe alternatives you've considered
I have a workaround where I use an init container to check if the volume is empty, and then set the permissions.

Additional context
The initial format happens in https://github.com/longhorn/longhorn-manager/blob/master/csi/node_server.go#L81 , where k8s.io/util/mount is used to transparently format the volume on first mount. Since this package neither supports this feature, nor reports back on whether or not it had to format the volume, the package would either have to be changed, or an alternative implementation would be required.

The source that matters in k8s.io/util/mount is:

Rancheroo · 2020-05-05T01:02:47Z

I have this issue also when running containers using a non-root limited access user.
It cannot access the mount with Permission denied.

I have tried setting the Security Context however that did not get me far.
https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#set-the-security-context-for-a-pod

The above post is very useful and I will give the workaround a try. However this would be a useful feature as root shouldn't be the default user in a container and I would expect many people to hit this once Longhorn is released and used widely in Enterprise.

virus2016 · 2020-10-11T20:36:34Z

This might be the issue with my Mongodb instances. It's randomly after we take a snapshot or backup where the pod crashes and can not start backup because a file is locked and the user trying to access it can not remove the lock. As soon as I delete the pod, mongo starts up perfectly.

Does anyone else have this problem?

yasker · 2020-10-12T21:18:25Z

@virus2016 I am not sure you're experiencing the same issue here. Can you file a bug and provide the reproduce steps? If you can send us a support bundle (see the footer of UI) that would be even better.

parsley42 · 2020-10-16T15:09:14Z

I'm definitely experiencing the same issue. For other folks who hit this, the fix for me was just setting fsGroup: x in the podSecurityContext (for me, x=1 -> "daemon"). When this is set, the kubelet will recursively update group perms on the mount.

virus2016 · 2020-10-25T12:25:42Z

@virus2016 I am not sure you're experiencing the same issue here. Can you file a bug and provide the reproduce steps? If you can send us a support bundle (see the footer of UI) that would be even better.

Yes I will do. After a couple of weeks looking into this, it is definitely an issue with permissions specifically longhorn. I have run 3 deployments of mongodb replica as:

no volumes attached
digitalocean volumes attached
longhorn volumes attached

I am using the rancher mongodb replica template in apps. All run perfect apart from longhorn.

At a random time, one mongodb pod with longhorn attached storage gets into a crash loop with the following:

2020-10-25T10:56:46.490+0000 I STORAGE  [main] Engine custom option: cache_size=600M
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] MongoDB starting : pid=1 port=27017 dbpath=/data/db 64-bit host=appdrift-mongodb-replicaset-2
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] db version v3.6.14
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] git version: cbef87692475857c7ee6e764c8f5104b39c342a1
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] allocator: tcmalloc
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] modules: none
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] build environment:
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten]     distmod: ubuntu1604
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten]     distarch: x86_64
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten]     target_arch: x86_64
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] options: { config: "/data/configdb/mongod.conf", net: { bindIp: "0.0.0.0", port: 27017 }, replication: { replSet: "appdriftlive" }, security: { authorization: "enabled", keyFile: "/data/configdb/key.txt" }, storage: { dbPath: "/data/db", wiredTiger: { engineConfig: { configString: "cache_size=600M" } } } }
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I STORAGE  [initandlisten] exception in initAndListen: DBPathInUse: Unable to create/open the lock file: /data/db/mongod.lock (Read-only file system). Ensure the user executing mongod is the owner of the lock file and has the appropriate permissions. Also make sure that another mongod instance is not already running on the /data/db directory, terminating
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] now exiting
25/10/2020 10:56:46 2020-10-25T10:56:46.513+0000 I CONTROL  [initandlisten] shutting down with code:100

Soon after the other pods follow.

I have create the issue #1909.

Thanks for your help so far - I love Longhorn!

samip5 · 2021-09-15T00:53:49Z

@dmayle Could you please share your workaround?

Longhorn definitely is not respecting podSecurityContext, even with 1.2.0 version of Longhorn.

yasker · 2021-09-15T18:15:47Z

@samip5 v1.2.0 has a bug in the security context. Can you check #2964 for a workaround?

I will close this issue since it's one year old. Feel free to file new issues if #2964 is not the issue you're experiencing.

yasker added area/kubernetes Kubernetes related like K8s version compatibility component/longhorn-manager Longhorn manager (control plane) kind/feature Feature request, new feature labels Apr 8, 2020

yasker added the area/driver label May 7, 2020

yasker added this to the v1.1.0 milestone May 7, 2020

yasker assigned ttpcodes Jun 12, 2020

yasker added require/auto-e2e-test Require adding/updating auto e2e test cases if they can be automated require/doc Require updating the longhorn.io documentation labels Jun 12, 2020

yasker unassigned ttpcodes Jul 21, 2020

yasker added the priority/2 Nice to implement or fix in this release (managed by PO) label Jul 22, 2020

yasker modified the milestones: v1.1.0, v1.1.1 Oct 1, 2020

yasker modified the milestones: v1.1.1, Planning Oct 16, 2020

yasker closed this as completed Sep 15, 2021

groundhog2k mentioned this issue Nov 30, 2021

Problem with persistent PVC and erlang cookie groundhog2k/helm-charts#718

Closed

jonizen mentioned this issue Jul 29, 2023

Error writing Galera Config. Permission Denied[Bug] mariadb-operator/mariadb-operator#171

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] add volume attribute for filesystem owner #1165

[FEATURE] add volume attribute for filesystem owner #1165

dmayle commented Apr 8, 2020

Rancheroo commented May 5, 2020 •

edited

Loading

virus2016 commented Oct 11, 2020

yasker commented Oct 12, 2020

parsley42 commented Oct 16, 2020

virus2016 commented Oct 25, 2020

samip5 commented Sep 15, 2021

yasker commented Sep 15, 2021

[FEATURE] add volume attribute for filesystem owner #1165

[FEATURE] add volume attribute for filesystem owner #1165

Comments

dmayle commented Apr 8, 2020

Rancheroo commented May 5, 2020 • edited Loading

virus2016 commented Oct 11, 2020

yasker commented Oct 12, 2020

parsley42 commented Oct 16, 2020

virus2016 commented Oct 25, 2020

samip5 commented Sep 15, 2021

yasker commented Sep 15, 2021

Rancheroo commented May 5, 2020 •

edited

Loading