Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSI plugin functionality is broken: "attacher.Attach failed: volumeattachments.storage.k8s.io is forbidden" #732

Closed
ilyasotkov opened this issue Aug 12, 2019 · 36 comments
Labels
kind/bug Something isn't working

Comments

@ilyasotkov
Copy link

Bug

When installing a CSI Driver, PVCs and PVs are created but will not attach to a pod with this error:

E0727 14:03:55.220699    2205 csi_attacher.go:93] kubernetes.io/csi: attacher.Attach failed: volumeattachments.storage.k8s.io is forbidden: User "system:node:master1" cannot create resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: can only get individual resources of this type

To Reproduce

Expected behavior
PVs are able to attach to a pod without issue

Additional context
The issue has been reported for multiple CSI drivers and all things point to k3s. I can personally confirm that this works normally when used with a kubespray cluster.

hetznercloud/csi-driver: hetznercloud/csi-driver#46
longhorn CSI driver: https://forums.rancher.com/t/longhorn-on-k3s-pv-attach-error/14920

@DracoBlue
Copy link

DracoBlue commented Aug 12, 2019

I tried with azures hostpath from https://github.com/Azure/kubernetes-volume-drivers/tree/master/csi/hostpath and retrieved the same message on k3s:

AttachVolume.Attach failed for volume "kubernetes-dynamic-pv-42256afbbc7411e9" : volumeattachments.storage.k8s.io is forbidden: User "system:node:k3s-test" cannot create resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: can only get individual resources of this type
Unable to mount volumes for pod "my-csi-app_default(3ac675b9-bc74-11e9-a41e-96000029eacb)": timeout expired waiting for volumes to attach or mount for pod "default"/"my-csi-app". list of unmounted volumes=[my-csi-volume]. list of unattached volumes=[my-csi-volume default-token-8d8kh]

0.6.1: works
0.7.x: broken
0.8.x: broken

@vitobotta
Copy link

Same problem here 😔

@vitobotta
Copy link

As a temporary workaround, are there any changes I can make to permissions or something to make it work? Thanks

@vitobotta
Copy link

vitobotta commented Aug 12, 2019

I don't know much about authentication and authotirzation yet... but I was playing a little and got a volume attached and mounted by doing this:

  1. Because the system:node ClusterRole had only the verb 'get' for 'volumeattachments', I added 'create', 'delete', 'patch', 'update', 'list' and 'watch' after seeing what's in the other sections of the role...

  2. I edited the ClusterRoleBinding for system:node and since there were no subjects I tried adding these:

subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:nodes

I have no idea of what I'm doing but with this change it works LOL :D the volume is mounted and working correctly. Is this something that should be done in K3s?

@costela
Copy link

costela commented Aug 12, 2019

@vitobotta I see two possible issues with your workaround (which isn't to say you shouldn't use it, just FYI):

If those are acceptable trade-offs, then I think you're good to go (no guarantees though: I may be missing some other reason)

@vitobotta
Copy link

Hi @costela , like I said I had no idea of what I was doing :D Anyway it's just me using my small clusters, so I would be worried about the first point only. I'm going to try and restart k3s and see if I lose the changes I made.

@vitobotta
Copy link

Just tried restarting K3s on both master and workers, and the changes are still there and volumes still attach fine.

@costela
Copy link

costela commented Aug 12, 2019

@vitobotta cool, so auto-reconciliation isn't an issue.

As for your other question:

Is this something that should be done in K3s?

Probably not. The system:nodes Group should not be used after k8s > 1.8.
Still a valid workaround, though.

@vitobotta
Copy link

I checked on another cluster deployed with Rancher (not K3s) and I see the same thing about that group. So if this isn't something that should be fixed in k3s... where should it?

@costela
Copy link

costela commented Aug 12, 2019

@vitobotta The system:nodes Group is deprecated in favor of Node authorization and NodeRestriction admission plugin. Both on per default in k3s (and k8s, for that matter).
So this specific solution should not be done by k3s. Which isn't to say the final fix for this issue won't turn out to be in k3s. It just probably won't involve the system:nodes Group.

My current bet is on the service-account token auth. I suspect the attachdetach-controller isn't getting the credentials it should. Maybe related to the "embedded" way k3s runs it. Didn't get a chance to dig deep enough to confirm that yet, though 😞

@ilyasotkov
Copy link
Author

@vitobotta the link above also says

To opt out of this reconciliation, set the rbac.authorization.kubernetes.io/autoupdate annotation on a default cluster role or rolebinding to false. Be aware that missing default permissions and subjects can result in non-functional clusters.

so the auto-reconciliation issue would be easily avoidable if ever present. But it's a hack ;)

@ilyasotkov
Copy link
Author

ilyasotkov commented Aug 12, 2019

@costela

My current bet is on the service-account token auth. I suspect the attachdetach-controller isn't getting the credentials it should. Maybe related to the "embedded" way k3s runs it. Didn't get a chance to dig deep enough to confirm that yet, though

I agree, I think we can pin the issue on the fact that in k3s system:node:<node_name> credentials are being used (authorized via NodeAuthorizer or via RBACAuthorizer in @vitobotta's hack) instead of the attachdetach-controller serviceaccount. I checked and in vanilla Kubernetes this service account has all the needed permissions on volumeattachments.

I also dug a bit through the Kubernetes codebase and the process goes like this:

attachdetach-controller finds and initializes the csiPlugin via the VolumePluginMgr, passing its credentials to the plugin. The csiAttacher of the csiPlugin would then attempt to create a volumeattachment in this line:

https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_attacher.go#L105

and fail in k3s.

@vitobotta
Copy link

The thing is that I need to create a couple of clusters for development asap and I would like to use K3s because it's cheaper to run. I have spent quite a bit of time so far learning Kubernetes (I'll still a noob sadly) so I would like to go back to coding...

Is it safe to use the hack I posted earlier or do I risk for example that future versions of K3s might screw it up?

@ilyasotkov from what you say it seems that the problem is with K3s, if vanilla Kubernetes has all the correct permissions. Right?

@ilyasotkov
Copy link
Author

@vitobotta It's without doubt a k3s issue because I tried it with a vanilla Kubernetes cluster (installed via kubespray / kubeadm on identical Hetzner Cloud infrastructure) and didn't face any similar issue.

@vitobotta
Copy link

Yeah I tried it with Kubernetes deployed with Rancher and didn't have any problems.

@erikwilson erikwilson added the kind/bug Something isn't working label Aug 16, 2019
@DracoBlue
Copy link

Is there a solution, which can be incorporated into k3s to resolve:

it may pose a security issue: every node can now "steal" volumes, which might be a problem depending on what you want to use your cluster for.

in the suggested workaround? I couldn’t find the source, where k3s does magic stuff to initialize the attacher with the wrong service account. Any hint where I can find this?

@costela
Copy link

costela commented Aug 23, 2019

@DracoBlue if we knew exactly where the error was, it would probably be fixed already 😉
k3s starts the controller-manager (which includes the attachdetach-controller) here. The controller-manager starts loading the service account here, if I'm not mistaken. Somewhere down this path lies our issue (or at least that's my current unverified theory).

As for the workaround problem: no, I don't think there's a solution for it, other than fixing the attachdetach-controller authentication (or whatever causes it as a side-effect).
I'd gladly be proven wrong, though!

@grumps
Copy link

grumps commented Aug 23, 2019

I'm also seeing this issue with my the Digital Ocean CSI driver. I actually had it all working then destroyed my instance and started over again for automation and now it doesn't work. Based on the releases page I believe I was using 0.7.0. So it must of broken in the 0.8.0 release.

@erikwilson
Copy link
Contributor

This appears to be broken in v0.7.0. That version introduced a certs refactor which should utilize node authorization (https://kubernetes.io/docs/reference/access-authn-authz/node/). I am not sure if kubespray uses node authorization, but from a quick look it appears not.

@DracoBlue
Copy link

DracoBlue commented Aug 25, 2019

The NodeRestriction docs at https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction say:

kubelets must use credentials in the system:nodes group, with a username in the form system:node:<nodeName>

If I understand your responses correct, the right way would be to ensure there is an extra service account for theattachdetach-controller instead of using the system:node:<nodeName> user. It gets initialized in k8s by https://github.com/kubernetes/kubernetes/blob/103e926604de6f79161b78af3e792d0ed282bc06/cmd/kube-controller-manager/app/controllermanager.go#L404 with a call to https://github.com/kubernetes/kubernetes/blob/9bae1bc56804db4905abebcd408e0f02e199ab93/cmd/kube-controller-manager/app/core.go#L250.

Looks like authorization ist created here https://github.com/kubernetes/kubernetes/blob/9c973c6d2c33e88521ebcebec2fdd9cbddccd857/pkg/controller/client_builder.go#L111

In k8s the Account is created here https://github.com/kubernetes/kubernetes/blob/22ff2673249d39f4559aa16f985c15cb4b66488c/plugin/pkg/auth/authorizer/rbac/bootstrappolicy/testdata/controller-role-bindings.yaml#L17

# kubectl describe clusterrolebindings/system:controller:attachdetach-controller
Name:         system:controller:attachdetach-controller
Labels:       kubernetes.io/bootstrapping=rbac-defaults
Annotations:  rbac.authorization.kubernetes.io/autoupdate: true
Role:
  Kind:  ClusterRole
  Name:  system:controller:attachdetach-controller
Subjects:
  Kind            Name                     Namespace
  ----            ----                     ---------
  ServiceAccount  attachdetach-controller  kube-system
# kubectl describe clusterroles/system:controller:attachdetach-controller
Name:         system:controller:attachdetach-controller
Labels:       kubernetes.io/bootstrapping=rbac-defaults
Annotations:  rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
  Resources                         Non-Resource URLs  Resource Names  Verbs
  ---------                         -----------------  --------------  -----
  volumeattachments.storage.k8s.io  []                 []              [create delete get list watch]
  events                            []                 []              [create patch update]
  nodes                             []                 []              [get list watch]
  csidrivers.storage.k8s.io         []                 []              [get list watch]
  persistentvolumeclaims            []                 []              [list watch]
  persistentvolumes                 []                 []              [list watch]
  pods                              []                 []              [list watch]
  nodes/status                      []                 []              [patch update]

I will try to prepare a PR with such changes!

PS: I installed 0.6.1 and retried it. Worked out of the box.

@erikwilson
Copy link
Contributor

Thanks for looking into this @DracoBlue! Were you able to make any progress?

It does seem related to the attach detach controller logic somehow, found this issue piraeusdatastore/linstor-csi#4 which has a similar error when that controller is disabled. The controller does run and a attachdetach-controller service account exists on the system tho.

@erikwilson
Copy link
Contributor

Turning on more debugging with -v 9 shows that the node authorizer is attempted but falls back to RBAC:

I0825 21:22:34.910001   24619 node_authorizer.go:161] NODE DENY: k3s-1 &authorizer.AttributesRecord{User:(*user.DefaultInfo)(0xc004d41180), Verb:"create", Namespace:"", APIGroup:"storage.k8s.io", APIVersion:"v1", Resource:"volumeattachments", Subresource:"", Name:"", ResourceRequest:true, Path:"/apis/storage.k8s.io/v1/volumeattachments"}
I0825 21:22:34.910100   24619 rbac.go:118] RBAC DENY: user "system:node:k3s-1" groups ["system:nodes" "system:authenticated"] cannot "create" resource "volumeattachments.storage.k8s.io" cluster-wide
I0825 21:22:34.910113   24619 authorization.go:73] Forbidden: "/apis/storage.k8s.io/v1/volumeattachments", Reason: "can only get individual resources of this type"

It looks like the node authorizer is hard coded to reject the request: https://github.com/kubernetes/kubernetes/blob/v1.14.6/plugin/pkg/auth/authorizer/node/node_authorizer.go#L161

@erikwilson
Copy link
Contributor

Similarly we are running with --use-service-account-credentials=true for controller-manager and appear to be using the attach detach controller service account.

I think like @vitobotta found an RBAC rule is going to be the easiest fix at the moment. Instead of modifying the system:node role for create volume attachment and binding system:nodes to it I just created a new role and binding for system:nodes with minimal permissions:

kubectl apply -f - <<EOF

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:nodes:volumeattachments
rules:
- apiGroups:
  - storage.k8s.io
  resources:
  - volumeattachments
  verbs:
  - create
  - watch

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:nodes:volumeattachments
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:nodes:volumeattachments
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:nodes

EOF

@DracoBlue
Copy link

@erikwilson yup! So I don‘t try to fix this at the other end. The ClusterRolebinding will be shipped with k8s then and CSI should be usable with >0.6.1 again ;)

@ilyasotkov
Copy link
Author

Similarly we are running with --use-service-account-credentials=true for controller-manager and appear to be using the attach detach controller service account.

The service account is being created but not used, which is the core of the problem.

The request (create volumeattachment) in k3s is coming from from system:node:<nodeName> but should be coming from attachdetach-controller service account.

@erikwilson
Copy link
Contributor

I agree that it is weird that system:node:<nodeName> auth is being used, looking through the kubelet code it is deferring to attachdetach-controller for the attach so not sure how those credentials end up being used. I think that service account is being used for other stuff because if you remove the role binding there will be other issues. I think the RBAC change is just a temporary stop-gap, I would like to get to the bottom of what is going on here.

@erikwilson
Copy link
Contributor

This has something to do with us running a combined binary, it looks like some CSI operations are being picked up by the kubelet instead of the attachdetach controller. If you run the server with --disable-agent and run a separate agent the CSI attach will work correctly. Stack trace of attacher code:

github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/csi.(*csiAttacher).Attach(0xc006719e30, 0xc00669cc20, 0xc00453019b, 0x5, 0xc008cfeb30, 0x10, 0x10, 0x2f99aa0)
        /Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/csi/csi_attacher.go:90 +0x2d6
github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/operationexecutor.(*operationGenerator).GenerateAttachVolumeFunc.func2(0x8, 0x3557a98, 0xc004458d00, 0xc002a07720)
        /Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/operationexecutor/operation_generator.go:346 +0x9b
github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations.(*nestedPendingOperations).Run.func1(0xc006f76200, 0xc002a07720, 0x4e, 0x0, 0x0, 0xc0087e18c0, 0xc00659de00, 0xc0087e1940, 0x38, 0x40, ...)
        /Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations/nestedpendingoperations.go:143 +0x146
created by github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations.(*nestedPendingOperations).Run
        /Users/erik/go/src/github.com/rancher/k3s/vendor/k8s.io/kubernetes/pkg/volume/util/nestedpendingoperations/nestedpendingoperations.go:130 +0x2ce

@erikwilson
Copy link
Contributor

The issue appears to be that basically csi_plugin.go is maintaining some global state that makes it difficult for us to run kubelet and the attachdetach controller in the same process.

We tried to work around it with 4ce15bb, which was okay with the previous node permissions but with the node authorizer is now having issues. Basically there are three global variables in the upstream csi_plugin.go code: csiDrivers, nim, and PluginHandler, which need to be made into instance variables for the kubelet and attachdetach routines (see note here).

Still trying to figure out the best way to approach the problem, unfortunately getting rid of those global variables appears not to be easy, suggestions welcome! :)

@erikwilson
Copy link
Contributor

This will be fixed with the next release and using k8s v1.15. The nim package variable is the main culprit here since it contains host information which is different for the attachdetach controller and kubelet. In general we can say that package level state like this is a bad idea, unfortunately there is no good way for us to audit the code for variables like this. We could try forking to help isolate processes but that will likely come with an additional memory cost.

@vitobotta
Copy link

Hi, is there a planned date for the next release? Thanks!

@vitobotta
Copy link

Yes! I just tried 0.9.0 RC2 and I didn't need the hack this time. Thanks! :)

@ilyasotkov
Copy link
Author

Closing as this has been fixed in the v0.9.0 release.

@smartin015
Copy link

FWIW I ran across this issue today - had a test v1.19.7+k3s1 cluster running Longhorn, and for reasons decided to run sudo service k3s restart and sudo service k3s-agent restart on each of my nodes in sequence.

One of the pods kept failing with MountVolume.WaitForAttach failed for volume "pvc-da55e64f-4928-4a78-bb15-2b811712a17d" : volume pvc-da55e64f-4928-4a78-bb15-2b811712a17d has GET error for volume attachment csi-03f2267d55258d7abc4cfa13778de5e53882fa7d96d2d1decb616cb26b9d1472: volumeattachments.storage.k8s.io "csi-03f2267d55258d7abc4cfa13778de5e53882fa7d96d2d1decb616cb26b9d1472" is forbidden: User "system:node:hostname" cannot get resource "volumeattachments" in API group "storage.k8s.io" at the cluster scope: no relationship found between node 'hostname' and this object

I modified and pushed a new PVC, adding a "2" to the end of the metadata.name field, then edited the deployment spec.template.spec.volumes.persistentVolumeClaim.claimName to match and pushed that... the pod came up just fine.

Finally, I edited the deployment config's volume claim name back to the original, and it appears to be attaching again now (logs appear normal, and I can kubectl exec into the container and read the mounted files which appear to be intact).

I have no idea how things got into this state - but I hope this helps other folks who come across this problem.

@soakes
Copy link

soakes commented Apr 5, 2022

Similarly we are running with --use-service-account-credentials=true for controller-manager and appear to be using the attach detach controller service account.

I think like @vitobotta found an RBAC rule is going to be the easiest fix at the moment. Instead of modifying the system:node role for create volume attachment and binding system:nodes to it I just created a new role and binding for system:nodes with minimal permissions:

kubectl apply -f - <<EOF

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: system:nodes:volumeattachments
rules:
- apiGroups:
  - storage.k8s.io
  resources:
  - volumeattachments
  verbs:
  - create
  - watch

---

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: system:nodes:volumeattachments
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:nodes:volumeattachments
subjects:
- apiGroup: rbac.authorization.k8s.io
  kind: Group
  name: system:nodes

EOF

I also hit the same problem using longhorn 1.2.4 (latest to date) with k3s v1.22.8+k3s1 when ANY pod which were using longhorns RWX mode failed to start with the same error listed by the @ilyasotkov . This happened when the node was rebooted for kernel updates where in this case the k3s node is just a standalone machine and not part of a cluster so it could not push the pods to a surviving node.

Applying the fix as mentioned by creating the yaml file and deleteing/applying the deployment brought back the effected pods which resolved the problem. No data loss suffered.

All other pods which use RWO mode was uneffected and did not crash, so likely there is a bug with the RWX implmention. In any case anyone suffering from this problem, the solution @erikwilson worked. Thank you.

@palarconGit
Copy link

I faced the same problem, but unfortunately given more rights regarding volume attachment to cluster role as suggested by [vitobotta] here above did not help to solve the issue.

There were 2 pods that raised this error.
I noticed that each pod is deployed cluster node 2, but they try to attach volumes that already exist and
that were attach on cluster node 1.
I had the idea to force Kubernetes to deploy the pods on node 1 by adding cluster node 1 affinity to their deployment.
(Documentation here: https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/)
After deleting pods in error with a kubectl command, the pods were deployed on cluster node 1 and
managed to attach the volumes.
This solved the issue for me.

Hope this will help.

@Sierra1011
Copy link

I modified and pushed a new PVC, adding a "2" to the end of the metadata.name field, then edited the deployment spec.template.spec.volumes.persistentVolumeClaim.claimName to match and pushed that... the pod came up just fine.

Finally, I edited the deployment config's volume claim name back to the original, and it appears to be attaching again now (logs appear normal, and I can kubectl exec into the container and read the mounted files which appear to be intact).

I have no idea how things got into this state - but I hope this helps other folks who come across this problem.

Running Longhorn on a K3s 1.24.9 cluster and woke up to this issue, you just saved my morning/day! 🙌

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

10 participants