-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CSI plugin functionality is broken: "attacher.Attach failed: volumeattachments.storage.k8s.io is forbidden" #732
Comments
I tried with azures hostpath from https://github.com/Azure/kubernetes-volume-drivers/tree/master/csi/hostpath and retrieved the same message on k3s:
0.6.1: works |
Same problem here 😔 |
As a temporary workaround, are there any changes I can make to permissions or something to make it work? Thanks |
I don't know much about authentication and authotirzation yet... but I was playing a little and got a volume attached and mounted by doing this:
I have no idea of what I'm doing but with this change it works LOL :D the volume is mounted and working correctly. Is this something that should be done in K3s? |
@vitobotta I see two possible issues with your workaround (which isn't to say you shouldn't use it, just FYI):
If those are acceptable trade-offs, then I think you're good to go (no guarantees though: I may be missing some other reason) |
Hi @costela , like I said I had no idea of what I was doing :D Anyway it's just me using my small clusters, so I would be worried about the first point only. I'm going to try and restart k3s and see if I lose the changes I made. |
Just tried restarting K3s on both master and workers, and the changes are still there and volumes still attach fine. |
@vitobotta cool, so auto-reconciliation isn't an issue. As for your other question:
Probably not. The |
I checked on another cluster deployed with Rancher (not K3s) and I see the same thing about that group. So if this isn't something that should be fixed in k3s... where should it? |
@vitobotta The My current bet is on the service-account token auth. I suspect the |
@vitobotta the link above also says
so the auto-reconciliation issue would be easily avoidable if ever present. But it's a hack ;) |
I agree, I think we can pin the issue on the fact that in k3s I also dug a bit through the Kubernetes codebase and the process goes like this: attachdetach-controller finds and initializes the csiPlugin via the VolumePluginMgr, passing its credentials to the plugin. The csiAttacher of the csiPlugin would then attempt to create a volumeattachment in this line: https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_attacher.go#L105 and fail in k3s. |
The thing is that I need to create a couple of clusters for development asap and I would like to use K3s because it's cheaper to run. I have spent quite a bit of time so far learning Kubernetes (I'll still a noob sadly) so I would like to go back to coding... Is it safe to use the hack I posted earlier or do I risk for example that future versions of K3s might screw it up? @ilyasotkov from what you say it seems that the problem is with K3s, if vanilla Kubernetes has all the correct permissions. Right? |
@vitobotta It's without doubt a k3s issue because I tried it with a vanilla Kubernetes cluster (installed via kubespray / kubeadm on identical Hetzner Cloud infrastructure) and didn't face any similar issue. |
Yeah I tried it with Kubernetes deployed with Rancher and didn't have any problems. |
Is there a solution, which can be incorporated into k3s to resolve:
in the suggested workaround? I couldn’t find the source, where k3s does magic stuff to initialize the attacher with the wrong service account. Any hint where I can find this? |
@DracoBlue if we knew exactly where the error was, it would probably be fixed already 😉 As for the workaround problem: no, I don't think there's a solution for it, other than fixing the |
I'm also seeing this issue with my the Digital Ocean CSI driver. I actually had it all working then destroyed my instance and started over again for automation and now it doesn't work. Based on the releases page I believe I was using 0.7.0. So it must of broken in the 0.8.0 release. |
This appears to be broken in v0.7.0. That version introduced a certs refactor which should utilize node authorization (https://kubernetes.io/docs/reference/access-authn-authz/node/). I am not sure if kubespray uses node authorization, but from a quick look it appears not. |
The NodeRestriction docs at https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction say:
If I understand your responses correct, the right way would be to ensure there is an extra service account for the Looks like authorization ist created here https://github.com/kubernetes/kubernetes/blob/9c973c6d2c33e88521ebcebec2fdd9cbddccd857/pkg/controller/client_builder.go#L111 In k8s the Account is created here https://github.com/kubernetes/kubernetes/blob/22ff2673249d39f4559aa16f985c15cb4b66488c/plugin/pkg/auth/authorizer/rbac/bootstrappolicy/testdata/controller-role-bindings.yaml#L17
I will try to prepare a PR with such changes! PS: I installed 0.6.1 and retried it. Worked out of the box. |
Thanks for looking into this @DracoBlue! Were you able to make any progress? It does seem related to the attach detach controller logic somehow, found this issue piraeusdatastore/linstor-csi#4 which has a similar error when that controller is disabled. The controller does run and a attachdetach-controller service account exists on the system tho. |
Turning on more debugging with
It looks like the node authorizer is hard coded to reject the request: https://github.com/kubernetes/kubernetes/blob/v1.14.6/plugin/pkg/auth/authorizer/node/node_authorizer.go#L161 |
Similarly we are running with I think like @vitobotta found an RBAC rule is going to be the easiest fix at the moment. Instead of modifying the kubectl apply -f - <<EOF
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:nodes:volumeattachments
rules:
- apiGroups:
- storage.k8s.io
resources:
- volumeattachments
verbs:
- create
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:nodes:volumeattachments
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:nodes:volumeattachments
subjects:
- apiGroup: rbac.authorization.k8s.io
kind: Group
name: system:nodes
EOF |
@erikwilson yup! So I don‘t try to fix this at the other end. The ClusterRolebinding will be shipped with k8s then and CSI should be usable with >0.6.1 again ;) |
The service account is being created but not used, which is the core of the problem. The request (create volumeattachment) in k3s is coming from from |
I agree that it is weird that |
This has something to do with us running a combined binary, it looks like some CSI operations are being picked up by the kubelet instead of the attachdetach controller. If you run the server with
|
The issue appears to be that basically We tried to work around it with 4ce15bb, which was okay with the previous node permissions but with the node authorizer is now having issues. Basically there are three global variables in the upstream Still trying to figure out the best way to approach the problem, unfortunately getting rid of those global variables appears not to be easy, suggestions welcome! :) |
This will be fixed with the next release and using k8s v1.15. The |
Hi, is there a planned date for the next release? Thanks! |
Yes! I just tried 0.9.0 RC2 and I didn't need the hack this time. Thanks! :) |
Closing as this has been fixed in the v0.9.0 release. |
FWIW I ran across this issue today - had a test v1.19.7+k3s1 cluster running Longhorn, and for reasons decided to run One of the pods kept failing with I modified and pushed a new PVC, adding a "2" to the end of the Finally, I edited the deployment config's volume claim name back to the original, and it appears to be attaching again now (logs appear normal, and I can I have no idea how things got into this state - but I hope this helps other folks who come across this problem. |
I also hit the same problem using longhorn Applying the fix as mentioned by creating the yaml file and deleteing/applying the deployment brought back the effected pods which resolved the problem. No data loss suffered. All other pods which use |
I faced the same problem, but unfortunately given more rights regarding volume attachment to cluster role as suggested by [vitobotta] here above did not help to solve the issue. There were 2 pods that raised this error. Hope this will help. |
Running Longhorn on a K3s 1.24.9 cluster and woke up to this issue, you just saved my morning/day! 🙌 |
Bug
When installing a CSI Driver, PVCs and PVs are created but will not attach to a pod with this error:
To Reproduce
Expected behavior
PVs are able to attach to a pod without issue
Additional context
The issue has been reported for multiple CSI drivers and all things point to k3s. I can personally confirm that this works normally when used with a kubespray cluster.
hetznercloud/csi-driver: hetznercloud/csi-driver#46
longhorn CSI driver: https://forums.rancher.com/t/longhorn-on-k3s-pv-attach-error/14920
The text was updated successfully, but these errors were encountered: