I've upgraded to Trident v23.04 via helm and got the following error in the trident-controller pod in main container:
chmod /plugin/: operation not permitted
leaving the pod in CrashLoopBackOff.
The article [1] says it may be related to the helm chart and I should install via tridentctl.
First try
- I removed trident v23.04 via helm
- I reinstalled trident v23.04 via tridentctl
- Nothing changed
Trial and Error
- I disabled SELinux via
setentforce permissive
- I deleted CRD TridentVersion instances and tried many times to reinstall trident.
- I tried fixing permissions in
/var/lib/kubelet/pods/<pod-uuid>/volumes/kubernetes.io~empty-dir/socket-dir on the node
- Nothing changed
Downgrade
- I uninstalled trident v23.04 via tridentctl
- I installed trident v22.10 via helm
- trident-csi Pod was running but the DeamonSet
trident-csi-node wasn't running anymore:
Error creating: pods "trident-csi-" is forbidden: unable to validate against any security context constraint.
Another chance
- I uninstalled trident v22.10 via helm
- I installed trident v22.10 via tridentctl
- Nothing changed
Solution
- I copied the SCCs
trident-node-linux and trident-controller from a working cluster
- I added the SCCs to the trident namespaces as described in article [2]
- DeamonSet trident-csi v22.10 came up
Upgrade
- I uninstalled trident v22.10 via tridentctl
- I installed trident v23.04 via tridentctl
- Still working
Helm Upgrade
- I uninstalled trident v23.04 via tridentctl
- I installed trident v23.04 via helm
- Still working, error is finally gone
Possible reason
- tridentctl nor helm are installing missing SCCs
- tridentctl nor helm are not detecting clean installations
References
[1] https://kb.netapp.com/Cloud/Astra/Trident/Upgrade_of_Trident_fails_pod_communications_with_plug-in_dir_chmod_error
[2] https://kb.netapp.com/Cloud/Astra/Trident/The_trident-csi_pods_are_not_rebuilt%2C_erroring_on_security_constraints_in_Openshift
Environment
- Trident version: v23.04, v22.10
- Trident installation flags used:
tridentctl install -n trident
- Kubernetes version: v1.24
- Kubernetes orchestrator: OKD 4.11.x
I've upgraded to Trident v23.04 via helm and got the following error in the
trident-controllerpod inmaincontainer:leaving the pod in CrashLoopBackOff.
The article [1] says it may be related to the helm chart and I should install via
tridentctl.First try
Trial and Error
setentforce permissive/var/lib/kubelet/pods/<pod-uuid>/volumes/kubernetes.io~empty-dir/socket-diron the nodeDowngrade
trident-csi-nodewasn't running anymore:Another chance
Solution
trident-node-linuxandtrident-controllerfrom a working clusterUpgrade
Helm Upgrade
Possible reason
References
[1] https://kb.netapp.com/Cloud/Astra/Trident/Upgrade_of_Trident_fails_pod_communications_with_plug-in_dir_chmod_error
[2] https://kb.netapp.com/Cloud/Astra/Trident/The_trident-csi_pods_are_not_rebuilt%2C_erroring_on_security_constraints_in_Openshift
Environment
tridentctl install -n trident