Describe the bug
Following a fresh Trident installation and backend creation according to the NetApp Trident Backend Configuration, we had an issue with volume creation and mounting on the NetApp backend (CVO).
Despite the backend's successful creation, the expected volume still needs to be created on the backend. Instead, a "magic" volume appears mounted in the pod as the PVC, which is of "tmpfs" type rather than the anticipated shared volume from the NetApp backend.
The Trident operator, controller, and node pods fail to bind to the "ontap-nas" storage class and do not create a volume on the NetApp backend upon PV or PVC creation. Although the PV, PVC, and pod are successfully created, the NFS shared NetApp volume is not displayed in the pod.
Trident-controller pod logs show errors and warnings potentially related to this issue.
Environment
Provide accurate information about the environment to help us reproduce the issue.
- Trident version: 23.10
- Trident installation flags used:
- Container runtime: containerd://1.6.18
- Kubernetes version: v1.28.1
- Kubernetes orchestrator: GKE v1.26.6-gke.1700
- Kubernetes enabled feature gates:
- OS: Container-Optimized OS from Google (Kernel Version: 5.15.107+)
- NetApp backend types: CVO NetApp Release 9.14.0: Sun Jul 30 11:19:35 UTC 2023
- Other:
To Reproduce
kubectl apply -f merged_manifests.yml -n trident
apiVersion: v1
kind: Secret
metadata:
name: backend-tbc-ontap-nas-secret
type: Opaque
stringData:
username: <redacted>
password: <redacted>
---
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
name: backend-tbc-ontap-nas
spec:
version: 1
backendName: ontap-nas-backend
storageDriverName: ontap-nas
managementLIF: <redacted>
dataLIF: <redacted>
svm: <svm>
credentials:
name: backend-tbc-ontap-nas-secret
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: ontapnasudp
provisioner: csi.trident.netapp.io
mountOptions: ["rwx", "nfsvers=3", "proto=udp"]
parameters:
backendType: "ontap-nas"
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-storage
labels:
type: local
spec:
storageClassName: ontapnasudp
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/tmp/trident-test"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
name: pvc-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: ontapnasudp
---
kind: Pod
apiVersion: v1
metadata:
name: pv-pod
spec:
volumes:
- name: pv-storage
persistentVolumeClaim:
claimName: pvc-storage
containers:
- name: pv-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/tmp/trident-test"
name: pv-storage
Expected behavior
Upon successful creation of the Trident backend, it is expected that a volume would be created on the NetApp backend (CVO) that corresponds to any PV or PVC created in the Kubernetes cluster. The Trident operator, controller, and node pods should bind to the "ontap-nas" storage class and initiate the volume creation on the backend.
Once the PV, PVC, and pod are created, the NFS shared NetApp volume should be mounted in the pod and be visible when inspecting the pod's volume details. Thus, the expected behavior is a seamless creation and mounting of NetApp volumes in the Kubernetes pods through the Trident operator.
~/repo/ kubectl get tridentbackendconfigs -n trident-helm
NAME BACKEND NAME BACKEND UUID PHASE STATUS
backend-tbc-ontap-nas ontap-nas-backend 54168d75-18e7-46e9-8b1a-d50a467c6aab Bound Success
~/repo/ kubectl get tridentbackendconfigs -n trident-helm -o yaml
apiVersion: v1
items:
- apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
creationTimestamp: "2023-11-30T11:06:38Z"
finalizers:
- trident.netapp.io
generation: 1
name: backend-tbc-ontap-nas
namespace: trident-helm
resourceVersion: "548627922"
uid: d32f26e6-b942-4243-a9d3-007b0786c10d
spec:
backendName: ontap-nas-backend
credentials:
name: backend-tbc-ontap-nas-secret
dataLIF: <redacted>
managementLIF: <redacted>
storageDriverName: ontap-nas
svm: <redacted>
version: 1
status:
backendInfo:
backendName: ontap-nas-backend
backendUUID: 54168d75-18e7-46e9-8b1a-d50a467c6aab
deletionPolicy: delete
lastOperationStatus: Success
message: Backend 'ontap-nas-backend' created
phase: Bound
kind: List
metadata:
resourceVersion: ""
Storage Class:
~/repo/ kubectl get sc ontapnasudp -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"ontapnasudp"},"mountOptions":["rwx","nfsvers=3","proto=udp"],"parameters":{"backendType":"ontap-nas"},"provisioner":"csi.trident.netapp.io","volumeBindingMode":"Immediate"}
creationTimestamp: "2023-11-30T12:32:00Z"
name: ontapnasudp
resourceVersion: "548675493"
uid: 29728bf2-dd12-45ac-9111-abe15a6c25f2
mountOptions:
- rwx
- nfsvers=3
- proto=udp
parameters:
backendType: ontap-nas
provisioner: csi.trident.netapp.io
reclaimPolicy: Delete
volumeBindingMode: Immediate
Physical Volume (PV)
~/repo/ kubectl get pv -n trident -o yaml pv-storage
apiVersion: v1
kind: PersistentVolume
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"labels":{"type":"local"},"name":"pv-storage"},"spec":{"accessModes":["ReadWriteMany"],"capacity":{"storage":"10Gi"},"hostPath":{"path":"/tmp/trident-test"},"storageClassName":"ontapnasudp"}}
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2023-11-30T12:33:09Z"
finalizers:
- kubernetes.io/pv-protection
labels:
type: local
name: pv-storage
resourceVersion: "548676147"
uid: 7519b4d8-8f29-4a09-bad8-412082192e20
spec:
accessModes:
- ReadWriteMany
capacity:
storage: 10Gi
claimRef:
apiVersion: v1
kind: PersistentVolumeClaim
name: pvc-storage
namespace: trident
resourceVersion: "548676145"
uid: d2e363e4-79f4-4f79-bdb2-c4374e37e2ad
hostPath:
path: /tmp/trident-test
type: ""
persistentVolumeReclaimPolicy: Retain
storageClassName: ontapnasudp
volumeMode: Filesystem
status:
phase: Bound
Physical VOlume Claim (PVC)
~/repo/ kubectl get pvc -n trident -o yaml pvc-storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-storage","namespace":"trident"},"spec":{"accessModes":["ReadWriteMany"],"resources":{"requests":{"storage":"10Gi"}},"storageClassName":"ontapnasudp"}}
pv.kubernetes.io/bind-completed: "yes"
pv.kubernetes.io/bound-by-controller: "yes"
creationTimestamp: "2023-11-30T12:33:09Z"
finalizers:
- kubernetes.io/pvc-protection
name: pvc-storage
namespace: trident
resourceVersion: "548676149"
uid: d2e363e4-79f4-4f79-bdb2-c4374e37e2ad
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 10Gi
storageClassName: ontapnasudp
volumeMode: Filesystem
volumeName: pv-storage
status:
accessModes:
- ReadWriteMany
capacity:
storage: 10Gi
phase: Bound
the mount in pv-pod (the pod)
root@pv-pod:/# df -h | grep trid
tmpfs 3.9G 4.0K 3.9G 1% /tmp/trident-test
Additional context
We have errors, and some additional information from the trident-controller pod:
csi-attacher I1205 10:46:30.285060 1 connection.go:201] GRPC error: <nil> csi-attacher I1205 10:47:30.287315 1 connection.go:201] GRPC error: <nil> │
│ csi-attacher I1205 10:48:30.294277 1 connection.go:201] GRPC error: <nil> │
│ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work │
│ flow="core=version"
csi-attacher W1205 10:48:30.294433 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │
│ csi-attacher I1205 10:48:30.294442 1 csi_handler.go:740] Found NodeID <redacted> in CSINode <redacted> │
│ csi-attacher W1205 10:48:30.294461 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call received." Duration="10.192µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion logLayer=rest_frontend requestID=1 │
│ 9bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="Getting Trident-ACP version." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=vers │
│ ion" │
│ trident-main time="2023-12-05T10:48:31Z" level=warning msg="ACP is not enabled." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=version" │
│ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work │
│ flow="core=version" │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call complete." Duration="978.427µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion StatusCode=200 logLayer=rest_fron │
│ tend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=7a27627a-c480-446c-a35a-9addc41b169 │
│ 2 requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=d33b8f54-6142-4e3d-bc0b-f9458a7c37e │
│ c requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper has no record of the updated storage class; instead it will try to create it." logLayer=csi_frontend name=ontapnasudp paramet │
│ ers="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow="storage_class=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-b083af15-zn8s requestID=86a4ae35-eeb7-4340-a415-929d93b662c │
│ f requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-88607822-luiw requestID=8ea3a1cc-0807-4ae3-893e-ddb8f4ed476 │
│ 6 requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper could not add a storage class: object is being deleted: tridentstorageclasses.trident.netapp.io "ontapnasudp" already exist │
│ s" logLayer=csi_frontend name=ontapnasudp parameters="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow │
│ ="storage_class=update" │
│ trident-main time="2023-12-05T10:48:49Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=e3db19b3-4afd-441d-85dd-37022ed9831 │
│ b requestSource=Kubernetes workflow="node=update"
Describe the bug
Following a fresh Trident installation and backend creation according to the NetApp Trident Backend Configuration, we had an issue with volume creation and mounting on the NetApp backend (CVO).
Despite the backend's successful creation, the expected volume still needs to be created on the backend. Instead, a "magic" volume appears mounted in the pod as the PVC, which is of "tmpfs" type rather than the anticipated shared volume from the NetApp backend.
The Trident operator, controller, and node pods fail to bind to the "ontap-nas" storage class and do not create a volume on the NetApp backend upon PV or PVC creation. Although the PV, PVC, and pod are successfully created, the NFS shared NetApp volume is not displayed in the pod.
Trident-controller pod logs show errors and warnings potentially related to this issue.
Environment
Provide accurate information about the environment to help us reproduce the issue.
To Reproduce
kubectl apply -f merged_manifests.yml -n trident
Expected behavior
Upon successful creation of the Trident backend, it is expected that a volume would be created on the NetApp backend (CVO) that corresponds to any PV or PVC created in the Kubernetes cluster. The Trident operator, controller, and node pods should bind to the "ontap-nas" storage class and initiate the volume creation on the backend.
Once the PV, PVC, and pod are created, the NFS shared NetApp volume should be mounted in the pod and be visible when inspecting the pod's volume details. Thus, the expected behavior is a seamless creation and mounting of NetApp volumes in the Kubernetes pods through the Trident operator.
Storage Class:
Physical Volume (PV)
Physical VOlume Claim (PVC)
the mount in pv-pod (the pod)
Additional context
We have errors, and some additional information from the trident-controller pod:
csi-attacher I1205 10:46:30.285060 1 connection.go:201] GRPC error:
<nil>csi-attacher I1205 10:47:30.287315 1 connection.go:201] GRPC error:<nil>││ csi-attacher I1205 10:48:30.294277 1 connection.go:201] GRPC error:
<nil>││ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="
<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work ││ flow="core=version"
csi-attacher W1205 10:48:30.294433 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │
│ csi-attacher I1205 10:48:30.294442 1 csi_handler.go:740] Found NodeID
<redacted>in CSINode<redacted>││ csi-attacher W1205 10:48:30.294461 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call received." Duration="10.192µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion logLayer=rest_frontend requestID=1 │
│ 9bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="Getting Trident-ACP version." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=vers │
│ ion" │
│ trident-main time="2023-12-05T10:48:31Z" level=warning msg="ACP is not enabled." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=version" │
│ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="
<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work ││ flow="core=version" │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call complete." Duration="978.427µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion StatusCode=200 logLayer=rest_fron │
│ tend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=
<redacted>requestID=7a27627a-c480-446c-a35a-9addc41b169 ││ 2 requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=
<redacted>requestID=d33b8f54-6142-4e3d-bc0b-f9458a7c37e ││ c requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper has no record of the updated storage class; instead it will try to create it." logLayer=csi_frontend name=ontapnasudp paramet │
│ ers="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow="storage_class=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-b083af15-zn8s requestID=86a4ae35-eeb7-4340-a415-929d93b662c │
│ f requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-88607822-luiw requestID=8ea3a1cc-0807-4ae3-893e-ddb8f4ed476 │
│ 6 requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper could not add a storage class: object is being deleted: tridentstorageclasses.trident.netapp.io "ontapnasudp" already exist │
│ s" logLayer=csi_frontend name=ontapnasudp parameters="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow │
│ ="storage_class=update" │
│ trident-main time="2023-12-05T10:48:49Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=
<redacted>requestID=e3db19b3-4afd-441d-85dd-37022ed9831 ││ b requestSource=Kubernetes workflow="node=update"