Skip to content

Issue creating a volume and mounting to a pod in GKE with Trident  #874

@andreasvandaalen

Description

@andreasvandaalen

Describe the bug
Following a fresh Trident installation and backend creation according to the NetApp Trident Backend Configuration, we had an issue with volume creation and mounting on the NetApp backend (CVO).

Despite the backend's successful creation, the expected volume still needs to be created on the backend. Instead, a "magic" volume appears mounted in the pod as the PVC, which is of "tmpfs" type rather than the anticipated shared volume from the NetApp backend.

The Trident operator, controller, and node pods fail to bind to the "ontap-nas" storage class and do not create a volume on the NetApp backend upon PV or PVC creation. Although the PV, PVC, and pod are successfully created, the NFS shared NetApp volume is not displayed in the pod.

Trident-controller pod logs show errors and warnings potentially related to this issue.

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: 23.10
  • Trident installation flags used:
  • Container runtime: containerd://1.6.18
  • Kubernetes version: v1.28.1
  • Kubernetes orchestrator: GKE v1.26.6-gke.1700
  • Kubernetes enabled feature gates:
  • OS: Container-Optimized OS from Google (Kernel Version: 5.15.107+)
  • NetApp backend types: CVO NetApp Release 9.14.0: Sun Jul 30 11:19:35 UTC 2023
  • Other:

To Reproduce
kubectl apply -f merged_manifests.yml -n trident

apiVersion: v1
kind: Secret
metadata:
  name: backend-tbc-ontap-nas-secret
type: Opaque
stringData:
  username: <redacted>
  password: <redacted>
---
apiVersion: trident.netapp.io/v1
kind: TridentBackendConfig
metadata:
  name: backend-tbc-ontap-nas
spec:
  version: 1
  backendName: ontap-nas-backend
  storageDriverName: ontap-nas
  managementLIF: <redacted>
  dataLIF: <redacted>
  svm: <svm>
  credentials:
    name: backend-tbc-ontap-nas-secret
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: ontapnasudp
provisioner: csi.trident.netapp.io
mountOptions: ["rwx", "nfsvers=3", "proto=udp"]
parameters:
  backendType: "ontap-nas"
---
apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv-storage
  labels:
    type: local
spec:
  storageClassName: ontapnasudp
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/tmp/trident-test"
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: pvc-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 1Gi
  storageClassName: ontapnasudp
---
kind: Pod
apiVersion: v1
metadata:
  name: pv-pod
spec:
  volumes:
    - name: pv-storage
      persistentVolumeClaim:
       claimName: pvc-storage
  containers:
    - name: pv-container
      image: nginx
      ports:
        - containerPort: 80
          name: "http-server"
      volumeMounts:
        - mountPath: "/tmp/trident-test"
          name: pv-storage

Expected behavior

Upon successful creation of the Trident backend, it is expected that a volume would be created on the NetApp backend (CVO) that corresponds to any PV or PVC created in the Kubernetes cluster. The Trident operator, controller, and node pods should bind to the "ontap-nas" storage class and initiate the volume creation on the backend.

Once the PV, PVC, and pod are created, the NFS shared NetApp volume should be mounted in the pod and be visible when inspecting the pod's volume details. Thus, the expected behavior is a seamless creation and mounting of NetApp volumes in the Kubernetes pods through the Trident operator.

 ~/repo/ kubectl get tridentbackendconfigs -n trident-helm
NAME                    BACKEND NAME        BACKEND UUID                           PHASE   STATUS
backend-tbc-ontap-nas   ontap-nas-backend   54168d75-18e7-46e9-8b1a-d50a467c6aab   Bound   Success
 ~/repo/ kubectl get tridentbackendconfigs -n trident-helm -o yaml
apiVersion: v1
items:
- apiVersion: trident.netapp.io/v1
  kind: TridentBackendConfig
  metadata:
    creationTimestamp: "2023-11-30T11:06:38Z"
    finalizers:
    - trident.netapp.io
    generation: 1
    name: backend-tbc-ontap-nas
    namespace: trident-helm
    resourceVersion: "548627922"
    uid: d32f26e6-b942-4243-a9d3-007b0786c10d
  spec:
    backendName: ontap-nas-backend
    credentials:
      name: backend-tbc-ontap-nas-secret
    dataLIF: <redacted> 
    managementLIF: <redacted>
    storageDriverName: ontap-nas
    svm: <redacted>
    version: 1
  status:
    backendInfo:
      backendName: ontap-nas-backend
      backendUUID: 54168d75-18e7-46e9-8b1a-d50a467c6aab
    deletionPolicy: delete
    lastOperationStatus: Success
    message: Backend 'ontap-nas-backend' created
    phase: Bound
kind: List
metadata:
  resourceVersion: ""

Storage Class:

 ~/repo/ kubectl get sc ontapnasudp -o yaml
allowVolumeExpansion: true
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"allowVolumeExpansion":true,"apiVersion":"storage.k8s.io/v1","kind":"StorageClass","metadata":{"annotations":{},"name":"ontapnasudp"},"mountOptions":["rwx","nfsvers=3","proto=udp"],"parameters":{"backendType":"ontap-nas"},"provisioner":"csi.trident.netapp.io","volumeBindingMode":"Immediate"}
  creationTimestamp: "2023-11-30T12:32:00Z"
  name: ontapnasudp
  resourceVersion: "548675493"
  uid: 29728bf2-dd12-45ac-9111-abe15a6c25f2
mountOptions:
- rwx
- nfsvers=3
- proto=udp
parameters:
  backendType: ontap-nas
provisioner: csi.trident.netapp.io
reclaimPolicy: Delete
volumeBindingMode: Immediate

Physical Volume (PV)

~/repo/ kubectl get pv -n trident -o yaml pv-storage
apiVersion: v1
kind: PersistentVolume
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolume","metadata":{"annotations":{},"labels":{"type":"local"},"name":"pv-storage"},"spec":{"accessModes":["ReadWriteMany"],"capacity":{"storage":"10Gi"},"hostPath":{"path":"/tmp/trident-test"},"storageClassName":"ontapnasudp"}}
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2023-11-30T12:33:09Z"
  finalizers:
  - kubernetes.io/pv-protection
  labels:
    type: local
  name: pv-storage
  resourceVersion: "548676147"
  uid: 7519b4d8-8f29-4a09-bad8-412082192e20
spec:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  claimRef:
    apiVersion: v1
    kind: PersistentVolumeClaim
    name: pvc-storage
    namespace: trident
    resourceVersion: "548676145"
    uid: d2e363e4-79f4-4f79-bdb2-c4374e37e2ad
  hostPath:
    path: /tmp/trident-test
    type: ""
  persistentVolumeReclaimPolicy: Retain
  storageClassName: ontapnasudp
  volumeMode: Filesystem
status:
  phase: Bound

Physical VOlume Claim (PVC)

 ~/repo/ kubectl get pvc -n trident -o yaml pvc-storage
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","kind":"PersistentVolumeClaim","metadata":{"annotations":{},"name":"pvc-storage","namespace":"trident"},"spec":{"accessModes":["ReadWriteMany"],"resources":{"requests":{"storage":"10Gi"}},"storageClassName":"ontapnasudp"}}
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
  creationTimestamp: "2023-11-30T12:33:09Z"
  finalizers:
  - kubernetes.io/pvc-protection
  name: pvc-storage
  namespace: trident
  resourceVersion: "548676149"
  uid: d2e363e4-79f4-4f79-bdb2-c4374e37e2ad
spec:
  accessModes:
  - ReadWriteMany
  resources:
    requests:
      storage: 10Gi
  storageClassName: ontapnasudp
  volumeMode: Filesystem
  volumeName: pv-storage
status:
  accessModes:
  - ReadWriteMany
  capacity:
    storage: 10Gi
  phase: Bound

the mount in pv-pod (the pod)

root@pv-pod:/# df -h | grep trid
tmpfs           3.9G  4.0K  3.9G   1% /tmp/trident-test

Additional context
We have errors, and some additional information from the trident-controller pod:

csi-attacher I1205 10:46:30.285060 1 connection.go:201] GRPC error: <nil> csi-attacher I1205 10:47:30.287315 1 connection.go:201] GRPC error: <nil>
│ csi-attacher I1205 10:48:30.294277 1 connection.go:201] GRPC error: <nil>
│ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work │
│ flow="core=version"

csi-attacher W1205 10:48:30.294433 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │
│ csi-attacher I1205 10:48:30.294442 1 csi_handler.go:740] Found NodeID <redacted> in CSINode <redacted>
│ csi-attacher W1205 10:48:30.294461 1 csi_handler.go:173] Failed to repair volume handle for driver pd.csi.storage.gke.io: node handle has wrong number of elements; got 1, wanted 6 or more │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call received." Duration="10.192µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion logLayer=rest_frontend requestID=1 │
│ 9bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="Getting Trident-ACP version." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=vers │
│ ion" │
│ trident-main time="2023-12-05T10:48:31Z" level=warning msg="ACP is not enabled." logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="core=version" │
│ trident-main time="2023-12-05T10:48:31Z" level=error msg="Trident-ACP version is empty." error="<nil>" logLayer=rest_frontend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST work │
│ flow="core=version" │
│ trident-main time="2023-12-05T10:48:31Z" level=debug msg="REST API call complete." Duration="978.427µs" Method=GET RequestURL=/trident/v1/version Route=GetVersion StatusCode=200 logLayer=rest_fron │
│ tend requestID=19bcda76-3268-4fa1-b7e1-6f2ae7ff0833 requestSource=REST workflow="trident_rest=logger" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=7a27627a-c480-446c-a35a-9addc41b169 │
│ 2 requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=d33b8f54-6142-4e3d-bc0b-f9458a7c37e │
│ c requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper has no record of the updated storage class; instead it will try to create it." logLayer=csi_frontend name=ontapnasudp paramet │
│ ers="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow="storage_class=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-b083af15-zn8s requestID=86a4ae35-eeb7-4340-a415-929d93b662c │
│ f requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=gke-e-infra-gke-e-infra-gke-88607822-luiw requestID=8ea3a1cc-0807-4ae3-893e-ddb8f4ed476 │
│ 6 requestSource=Kubernetes workflow="node=update" │
│ trident-main time="2023-12-05T10:48:37Z" level=warning msg="K8S helper could not add a storage class: object is being deleted: tridentstorageclasses.trident.netapp.io "ontapnasudp" already exist │
│ s" logLayer=csi_frontend name=ontapnasudp parameters="map[backendType:ontap-nas]" provisioner=csi.trident.netapp.io requestID=c2f7f1ba-2333-4d73-b142-ac072a9ef5fd requestSource=Kubernetes workflow │
│ ="storage_class=update" │
│ trident-main time="2023-12-05T10:48:49Z" level=debug msg="Node updated in cache." logLayer=csi_frontend name=<redacted> requestID=e3db19b3-4afd-441d-85dd-37022ed9831 │
│ b requestSource=Kubernetes workflow="node=update"

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions