Mount 32 Failure During Upgrade from Trident 22.10 to Trident 23.01

We have encountered an issue during the upgrade of our environment that results in specific upgraded pods being unable to mount volumes intermittently (~50% of the time).

**Describe the bug**
After software on a kubernetes node is upgraded (where Trident goes from 22.10 to 23.01), we encountered an issue
whereby the underlying volume is unable to mount.

This issue is encountered on a PV that is "multi-mounted" - that is, two separate pods running on the same node are accessing the same PV in filesystem mode.

Here's the specific message that is printed when the upgrade bug occurs:
```time="2023-06-23T17:39:46Z" level=error msg="Mount failed." error="exit status 32"
requestID=f3c8025d-dad5-4cb3-84a1-ad0406182858 requestSource=CSI
time="2023-06-23T17:39:46Z" level=error msg="GRPC error: rpc
error: code = Internal desc = unable to mount device; exit status 32" 
```

We believe we have root caused the underlying problem to a regression that was introduced by this change:
https://github.com/NetApp/trident/commit/aa3e565a3869b311acbf15aeb2908458e09f1889

**Environment**

- Trident version: Going from 22.10 to 23.01.
- OS: Ubuntu
- NetApp backend types: OTS & ONTAP AFF

**To Reproduce**
Steps to reproduce the behavior:
* Set up a pair of pods that share an underlying PV. Ensure that both pods are bound to the same node.
* Upgrade Trident from 22.10 to 23.01
* Take a look at the Trident tracking information
* You'll note that there is a missing field for certain volumes:
```
root@node0:/var/lib/trident/tracking# cat pvc-7a90e5d5-3b73-45ec-a08c-9963fe04933c.json | jq
{
  "localhost": true,
  "fstype": "ext4",
  "sharedTarget": true,
  "LUKSEncryption": "false",
  "iscsiTargetPortal": "172.0.0.14",
  "iscsiPortals": [
    "172.0.0.5",
    "172.0.0.6",
    "172.0.0.7"
  ],
  "iscsiTargetIqn": "iqn.1992-08.com.netapp:sn.e90faeca0ff711eea04c005056acda88:vs.4",
  "iscsiLunNumber": 5,
  "iscsiInterface": "default",
  "iscsiIgroup": "node0-b3135d6a-2cbf-4383-abea-235403b560e8",
  "useCHAP": true,
  "iscsiUsername": "dude-initiator",
  "iscsiInitiatorSecret": "IAaPKlD6ygOf0AhC",
  "iscsiTargetUsername": "dude-iscsi-target",
  "iscsiTargetSecret": "ZiYqVKFGDN4ouieZ",
  "VolumeTrackingInfoPath": "",
  "stagingTargetPath": "/var/lib/kubelet/plugins/kubernetes.io/csi/csi.trident.netapp.io/8e2c5043cdde8eef0e3d303ef5eaacafa803b3671810b8e46bcb5e3e7fa12964/globalmount",
  "publishedTargetPaths": {
    "/var/lib/kubelet/pods/8a7775fe-14ae-4089-95b9-8e764cef43fc/volumes/kubernetes.io~csi/pvc-7a90e5d5-3b73-45ec-a08c-9963fe04933c/mount": {}
  }
}
```
* When we see this bug manifest, `rawDevicePath` is not populated
* This results in a `exit 32` error on the next attempt to mount the underlying PV.

**Expected behavior**
* On upgrade, `rawDevicePath` should be present


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mount 32 Failure During Upgrade from Trident 22.10 to Trident 23.01 #844

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mount 32 Failure During Upgrade from Trident 22.10 to Trident 23.01 #844

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions