Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race between pvc mount and multipathd path addition after upgrading to kubernetes 1.20.2 #511

Closed
ffilippopoulos opened this issue Jan 19, 2021 · 19 comments

Comments

@ffilippopoulos
Copy link

Describe the bug

After upgrading to kubernetes 1.20.2 we see that occasionally pvc mounts fail because the path (etc ./dev/sdc) that trident is trying to mount is already locked via multipathd. trident-csi pods error that the device is /dev/sdc already mounted or mount point busy. and keep looping over and over.

Environment
Provide accurate information about the environment to help us reproduce the issue.

  • Trident version: 2.10.1
  • Trident installation flags used: tridentctl install -n sys-trident --generate-custom-yaml
  • Container runtime: docker://19.3.14
  • Kubernetes version: v1.20.2
  • Kubernetes orchestrator: self managed cluster on-prem
  • Kubernetes enabled feature gates: defaults
  • OS: Flatcar Container Linux by Kinvolk 2605.11.0
  • NetApp backend types: ONTAP SAN 9.7

To Reproduce
Use the latest trident version on a kubernetes 1.20.2 cluster and kick pods with a pvc a few times until the error occurs.

Expected behavior
Trident should wait for multipath devices to be present or be able to recover and look again after a mount failure.

Additional context
Describing kube pods look like:

 Warning  FailedMount             39s (x23 over 31m)   kubelet                  MountVolume.SetUp failed for volume "pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641" : rpc error: code = Internal desc = unable to mount device; exit status 32

From trident-csi debug logs we see:

time="2021-01-19T15:13:27Z" level=debug msg=">>>> osutils.findMultipathDeviceForDevice" device=sdc requestID=294166a3-f3a7-48ab-9ce3-b77fe6c4ec90 requestSource=CSI
time="2021-01-19T15:13:27Z" level=debug msg="Could not find multipath device for device." device=sdc requestID=294166a3-f3a7-48ab-9ce3-b77fe6c4ec90 requestSource=CSI
time="2021-01-19T15:13:27Z" level=debug msg="<<<< osutils.findMultipathDeviceForDevice" device=sdc requestID=294166a3-f3a7-48ab-9ce3-b77fe6c4ec90 requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg=">>>> osutils.MountDevice" device=/dev/sdc mountpoint="/var/lib/kubelet/pods/ecfb5625-17fc-4b65-8163-25398d6987f0/volumes/kubernetes.io~csi/pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641/mount" options= requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg=">>>> osutils.IsMounted" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI source=/dev/sdc target="/var/lib/kubelet/pods/ecfb5625-17fc-4b65-8163-25398d6987f0/volumes/kubernetes.io~csi/pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641/mount"
time="2021-01-19T15:31:53Z" level=debug msg="Source device: /dev/sdc, Target: /var/lib/kubelet/pods/ecfb5625-17fc-4b65-8163-25398d6987f0/volumes/kubernetes.io~csi/pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641/mount, is mounted: false" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg="<<<< osutils.IsMounted" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI source=/dev/sdc target="/var/lib/kubelet/pods/ecfb5625-17fc-4b65-8163-25398d6987f0/volumes/kubernetes.io~csi/pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641/mount"
time="2021-01-19T15:31:53Z" level=debug msg="Already mounted: false, mountpoint exists: false" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg=">>>> osutils.execCommand." args="[/dev/sdc /var/lib/kubelet/pods/ecfb5625-17fc-4b65-8163-25398d6987f0/volumes/kubernetes.io~csi/pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641/mount]" command=mount requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg="<<<< osutils.execCommand." command=mount error="exit status 32" output="mount: /var/lib/kubelet/pods/ecfb5625-17fc-4b65-8163-25398d6987f0/volumes/kubernetes.io~csi/pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641/mount: /dev/sdc already mounted or mount point busy." requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=error msg="Mount failed." error="exit status 32" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg="<<<< osutils.MountDevice" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg="<<<< NodePublishVolume" Method=NodePublishVolume Type=CSI_Node requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=debug msg="Released shared lock (NodePublishVolume-pvc-8789f2d4-c5b7-46c8-b975-660e51e0d641)." lock=csi_node_server requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI
time="2021-01-19T15:31:53Z" level=error msg="GRPC error: rpc error: code = Internal desc = unable to mount device; exit status 32" requestID=1c742d5c-5e9d-40f4-ae94-342bf97ea23f requestSource=CSI

The reason that this path is busy on the host seems to be multipath:

$ sudo lsof  | grep sdc
multipath   845                   root    5r      BLK               8,32       0t0     163141 /dev/sdc
multipath   845   847             root    5r      BLK               8,32       0t0     163141 /dev/sdc
multipath   845   848             root    5r      BLK               8,32       0t0     163141 /dev/sdc
multipath   845   849             root    5r      BLK               8,32       0t0     163141 /dev/sdc
multipath   845   850             root    5r      BLK               8,32       0t0     163141 /dev/sdc
multipath   845   851             root    5r      BLK               8,32       0t0     163141 /dev/sdc
multipath   845 39963             root    5r      BLK               8,32       0t0     163141 /dev/sdc

also sdc is now pointing to the correct dm:

ls /sys/block/sdc/holders/
dm-0

Please note the intermittent mount failures appeared only after upgrading kubernetes from 1.19.2 to 1.20.2 and we are able to reproduce them quite consistently.

Any advice on how to continue debugging that from our side would be much appreciated.

@gnarl
Copy link
Contributor

gnarl commented Jan 21, 2021

Hi @ffilippopoulos,

We've tried to reproduce this issue internally and haven't been able to do so. There may be something specific to your environment that is causing this issue. Please contact NetApp Support so that we can investigate this issue further.

To open a case with NetApp, please go to https://www.netapp.com/company/contact-us/support/
Find the appropriate number from your region to call in, or login.
Note: Trident is a supported product by NetApp based on a supported Netapp storage SN.
Open the case on the NetApp storage SN, and provide the description of the problem.
Be sure to mention the product is Trident on Kubernetes, and provide the details. Mention this GitHub.
The case will be directed to Trident support engineers for response.

@bswartz
Copy link
Contributor

bswartz commented Jan 29, 2021

@ffilippopoulos Hey we have been experimenting with Trident on top of of flatcar Linux and we are observing periodic core dumps from iscsiadm. Have you seen anything like that in your logs? Something like this is what we're seeing:

time="2021-01-29T21:11:22Z" level=debug msg="<<<< osutils.execCommand." command=iscsiadm error="signal: segmentation fault (core dumped)" output=

It's possible that we've screwed up our Flatcar Linux installation somehow, but it's also possible that Flatcar Linux has a screwed up Open iSCSI userspace. As you know Flatcar Linux isn't a supported Linux distro for Trident, but there's every reason that it should probably work as long as the kernel and userspace tools are high enough versions and not buggy. The coredumps make me suspect there could be bugs, though.

@ffilippopoulos
Copy link
Author

hi @bswartz, tyvm for looking at this issue!
We are currently usnig flatcar version 2605.11.0, which comes with: kernel 5.4.87, iscsid version 2.1.2 and multipath-tools v0.6.4.

We do not see any core dumps as you posted, but we see the following logs when the mount gets stuck:

time="2021-02-01T12:38:35Z" level=debug msg="Acquired shared lock (NodePublishVolume-pvc-fd5da671-4e99-4fad-97cf-43225859725c)." lock=csi_node_server requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg=">>>> NodePublishVolume" Method=NodePublishVolume Type=CSI_Node requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg="Publish Info found" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg=">>>> osutils.MountDevice" device=/dev/sdc mountpoint="/var/lib/kubelet/pods/c341e768-d253-4907-a62c-417b27b350ea/volumes/kubernetes.io~csi/pvc-fd5da671-4e99-4fad-97cf-43225859725c/mount" options= requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg=">>>> osutils.IsMounted" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI source=/dev/sdc target="/var/lib/kubelet/pods/c341e768-d253-4907-a62c-417b27b350ea/volumes/kubernetes.io~csi/pvc-fd5da671-4e99-4fad-97cf-43225859725c/mount"
time="2021-02-01T12:38:35Z" level=debug msg="Source device: /dev/sdc, Target: /var/lib/kubelet/pods/c341e768-d253-4907-a62c-417b27b350ea/volumes/kubernetes.io~csi/pvc-fd5da671-4e99-4fad-97cf-43225859725c/mount, is mounted: false" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg="<<<< osutils.IsMounted" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI source=/dev/sdc target="/var/lib/kubelet/pods/c341e768-d253-4907-a62c-417b27b350ea/volumes/kubernetes.io~csi/pvc-fd5da671-4e99-4fad-97cf-43225859725c/mount"
time="2021-02-01T12:38:35Z" level=debug msg="Already mounted: false, mountpoint exists: false" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg=">>>> osutils.execCommand." args="[/dev/sdc /var/lib/kubelet/pods/c341e768-d253-4907-a62c-417b27b350ea/volumes/kubernetes.io~csi/pvc-fd5da671-4e99-4fad-97cf-43225859725c/mount]" command=mount requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg="<<<< osutils.execCommand." command=mount error="exit status 32" output="mount: /var/lib/kubelet/pods/c341e768-d253-4907-a62c-417b27b350ea/volumes/kubernetes.io~csi/pvc-fd5da671-4e99-4fad-97cf-43225859725c/mount: /dev/sdc already mounted or mount point busy." requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=error msg="Mount failed." error="exit status 32" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg="<<<< osutils.MountDevice" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg="<<<< NodePublishVolume" Method=NodePublishVolume Type=CSI_Node requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=debug msg="Released shared lock (NodePublishVolume-pvc-fd5da671-4e99-4fad-97cf-43225859725c)." lock=csi_node_server requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI
time="2021-02-01T12:38:35Z" level=error msg="GRPC error: rpc error: code = Internal desc = unable to mount device; exit status 32" requestID=39e39f5d-97c0-481c-b489-194a9a16da0e requestSource=CSI

Please note that this appears only on nodes of our cluster that run kubernetes 1.20.2, so we believe that the kube version upgrade is significant here. As stated in the description the issue we have is probably because multipathd does not add the path under the respective dm in time and trident decides to proceed by mounting directly one of the paths (/dev/sdc in the above logs - example). That mount cannot succeed as multipath "locks" the device in the meantime:

sudo lsof | grep sdc
multipath     864                     root    5r      BLK               8,32       0t0   22437370 /dev/sdc
multipath     864     865             root    5r      BLK               8,32       0t0   22437370 /dev/sdc
multipath     864     866             root    5r      BLK               8,32       0t0   22437370 /dev/sdc
multipath     864     867             root    5r      BLK               8,32       0t0   22437370 /dev/sdc
multipath     864     868             root    5r      BLK               8,32       0t0   22437370 /dev/sdc
multipath     864     869             root    5r      BLK               8,32       0t0   22437370 /dev/sdc
multipath     864 1920499             root    5r      BLK               8,32       0t0   22437370 /dev/sdc

verbosity 3 multipathd logs from the same host:

Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: sdc: path state = running
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: sdc: get_state
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: 8:32: tur checker starting up
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: 8:32: tur checker finished, state up
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: sdc: state = up
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: 3600a098038314438492b5057512f6859: disassemble map [4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handler 1 alua 2 1 service-time>
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: 3600a098038314438492b5057512f6859: disassemble status [2 0 1 0 2 1 A 0 1 2 8:16 A 0 0 1 E 0 1 2 8:32 A 0 0 1 ]
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: sdc: mask = 0x8
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: sdc: path state = running
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: sdc: sysfs prio = 10
Feb 01 12:38:37 worker-8.dev.merit.uw.systems multipathd[864]: checked 1 path in 0.000865 secs
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: sdb: path state = running
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: sdb: get_state
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: 8:16: tur checker starting up
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: 8:16: tur checker finished, state up
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: sdb: state = up
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: 3600a098038314438492b5057512f6859: disassemble map [4 queue_if_no_path pg_init_retries 50 retain_attached_hw_handler 1 alua 2 1 service-time>
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: 3600a098038314438492b5057512f6859: disassemble status [2 0 1 0 2 1 A 0 1 2 8:16 A 0 0 1 E 0 1 2 8:32 A 0 0 1 ]
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: sdb: mask = 0x8
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: sdb: path state = running
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: sdb: sysfs prio = 50
Feb 01 12:38:39 worker-8.dev.merit.uw.systems multipathd[864]: checked 1 path in 0.000763 secs

Also note that we tried updating flatcar to the latest version today (2605.11.0 which comes with a iscsi bump to version 2.1.3) and we ran into the core dumps you posted above.
Any ideas on how to proceed further with this one?

@ffilippopoulos
Copy link
Author

Digging a little bit more into that, I believe that our problem isn't related with multipath, but with iscsi scans failing to find all the devices in time:

time="2021-02-03T17:44:46Z" level=debug msg="Invoked single-LUN scan." requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI scanCmd="0 0 6" scanFile=/sys/class/scsi_host/host6/scan
time="2021-02-03T17:44:47Z" level=debug msg="Invoked single-LUN scan." requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI scanCmd="0 0 6" scanFile=/sys/class/scsi_host/host7/scan
time="2021-02-03T17:44:47Z" level=debug msg="<<<< osutils.iSCSIScanTargetLUN" hosts="[6 7]" lunID=6 requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:47Z" level=debug msg="Scanning paths: [/sys/class/scsi_host/host6/device/session1/iscsi_session/session1/device/target6:0:0/6:0:0:6 /sys/class/scsi_host/host7/device/session2/iscsi_
session/session2/device/target7:0:0/7:0:0:6]" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:47Z" level=debug msg="All devices not yet present, waiting." increment=996.461192ms requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:48Z" level=debug msg="All devices not yet present, waiting." increment=1.306578758s requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:49Z" level=debug msg="All devices not yet present, waiting." increment=2.179888496s requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:51Z" level=debug msg="Paths found so far: []" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:51Z" level=debug msg="No devices present yet, waiting." increment=990.081425ms requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="Paths found: []" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="<<<< osutils.waitForDeviceScanIfNeeded" iSCSINodeName="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9" lunID=6 requestID=7e4b5dce-31dd-4958-bd
c2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg=">>>> osutils.GetISCSIHostSessionMapForTarget" iSCSINodeName="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="Found iSCSI host/session." hostNumber=6 requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI sessionNumber=1
time="2021-02-03T17:44:52Z" level=debug msg="Found iSCSI host/session." hostNumber=7 requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI sessionNumber=2
time="2021-02-03T17:44:52Z" level=debug msg="<<<< osutils.GetISCSIHostSessionMapForTarget" iSCSINodeName="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9" requestID=7e4b5dce-31dd-4958-bdc2
-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="Get LUN Serial" path="/sys/class/scsi_host/host7/device/session2/iscsi_session/session2/device/target7:0:0/7:0:0:6" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7
d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="LUN serial check passed" lun=6 path="/sys/class/scsi_host/host7/device/session2/iscsi_session/session2/device/target7:0:0/7:0:0:6" requestID=7e4b5dce-31dd-4958
-bdc2-8a6a612e7d15 requestSource=CSI serial=81D8I+PWQ/hW target="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9"
time="2021-02-03T17:44:52Z" level=debug msg="Get LUN Serial" path="/sys/class/scsi_host/host6/device/session1/iscsi_session/session1/device/target6:0:0/6:0:0:6" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7
d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="LUN serial check passed" lun=6 path="/sys/class/scsi_host/host6/device/session1/iscsi_session/session1/device/target6:0:0/6:0:0:6" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI serial=81D8I+PWQ/hW target="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9"
time="2021-02-03T17:44:52Z" level=debug msg=">>>> osutils.waitForMultipathDeviceForLUN" iSCSINodeName="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9" lunID=6 requestID=7e4b5dce-31dd-4958
-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg=">>>> osutils.GetISCSIHostSessionMapForTarget" iSCSINodeName="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9" requestID=7e4b5dce-31dd-4958-bdc2
-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="Found iSCSI host/session." hostNumber=6 requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI sessionNumber=1
time="2021-02-03T17:44:52Z" level=debug msg="Found iSCSI host/session." hostNumber=7 requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI sessionNumber=2
time="2021-02-03T17:44:52Z" level=debug msg="<<<< osutils.GetISCSIHostSessionMapForTarget" iSCSINodeName="iqn.1992-08.com.netapp:sn.0205ffce026911ebb4d9d039ea1a7953:vs.9" requestID=7e4b5dce-31dd-4958-bdc2
-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg=">>>> osutils.waitForMultipathDeviceForDevices" devices="[sdc]" requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI
time="2021-02-03T17:44:52Z" level=debug msg="Skipping multipath discovery, 1 device(s) specified." requestID=7e4b5dce-31dd-4958-bdc2-8a6a612e7d15 requestSource=CSI

In the logs above, it seems like trident finds only one device (sdc) and thus decides to skip multipath discovery and proceed mounting the device scanned. That is when the race with multipath occurs.

@ffilippopoulos
Copy link
Author

@bswartz I've run into on old comment of yours for a relevant situation: kubernetes/kubernetes#60894 (comment)

Do you think trident could also follow a similar approach, where a delay is introduced based on the number of portals in case a smaller amount of volumes is discovered?
I also saw comments about the significance of find_multipaths, which defaults to "no" in our nodes and tried setting it to "yes", but no luck. I still see this happening quite often.

@ffilippopoulos
Copy link
Author

based on the above ^ I've tried this: https://github.com/ffilippopoulos/trident/pull/1/files and looks to do the job. This is basically checking if len(devices) < len(hostSessionMap) and allows a few more seconds to discover more devices.

time="2021-02-09T17:48:21Z" level=debug msg="Devices less than sessions, waiting" increment=934.767587ms requestID=fb1b3b2c-0a9f-4204-8b42-d0189763efca requestSource=CSI
time="2021-02-09T17:48:22Z" level=debug msg="LUN target devices found." devices="[sdb sdc]" requestID=fb1b3b2c-0a9f-4204-8b42-d0189763efca requestSource=CSI

If you guys think that this approach will not mess with anything else we could even PR it directly.

@bswartz
Copy link
Contributor

bswartz commented Feb 10, 2021

Sorry I haven't been looking at this issue the last few days @ffilippopoulos. I think your analysis is on to something. Trident's current multipath behavior is a little fragile because we don't allow users anywhere to tell Trident if they want multipathing enabled or disabled -- we simply try to detect it, and in your case we may be detecting it wrong. It isn't the first time I've seen a problem like this, and my first instinct to address the problem is to add some sort of control knob that users can use to explicitly turn multipathing on or off. In the on case, we could be more aggressive about waiting for the multipath devices to show up. I'm worried that changing it to wait in ALL cases will introduce problems in situations where multipathing either isn't enabled, or it's in a degraded state. Let me know if you disagree or have an alternative suggestion. I'm going to take a look at adding a top level option to the Trident backend to control this behavior.

@ffilippopoulos
Copy link
Author

I'm worried that changing it to wait in ALL cases will introduce problems in situations where multipathing either isn't enabled, or it's in a degraded state.

it will at least create a not needed delay, depending on how much time is allowed waiting for multipath devices, that is why in the code changes I linked above after forking trident, I only allow it to try for 5 seconds.
I do not object to have such a wait policy only if multipathd is explicitly triggered via a flag, but we'd definitely need the extra wait time allowed. In my effort I added len(devices) < len(hostSessionMap) as a rule of thumb to decide whether to wait or not, but I appreciate there might be setups that I am not aware, where this would slow things down.

@bswartz
Copy link
Contributor

bswartz commented Feb 10, 2021

The most obvious risk is when multipathing actually is enabled, and you have a path down at the moment you try to attach a pod. In that case, it's not the best policy to wait for the down path to come back up to allow the pod to start up.

@ffilippopoulos
Copy link
Author

yup, I agree, that is why we are talking about a small wait window (in my case I added 5 seconds but I think it could be less). After that we'd have to proceed with what is discovered.

@ffilippopoulos
Copy link
Author

Also, note that the 5 seconds period might not be enough (observed that loaded nodes in a cluster rollout might need more). It would be worthy to make the wait period configurable imo.

@bswartz
Copy link
Contributor

bswartz commented Feb 13, 2021

I also saw comments about the significance of find_multipaths, which defaults to "no" in our nodes and tried setting it to "yes", but no luck. I still see this happening quite often.

I strongly recommend setting find_multipaths to no in your multipathd.conf file. Also, I've noticed that open-iscsi 2.1.3 is buggy, and I myself have downgraded to 2.1.2 on my dev system to avoid some coredump issues. When I have time I may pursue the coredumps with the open-iscsi team. They're definitely caused by changes in the 2.1.3 package.

@bswartz
Copy link
Contributor

bswartz commented Feb 17, 2021

Please refer to open-iscsi/open-iscsi#252 for the open-iscsi bug.

@ffilippopoulos
Copy link
Author

Is there any update regarding this one?

@scaleoutsean
Copy link
Contributor

I think I've been seeing a related problem with an older stack. iSCSI mounts fail (sometimes also to SolidFire, which I briefly set up to check if it's only ONTAP-related) although PVC/PV are both bound - it seems that iSCSI and/or multipath aren't ready and I in the pod I get attachdetach-controller AttachVolume.Attach succeed for volume (and PVC/PV both are good) but instantly followed by kubelet, m MountVolume.WaitForAttach failed for volume "default-xxxxxxxx" : failed to get any path for iscsi disk, last err seen: although the device is clearly bound and accessible via non-multipath name (/dev/sdb, iscsi session is there and tried to format it from host which works). So it seems related to iSCSI and multipathd.

I'll try to use @ffilippopoulos' patch and do more testing, but if anyone has any other patches or ideas (I'll try w/o find_multipath as well), let us know what else can we try.

Tangential to this: ONTAP best practices for Linux recommend to enable find_multipaths and in general don't consider Kubernetes use cases, both of which may benefit from our feedback.

@george-angel
Copy link

Hello, we have been running with utilitywarehouse#3 patch, which implements @bswartz 's suggestion of having an explicit mpath flag.

We can confirm, this also mitigates our issue, and is an alternative to the #568 solution.

In the conversation on the case with NetAPP support (https://www.netapp.com/company/contact-us/support/), we have been told that NetAPP cannot support this change, are working on their own fix, which will "probably be available in the next release". After which our case was closed, and we are told to "follow our newsletter".

@gnarl
Copy link
Contributor

gnarl commented Jun 17, 2021

Hi @george-angel,

While investigating this issue further it was discovered that the mpath flag isn't necessary. It was also decided that additional improvements and refactoring of the osutils.go iSCSI code are needed. In order to complete these code changes there are also changes needed to our functional test suite (i.e. multipathd manipulation, network connectivity, etc...).

Once we are further along with the testing and are convinced that this issue is solved and backwards compatibility is maintained we will be able to share more information with you. Thanks for your collaboration so far.

@george-angel
Copy link

Thanks for the update @gnarl 👍

@gnarl
Copy link
Contributor

gnarl commented Jul 26, 2021

This issue was addressed with commit 2eee92b and will be included in the Trident 21.07 release.

@gnarl gnarl closed this as completed Jul 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants