Skip to content

Commit

Permalink
Merge pull request #67140 from NetApp/multipath-race-fix
Browse files Browse the repository at this point in the history
Automatic merge from submit-queue (batch tested with PRs 67017, 67190, 67110, 67140, 66873). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Add wait loop for multipath devices to appear

It takes a variable amount of time for the multipath daemon
to create /dev/dm-XX in response to new LUNs being discovered.
The old iscsi_util code only discovered the multipath device
if it was created quickly enough, but in a significant number
of cases, kubelet would grab one of the individual paths and
put a filesystem it on before multipathd could construct a
multipath device.

This change waits for the multipath device to get created for
up to 10 seconds, but only if the PV actually had more than
one portal.

fixes #60894

```release-note
Dynamic provisions that create iSCSI PVs can ensure that multipath is used by specifying 2 or more target portals in the PV, which will cause kubelet to wait up to 10 seconds for the multipath device. PVs with just one portal continue to work as before, with kubelet not waiting for the multipath device and just using the first disk it finds.
```
  • Loading branch information
Kubernetes Submit Queue committed Aug 11, 2018
2 parents 032a096 + 39e52dd commit 1dfe2e8
Showing 1 changed file with 40 additions and 13 deletions.
53 changes: 40 additions & 13 deletions pkg/volume/iscsi/iscsi_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,31 @@ func scanOneLun(hostNumber int, lunNumber int) error {
return nil
}

func waitForMultiPathToExist(devicePaths []string, maxRetries int, deviceUtil volumeutil.DeviceUtil) string {
if 0 == len(devicePaths) {
return ""
}

for i := 0; i < maxRetries; i++ {
for _, path := range devicePaths {
// There shouldnt be any empty device paths. However adding this check
// for safer side to avoid the possibility of an empty entry.
if path == "" {
continue
}
// check if the dev is using mpio and if so mount it via the dm-XX device
if mappedDevicePath := deviceUtil.FindMultipathDeviceForDevice(path); mappedDevicePath != "" {
return mappedDevicePath
}
}
if i == maxRetries-1 {
break
}
time.Sleep(time.Second)
}
return ""
}

// AttachDisk returns devicePath of volume if attach succeeded otherwise returns error
func (util *ISCSIUtil) AttachDisk(b iscsiDiskMounter) (string, error) {
var devicePath string
Expand Down Expand Up @@ -381,19 +406,21 @@ func (util *ISCSIUtil) AttachDisk(b iscsiDiskMounter) (string, error) {
glog.Errorf("iscsi: last error occurred during iscsi init:\n%v", lastErr)
}

//Make sure we use a valid devicepath to find mpio device.
devicePath = devicePaths[0]
for _, path := range devicePaths {
// There shouldnt be any empty device paths. However adding this check
// for safer side to avoid the possibility of an empty entry.
if path == "" {
continue
}
// check if the dev is using mpio and if so mount it via the dm-XX device
if mappedDevicePath := b.deviceUtil.FindMultipathDeviceForDevice(path); mappedDevicePath != "" {
devicePath = mappedDevicePath
break
}
// Try to find a multipath device for the volume
if 1 < len(bkpPortal) {
// If the PV has 2 or more portals, wait up to 10 seconds for the multipath
// device to appear
devicePath = waitForMultiPathToExist(devicePaths, 10, b.deviceUtil)
} else {
// For PVs with 1 portal, just try one time to find the multipath device. This
// avoids a long pause when the multipath device will never get created, and
// matches legacy behavior.
devicePath = waitForMultiPathToExist(devicePaths, 1, b.deviceUtil)
}

// When no multipath device is found, just use the first (and presumably only) device
if devicePath == "" {
devicePath = devicePaths[0]
}

glog.V(5).Infof("iscsi: AttachDisk devicePath: %s", devicePath)
Expand Down

0 comments on commit 1dfe2e8

Please sign in to comment.