Skip to content

Commit

Permalink
Improve iSCSI multipath, login and detach
Browse files Browse the repository at this point in the history
This PR addresses the following issue:

1. Fix the find_multipaths=no multipath documentation and its handling when mounting a SAN volume, also ensure find_multipaths=yes use case does not break but throws a warning.
2. Enable multipath device to show up even if the number of devices identified is 1 for the default find_multipaths value.
3. However, for the find_multipaths=yes use case do not wait for multipath device if the original number of portal count is 1 (target portal and other iSCSI portals).
4. Add 10 seconds of timeout for iSCSI login.
5. Skip device read when detaching the device.
  • Loading branch information
rohit-arora-dev committed Jun 29, 2021
1 parent d983448 commit 2eee92b
Show file tree
Hide file tree
Showing 5 changed files with 229 additions and 56 deletions.
15 changes: 14 additions & 1 deletion docs/docker/install/host_config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,12 @@ iSCSI
sudo mpathconf --enable --with_multipathd y
.. note::
You should ensure the ``/etc/multipath.conf`` file does not contain ``find_multipaths yes``
under the ``defaults`` section. Acceptable values for ``find_multipath`` are ``no``,
``greedy``, or ``strict``, but because older versions of multipath-tools don't support values
other than ``no`` it's safest to omit ``find_multipath`` to get the default behavior.

#. Ensure that `iscsid` and `multipathd` are enabled and running:

.. code-block:: bash
Expand Down Expand Up @@ -78,10 +84,17 @@ iSCSI
sudo tee /etc/multipath.conf <<-'EOF'
defaults {
user_friendly_names yes
find_multipaths yes
}
EOF
.. note::
You should ensure the ``/etc/multipath.conf`` file does not contain ``find_multipaths yes``
under the ``defaults`` section. Acceptable values for ``find_multipath`` are ``no``,
``greedy``, or ``strict``, but because older versions of multipath-tools don't support values
other than ``no`` it's safest to omit ``find_multipath`` to get the default behavior.
.. code-block:: bash
sudo service multipath-tools restart
#. Ensure that ``iscsid`` and ``multipathd`` are running:
Expand Down
17 changes: 15 additions & 2 deletions docs/kubernetes/operations/tasks/worker.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Keep in mind the following considerations when using iSCSI volumes:
.. code-block:: bash
sudo sed -i 's/^\(node.session.auth.chap_algs\).*/\1 = MD5/' /etc/iscsi/iscsid.conf
* When using worker nodes that run RHEL/RedHat CoreOS with iSCSI
PVs, make sure to specify the ``discard`` mountOption in the
`StorageClass <https://kubernetes.io/docs/concepts/storage/storage-classes/#mount-options>`_
Expand Down Expand Up @@ -86,6 +86,12 @@ Keep in mind the following considerations when using iSCSI volumes:
sudo mpathconf --enable --with_multipathd y
.. note::
You should ensure the ``/etc/multipath.conf`` file does not contain ``find_multipaths yes``
under the ``defaults`` section. Acceptable values for ``find_multipath`` are ``no``,
``greedy``, or ``strict``, but because older versions of multipath-tools don't support values
other than ``no`` it's safest to omit ``find_multipath`` to get the default behavior.

#. Ensure that ``iscsid`` and ``multipathd`` are running:

.. code-block:: bash
Expand Down Expand Up @@ -132,10 +138,17 @@ Keep in mind the following considerations when using iSCSI volumes:
sudo tee /etc/multipath.conf <<-'EOF'
defaults {
user_friendly_names yes
find_multipaths yes
}
EOF
.. note::
You should ensure the ``/etc/multipath.conf`` file does not contain ``find_multipaths yes``
under the ``defaults`` section. Acceptable values for ``find_multipath`` are ``no``,
``greedy``, or ``strict``, but because older versions of multipath-tools don't support values
other than ``no`` it's safest to omit ``find_multipath`` to get the default behavior.
.. code-block:: bash
sudo systemctl enable --now multipath-tools.service
sudo service multipath-tools restart
Expand Down
126 changes: 86 additions & 40 deletions utils/osutils.go
Original file line number Diff line number Diff line change
Expand Up @@ -106,6 +106,7 @@ func AttachISCSIVolume(ctx context.Context, name, mountpoint string, publishInfo
var lunSerial = publishInfo.IscsiLunSerial
var fstype = publishInfo.FilesystemType
var options = publishInfo.MountOptions
var advertisedPortalCount = 1 + len(publishInfo.IscsiPortals)

if iscsiInterface == "" {
iscsiInterface = "default"
Expand Down Expand Up @@ -199,7 +200,7 @@ func AttachISCSIVolume(ctx context.Context, name, mountpoint string, publishInfo
return err
}

err = waitForMultipathDeviceForLUN(ctx, lunID, targetIQN)
err = waitForMultipathDeviceForLUN(ctx, lunID, advertisedPortalCount, targetIQN)
if err != nil {
return err
}
Expand Down Expand Up @@ -696,13 +697,12 @@ func getDeviceInfoForLUN(
devicePath = "/dev/" + devices[0]
}

err = ensureDeviceReadableWithRetry(ctx, devicePath)
if err != nil {
return nil, err
}

fsType := ""
if needFSType {
err = ensureDeviceReadableWithRetry(ctx, devicePath)
if err != nil {
return nil, err
}

fsType, err = getFSType(ctx, devicePath)
if err != nil {
Expand Down Expand Up @@ -773,7 +773,7 @@ func getDeviceInfoForMountPath(ctx context.Context, mountpath string) (*ScsiDevi
}

// waitForMultipathDeviceForLUN
func waitForMultipathDeviceForLUN(ctx context.Context, lunID int, iSCSINodeName string) error {
func waitForMultipathDeviceForLUN(ctx context.Context, lunID, advertisedPortalCount int, iSCSINodeName string) error {

fields := log.Fields{
"lunID": lunID,
Expand All @@ -794,61 +794,84 @@ func waitForMultipathDeviceForLUN(ctx context.Context, lunID int, iSCSINodeName
return err
}

waitForMultipathDeviceForDevices(ctx, devices)
waitForMultipathDeviceForDevices(ctx, advertisedPortalCount, devices)
return nil
}

// waitForMultipathDeviceForDevices accepts a list of sd* device names and waits until
// a multipath device is present for at least one of those. It returns the name of the
// multipath device, or an empty string if multipathd isn't running or there is only one path.
func waitForMultipathDeviceForDevices(ctx context.Context, devices []string) string {
func waitForMultipathDeviceForDevices(ctx context.Context, advertisedPortalCount int, devices []string) string {

fields := log.Fields{"devices": devices}
Logc(ctx).WithFields(fields).Debug(">>>> osutils.waitForMultipathDeviceForDevices")
defer Logc(ctx).WithFields(fields).Debug("<<<< osutils.waitForMultipathDeviceForDevices")

if len(devices) <= 1 {
// Even is the device len = 1, we should check for the multipath device, other devices may be
// taking time to be acknowledged
if len(devices) < 1 {
Logc(ctx).Debugf("Skipping multipath discovery, %d device(s) specified.", len(devices))
return ""
} else if !multipathdIsRunning(ctx) {
Logc(ctx).Debug("Skipping multipath discovery, multipathd isn't running.")
return ""
}

maxDuration := multipathDeviceDiscoveryTimeoutSecs * time.Second
multipathDevice := ""
if !multipathdIsRunning(ctx) {
Logc(ctx).Debug("Skipping multipath discovery, multipathd isn't running.")

checkMultipathDeviceExists := func() error {
if len(devices) > 1 || advertisedPortalCount > 1 {
Logc(ctx).Warnf("A multipath device may not exist, the multipathd is not running, however the " +
"number of devices (%d) or portals (%d) is greater than 1.", len(devices), advertisedPortalCount)
}

for _, device := range devices {
multipathDevice = findMultipathDeviceForDevice(ctx, device)
if multipathDevice != "" {
return nil
return ""
} else {
if findMultipathsValue, err := identifyFindMultipathsValue(ctx); err != nil {
// If Trident is unable to find the find_multipaths value, assume it to be default "no"
Logc(ctx).Errorf("unable to get the find_multipaths value from the multipath.conf: %v", err)
} else if findMultipathsValue == "yes" || findMultipathsValue == "smart" {
Logc(ctx).Warnf("A multipath device may not exist, the multipathd is running but find_multipaths " +
"value is set to '%s' in the multipath configuration. !!!Please correct this issue!!!", findMultipathsValue)

if advertisedPortalCount <= 1 {
Logc(ctx).Warnf("Skipping multipath discovery, %d portal(s) specified.", advertisedPortalCount)
return ""
}
}
if multipathDevice == "" {
return errors.New("multipath device not yet present")

maxDuration := multipathDeviceDiscoveryTimeoutSecs * time.Second
multipathDevice := ""

checkMultipathDeviceExists := func() error {

for _, device := range devices {
multipathDevice = findMultipathDeviceForDevice(ctx, device)
if multipathDevice != "" {
return nil
}
}
if multipathDevice == "" {
return errors.New("multipath device not yet present")
}
return nil
}
return nil
}

deviceNotify := func(err error, duration time.Duration) {
Logc(ctx).WithField("increment", duration).Debug("Multipath device not yet present, waiting.")
}
deviceNotify := func(err error, duration time.Duration) {
Logc(ctx).WithField("increment", duration).Debug("Multipath device not yet present, waiting.")
}

multipathDeviceBackoff := backoff.NewExponentialBackOff()
multipathDeviceBackoff.InitialInterval = 1 * time.Second
multipathDeviceBackoff.Multiplier = 1.414 // approx sqrt(2)
multipathDeviceBackoff.RandomizationFactor = 0.1
multipathDeviceBackoff.MaxElapsedTime = maxDuration
multipathDeviceBackoff := backoff.NewExponentialBackOff()
multipathDeviceBackoff.InitialInterval = 1 * time.Second
multipathDeviceBackoff.Multiplier = 1.414 // approx sqrt(2)
multipathDeviceBackoff.RandomizationFactor = 0.1
multipathDeviceBackoff.MaxElapsedTime = maxDuration

// Run the check/scan using an exponential backoff
if err := backoff.RetryNotify(checkMultipathDeviceExists, multipathDeviceBackoff, deviceNotify); err != nil {
Logc(ctx).Warnf("Could not find multipath device after %3.2f seconds.", maxDuration.Seconds())
} else {
Logc(ctx).WithField("multipathDevice", multipathDevice).Debug("Multipath device found.")
// Run the check/scan using an exponential backoff
if err := backoff.RetryNotify(checkMultipathDeviceExists, multipathDeviceBackoff, deviceNotify); err != nil {
Logc(ctx).Warnf("Could not find multipath device after %3.2f seconds.", maxDuration.Seconds())
} else {
Logc(ctx).WithField("multipathDevice", multipathDevice).Debug("Multipath device found.")
}
return multipathDevice
}
return multipathDevice
}

// waitForDevice accepts a device name and waits until it is present and returns error if it times out
Expand Down Expand Up @@ -2854,7 +2877,7 @@ func loginISCSITarget(ctx context.Context, iqn, portal string) error {

args := []string{"-m", "node", "-T", iqn, "-l", "-p", formatPortal(portal)}
listAllISCSIDevices(ctx)
if _, err := execIscsiadmCommand(ctx, args...); err != nil {
if _, err := execIscsiadmCommandWithTimeout(ctx, 10, args...); err != nil {
Logc(ctx).WithField("error", err).Error("Error logging in to iSCSI target.")
return err
}
Expand Down Expand Up @@ -2919,7 +2942,7 @@ func loginWithChap(
}

loginArgs := append(args, []string{"--login"}...)
if _, err := execIscsiadmCommand(ctx, loginArgs...); err != nil {
if _, err := execIscsiadmCommandWithTimeout(ctx, 10, loginArgs...); err != nil {
Logc(ctx).Error("Error running iscsiadm login.")
return err
}
Expand Down Expand Up @@ -3103,6 +3126,11 @@ func execIscsiadmCommand(ctx context.Context, args ...string) ([]byte, error) {
return execCommand(ctx, "iscsiadm", args...)
}

// execIscsiadmCommandWithTimeout uses the 'iscsiadm' command to perform operations with timeout
func execIscsiadmCommandWithTimeout(ctx context.Context, timeout time.Duration, args ...string) ([]byte, error) {
return execCommandWithTimeout(ctx, "iscsiadm", timeout, true, args...)
}

// execCommand invokes an external process
func execCommand(ctx context.Context, name string, args ...string) ([]byte, error) {

Expand Down Expand Up @@ -3265,3 +3293,21 @@ func listAllISCSIDevices(ctx context.Context) {
"iscsiadm -m session output": string(out2),
}).Trace("Listing all iSCSI Devices.")
}

// identifyFindMultipathsValue reads /etc/multipath.conf and identifies find_multipaths value (if set)
func identifyFindMultipathsValue(ctx context.Context) (string, error) {
if output, err := execCommandWithTimeout(ctx, "multipathd", 5, false, "show", "config"); err != nil {
Logc(ctx).WithFields(log.Fields{
"error": err,
}).Error("Could not read multipathd configuration")

return "", fmt.Errorf("could not read multipathd configuration: %v", err)
} else {
findMultipathsValue := GetFindMultipathValue(string(output))

Logc(ctx).WithField("findMultipathsValue", findMultipathsValue).Debug("Multipath find_multipaths value found.")

return findMultipathsValue, nil
}

}
25 changes: 25 additions & 0 deletions utils/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -497,6 +497,31 @@ func CountSpacesBeforeText(text string) int {
return len(text) - len(strings.TrimLeft(text, " \t"))
}

// GetFindMultipathValue returns the value of find_multipaths
// Returned values:
// no (or off): Create a multipath device for every path that is not explicitly disabled
// yes (or on): Create a device if one of some conditions are met
// other possible values: smart, greedy, strict
func GetFindMultipathValue(text string) string {

// This matches pattern in a multiline string of type " find_multipaths: yes"
tagsWithIndentationRegex := regexp.MustCompile(`(?m)^[\t ]*find_multipaths[\t ]*["|']?(?P<tagName>[\w-_]+)["|']?[\t ]*$`)
tag := tagsWithIndentationRegex.FindStringSubmatch(text)

// Since we have two of `()` in the pattern, we want to use the tag identified by the second `()`.
if len(tag) > 1 {
if tag[1] == "off" {
return "no"
} else if tag[1] == "on" {
return "yes"
}

return tag[1]
}

return ""
}

// GetNFSVersionFromMountOptions accepts a set of mount options, a default NFS version, and a list of
// supported NFS versions, and it returns the NFS version specified by the mount options, or the default
// if none is found, plus an error (if any). If a set of supported versions is supplied, and the returned
Expand Down

0 comments on commit 2eee92b

Please sign in to comment.