Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added auto_online_001_pos test as apart of ZED/FMA tests in the ZTS #5350

Closed
wants to merge 1 commit into from

Conversation

sydneyvanda
Copy link
Contributor

Also suppressed output from is_real_device/is_loop_device/is_mpath_device -
was making the log file very cluttered with useless error messages
"ie /dev/mapper/sdc is not a block device"

Auto-online works with real devices (sd- and mpath).
Multipath functionality needs more testing with real multipath set-up.

Signed-off-by: Sydney Vanda sydney.m.vanda@intel.com
Reviewed-by: Don Brady don.brady@intel.com

@mention-bot
Copy link

@sydneyvanda, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @jwk404 and @ahrens to be potential reviewers.

if (($? != 0)); then
log_note "Starting ZED"
log_must $SYSTEMCTL start zfs-zed
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid adding a dependency on systemd for these tests. Adding this dependency rules out running the tests in-tree and on systems which don't use systemd.

Instead I suggest starting the ZED explicitly in the setup.sh and terminating it in cleanup.sh. If the ZED is already running I'd suggest skipping this test group since we can't be sure the right version of the ZED is running since we didn't start it. Checking for /var/run/zed.pid is a easy way to do this.

Starting the ZED ourselves has a few advantages:

  • We'll know for certain it's the version we expect.
  • We can control the location of the ZEDLET directory, zed.pid, and zed.state files.
  • We can run it in the foreground with the verbose flag and control the location of the log file.
  • As features are added to the ZED it'll be easy to configure them for testing.

else
log_note "System logging path not found"
export ZED_LOG=""
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed if the test suite starts/stop the ZED.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behlendorf are you suggesting not using log at all? I was thinking of specifying the zedlet directory when starting zed like you suggested, and before that copying "/usr/local/etc/zfs/zed.d/all-syslog.sh" into either /var/tmp or some other directory (to bypass in-tree permissions of zedlets) and then still parsing the syslog for the events (more filtered then zpool events). Unless you were trying to suggest something else?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was suggesting that if you launch the ZED as part of the test suite you can redirect stdout+stderr to files of your choosing. Plus when running in the foreground additional information will be logged which might be helpful for the purposes of writing test cases. It's possible we'll need to tweak the ZED logging to accommodate this but that's OK.

I was thinking of specifying the zedlet directory when starting zed like you suggested

Yeah I think that makes sense. That'll let you customize it as needed. You'll also want to specify a custom pid file.

if ( is_real_device $device ); then
dev_id="$($UDEVADM info $DEV_DSKDIR/$device | $EGREP \
disk/by-id | $NAWK '{print $2; exit}' \
| $NAWK -F / '{print $3}')"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this won't work on CentOS6 due to the older version of udevadm but we'll have to try it and see.

else
log_fail
fi
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this function really needed? It looks like we may be able to do without this, see my comment on on_off_disk. Then we'd just need the pass the device name/path around.

| $NAWK -F : '{print $2}')"
targ="$($LS /sys/block/$dm_name/slaves/$slave/device/scsi_device/ \
| $NAWK -F : '{print $3}')"
lun="$($LS /sys/block/$dm_name/slaves/$slave/device/scsi_device/ \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[cstyle] 80 character limit. The same applied to everything in this patch.

$ECHO "offline" > \
/sys/class/scsi_device/${HBA}:${CHAN}:${TARG}:${LUN}/device/state
$ECHO "1" > \
/sys/class/scsi_device/${HBA}:${CHAN}:${TARG}:${LUN}/device/delete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to achieve this more realistically by offlining and removing all underlying devices from the slaves directory above. This way you'll also be able to exercise the multipath code for detecting that an underlying device went away. Plus you won't need to know the hba, chan, target, len. Which is nice.

fi
elif [[ $state == "online" ]]; then
$ECHO "$CHAN $TARG $LUN" > \
/sys/class/scsi_host/host${HBA}/scan
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good that this is so precisely targeted but we could probably also force a full rescan:

echo "- - -" > /sys/class/scsi_host/host*/scan

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behlendorf played around with this idea, but I don't think a full rescan of all SCSI hosts is needed (there is a script within RedHat to accomplish this if so) but maybe unneeded. instead I was thinking just scan the 1 host (still would not need chan targ or lun)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect forcing a full rescan would be a simpler and more robust way to handle this. Plus it's the kind of thing people will do and it's need to not cause problems. As you pointed out RedHat provides a scsi-rescan tool as do other distributions.

disk1=${DISKS%% *}
if ! ( is_real_device $disk1 || is_mpath_device $disk1 ); then
log_unsupported "This test can only be run on real disks."
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you move this block to setup.sh the whole test group can be cleanly skipped. Although I think we can probably find a way to run most/all of these tests without real block devices. The scsi_debug.ko module provides a convenient way to create a virtual real scsi device and a mechanism to cause interesting failures.

@sydneyvanda sydneyvanda force-pushed the zts_auto-online branch 2 times, most recently from e543aaa to 9d6a4c2 Compare December 5, 2016 16:36
Automated auto-online test to go along with ZED FMA integration (PR 4673)
auto_online_001.pos works with real devices (sd- and mpath) and with non-real
block devices (loop) by adding a scsi_debug device to the pool

Note: In order for test group to run, ZED must not currently be running.
Kernel 3.16.37 or higher needed for scsi_debug to work properly
If timeout occurs on test using a scsi_debug device (error noticed on Ubuntu
system), a reboot might be needed in order for test to pass. (more
investigation into this)

Also suppressed output from is_real_device/is_loop_device/is_mpath_device -
was making the log file very cluttered with useless error messages
"ie /dev/mapper/sdc is not a block device" from previous patch

Signed-off-by: Sydney Vanda <sydney.m.vanda@intel.com>
Reviewed-by: Don Brady <don.brady@intel.com>
@behlendorf
Copy link
Contributor

Updated version in #5774

@behlendorf behlendorf closed this Feb 24, 2017
@behlendorf behlendorf moved this from In Progress to Done in 0.7.0-rc4 Mar 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

3 participants