New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Added auto_online_001_pos test as apart of ZED/FMA tests in the ZTS #5350

Closed

sydneyvanda wants to merge 1 commit into openzfs:master from sydneyvanda:zts_auto-online

Contributor

sydneyvanda commented Oct 29, 2016

Also suppressed output from is_real_device/is_loop_device/is_mpath_device -
was making the log file very cluttered with useless error messages
"ie /dev/mapper/sdc is not a block device"

Auto-online works with real devices (sd- and mpath).
Multipath functionality needs more testing with real multipath set-up.

Signed-off-by: Sydney Vanda sydney.m.vanda@intel.com
Reviewed-by: Don Brady don.brady@intel.com

mention-bot commented Oct 29, 2016

@sydneyvanda, thanks for your PR! By analyzing the history of the files in this pull request, we identified @behlendorf, @jwk404 and @ahrens to be potential reviewers.

sydneyvanda mentioned this pull request

Update migration tests #5315

Merged

sydneyvanda force-pushed the zts_auto-online branch from 18eb5a9 to 422da65 Compare

October 31, 2016 17:15

behlendorf requested changes

View reviewed changes

tests/zfs-tests/tests/functional/fault/auto_online_001_pos.ksh

+              if (($? != 0)); then
+              	log_note "Starting ZED"
+              	log_must $SYSTEMCTL start zfs-zed
+              fi

Contributor

behlendorf Nov 2, 2016

I'd like to avoid adding a dependency on systemd for these tests. Adding this dependency rules out running the tests in-tree and on systems which don't use systemd.

Instead I suggest starting the ZED explicitly in the setup.sh and terminating it in cleanup.sh. If the ZED is already running I'd suggest skipping this test group since we can't be sure the right version of the ZED is running since we didn't start it. Checking for /var/run/zed.pid is a easy way to do this.

Starting the ZED ourselves has a few advantages:

We'll know for certain it's the version we expect.
We can control the location of the ZEDLET directory, zed.pid, and zed.state files.
We can run it in the foreground with the verbose flag and control the location of the log file.
As features are added to the ZED it'll be easy to configure them for testing.

tests/zfs-tests/include/default.cfg.in

+              	else
+              		log_note "System logging path not found"
+              		export ZED_LOG=""
+              	fi

Contributor

behlendorf Nov 2, 2016

Can be removed if the test suite starts/stop the ZED.

Contributor Author

sydneyvanda Nov 14, 2016

@behlendorf are you suggesting not using log at all? I was thinking of specifying the zedlet directory when starting zed like you suggested, and before that copying "/usr/local/etc/zfs/zed.d/all-syslog.sh" into either /var/tmp or some other directory (to bypass in-tree permissions of zedlets) and then still parsing the syslog for the events (more filtered then zpool events). Unless you were trying to suggest something else?

Contributor

behlendorf Nov 14, 2016

I was suggesting that if you launch the ZED as part of the test suite you can redirect stdout+stderr to files of your choosing. Plus when running in the foreground additional information will be logged which might be helpful for the purposes of writing test cases. It's possible we'll need to tweak the ZED logging to accommodate this but that's OK.

I was thinking of specifying the zedlet directory when starting zed like you suggested

Yeah I think that makes sense. That'll let you customize it as needed. You'll also want to specify a custom pid file.

tests/zfs-tests/include/libtest.shlib

+              		if ( is_real_device $device ); then
+              			dev_id="$($UDEVADM info $DEV_DSKDIR/$device | $EGREP \
+              			    disk/by-id | $NAWK '{print $2; exit}' \
+              			    | $NAWK -F / '{print $3}')"

Contributor

behlendorf Nov 2, 2016

I suspect this won't work on CentOS6 due to the older version of udevadm but we'll have to try it and see.

tests/zfs-tests/include/libtest.shlib

+              	else
+              		log_fail
+              	fi
+              }

Contributor

behlendorf Nov 2, 2016

Is this function really needed? It looks like we may be able to do without this, see my comment on on_off_disk. Then we'd just need the pass the device name/path around.

tests/zfs-tests/include/libtest.shlib

+              			    | $NAWK -F : '{print $2}')"
+              			targ="$($LS /sys/block/$dm_name/slaves/$slave/device/scsi_device/ \
+              			    | $NAWK -F : '{print $3}')"
+              			lun="$($LS /sys/block/$dm_name/slaves/$slave/device/scsi_device/ \

Contributor

behlendorf Nov 2, 2016

[cstyle] 80 character limit. The same applied to everything in this patch.

tests/zfs-tests/include/libtest.shlib

+              				$ECHO "offline" > \
+              				    /sys/class/scsi_device/${HBA}:${CHAN}:${TARG}:${LUN}/device/state
+              				$ECHO "1" > \
+              				    /sys/class/scsi_device/${HBA}:${CHAN}:${TARG}:${LUN}/device/delete

Contributor

behlendorf Nov 2, 2016

You should be able to achieve this more realistically by offlining and removing all underlying devices from the slaves directory above. This way you'll also be able to exercise the multipath code for detecting that an underlying device went away. Plus you won't need to know the hba, chan, target, len. Which is nice.

tests/zfs-tests/include/libtest.shlib

+              			fi
+              		elif [[ $state == "online" ]]; then
+              			$ECHO "$CHAN $TARG $LUN" > \
+              			    /sys/class/scsi_host/host${HBA}/scan

Contributor

behlendorf Nov 2, 2016

It's good that this is so precisely targeted but we could probably also force a full rescan:

echo "- - -" > /sys/class/scsi_host/host*/scan

Contributor Author

sydneyvanda Nov 14, 2016

@behlendorf played around with this idea, but I don't think a full rescan of all SCSI hosts is needed (there is a script within RedHat to accomplish this if so) but maybe unneeded. instead I was thinking just scan the 1 host (still would not need chan targ or lun)

Contributor

behlendorf Nov 14, 2016

I suspect forcing a full rescan would be a simpler and more robust way to handle this. Plus it's the kind of thing people will do and it's need to not cause problems. As you pointed out RedHat provides a scsi-rescan tool as do other distributions.

tests/zfs-tests/tests/functional/fault/auto_online_001_pos.ksh

+              disk1=${DISKS%% *}
+              if ! ( is_real_device $disk1 || is_mpath_device $disk1 ); then
+              	log_unsupported "This test can only be run on real disks."
+              fi

Contributor

behlendorf Nov 2, 2016

If you move this block to setup.sh the whole test group can be cleanly skipped. Although I think we can probably find a way to run most/all of these tests without real block devices. The scsi_debug.ko module provides a convenient way to create a virtual real scsi device and a mechanism to cause interesting failures.

sydneyvanda force-pushed the zts_auto-online branch 2 times, most recently from e543aaa to 9d6a4c2 Compare

December 5, 2016 16:36


          Added auto-online test as apart of ZED/FMA in the ZFS Test Suite

aec65eb

Automated auto-online test to go along with ZED FMA integration (PR 4673)
auto_online_001.pos works with real devices (sd- and mpath) and with non-real
block devices (loop) by adding a scsi_debug device to the pool

Note: In order for test group to run, ZED must not currently be running.
Kernel 3.16.37 or higher needed for scsi_debug to work properly
If timeout occurs on test using a scsi_debug device (error noticed on Ubuntu
system), a reboot might be needed in order for test to pass. (more
investigation into this)

Also suppressed output from is_real_device/is_loop_device/is_mpath_device -
was making the log file very cluttered with useless error messages
"ie /dev/mapper/sdc is not a block device" from previous patch

Signed-off-by: Sydney Vanda <sydney.m.vanda@intel.com>
Reviewed-by: Don Brady <don.brady@intel.com>

sydneyvanda force-pushed the zts_auto-online branch from 9d6a4c2 to aec65eb Compare

December 7, 2016 03:11

Contributor

behlendorf commented Feb 24, 2017

Updated version in #5774

behlendorf closed this

behlendorf moved this from In Progress to Done in 0.7.0-rc4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment