Please sign in to comment.
MMP interval and fail_intervals in uberblock
When Multihost is enabled, and a pool is imported, uberblock writes include ub_mmp_delay to allow an importing node to calculate the duration of an activity test. This value, however, is not enough information. If zfs_multihost_fail_intervals > 0 on the node with the pool imported, the safe minimum duration of the activity test is well defined, but does not depend on ub_mmp_delay: zfs_multihost_fail_intervals * zfs_multihost_interval and if zfs_multihost_fail_intervals == 0 on that node, there is no such well defined safe duration, but the importing host cannot tell whether mmp_delay is high due to I/O delays, or due to a very large zfs_multihost_interval setting on the host which last imported the pool. As a result, it may use a far longer period for the activity test than is necessary. This patch renames ub_mmp_sequence to ub_mmp_config and uses it to record the zfs_multihost_interval and zfs_multihost_fail_intervals values, as well as the mmp sequence. This allows a shorter activity test duration to be calculated by the importing host in most situations. These values are also added to the multihost_history kstat records. ZTS tests are added to verify the new functionality. In addition, it makes a few other improvements: * It updates the "sequence" part of ub_mmp_config when MMP writes in between syncs occur. This allows an importing host to detect MMP on the remote host sooner, when the pool is idle, as it is not limited to the granularity of ub_timestamp (1 second). * It issues writes immediately when zfs_multihost_interval is changed so remote hosts see the udpated value as soon as possible. * It fixes a bug where setting zfs_multihost_fail_intervals = 1 results in immediate pool suspension. * It reports nanoseconds remaining in the activity test via /proc/spl/kstat/zfs/<pool>/activity_test (during a tryimport, where the test is normally performed, the pool name is $import) * It fixes a cleanup issue with test mmp_active_import, where ztest is not killed for some failure modes. * In ZTS, when checking whether the activity test occurred, check against a duration specified via an argument, so it is clear reading the test what is expected. Signed-off-by: Olaf Faaland <email@example.com>
- Loading branch information...
Showing with 822 additions and 195 deletions.
- +14 −1 cmd/zdb/zdb.c
- +10 −0 include/sys/mmp.h
- +2 −1 include/sys/spa.h
- +53 −3 include/sys/uberblock_impl.h
- +28 −22 man/man5/zfs-module-parameters.5
- +203 −100 module/zfs/mmp.c
- +98 −21 module/zfs/spa.c
- +90 −16 module/zfs/spa_stats.c
- +10 −2 module/zfs/uberblock.c
- +12 −1 module/zfs/vdev_label.c
- +1 −1 tests/runfiles/linux.run
- +1 −0 tests/zfs-tests/tests/functional/mmp/Makefile.am
- +6 −0 tests/zfs-tests/tests/functional/mmp/mmp.cfg
- +106 −13 tests/zfs-tests/tests/functional/mmp/mmp.kshlib
- +15 −5 tests/zfs-tests/tests/functional/mmp/mmp_active_import.ksh
- +80 −0 tests/zfs-tests/tests/functional/mmp/mmp_activity_test_duration.ksh
- +16 −2 tests/zfs-tests/tests/functional/mmp/mmp_inactive_import.ksh
- +17 −5 tests/zfs-tests/tests/functional/mmp/mmp_on_uberblocks.ksh
- +58 −2 tests/zfs-tests/tests/functional/mmp/mmp_reset_interval.ksh
- +2 −0 tests/zfs-tests/tests/functional/mmp/setup.ksh
Oops, something went wrong.