Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reimplement vdev_random_leaf and rename it #6665

Merged
merged 1 commit into from Sep 22, 2017

Conversation

ofaaland
Copy link
Contributor

@ofaaland ofaaland commented Sep 21, 2017

Description

Rename the function mmp_random_leaf() since it is defined in mmp.c.

Reimplement to recursively walk the device tree to select the leaf. It
searches the entire tree, so that a return value of (NULL) indicates
there were no usable leaves in the pool; all were either not writeable
or had pending mmp writes.

It still chooses the starting child randomly at each level of the tree,
so if the pool's devices are healthy, the mmp writes go to random leaves
with an even distribution. This was verified by testing using
zfs_multihost_history enabled.

Motivation and Context

The earlier implementation could end up spinning forever if a pool had a
vdev marked writeable, none of whose children were writeable. It also
did not guarantee that if a writeable leaf vdev existed, it would be
found.

Fixes #6631

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • All commit messages are properly formatted and contain Signed-off-by.
  • Change has been approved by a ZFS on Linux member.

@codecov
Copy link

codecov bot commented Sep 21, 2017

Codecov Report

Merging #6665 into master will increase coverage by 0.08%.
The diff coverage is 66.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6665      +/-   ##
==========================================
+ Coverage    67.3%   67.38%   +0.08%     
==========================================
  Files         196      196              
  Lines       70336    70347      +11     
  Branches    13919    13920       +1     
==========================================
+ Hits        47337    47405      +68     
+ Misses      17592    17543      -49     
+ Partials     5407     5399       -8
Impacted Files Coverage Δ
module/zfs/mmp.c 87.5% <66.66%> (+1.22%) ⬆️
module/zfs/edonr_zfs.c 0% <0%> (-76%) ⬇️
module/zfs/skein_zfs.c 0% <0%> (-72.73%) ⬇️
module/zfs/zle.c 56.66% <0%> (-43.34%) ⬇️
module/zfs/zio_checksum.c 80.23% <0%> (-4.57%) ⬇️
module/zfs/vdev_queue.c 90.98% <0%> (-3.53%) ⬇️
module/zfs/rrwlock.c 90.59% <0%> (-1.71%) ⬇️
module/zfs/vdev_file.c 75.86% <0%> (-1.15%) ⬇️
module/zcommon/zfs_uio.c 87.2% <0%> (-1.03%) ⬇️
module/zfs/vdev_raidz.c 82.49% <0%> (-0.98%) ⬇️
... and 56 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5df5d06...cf3ea10. Read the comment docs.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally this looks good! Thanks.

module/zfs/mmp.c Outdated
* that when the tree is healthy, the leaf chosen will be random with even
* distribution. If there are unhealthy vdevs in the tree, the distribution
* will be really poor only if a large proportion of the vdevs are unhealthy,
* in which case there are other more pressing problems..
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: extra period at end of line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

module/zfs/mmp.c Outdated
else
return (NULL);
}
if (vd->vdev_children == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When vd->vdev_children == 0 then vd->vdev_ops->vdev_op_leaf == B_TRUE. Many of the existing functions which recursively walk the tree already depends on this. So you could simplify this too.

        if (vd->vdev_ops->vdev_op_leaf) {
                if (vd->vdev_mmp_pending == 0)
                        return (vd);
                else
                        return (NULL);
        }

Or even

        if (vd->vdev_ops->vdev_op_leaf)
                return (vd->vdev_mmp_pending == 0 ? vd : NULL);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

module/zfs/mmp.c Outdated
* Since we hold SCL_STATE, neither pool nor vdev state can
* change. Therefore, if the root is not dead, there is a
* child that is not dead, and so on down to a leaf.
* Base case: vd has no children
*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I suggest dropping the Base case and Recursive case comments which I don't think add much. The function is already pretty small and concise (which is good!).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@tcaputi tcaputi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Rename it as mmp_random_leaf() since it is defined in mmp.c.

The earlier implementation could end up spinning forever if a pool had a
vdev marked writeable, none of whose children were writeable.  It also
did not guarantee that if a writeable leaf vdev existed, it would be
found.

Reimplement to recursively walk the device tree to select the leaf.  It
searches the entire tree, so that a return value of (NULL) indicates
there were no usable leaves in the pool; all were either not writeable
or had pending mmp writes.

It still chooses the starting child randomly at each level of the tree,
so if the pool's devices are healthy, the mmp writes go to random leaves
with an even distribution.  This was verified by testing using
zfs_multihost_history enabled.

Fixes openzfs#6631

Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
@ofaaland
Copy link
Contributor Author

Pushed an update that addressed the review feedback.

@behlendorf behlendorf merged commit d410c6d into openzfs:master Sep 22, 2017
@ofaaland ofaaland deleted the b_mmp_random_leaf branch September 22, 2017 23:27
@behlendorf behlendorf added this to PR Needed for 0.7.4 in 0.7.4 Oct 31, 2017
tonyhutter pushed a commit to tonyhutter/zfs that referenced this pull request Nov 21, 2017
Rename it as mmp_random_leaf() since it is defined in mmp.c.

The earlier implementation could end up spinning forever if a pool had a
vdev marked writeable, none of whose children were writeable.  It also
did not guarantee that if a writeable leaf vdev existed, it would be
found.

Reimplement to recursively walk the device tree to select the leaf.  It
searches the entire tree, so that a return value of (NULL) indicates
there were no usable leaves in the pool; all were either not writeable
or had pending mmp writes.

It still chooses the starting child randomly at each level of the tree,
so if the pool's devices are healthy, the mmp writes go to random leaves
with an even distribution.  This was verified by testing using
zfs_multihost_history enabled.

Reviewed by: Thomas Caputi <tcaputi@datto.com>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov>
Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
Closes openzfs#6631 
Closes openzfs#6665
@behlendorf behlendorf moved this from PR Needed for 0.7.4 to Merged to 0.7.4 in 0.7.4 Dec 12, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
0.7.4
Merged to 0.7.4
Development

Successfully merging this pull request may close these issues.

vdev_random_leaf() can loop forever
4 participants