New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reimplement vdev_random_leaf and rename it #6665
Conversation
Codecov Report
@@ Coverage Diff @@
## master #6665 +/- ##
==========================================
+ Coverage 67.3% 67.38% +0.08%
==========================================
Files 196 196
Lines 70336 70347 +11
Branches 13919 13920 +1
==========================================
+ Hits 47337 47405 +68
+ Misses 17592 17543 -49
+ Partials 5407 5399 -8
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Functionally this looks good! Thanks.
module/zfs/mmp.c
Outdated
| * that when the tree is healthy, the leaf chosen will be random with even | ||
| * distribution. If there are unhealthy vdevs in the tree, the distribution | ||
| * will be really poor only if a large proportion of the vdevs are unhealthy, | ||
| * in which case there are other more pressing problems.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: extra period at end of line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
module/zfs/mmp.c
Outdated
| else | ||
| return (NULL); | ||
| } | ||
| if (vd->vdev_children == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When vd->vdev_children == 0 then vd->vdev_ops->vdev_op_leaf == B_TRUE. Many of the existing functions which recursively walk the tree already depends on this. So you could simplify this too.
if (vd->vdev_ops->vdev_op_leaf) {
if (vd->vdev_mmp_pending == 0)
return (vd);
else
return (NULL);
}Or even
if (vd->vdev_ops->vdev_op_leaf)
return (vd->vdev_mmp_pending == 0 ? vd : NULL);There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
module/zfs/mmp.c
Outdated
| * Since we hold SCL_STATE, neither pool nor vdev state can | ||
| * change. Therefore, if the root is not dead, there is a | ||
| * child that is not dead, and so on down to a leaf. | ||
| * Base case: vd has no children | ||
| */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, I suggest dropping the Base case and Recursive case comments which I don't think add much. The function is already pretty small and concise (which is good!).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Rename it as mmp_random_leaf() since it is defined in mmp.c. The earlier implementation could end up spinning forever if a pool had a vdev marked writeable, none of whose children were writeable. It also did not guarantee that if a writeable leaf vdev existed, it would be found. Reimplement to recursively walk the device tree to select the leaf. It searches the entire tree, so that a return value of (NULL) indicates there were no usable leaves in the pool; all were either not writeable or had pending mmp writes. It still chooses the starting child randomly at each level of the tree, so if the pool's devices are healthy, the mmp writes go to random leaves with an even distribution. This was verified by testing using zfs_multihost_history enabled. Fixes openzfs#6631 Signed-off-by: Olaf Faaland <faaland1@llnl.gov>
e82a2ae
to
cf3ea10
Compare
|
Pushed an update that addressed the review feedback. |
Rename it as mmp_random_leaf() since it is defined in mmp.c. The earlier implementation could end up spinning forever if a pool had a vdev marked writeable, none of whose children were writeable. It also did not guarantee that if a writeable leaf vdev existed, it would be found. Reimplement to recursively walk the device tree to select the leaf. It searches the entire tree, so that a return value of (NULL) indicates there were no usable leaves in the pool; all were either not writeable or had pending mmp writes. It still chooses the starting child randomly at each level of the tree, so if the pool's devices are healthy, the mmp writes go to random leaves with an even distribution. This was verified by testing using zfs_multihost_history enabled. Reviewed by: Thomas Caputi <tcaputi@datto.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Giuseppe Di Natale <dinatale2@llnl.gov> Signed-off-by: Olaf Faaland <faaland1@llnl.gov> Closes openzfs#6631 Closes openzfs#6665
Description
Rename the function mmp_random_leaf() since it is defined in mmp.c.
Reimplement to recursively walk the device tree to select the leaf. It
searches the entire tree, so that a return value of (NULL) indicates
there were no usable leaves in the pool; all were either not writeable
or had pending mmp writes.
It still chooses the starting child randomly at each level of the tree,
so if the pool's devices are healthy, the mmp writes go to random leaves
with an even distribution. This was verified by testing using
zfs_multihost_history enabled.
Motivation and Context
The earlier implementation could end up spinning forever if a pool had a
vdev marked writeable, none of whose children were writeable. It also
did not guarantee that if a writeable leaf vdev existed, it would be
found.
Fixes #6631
How Has This Been Tested?
Types of changes
Checklist:
Signed-off-by.