Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix "Detach spare vdev in case if resilvering does not happen" #14722

Merged
merged 1 commit into from
Apr 19, 2023

Conversation

ixhamza
Copy link
Contributor

@ixhamza ixhamza commented Apr 5, 2023

Spare vdev should detach from the pool when a disk is reinserted. However, spare detachment depends on the completion of resilvering, and if resilver does not schedule, the spare vdev keeps attached to the pool until the next resilvering. When a zfs pool contains several disks (e.g. 25+ mirror), resilvering does not always happen when a disk is reinserted. In this patch, spare vdev is manually detached from the pool when resilvering does not occur and it has been tested on both Linux and FreeBSD.

How Has This Been Tested?

Created a pool with 25 data mirrors and two spare vdevs. Detached and reattached several data disks to verify spare detaches when the disk comes back online.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Library ABI change (libzfs, libzfs_core, libnvpair, libuutil and libzfsbootenv)
  • Documentation (a change to man pages or other documentation)

Checklist:

@ixhamza
Copy link
Contributor Author

ixhamza commented Apr 5, 2023

@amotin, @tonyhutter, @asomers - This patch works for both Linux and FreeBSD since both zed and zfsd use the same IOCTL to make the disk online.

@asomers
Copy link
Contributor

asomers commented Apr 5, 2023

Did you test with a 25-disk mirror, or a pool with 25 top-level vdevs, each of which was a two-way mirror? And why doesn't resilvering happen when the original disk gets reinserted? That sounds to me like a bug.

@ixhamza
Copy link
Contributor Author

ixhamza commented Apr 5, 2023

Thanks, @asomers for your response. To clarify, this is easily reproducible on 25 top-level vdevs, each of which has a two-way mirror. I am not very deep in that part of the code but here are the details of why resilvering is avoided. range_tree_is_empty(vd->vdev_dtl[DTL_MISSING]) returns empty for reinserted vdev (https://github.com/openzfs/zfs/blob/master/module/zfs/vdev.c#L3479), which is called from vdev_open(), which is actually called from vdev_online() context. If we have fewer disks, resilvering always happens, i.e., range_tree_space(vd->vdev_dtl[DTL_MISSING]) always has some value.

@asomers
Copy link
Contributor

asomers commented Apr 5, 2023

So there's no resilvering because during the time that the disk was missing, no data was written to that TLV? I guess that makes sense. And this patch causes the ZFS kernel module to automatically detach the spare if it determines that resilvering is not necessary? That sounds good.

@ixhamza
Copy link
Contributor Author

ixhamza commented Apr 5, 2023

Thanks. That's the intention of this patch. When zpool_vdev_online() is called from zed/zfsd, we now manually detach the spare vdev if resilvering is not necessary.

@behlendorf behlendorf added the Status: Code Review Needed Ready for review and testing label Apr 6, 2023
@ixhamza ixhamza force-pushed the NAS-121207-upstream branch 2 times, most recently from d57777b to d5a4321 Compare April 18, 2023 02:40
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.

Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Apr 18, 2023
@behlendorf behlendorf merged commit 719534c into openzfs:master Apr 19, 2023
ixhamza added a commit to truenas/zfs that referenced this pull request Apr 21, 2023
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes openzfs#14722
ixhamza added a commit to truenas/zfs that referenced this pull request Apr 21, 2023
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes openzfs#14722
ixhamza added a commit to truenas/zfs that referenced this pull request Apr 21, 2023
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes openzfs#14722
behlendorf pushed a commit that referenced this pull request Apr 24, 2023
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes #14722
andrewc12 pushed a commit to andrewc12/openzfs that referenced this pull request Apr 30, 2023
Spare vdev should detach from the pool when a disk is reinserted.
However, spare detachment depends on the completion of resilvering,
and if resilver does not schedule, the spare vdev keeps attached to
the pool until the next resilvering. When a zfs pool contains
several disks (25+ mirror), resilvering does not always happen when
a disk is reinserted. In this patch, spare vdev is manually detached
from the pool when resilvering does not occur and it has been tested
on both Linux and FreeBSD.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Ameer Hamza <ahamza@ixsystems.com>
Closes openzfs#14722
@ixhamza ixhamza deleted the NAS-121207-upstream branch August 2, 2023 17:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants