This repository has been archived by the owner on Nov 7, 2019. It is now read-only.
forked from illumos/illumos-gate
-
Notifications
You must be signed in to change notification settings - Fork 69
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
9075 Improve ZFS pool import/load process and corrupted pool recovery
Reviewed by: George Wilson <george.wilson@delphix.com> Reviewed by: Matthew Ahrens <mahrens@delphix.com> Reviewed by: Andrew Stormont <andyjstormont@gmail.com> Some work has been done lately to improve the debugability of the ZFS pool load (and import) process. This includes: 7638 Refactor spa_load_impl into several functions 8961 SPA load/import should tell us why it failed 7277 zdb should be able to print zfs_dbgmsg's To iterate on top of that, there's a few changes that were made to make the import process more resilient and crash free. One of the first tasks during the pool load process is to parse a config provided from userland that describes what devices the pool is composed of. A vdev tree is generated from that config, and then all the vdevs are opened. The Meta Object Set (MOS) of the pool is accessed, and several metadata objects that are necessary to load the pool are read. The exact configuration of the pool is also stored inside the MOS. Since the configuration provided from userland is external and might not accurately describe the vdev tree of the pool at the txg that is being loaded, it cannot be relied upon to safely operate the pool. For that reason, the configuration in the MOS is read early on. In the past, the two configurations were compared together and if there was a mismatch then the load process was aborted and an error was returned. The latter was a good way to ensure a pool does not get corrupted, however it made the pool load process needlessly fragile in cases where the vdev configuration changed or the userland configuration was outdated. Since the MOS is stored in 3 copies, the configuration provided by userland doesn't have to be perfect in order to read its contents. Hence, a new approach has been adopted: The pool is first opened with the untrusted userland configuration just so that the real configuration can be read from the MOS. The trusted MOS configuration is then used to generate a new vdev tree and the pool is re-opened. When the pool is opened with an untrusted configuration, writes are disabled to avoid accidentally damaging it. During reads, some sanity checks are performed on block pointers to see if each DVA points to a known vdev; when the configuration is untrusted, instead of panicking the system if those checks fail we simply avoid issuing reads to the invalid DVAs. This new two-step pool load process now allows rewinding pools accross vdev tree changes such as device replacement, addition, etc. Loading a pool from an external config file in a clustering environment also becomes much safer now since the pool will import even if the config is outdated and didn't, for instance, register a recent device addition. With this code in place, it became relatively easy to implement a long-sought-after feature: the ability to import a pool with missing top level (i.e. non-redundant) devices. Note that since this almost guarantees some loss of data, this feature is for now restricted to a read-only import. Closes #539
- Loading branch information
Showing
34 changed files
with
2,792 additions
and
542 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
76 changes: 76 additions & 0 deletions
76
...c/test/zfs-tests/tests/functional/cli_root/zpool_import/import_cachefile_device_added.ksh
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
#!/usr/bin/ksh -p | ||
|
||
# | ||
# This file and its contents are supplied under the terms of the | ||
# Common Development and Distribution License ("CDDL"), version 1.0. | ||
# You may only use this file in accordance with the terms of version | ||
# 1.0 of the CDDL. | ||
# | ||
# A full copy of the text of the CDDL should have accompanied this | ||
# source. A copy of the CDDL is also available via the Internet at | ||
# http://www.illumos.org/license/CDDL. | ||
# | ||
|
||
# | ||
# Copyright (c) 2016 by Delphix. All rights reserved. | ||
# | ||
|
||
. $STF_SUITE/tests/functional/cli_root/zpool_import/zpool_import.kshlib | ||
|
||
# | ||
# DESCRIPTION: | ||
# A pool should be importable using an outdated cachefile that is unaware | ||
# that one or two top-level vdevs were added. | ||
# | ||
# STRATEGY: | ||
# 1. Create a pool with some devices and an alternate cachefile. | ||
# 2. Backup the cachefile. | ||
# 3. Add a device/mirror/raid to the pool. | ||
# 4. Export the pool. | ||
# 5. Verify that we can import the pool using the backed-up cachefile. | ||
# | ||
|
||
verify_runnable "global" | ||
|
||
log_onexit cleanup | ||
|
||
function test_add_vdevs | ||
{ | ||
typeset poolcreate="$1" | ||
typeset addvdevs="$2" | ||
typeset poolcheck="$3" | ||
|
||
log_note "$0: pool '$poolcreate', add $addvdevs." | ||
|
||
log_must zpool create -o cachefile=$CPATH $TESTPOOL1 $poolcreate | ||
|
||
log_must cp $CPATH $CPATHBKP | ||
|
||
log_must zpool add -f $TESTPOOL1 $addvdevs | ||
|
||
log_must zpool export $TESTPOOL1 | ||
|
||
log_must zpool import -c $CPATHBKP $TESTPOOL1 | ||
log_must check_pool_config $TESTPOOL1 "$poolcheck" | ||
|
||
# Cleanup | ||
log_must zpool destroy $TESTPOOL1 | ||
log_must rm -f $CPATH $CPATHBKP | ||
|
||
log_note "" | ||
} | ||
|
||
test_add_vdevs "$VDEV0" "$VDEV1" "$VDEV0 $VDEV1" | ||
test_add_vdevs "$VDEV0 $VDEV1" "$VDEV2" "$VDEV0 $VDEV1 $VDEV2" | ||
test_add_vdevs "$VDEV0" "$VDEV1 $VDEV2" "$VDEV0 $VDEV1 $VDEV2" | ||
test_add_vdevs "$VDEV0" "mirror $VDEV1 $VDEV2" \ | ||
"$VDEV0 mirror $VDEV1 $VDEV2" | ||
test_add_vdevs "mirror $VDEV0 $VDEV1" "mirror $VDEV2 $VDEV3" \ | ||
"mirror $VDEV0 $VDEV1 mirror $VDEV2 $VDEV3" | ||
test_add_vdevs "$VDEV0" "raidz $VDEV1 $VDEV2 $VDEV3" \ | ||
"$VDEV0 raidz $VDEV1 $VDEV2 $VDEV3" | ||
test_add_vdevs "$VDEV0" "log $VDEV1" "$VDEV0 log $VDEV1" | ||
test_add_vdevs "$VDEV0 log $VDEV1" "$VDEV2" "$VDEV0 $VDEV2 log $VDEV1" | ||
test_add_vdevs "$VDEV0" "$VDEV1 log $VDEV2" "$VDEV0 $VDEV1 log $VDEV2" | ||
|
||
log_pass "zpool import -c cachefile_unaware_of_add passed." |
Oops, something went wrong.