Feature Flags + Async Destroy + Empty Bpobj #1148

Closed
wants to merge 9 commits into from

4 participants

@behlendorf
ZFS on Linux member

This pull request contains all the required upstream Illumos patches to implement feature flags, the new asynchronous destroy feature, and the new empty bpobj feature. The branch:

  • Builds cleanly on all supported platforms
  • Passes ztest
  • Passes all the usual regression tests
  • Behaves as expected during manual testing

I would like to merge this feature in for the rc14 tag. However, before I do that I want to see some additional real world testing. If your interested in getting this support added to ZoL please consider running this branch on a test zpool and reporting any issues you encounter. I'd love to see the following scenarios tested using pool with interesting test data.

  • zpool upgrade a v28 pool to a v5000 feature flags pool
  • zpool upgrade a pre-v28 pool to a v5000 feature flags pool
  • Verify a v5000 pool created by ZoL can be imported under the latest Illumos/FreeBSD
  • Verify a v5000 pool created by Illumos/FreeBSD can be imported under ZoL
  • Verify async destroy works as expected
  • Any other usage model you can think to try
behlendorf added some commits Dec 13, 2012
@behlendorf behlendorf Illumos #2619 and #2747
2619 asynchronous destruction of ZFS file systems
2747 SPA versioning with zfs feature flags
Reviewed by: Matt Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <gwilson@delphix.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Reviewed by: Dan Kruchinin <dan.kruchinin@gmail.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

NOTE: The grub specific changes were not ported.  This change
must be made to the Linux grub packages.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
49ac1d2
@behlendorf behlendorf Revert "Temporarily disable the reguid test."
This reverts commit d135245.
Since feature flags have now been merged we can apply the real
upstream fix from Illumos.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
a52ca4a
@behlendorf behlendorf Illumos #3090 and #3102
3090 vdev_reopen() during reguid causes vdev to be treated as corrupt
3102 vdev_uberblock_load() and vdev_validate() may read the wrong label

Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: Christopher Siden <chris.siden@delphix.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes #939
0ef0ee9
@behlendorf behlendorf Illumos #2762
2762 zpool command should have better support for feature flags
Reviewed by: Matthew Ahrens <mahrens@delphix.com>
Reviewed by: George Wilson <george.wilson@delphix.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>
fd98b16
@behlendorf behlendorf Set featureflags as the release
For the purposes of testing this branch set the release field
to featureflags.  This commit will be dropped when the branch
is merged.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
7a599ea
@behlendorf behlendorf Illumos #3086
3086 unnecessarily setting DS_FLAG_INCONSISTENT on async
destroyed datasets
Reviewed by: Christopher Siden <chris.siden@delphix.com>
Approved by: Eric Schrock <Eric.Schrock@delphix.com>
63a18d3
@behlendorf behlendorf Illumos #3349
3349 zpool upgrade -V bumps the on disk version number, but leaves
the in version
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Christopher Siden <chris.siden@delphix.com>
Reviewed by: Matt Ahrens <matthew.ahrens@delphix.com>
Reviewed by: Richard Lowe <richlowe@richlowe.net>
Approved by: Dan McDonald <danmcd@nexenta.com>
ceb4cd7
@behlendorf behlendorf Fix __zio_execute() asynchronous dispatch
To save valuable stack all zio's were made asynchronous when in the
tgx_sync_thread context or during pool initialization.  See commit
2fac4c2 for the original patch and motivation.

Unfortuantely, the changes to dsl_pool_sync_context() made by the
feature flags broke this logic causing in __zio_execute() to dispatch
itself infinitely when called during pool initialization.  This
commit refines the existing logic to specificly target only the two
cases we care about.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
f554250
@mmatuska

Do you consider importing the bpobj feature (Illumos #3104) at the same time? That will make user transition easier.
https://www.illumos.org/issues/3104

@behlendorf behlendorf Illumos #3104
3104 eliminate empty bpobjs
Reviewed by: George Wilson <george.wilson@delphix.com>
Reviewed by: Adam Leventhal <ahl@delphix.com>
Reviewed by: Christopher Siden <chris.siden@delphix.com>
Reviewed by: Garrett D'Amore <garrett@damore.org>
Approved by: Eric Schrock <eric.schrock@delphix.com>
5b63e33
@behlendorf
ZFS on Linux member

@mmatuska Good thought, I've added the bpobj feature to the branch for testing.

@lundman

I've done some tests.

FreeBSD-9.1RC3, which uses v28
OmniOS, omnios-a373ec8 IllumOS clone, v5000
ZOL-feature
and my own ZOL+ with new feature@roger, to test/simulate adding a new feature.

Tested:

  • ZOL v5000 -> OmniOS import, which enables empty_bpobj (my code is pre-#3104)

  • OmniOS v5000->ZOL import OK, with extra;
    mypool unsupported@com.delphix:empty_bpobj inactive local

  • ZOL v28 -> FBSD import OK

  • FBSD v28 ->ZOL import OK, upgrade OK.

  • ZOL v28 -> Omni import OK, upgrade OK.

  • ZOL+ v5000 -> ZOL import OK, with extra
    mypool unsupported@net.lundman:roger inactive local

I went a little further, and tested the readonly situations with my ZOL+ pool feature@roger;

ZOL+
# zpool create -f mypool ~/src/pool-image.bin
# zfs create -o roger=on mypool/BOOM
# zfs create -o roger=off mypool/DOOM
# zfs export mypool

ZOL
zpool import -fd ~/src/ 
   pool: mypool
     id: 239823477007881923
  state: UNAVAIL
status: The pool can only be accessed in read-only mode on this system. It
        cannot be accessed in read-write mode because it uses the following
        feature(s) not supported on this system:
        net.lundman:roger
action: The pool cannot be imported in read-write mode. Import the pool with
        "-o readonly=on", access the pool on a system that supports the
        required feature(s), or recreate the pool from backup.
 config:

        mypool                             UNAVAIL  unsupported feature(s)
          /media/sf_zfssrc/pool-image.bin  ONLINE

# zpool import -o readonly=on -fd ~/src/ mypool
filesystem 'mypool/BOOM' can not be mounted due to error 22
cannot mount 'mypool/BOOM': Invalid argument

# zfs list
NAME          USED  AVAIL  REFER  MOUNTPOINT
mypool        196K   984M    32K  /mypool
mypool/BOOM    31K   984M    31K  /mypool/BOOM
mypool/DOOM    30K   984M    30K  /mypool/DOOM

# zfs mount
mypool                          /mypool
mypool/DOOM                     /mypool/DOOM

# zpool get all mypool
mypool  feature@async_destroy           enabled                         local
mypool  unsupported@net.lundman:roger   readonly                        local

Which I think is all correct.

@lundman

Also, adding IllumOS-#3104, that also works:

# zpool upgrade
This system supports ZFS pool feature flags.

All pools are formatted using feature flags.

Every feature flags pool has all supported features enabled.

# zpool get all mypool
mypool  feature@async_destroy           enabled                         local
mypool  feature@empty_bpobj             enabled                         local

The only issue I came across, was doing something stupid:


# mv pool-image.bin images/
# zpool export mypool


# zpool export mypoolDec 25 16:33:51 zfsdev kernel: [ 3720.233728] INFO: task umount:8321 blocked for more than 120 seconds.
Dec 25 16:33:51 zfsdev kernel: [ 3720.233835] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 16:33:51 zfsdev kernel: [ 3720.233930] umount          D ffff8800602139c0     0  8321   8318 0x00000000
Dec 25 16:33:51 zfsdev kernel: [ 3720.233933]  ffff880045dd9c78 0000000000000086 ffff880045735c00 ffff880045dd9fd8
Dec 25 16:33:51 zfsdev kernel: [ 3720.233936]  ffff880045dd9fd8 ffff880045dd9fd8 ffffffff81c13460 ffff880045735c00
Dec 25 16:33:51 zfsdev kernel: [ 3720.233938]  ffff880045dd9c88 ffff88005ce34288 ffff88005ce341f0 ffff88005ce34290
Dec 25 16:33:51 zfsdev kernel: [ 3720.233940] Call Trace:
Dec 25 16:33:51 zfsdev kernel: [ 3720.233948]  [<ffffffff816827b9>] schedule+0x29/0x70
Dec 25 16:33:51 zfsdev kernel: [ 3720.233958]  [<ffffffffa012e7dc>] cv_wait_common+0x9c/0x190 [spl]
Dec 25 16:33:51 zfsdev kernel: [ 3720.233962]  [<ffffffff810768a0>] ? finish_wait+0x80/0x80
Dec 25 16:33:51 zfsdev kernel: [ 3720.233967]  [<ffffffffa012e903>] __cv_wait+0x13/0x20 [spl]
Dec 25 16:33:51 zfsdev kernel: [ 3720.233994]  [<ffffffffa022b383>] txg_wait_synced+0xb3/0x190 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234014]  [<ffffffffa025c156>] zfs_sb_teardown+0x2a6/0x370 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234033]  [<ffffffffa025c530>] zfs_umount+0x30/0xc0 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234051]  [<ffffffffa0277b5e>] zpl_put_super+0xe/0x10 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234054]  [<ffffffff81184581>] generic_shutdown_super+0x61/0xf0
Dec 25 16:33:51 zfsdev kernel: [ 3720.234056]  [<ffffffff811846a6>] kill_anon_super+0x16/0x30
Dec 25 16:33:51 zfsdev kernel: [ 3720.234074]  [<ffffffffa027794e>] zpl_kill_sb+0x1e/0x30 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234076]  [<ffffffff81183fc7>] deactivate_locked_super+0x57/0x90
Dec 25 16:33:51 zfsdev kernel: [ 3720.234078]  [<ffffffff8118423e>] deactivate_super+0x4e/0x70
Dec 25 16:33:51 zfsdev kernel: [ 3720.234080]  [<ffffffff811a04be>] mntput_no_expire+0xfe/0x160
Dec 25 16:33:51 zfsdev kernel: [ 3720.234082]  [<ffffffff811a1316>] sys_umount+0x76/0x3a0
Dec 25 16:33:51 zfsdev kernel: [ 3720.234085]  [<ffffffff8168b969>] system_call_fastpath+0x16/0x1b

So, it is unattractive that the process hangs, but moving the "active image" is a bit silly. Worth making a ticket?

@Vilbrekin

I've played a bit with your patch as well. Tested pools between OpenIndiana and Arch back and forth.
It works quite well. I just encountered bug at export time on a 3GB pool after playing with the .zfs/snapshot folder (I need to find a way to reproduce it):

[root@arch ~]# zpool export test_big
cannot unmount '/test_big/web/poirsouille': umount failed
[ 1007.924316] BUG: Dentry ffff88003b1ed8c0{i=4,n=/} still in use (1) [unmount of zfs zfs]
[ 1007.924677] ------------[ cut here ]------------
[ 1007.924941] kernel BUG at fs/dcache.c:967!
[ 1007.925141] invalid opcode: 0000 [#1] PREEMPT SMP
[ 1007.925339] Modules linked in: zfs(PO) zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate nfsv4 auth_rpcgss kvm_amd psmouse kvm crc32c_intel ghash_clmulni_intel vmwgfx ttm drm processor ppdev i2c_piix4 parport_pc parport aesni_intel aes_x86_64 aes_generic shpchp serio_raw ablk_helper cryptd pcspkr evdev pci_hotplug intel_agp e1000 intel_gtt vmw_balloon i2c_core microcode button ac container nfs lockd sunrpc fscache ext4 crc16 jbd2 mbcache sd_mod sr_mod cdrom mptspi scsi_transport_spi ata_generic mptscsih pata_acpi floppy mptbase ata_piix libata scsi_mod [last unloaded: zfs]
[ 1007.927841] CPU 2
[ 1007.927859] Pid: 1092, comm: umount Tainted: P        W  O 3.6.6-1-ARCH #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[ 1007.929251] RIP: 0010:[]  [] shrink_dcache_for_umount_subtree+0x1d7/0x1e0
[ 1007.930432] RSP: 0018:ffff880034a95d88  EFLAGS: 00010296
[ 1007.931010] RAX: 000000000000004b RBX: ffff88003b1ed8c0 RCX: 00000000000000cf
[ 1007.931702] RDX: 0000000000000016 RSI: 0000000000000046 RDI: 0000000000000246
[ 1007.932469] RBP: ffff880034a95da8 R08: 0000000000000000 R09: 0000000000000514
[ 1007.933278] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88003b1ed8c0
[ 1007.934098] R13: ffffffffa09cf2e0 R14: ffff88003be7ee40 R15: ffff8800307ca440
[ 1007.934877] FS:  00007ff0e37a6780(0000) GS:ffff88003f640000(0000) knlGS:0000000000000000
[ 1007.935697] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1007.936524] CR2: 00007ff0e3364220 CR3: 0000000012877000 CR4: 00000000000407e0
[ 1007.937459] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1007.938343] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1007.939209] Process umount (pid: 1092, threadinfo ffff880034a94000, task ffff88003b5a4080)
[ 1007.940047] Stack:
[ 1007.940880]  ffff88003aaf7b70 0000000000000000 ffff88003aaf7800 ffffffffa09ef6e0
[ 1007.941833]  ffff880034a95dc8 ffffffff81197733 ffff880034a95de8 ffff88003aaf7800
[ 1007.942806]  ffff880034a95df8 ffffffff8118160c ffff8800307ca420 0000000000000021
[ 1007.943719] Call Trace:
[ 1007.944455]  [] shrink_dcache_for_umount+0x33/0x60
[ 1007.945480]  [] generic_shutdown_super+0x2c/0xf0
[ 1007.946515]  [] kill_anon_super+0x16/0x30
[ 1007.947612]  [] zpl_kill_sb+0x1e/0x30 [zfs]
[ 1007.949001]  [] deactivate_locked_super+0x3c/0xa0
[ 1007.950016]  [] deactivate_super+0x4e/0x70
[ 1007.951017]  [] mntput_no_expire+0x101/0x160
[ 1007.951975]  [] mntput+0x26/0x40
[ 1007.952987]  [] release_mounts+0x8b/0xa0
[ 1007.953989]  [] sys_umount+0x1f1/0x3a0
[ 1007.954956]  [] system_call_fastpath+0x1a/0x1f
[ 1007.955979] Code: 00 00 48 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 1b 48 8b 50 40 48 89 34 24 48 c7 c7 c0 ac 6f 81 48 89 de 31 c0 e8 36 26 2f 00 <0f> 0b 31 d2 eb e5 0f 0b 90 55 48 89 e5 41 57 41 56 41 55 41 54
[ 1007.958861] RIP  [] shrink_dcache_for_umount_subtree+0x1d7/0x1e0
[ 1007.959829]  RSP 
[ 1007.960745] ---[ end trace 6e895d64c49dad86 ]---
@behlendorf
ZFS on Linux member

Both of these look like real issues. But I suspect neither are related to the feature flag changes. Let's open new issues for each of them.

Thank you for testing this code and posting your results.

@behlendorf
ZFS on Linux member

Merged feature-flags branch in to master as the following commits. Thanks for everyone for testing this, we'll shake out any remaining (minor) issues in master.

c1cdd99 Merge branch 'feature-flags'
1eb5bfa Illumos #3145, #3212
753c383 Illumos #3104: eliminate empty bpobjs
9157970 Fix __zio_execute() asynchronous dispatch
ea0b253 Illumos #3349: zpool upgrade -V bumps the on disk version number
29809a6 Illumos #3086: unnecessarily setting DS_FLAG_INCONSISTENT on async
b9b24bb Illumos #2762: zpool command should have better support for feature flag
3bc7e0f Illumos #3090 and #3102
5ac0c30 Revert "Temporarily disable the reguid test."
9ae529e Illumos #2619 and #2747

@behlendorf behlendorf closed this Jan 8, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment