Feature Flags + Async Destroy + Empty Bpobj #1148
Do you consider importing the bpobj feature (Illumos #3104) at the same time? That will make user transition easier.
https://www.illumos.org/issues/3104
@mmatuska Good thought, I've added the bpobj feature to the branch for testing.
I've done some tests.
FreeBSD-9.1RC3, which uses v28
OmniOS, omnios-a373ec8 IllumOS clone, v5000
ZOL-feature
and my own ZOL+ with new feature@roger, to test/simulate adding a new feature.
Tested:
ZOL v5000 -> OmniOS import, which enables empty_bpobj (my code is pre-#3104)
OmniOS v5000->ZOL import OK, with extra;
mypool unsupported@com.delphix:empty_bpobj inactive localZOL v28 -> FBSD import OK
FBSD v28 ->ZOL import OK, upgrade OK.
ZOL v28 -> Omni import OK, upgrade OK.
ZOL+ v5000 -> ZOL import OK, with extra
mypool unsupported@net.lundman:roger inactive local
I went a little further, and tested the readonly situations with my ZOL+ pool feature@roger;
ZOL+
# zpool create -f mypool ~/src/pool-image.bin
# zfs create -o roger=on mypool/BOOM
# zfs create -o roger=off mypool/DOOM
# zfs export mypool
ZOL
zpool import -fd ~/src/
pool: mypool
id: 239823477007881923
state: UNAVAIL
status: The pool can only be accessed in read-only mode on this system. It
cannot be accessed in read-write mode because it uses the following
feature(s) not supported on this system:
net.lundman:roger
action: The pool cannot be imported in read-write mode. Import the pool with
"-o readonly=on", access the pool on a system that supports the
required feature(s), or recreate the pool from backup.
config:
mypool UNAVAIL unsupported feature(s)
/media/sf_zfssrc/pool-image.bin ONLINE
# zpool import -o readonly=on -fd ~/src/ mypool
filesystem 'mypool/BOOM' can not be mounted due to error 22
cannot mount 'mypool/BOOM': Invalid argument
# zfs list
NAME USED AVAIL REFER MOUNTPOINT
mypool 196K 984M 32K /mypool
mypool/BOOM 31K 984M 31K /mypool/BOOM
mypool/DOOM 30K 984M 30K /mypool/DOOM
# zfs mount
mypool /mypool
mypool/DOOM /mypool/DOOM
# zpool get all mypool
mypool feature@async_destroy enabled local
mypool unsupported@net.lundman:roger readonly local
Which I think is all correct.
Also, adding IllumOS-#3104, that also works:
# zpool upgrade
This system supports ZFS pool feature flags.
All pools are formatted using feature flags.
Every feature flags pool has all supported features enabled.
# zpool get all mypool
mypool feature@async_destroy enabled local
mypool feature@empty_bpobj enabled local
The only issue I came across, was doing something stupid:
# mv pool-image.bin images/
# zpool export mypool
# zpool export mypoolDec 25 16:33:51 zfsdev kernel: [ 3720.233728] INFO: task umount:8321 blocked for more than 120 seconds.
Dec 25 16:33:51 zfsdev kernel: [ 3720.233835] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Dec 25 16:33:51 zfsdev kernel: [ 3720.233930] umount D ffff8800602139c0 0 8321 8318 0x00000000
Dec 25 16:33:51 zfsdev kernel: [ 3720.233933] ffff880045dd9c78 0000000000000086 ffff880045735c00 ffff880045dd9fd8
Dec 25 16:33:51 zfsdev kernel: [ 3720.233936] ffff880045dd9fd8 ffff880045dd9fd8 ffffffff81c13460 ffff880045735c00
Dec 25 16:33:51 zfsdev kernel: [ 3720.233938] ffff880045dd9c88 ffff88005ce34288 ffff88005ce341f0 ffff88005ce34290
Dec 25 16:33:51 zfsdev kernel: [ 3720.233940] Call Trace:
Dec 25 16:33:51 zfsdev kernel: [ 3720.233948] [<ffffffff816827b9>] schedule+0x29/0x70
Dec 25 16:33:51 zfsdev kernel: [ 3720.233958] [<ffffffffa012e7dc>] cv_wait_common+0x9c/0x190 [spl]
Dec 25 16:33:51 zfsdev kernel: [ 3720.233962] [<ffffffff810768a0>] ? finish_wait+0x80/0x80
Dec 25 16:33:51 zfsdev kernel: [ 3720.233967] [<ffffffffa012e903>] __cv_wait+0x13/0x20 [spl]
Dec 25 16:33:51 zfsdev kernel: [ 3720.233994] [<ffffffffa022b383>] txg_wait_synced+0xb3/0x190 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234014] [<ffffffffa025c156>] zfs_sb_teardown+0x2a6/0x370 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234033] [<ffffffffa025c530>] zfs_umount+0x30/0xc0 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234051] [<ffffffffa0277b5e>] zpl_put_super+0xe/0x10 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234054] [<ffffffff81184581>] generic_shutdown_super+0x61/0xf0
Dec 25 16:33:51 zfsdev kernel: [ 3720.234056] [<ffffffff811846a6>] kill_anon_super+0x16/0x30
Dec 25 16:33:51 zfsdev kernel: [ 3720.234074] [<ffffffffa027794e>] zpl_kill_sb+0x1e/0x30 [zfs]
Dec 25 16:33:51 zfsdev kernel: [ 3720.234076] [<ffffffff81183fc7>] deactivate_locked_super+0x57/0x90
Dec 25 16:33:51 zfsdev kernel: [ 3720.234078] [<ffffffff8118423e>] deactivate_super+0x4e/0x70
Dec 25 16:33:51 zfsdev kernel: [ 3720.234080] [<ffffffff811a04be>] mntput_no_expire+0xfe/0x160
Dec 25 16:33:51 zfsdev kernel: [ 3720.234082] [<ffffffff811a1316>] sys_umount+0x76/0x3a0
Dec 25 16:33:51 zfsdev kernel: [ 3720.234085] [<ffffffff8168b969>] system_call_fastpath+0x16/0x1b
So, it is unattractive that the process hangs, but moving the "active image" is a bit silly. Worth making a ticket?
I've played a bit with your patch as well. Tested pools between OpenIndiana and Arch back and forth.
It works quite well. I just encountered bug at export time on a 3GB pool after playing with the .zfs/snapshot folder (I need to find a way to reproduce it):
[root@arch ~]# zpool export test_big cannot unmount '/test_big/web/poirsouille': umount failed
[ 1007.924316] BUG: Dentry ffff88003b1ed8c0{i=4,n=/} still in use (1) [unmount of zfs zfs]
[ 1007.924677] ------------[ cut here ]------------
[ 1007.924941] kernel BUG at fs/dcache.c:967!
[ 1007.925141] invalid opcode: 0000 [#1] PREEMPT SMP
[ 1007.925339] Modules linked in: zfs(PO) zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O) zlib_deflate nfsv4 auth_rpcgss kvm_amd psmouse kvm crc32c_intel ghash_clmulni_intel vmwgfx ttm drm processor ppdev i2c_piix4 parport_pc parport aesni_intel aes_x86_64 aes_generic shpchp serio_raw ablk_helper cryptd pcspkr evdev pci_hotplug intel_agp e1000 intel_gtt vmw_balloon i2c_core microcode button ac container nfs lockd sunrpc fscache ext4 crc16 jbd2 mbcache sd_mod sr_mod cdrom mptspi scsi_transport_spi ata_generic mptscsih pata_acpi floppy mptbase ata_piix libata scsi_mod [last unloaded: zfs]
[ 1007.927841] CPU 2
[ 1007.927859] Pid: 1092, comm: umount Tainted: P W O 3.6.6-1-ARCH #1 VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform
[ 1007.929251] RIP: 0010:[] [] shrink_dcache_for_umount_subtree+0x1d7/0x1e0
[ 1007.930432] RSP: 0018:ffff880034a95d88 EFLAGS: 00010296
[ 1007.931010] RAX: 000000000000004b RBX: ffff88003b1ed8c0 RCX: 00000000000000cf
[ 1007.931702] RDX: 0000000000000016 RSI: 0000000000000046 RDI: 0000000000000246
[ 1007.932469] RBP: ffff880034a95da8 R08: 0000000000000000 R09: 0000000000000514
[ 1007.933278] R10: 0000000000000001 R11: 0000000000aaaaaa R12: ffff88003b1ed8c0
[ 1007.934098] R13: ffffffffa09cf2e0 R14: ffff88003be7ee40 R15: ffff8800307ca440
[ 1007.934877] FS: 00007ff0e37a6780(0000) GS:ffff88003f640000(0000) knlGS:0000000000000000
[ 1007.935697] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1007.936524] CR2: 00007ff0e3364220 CR3: 0000000012877000 CR4: 00000000000407e0
[ 1007.937459] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1007.938343] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1007.939209] Process umount (pid: 1092, threadinfo ffff880034a94000, task ffff88003b5a4080)
[ 1007.940047] Stack:
[ 1007.940880] ffff88003aaf7b70 0000000000000000 ffff88003aaf7800 ffffffffa09ef6e0
[ 1007.941833] ffff880034a95dc8 ffffffff81197733 ffff880034a95de8 ffff88003aaf7800
[ 1007.942806] ffff880034a95df8 ffffffff8118160c ffff8800307ca420 0000000000000021
[ 1007.943719] Call Trace:
[ 1007.944455] [] shrink_dcache_for_umount+0x33/0x60
[ 1007.945480] [] generic_shutdown_super+0x2c/0xf0
[ 1007.946515] [] kill_anon_super+0x16/0x30
[ 1007.947612] [] zpl_kill_sb+0x1e/0x30 [zfs]
[ 1007.949001] [] deactivate_locked_super+0x3c/0xa0
[ 1007.950016] [] deactivate_super+0x4e/0x70
[ 1007.951017] [] mntput_no_expire+0x101/0x160
[ 1007.951975] [] mntput+0x26/0x40
[ 1007.952987] [] release_mounts+0x8b/0xa0
[ 1007.953989] [] sys_umount+0x1f1/0x3a0
[ 1007.954956] [] system_call_fastpath+0x1a/0x1f
[ 1007.955979] Code: 00 00 48 8b 40 28 4c 8b 08 48 8b 43 30 48 85 c0 74 1b 48 8b 50 40 48 89 34 24 48 c7 c7 c0 ac 6f 81 48 89 de 31 c0 e8 36 26 2f 00 <0f> 0b 31 d2 eb e5 0f 0b 90 55 48 89 e5 41 57 41 56 41 55 41 54
[ 1007.958861] RIP [] shrink_dcache_for_umount_subtree+0x1d7/0x1e0
[ 1007.959829] RSP
[ 1007.960745] ---[ end trace 6e895d64c49dad86 ]---
Both of these look like real issues. But I suspect neither are related to the feature flag changes. Let's open new issues for each of them.
Thank you for testing this code and posting your results.
Merged feature-flags branch in to master as the following commits. Thanks for everyone for testing this, we'll shake out any remaining (minor) issues in master.
c1cdd99 Merge branch 'feature-flags'
1eb5bfa Illumos #3145, #3212
753c383 Illumos #3104: eliminate empty bpobjs
9157970 Fix __zio_execute() asynchronous dispatch
ea0b253 Illumos #3349: zpool upgrade -V bumps the on disk version number
29809a6 Illumos #3086: unnecessarily setting DS_FLAG_INCONSISTENT on async
b9b24bb Illumos #2762: zpool command should have better support for feature flag
3bc7e0f Illumos #3090 and #3102
5ac0c30 Revert "Temporarily disable the reguid test."
9ae529e Illumos #2619 and #2747
This pull request contains all the required upstream Illumos patches to implement feature flags, the new asynchronous destroy feature, and the new empty bpobj feature. The branch:
I would like to merge this feature in for the rc14 tag. However, before I do that I want to see some additional real world testing. If your interested in getting this support added to ZoL please consider running this branch on a test zpool and reporting any issues you encounter. I'd love to see the following scenarios tested using pool with interesting test data.
zpool upgradea v28 pool to a v5000 feature flags poolzpool upgradea pre-v28 pool to a v5000 feature flags pool