Verify On Disk Compatibility #5

behlendorf · 2010-05-14T22:48:15Z

When porting this to Linux I tried to be very careful to maintain on disk compatibility. That is you should be able to export a zpool from a Solaris ZFS system spread over a set of devices. Then move those same devices to a Linux ZFS system and successfully import the zpool. This should work but it has never actually been verified and that needs to be done.

behlendorf · 2010-08-02T21:48:17Z

Related to this we need to verify that this port is compatible with the latest zfs-fuse. The more common case I suspect will be people trying our our port with an existing zfs-fuse configuration.

behlendorf · 2010-09-28T22:04:31Z

Jim Silva was nice enough to setup an OpenSolaris (2009.06) box for us verify this functionality. The OpenSolaris system was running ZFS pool v14 and zfs v3. The Linux port was at v0.5.1 running pool v28 and zfs v5.

The Linux port was properly able to identify and assemble a ZFS file system created on the OpenSolaris system. Conversely, a ZFS file system created under the Linux port was properly detected on the OpenSolaris system but could not to used because it correctly determined it to be a much newer version.

Use 3 threads and 8 tasks. Dispatch the final 3 tasks with TQ_FRONT. The first three tasks keep the worker threads busy while we stuff the queues. Use msleep() to force a known execution order, assuming TQ_FRONT is properly honored. Verify that the expected completion order occurs. The splat_taskq_test5_order() function may be useful in more than one test. This commit generalizes it by renaming the function to splat_taskq_test_order() and adding a name argument instead of assuming SPLAT_TASKQ_TEST5_NAME as the test name. The documentation for splat taskq regression test openzfs#5 swaps the two required completion orders in the diagram. This commit corrects the error. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>

This patch is an RFC. There are at least two things it could be getting wrong: 1 - The use of a mutex to protect an increment; I couldn't see how to get atomic operations in this code 2 - There could easily be a better fix than putting each newly allocated sa_os_t in its own lockdep class This fix does eliminate the lockdep warnings, so there's that. The two lockdep reports are below. They show two different deadlock scenarios, but they share a common link, which is thread 1 holding sa_lock and trying to get zap->zap_rwlock: zap_lockdir_impl+0x858/0x16c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] thread 2 trying to get sa_lock, either in sa_setup: sa_setup+0x742/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] or in sa_build_index: sa_build_index+0x13d/0x790 [zfs] sa_handle_get_from_db+0x368/0x500 [zfs] zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs] zfs_znode_alloc+0x3da/0x1a40 [zfs] zfs_zget+0x39a/0x6e0 [zfs] zfs_root+0x101/0x160 [zfs] zfs_domount+0x91f/0xea0 [zfs] AFAICT, sa_os_t is unique to its zfsvfs, so if we have two stacks calling zfs_domount, each has a different zfsvfs and thus a different sa, and there is no real deadlock here. The sa_setup vs sa_setup case is easy, since each is referring to a newly allocated sa_os_t. In the sa_build_index vs sa_setup case, we need to reason that the sa_os_t is unique to a zfsvfs. ====================================================== WARNING: possible circular locking dependency detected 4.19.55-4.19.2-debug-b494b4b34cd8ef26 openzfs#1 Tainted: G W O ------------------------------------------------------ kswapd0/716 is trying to acquire lock: 00000000ac111d4a (&zfsvfs->z_teardown_inactive_lock){.+.+}, at: zfs_inactive+0x132/0xb40 [zfs] but task is already holding lock: 00000000218b764d (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> openzfs#4 (fs_reclaim){+.+.}: kmem_cache_alloc_node_trace+0x43/0x380 __kmalloc_node+0x3c/0x60 spl_kmem_alloc+0xd9/0x1f0 [spl] zap_name_alloc+0x34/0x480 [zfs] zap_lookup_impl+0x27/0x3a0 [zfs] zap_lookup_norm+0xb9/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] dsl_dir_hold+0x341/0x660 [zfs] dsl_dataset_hold+0xb6/0x6c0 [zfs] dmu_objset_hold+0xca/0x120 [zfs] zpl_mount+0x90/0x3b0 [zfs] mount_fs+0x86/0x2b0 vfs_kern_mount+0x68/0x3c0 do_mount+0x306/0x2550 ksys_mount+0x7e/0xd0 __x64_sys_mount+0xba/0x150 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#3 (&zap->zap_rwlock){++++}: zap_lockdir_impl+0x7ed/0x15c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] zpl_mount+0x270/0x3b0 [zfs] mount_fs+0x86/0x2b0 vfs_kern_mount+0x68/0x3c0 do_mount+0x306/0x2550 ksys_mount+0x7e/0xd0 __x64_sys_mount+0xba/0x150 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#2 (&sa->sa_lock){+.+.}: sa_setup+0x742/0x1380 [zfs] zfsvfs_init+0x3d6/0xb20 [zfs] zfsvfs_create+0x5dd/0x900 [zfs] zfs_domount+0xa3/0xe20 [zfs] zpl_mount+0x270/0x3b0 [zfs] mount_fs+0x86/0x2b0 vfs_kern_mount+0x68/0x3c0 do_mount+0x306/0x2550 ksys_mount+0x7e/0xd0 __x64_sys_mount+0xba/0x150 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#1 (&os->os_user_ptr_lock){+.+.}: zfs_get_vfs_flag_unmounted+0x63/0x3c0 [zfs] dmu_free_long_range+0x963/0xda0 [zfs] zfs_rmnode+0x719/0x9c0 [zfs] zfs_inactive+0x306/0xb40 [zfs] zpl_evict_inode+0xa7/0x140 [zfs] evict+0x212/0x570 do_unlinkat+0x2e6/0x540 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&zfsvfs->z_teardown_inactive_lock){.+.+}: down_read+0x3f/0xe0 zfs_inactive+0x132/0xb40 [zfs] zpl_evict_inode+0xa7/0x140 [zfs] evict+0x212/0x570 dispose_list+0xfa/0x1d0 prune_icache_sb+0xd3/0x140 super_cache_scan+0x292/0x440 do_shrink_slab+0x2b9/0x800 shrink_slab+0x195/0x410 shrink_node+0x2e1/0x10f0 kswapd+0x71c/0x11c0 kthread+0x2e7/0x3e0 ret_from_fork+0x3a/0x50 other info that might help us debug this: Chain exists of: &zfsvfs->z_teardown_inactive_lock --> &zap->zap_rwlock --> fs_reclaim Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(&zap->zap_rwlock); lock(fs_reclaim); lock(&zfsvfs->z_teardown_inactive_lock); *** DEADLOCK *** 3 locks held by kswapd0/716: #0: 00000000218b764d (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x5/0x30 openzfs#1: 00000000f9a6bfa1 (shrinker_rwsem){++++}, at: shrink_slab+0x109/0x410 openzfs#2: 0000000076154958 (&type->s_umount_key#50){.+.+}, at: trylock_super+0x16/0xc0 stack backtrace: CPU: 5 PID: 716 Comm: kswapd0 Tainted: G W O 4.19.55-4.19.2-debug-b494b4b34cd8ef26 openzfs#1 Hardware name: Dell Inc. PowerEdge R510/0W844P, BIOS 1.1.4 11/04/2009 Call Trace: dump_stack+0x91/0xeb print_circular_bug.isra.16+0x30b/0x5b0 ? save_trace+0xd6/0x240 __lock_acquire+0x41be/0x4f10 ? debug_show_all_locks+0x2d0/0x2d0 ? sched_clock_cpu+0x133/0x170 ? lock_acquire+0x153/0x330 lock_acquire+0x153/0x330 ? zfs_inactive+0x132/0xb40 [zfs] down_read+0x3f/0xe0 ? zfs_inactive+0x132/0xb40 [zfs] zfs_inactive+0x132/0xb40 [zfs] ? zfs_dirty_inode+0xa20/0xa20 [zfs] ? _raw_spin_unlock_irq+0x2d/0x40 zpl_evict_inode+0xa7/0x140 [zfs] evict+0x212/0x570 dispose_list+0xfa/0x1d0 ? list_lru_walk_one+0x9c/0xd0 prune_icache_sb+0xd3/0x140 ? invalidate_inodes+0x370/0x370 ? list_lru_count_one+0x179/0x310 super_cache_scan+0x292/0x440 do_shrink_slab+0x2b9/0x800 shrink_slab+0x195/0x410 ? unregister_shrinker+0x290/0x290 shrink_node+0x2e1/0x10f0 ? shrink_node_memcg+0x1230/0x1230 ? zone_watermark_ok_safe+0x35/0x270 ? lock_acquire+0x153/0x330 ? __fs_reclaim_acquire+0x5/0x30 ? pgdat_balanced+0x91/0xd0 kswapd+0x71c/0x11c0 ? mem_cgroup_shrink_node+0x460/0x460 ? sched_clock_cpu+0x133/0x170 ? _raw_spin_unlock_irq+0x29/0x40 ? wait_woken+0x260/0x260 ? check_flags.part.23+0x480/0x480 ? __kthread_parkme+0xad/0x180 ? mem_cgroup_shrink_node+0x460/0x460 kthread+0x2e7/0x3e0 ? kthread_park+0x120/0x120 ret_from_fork+0x3a/0x50 ====================================================== WARNING: possible circular locking dependency detected 4.19.55-4.19.2-debug-b494b4b34cd8ef26 openzfs#1 Tainted: G O ------------------------------------------------------ mount.zfs/3249 is trying to acquire lock: 000000000347bea0 (&zp->z_lock){+.+.}, at: zpl_mmap+0x27e/0x550 [zfs] but task is already holding lock: 00000000224314a3 (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0x118/0x190 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> openzfs#6 (&mm->mmap_sem){++++}: _copy_from_user+0x20/0xd0 scsi_cmd_ioctl+0x47d/0x620 cdrom_ioctl+0x10b/0x29b0 sr_block_ioctl+0x107/0x150 [sr_mod] blkdev_ioctl+0x946/0x1600 block_ioctl+0xdd/0x130 do_vfs_ioctl+0x176/0xf70 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#5 (sr_mutex){+.+.}: sr_block_open+0x104/0x1a0 [sr_mod] __blkdev_get+0x249/0x11c0 blkdev_get+0x280/0x7a0 do_dentry_open+0x7ee/0x1020 path_openat+0x11a7/0x2500 do_filp_open+0x17f/0x260 do_sys_open+0x195/0x300 __se_sys_open+0xbf/0xf0 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#4 (&bdev->bd_mutex){+.+.}: __blkdev_get+0x383/0x11c0 blkdev_get+0x3bc/0x7a0 blkdev_get_by_path+0x73/0xc0 vdev_disk_open+0x4c8/0x12e0 [zfs] vdev_open+0x34c/0x13e0 [zfs] vdev_open_child+0x46/0xd0 [zfs] taskq_thread+0x979/0x1480 [spl] kthread+0x2e7/0x3e0 ret_from_fork+0x3a/0x50 -> openzfs#3 (&vd->vd_lock){++++}: vdev_disk_io_start+0x13e/0x2230 [zfs] zio_vdev_io_start+0x358/0x990 [zfs] zio_nowait+0x1f4/0x3a0 [zfs] vdev_mirror_io_start+0x211/0x7b0 [zfs] zio_vdev_io_start+0x7d3/0x990 [zfs] zio_nowait+0x1f4/0x3a0 [zfs] arc_read+0x1782/0x43a0 [zfs] dbuf_read_impl.constprop.13+0xcb4/0x1fe0 [zfs] dbuf_read+0x2c8/0x12a0 [zfs] dmu_buf_hold_by_dnode+0x6d/0xd0 [zfs] zap_get_leaf_byblk.isra.6.part.7+0xd3/0x9d0 [zfs] zap_deref_leaf+0x1f3/0x290 [zfs] fzap_lookup+0x13c/0x340 [zfs] zap_lookup_impl+0x84/0x3a0 [zfs] zap_lookup_norm+0xb9/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] spa_dir_prop+0x56/0xa0 [zfs] spa_ld_trusted_config+0xd0/0xe70 [zfs] spa_ld_mos_with_trusted_config+0x2b/0xb0 [zfs] spa_load+0x14d/0x27d0 [zfs] spa_tryimport+0x32e/0xa90 [zfs] zfs_ioc_pool_tryimport+0x107/0x190 [zfs] zfsdev_ioctl+0x1047/0x1370 [zfs] do_vfs_ioctl+0x176/0xf70 ksys_ioctl+0x66/0x70 __x64_sys_ioctl+0x6f/0xb0 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#2 (&zap->zap_rwlock){++++}: zap_lockdir_impl+0x858/0x16c0 [zfs] zap_lockdir+0xd2/0x100 [zfs] zap_lookup_norm+0x7f/0x100 [zfs] zap_lookup+0x12/0x20 [zfs] sa_setup+0x902/0x1380 [zfs] zfsvfs_init+0x6c8/0xc70 [zfs] zfsvfs_create_impl+0x5cf/0x970 [zfs] zfsvfs_create+0xc6/0x130 [zfs] zfs_domount+0x16f/0xea0 [zfs] zpl_mount+0x270/0x3b0 [zfs] mount_fs+0x86/0x2b0 vfs_kern_mount+0x68/0x3c0 do_mount+0x306/0x2550 ksys_mount+0x7e/0xd0 __x64_sys_mount+0xba/0x150 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> openzfs#1 (&sa->sa_lock){+.+.}: sa_build_index+0x13d/0x790 [zfs] sa_handle_get_from_db+0x368/0x500 [zfs] zfs_znode_sa_init.isra.0+0x24b/0x330 [zfs] zfs_znode_alloc+0x3da/0x1a40 [zfs] zfs_zget+0x39a/0x6e0 [zfs] zfs_root+0x101/0x160 [zfs] zfs_domount+0x91f/0xea0 [zfs] zpl_mount+0x270/0x3b0 [zfs] mount_fs+0x86/0x2b0 vfs_kern_mount+0x68/0x3c0 do_mount+0x306/0x2550 ksys_mount+0x7e/0xd0 __x64_sys_mount+0xba/0x150 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe -> #0 (&zp->z_lock){+.+.}: __mutex_lock+0xef/0x1380 zpl_mmap+0x27e/0x550 [zfs] mmap_region+0x8fa/0x1150 do_mmap+0x89a/0xd60 vm_mmap_pgoff+0x14a/0x190 ksys_mmap_pgoff+0x16b/0x490 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe other info that might help us debug this: Chain exists of: &zp->z_lock --> sr_mutex --> &mm->mmap_sem Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&mm->mmap_sem); lock(sr_mutex); lock(&mm->mmap_sem); lock(&zp->z_lock); *** DEADLOCK *** 1 lock held by mount.zfs/3249: #0: 00000000224314a3 (&mm->mmap_sem){++++}, at: vm_mmap_pgoff+0x118/0x190 stack backtrace: CPU: 3 PID: 3249 Comm: mount.zfs Tainted: G O 4.19.55-4.19.2-debug-b494b4b34cd8ef26 openzfs#1 Hardware name: Dell Inc. PowerEdge R510/0W844P, BIOS 1.1.4 11/04/2009 Call Trace: dump_stack+0x91/0xeb print_circular_bug.isra.16+0x30b/0x5b0 ? save_trace+0xd6/0x240 __lock_acquire+0x41be/0x4f10 ? debug_show_all_locks+0x2d0/0x2d0 ? sched_clock_cpu+0x18/0x170 ? sched_clock_cpu+0x18/0x170 ? __lock_acquire+0xe3b/0x4f10 ? reacquire_held_locks+0x191/0x430 ? reacquire_held_locks+0x191/0x430 ? lock_acquire+0x153/0x330 lock_acquire+0x153/0x330 ? zpl_mmap+0x27e/0x550 [zfs] ? zpl_mmap+0x27e/0x550 [zfs] __mutex_lock+0xef/0x1380 ? zpl_mmap+0x27e/0x550 [zfs] ? __mutex_add_waiter+0x160/0x160 ? zpl_mmap+0x27e/0x550 [zfs] ? sched_clock+0x5/0x10 ? sched_clock_cpu+0x18/0x170 ? __mutex_add_waiter+0x160/0x160 ? touch_atime+0xcd/0x230 ? atime_needs_update+0x540/0x540 ? do_raw_spin_unlock+0x54/0x250 ? zpl_mmap+0x27e/0x550 [zfs] zpl_mmap+0x27e/0x550 [zfs] ? memset+0x1f/0x40 mmap_region+0x8fa/0x1150 ? arch_get_unmapped_area+0x460/0x460 ? vm_brk+0x10/0x10 ? lock_acquire+0x153/0x330 ? lock_acquire+0x153/0x330 ? security_mmap_addr+0x56/0x80 ? get_unmapped_area+0x222/0x350 do_mmap+0x89a/0xd60 ? proc_keys_start+0x3d0/0x3d0 vm_mmap_pgoff+0x14a/0x190 ? vma_is_stack_for_current+0x90/0x90 ? __ia32_sys_dup3+0xb0/0xb0 ? vfs_statx_fd+0x49/0x80 ? __se_sys_newfstat+0x75/0xa0 ksys_mmap_pgoff+0x16b/0x490 ? find_mergeable_anon_vma+0x90/0x90 ? trace_hardirqs_on_thunk+0x1a/0x1c ? do_syscall_64+0x18/0x410 do_syscall_64+0x9b/0x410 entry_SYSCALL_64_after_hwframe+0x49/0xbe Not-signed-off-by: Jeff Dike <jdike@akamai.com>

After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address 0x608000a1fcd0 at pc 0x7fe88b0c166e bp 0x7fe878414ad0 sp 0x7fe878414278 READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup ../../module/zfs/spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove ../../module/zfs/vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #4 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #6 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #7 0x7fe8899e588e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e) 0x608000a1fcd0 is located 48 bytes inside of 88-byte region [0x608000a1fca0,0x608000a1fcf8) freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7b8) #1 0x7fe88ae541c5 in nvlist_free ../../module/nvpair/nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free ../../module/nvpair/nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair ../../module/nvpair/nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux ../../module/zfs/vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove ../../module/zfs/vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #7 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #9 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da)

After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address 0x608000a1fcd0 at pc 0x7fe88b0c166e bp 0x7fe878414ad0 sp 0x7fe878414278 READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup ../../module/zfs/spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove ../../module/zfs/vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #4 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #6 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) #7 0x7fe8899e588e in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x12188e) 0x608000a1fcd0 is located 48 bytes inside of 88-byte region [0x608000a1fca0,0x608000a1fcf8) freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xde7b8) #1 0x7fe88ae541c5 in nvlist_free ../../module/nvpair/nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free ../../module/nvpair/nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair ../../module/nvpair/nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux ../../module/zfs/vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove ../../module/zfs/vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove /export/home/delphix/zfs/cmd/ztest/ztest.c:3229 #7 0x55ffbc769fba in ztest_execute /export/home/delphix/zfs/cmd/ztest/ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread /export/home/delphix/zfs/cmd/ztest/ztest.c:6761 #9 0x7fe889cbc6da in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76da) Signed-off-by: Matthew Ahrens <mahrens@delphix.com>

After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Signed-off-by: Matthew Ahrens <mahrens@delphix.com>

After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #9706

After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) openzfs#1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 openzfs#2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 openzfs#3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#4 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#5 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#6 0x7fe889cbc6da in start_thread openzfs#7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free openzfs#1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 openzfs#2 0x7fe88ae543ba in nvpair_free nvpair.c:844 openzfs#3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 openzfs#4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 openzfs#5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 openzfs#6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 openzfs#7 0x55ffbc769fba in ztest_execute ztest.c:6714 openzfs#8 0x55ffbc779a90 in ztest_thread ztest.c:6761 openzfs#9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes openzfs#9706

After spa_vdev_remove_aux() is called, the config nvlist is no longer valid, as it's been replaced by the new one (with the specified device removed). Therefore any pointers into the nvlist are no longer valid. So we can't save the result of `fnvlist_lookup_string(nv, ZPOOL_CONFIG_PATH)` (in vd_path) across the call to spa_vdev_remove_aux(). Instead, use spa_strdup() to save a copy of the string before calling spa_vdev_remove_aux. Found by AddressSanitizer: ERROR: AddressSanitizer: heap-use-after-free on address ... READ of size 34 at 0x608000a1fcd0 thread T686 #0 0x7fe88b0c166d (/usr/lib/x86_64-linux-gnu/libasan.so.4+0x5166d) #1 0x7fe88a5acd6e in spa_strdup spa_misc.c:1447 #2 0x7fe88a688034 in spa_vdev_remove vdev_removal.c:2259 #3 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #4 0x55ffbc769fba in ztest_execute ztest.c:6714 #5 0x55ffbc779a90 in ztest_thread ztest.c:6761 #6 0x7fe889cbc6da in start_thread #7 0x7fe8899e588e in __clone 0x608000a1fcd0 is located 48 bytes inside of 88-byte region freed by thread T686 here: #0 0x7fe88b14e7b8 in __interceptor_free #1 0x7fe88ae541c5 in nvlist_free nvpair.c:874 #2 0x7fe88ae543ba in nvpair_free nvpair.c:844 #3 0x7fe88ae57400 in nvlist_remove_nvpair nvpair.c:978 #4 0x7fe88a683c81 in spa_vdev_remove_aux vdev_removal.c:185 #5 0x7fe88a68857c in spa_vdev_remove vdev_removal.c:2221 #6 0x55ffbc7748f8 in ztest_vdev_aux_add_remove ztest.c:3229 #7 0x55ffbc769fba in ztest_execute ztest.c:6714 #8 0x55ffbc779a90 in ztest_thread ztest.c:6761 #9 0x7fe889cbc6da in start_thread Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ryan Moeller <ryan@ixsystems.com> Signed-off-by: Matthew Ahrens <mahrens@delphix.com> Closes #9706

Using zfs with Lustre, an arc_read can trigger kernel memory allocation that in turn leads to a memory reclaim callback and a deadlock within a single zfs process. This change uses spl_fstrans_mark and spl_trans_unmark to prevent the reclaim attempt and the deadlock (https://zfsonlinux.topicbox.com/groups/zfs-devel/T4db2c705ec1804ba). The stack trace observed is: #0 [ffffc9002b98adc8] __schedule at ffffffff81610f2e openzfs#1 [ffffc9002b98ae68] schedule at ffffffff81611558 openzfs#2 [ffffc9002b98ae70] schedule_preempt_disabled at ffffffff8161184a openzfs#3 [ffffc9002b98ae78] __mutex_lock at ffffffff816131e8 openzfs#4 [ffffc9002b98af18] arc_buf_destroy at ffffffffa0bf37d7 [zfs] openzfs#5 [ffffc9002b98af48] dbuf_destroy at ffffffffa0bfa6fe [zfs] openzfs#6 [ffffc9002b98af88] dbuf_evict_one at ffffffffa0bfaa96 [zfs] openzfs#7 [ffffc9002b98afa0] dbuf_rele_and_unlock at ffffffffa0bfa561 [zfs] openzfs#8 [ffffc9002b98b050] dbuf_rele_and_unlock at ffffffffa0bfa32b [zfs] openzfs#9 [ffffc9002b98b100] osd_object_delete at ffffffffa0b64ecc [osd_zfs] openzfs#10 [ffffc9002b98b118] lu_object_free at ffffffffa06d6a74 [obdclass] openzfs#11 [ffffc9002b98b178] lu_site_purge_objects at ffffffffa06d7fc1 [obdclass] openzfs#12 [ffffc9002b98b220] lu_cache_shrink_scan at ffffffffa06d81b8 [obdclass] openzfs#13 [ffffc9002b98b278] shrink_slab at ffffffff811ca9d8 openzfs#14 [ffffc9002b98b338] shrink_node at ffffffff811cfd94 openzfs#15 [ffffc9002b98b3b8] do_try_to_free_pages at ffffffff811cfe63 openzfs#16 [ffffc9002b98b408] try_to_free_pages at ffffffff811d01c4 openzfs#17 [ffffc9002b98b488] __alloc_pages_slowpath at ffffffff811be7f2 openzfs#18 [ffffc9002b98b580] __alloc_pages_nodemask at ffffffff811bf3ed openzfs#19 [ffffc9002b98b5e0] new_slab at ffffffff81226304 openzfs#20 [ffffc9002b98b638] ___slab_alloc at ffffffff812272ab openzfs#21 [ffffc9002b98b6f8] __slab_alloc at ffffffff8122740c openzfs#22 [ffffc9002b98b708] kmem_cache_alloc at ffffffff81227578 openzfs#23 [ffffc9002b98b740] spl_kmem_cache_alloc at ffffffffa048a1fd [spl] openzfs#24 [ffffc9002b98b780] arc_buf_alloc_impl at ffffffffa0befba2 [zfs] openzfs#25 [ffffc9002b98b7b0] arc_read at ffffffffa0bf0924 [zfs] openzfs#26 [ffffc9002b98b858] dbuf_read at ffffffffa0bf9083 [zfs] openzfs#27 [ffffc9002b98b900] dmu_buf_hold_by_dnode at ffffffffa0c04869 [zfs] Signed-off-by: Mark Roper <markroper@gmail.com>

This is a fixup of commit 0fdd610 See added test case for a reproducer. Stack trace: panic: VERIFY3(nvlist_next_nvpair(redactnvl, pair) == NULL) failed (0xfffff80003ce5d18x == 0x) cpuid = 7 time = 1602212370 KDB: stack backtrace: #0 0xffffffff80c1d297 at kdb_backtrace+0x67 openzfs#1 0xffffffff80bd05cd at vpanic+0x19d openzfs#2 0xffffffff828446fa at spl_panic+0x3a openzfs#3 0xffffffff828af85d at dmu_redact_snap+0x39d openzfs#4 0xffffffff829c0370 at zfs_ioc_redact+0xa0 openzfs#5 0xffffffff829bba44 at zfsdev_ioctl_common+0x4a4 openzfs#6 0xffffffff8284c3ed at zfsdev_ioctl+0x14d openzfs#7 0xffffffff80a85ead at devfs_ioctl+0xad openzfs#8 0xffffffff8122a46c at VOP_IOCTL_APV+0x7c openzfs#9 0xffffffff80cb0a3a at vn_ioctl+0x16a openzfs#10 0xffffffff80a8649f at devfs_ioctl_f+0x1f openzfs#11 0xffffffff80c3b55e at kern_ioctl+0x2be openzfs#12 0xffffffff80c3b22d at sys_ioctl+0x15d openzfs#13 0xffffffff810a88e4 at amd64_syscall+0x364 openzfs#14 0xffffffff81082330 at fast_syscall_common+0x101 Signed-off-by: Christian Schwarz <me@cschwarz.com>

Under certain loads, the following panic is hit: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 openzfs#4 0xffffffff8066fdee at vinactivef+0xde openzfs#5 0xffffffff80670b8a at vgonel+0x1ea openzfs#6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 A race condition can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Sponsored-by: rsync.net Sponsored-by: Klara, Inc.

Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 openzfs#4 0xffffffff808adc6f at trap_pfault+0x4f openzfs#5 0xffffffff80886da8 at calltrap+0x8 openzfs#6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 openzfs#4 0xffffffff8066fdee at vinactivef+0xde openzfs#5 0xffffffff80670b8a at vgonel+0x1ea openzfs#6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net

Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 #1 0xffffffff8058e86f at vpanic+0x17f #2 0xffffffff8058e6e3 at panic+0x43 #3 0xffffffff808adc15 at trap_fatal+0x385 #4 0xffffffff808adc6f at trap_pfault+0x4f #5 0xffffffff80886da8 at calltrap+0x8 #6 0xffffffff80669186 at vgonel+0x186 #7 0xffffffff80669841 at vgone+0x31 #8 0xffffffff8065806d at vfs_hash_insert+0x26d #9 0xffffffff81a39069 at sfs_vgetx+0x149 #10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #11 0xffffffff8065a28c at lookup+0x45c #12 0xffffffff806594b9 at namei+0x259 #13 0xffffffff80676a33 at kern_statat+0xf3 #14 0xffffffff8067712f at sys_fstatat+0x2f #15 0xffffffff808ae50c at amd64_syscall+0x10c #16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 #1 0xffffffff8059620f at vpanic+0x17f #2 0xffffffff81a27f4a at spl_panic+0x3a #3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 #4 0xffffffff8066fdee at vinactivef+0xde #5 0xffffffff80670b8a at vgonel+0x1ea #6 0xffffffff806711e1 at vgone+0x31 #7 0xffffffff8065fa0d at vfs_hash_insert+0x26d #8 0xffffffff81a39069 at sfs_vgetx+0x149 #9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 #10 0xffffffff80661c2c at lookup+0x45c #11 0xffffffff80660e59 at namei+0x259 #12 0xffffffff8067e3d3 at kern_statat+0xf3 #13 0xffffffff8067eacf at sys_fstatat+0x2f #14 0xffffffff808b5ecc at amd64_syscall+0x10c #15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes #14501

Under certain loads, the following panic is hit: panic: page fault KDB: stack backtrace: #0 0xffffffff805db025 at kdb_backtrace+0x65 openzfs#1 0xffffffff8058e86f at vpanic+0x17f openzfs#2 0xffffffff8058e6e3 at panic+0x43 openzfs#3 0xffffffff808adc15 at trap_fatal+0x385 openzfs#4 0xffffffff808adc6f at trap_pfault+0x4f openzfs#5 0xffffffff80886da8 at calltrap+0x8 openzfs#6 0xffffffff80669186 at vgonel+0x186 openzfs#7 0xffffffff80669841 at vgone+0x31 openzfs#8 0xffffffff8065806d at vfs_hash_insert+0x26d openzfs#9 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#10 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#11 0xffffffff8065a28c at lookup+0x45c openzfs#12 0xffffffff806594b9 at namei+0x259 openzfs#13 0xffffffff80676a33 at kern_statat+0xf3 openzfs#14 0xffffffff8067712f at sys_fstatat+0x2f openzfs#15 0xffffffff808ae50c at amd64_syscall+0x10c openzfs#16 0xffffffff808876bb at fast_syscall_common+0xf8 The page fault occurs because vgonel() will call VOP_CLOSE() for active vnodes. For this reason, define vop_close for zfsctl_ops_snapshot. While here, define vop_open for consistency. After adding the necessary vop, the bug progresses to the following panic: panic: VERIFY3(vrecycle(vp) == 1) failed (0 == 1) cpuid = 17 KDB: stack backtrace: #0 0xffffffff805e29c5 at kdb_backtrace+0x65 openzfs#1 0xffffffff8059620f at vpanic+0x17f openzfs#2 0xffffffff81a27f4a at spl_panic+0x3a openzfs#3 0xffffffff81a3a4d0 at zfsctl_snapshot_inactive+0x40 openzfs#4 0xffffffff8066fdee at vinactivef+0xde openzfs#5 0xffffffff80670b8a at vgonel+0x1ea openzfs#6 0xffffffff806711e1 at vgone+0x31 openzfs#7 0xffffffff8065fa0d at vfs_hash_insert+0x26d openzfs#8 0xffffffff81a39069 at sfs_vgetx+0x149 openzfs#9 0xffffffff81a39c54 at zfsctl_snapdir_lookup+0x1e4 openzfs#10 0xffffffff80661c2c at lookup+0x45c openzfs#11 0xffffffff80660e59 at namei+0x259 openzfs#12 0xffffffff8067e3d3 at kern_statat+0xf3 openzfs#13 0xffffffff8067eacf at sys_fstatat+0x2f openzfs#14 0xffffffff808b5ecc at amd64_syscall+0x10c openzfs#15 0xffffffff8088f07b at fast_syscall_common+0xf8 This is caused by a race condition that can occur when allocating a new vnode and adding that vnode to the vfs hash. If the newly created vnode loses the race when being inserted into the vfs hash, it will not be recycled as its usecount is greater than zero, hitting the above assertion. Fix this by dropping the assertion. FreeBSD-issue: https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=252700 Reviewed-by: Andriy Gapon <avg@FreeBSD.org> Reviewed-by: Mateusz Guzik <mjguzik@gmail.com> Reviewed-by: Alek Pinchuk <apinchuk@axcient.com> Reviewed-by: Ryan Moeller <ryan@iXsystems.com> Signed-off-by: Rob Wing <rob.wing@klarasystems.com> Co-authored-by: Rob Wing <rob.wing@klarasystems.com> Submitted-by: Klara, Inc. Sponsored-by: rsync.net Closes openzfs#14501

Sponsored-by: iXsystems, Inc. Sponsored-by: Klara, Inc. Signed-off-by: Alexander Stetsenko <alex.stetsenko@klarasystems.com>

FransUrbo mentioned this issue Feb 5, 2012

Snapshot Directory (.zfs) #173

Closed

michaelletzgus mentioned this issue May 13, 2012

raidz1: device IO failure when zfs filesystem is full #742

Closed

timbrody mentioned this issue May 30, 2012

zfs list slow #450

Closed

GregorKopka mentioned this issue Aug 24, 2012

kernel/userspace version mismatch: segfault on 'zfs list -r -t all' #892

Closed

This was referenced Aug 31, 2012

Memory allocation spamming the system #917

Closed

soft lockup errors writing to zvol #922

Closed

ancwrd1 mentioned this issue Sep 18, 2012

Segmentation fault with "zpool import -d /dev/disk/by-id -a" #974

Closed

marcin-github mentioned this issue Jan 30, 2013

zfs set compression=lz4 throws core dump #1251

Closed

prometheanfire mentioned this issue Apr 4, 2013

cannot use zfs send recursivly #1388

Closed

edillmann mentioned this issue Jun 14, 2013

zfs receive -> unable to handle kernel NULL pointer dereference #1518

Closed

tuomari mentioned this issue Jul 1, 2013

General protection faults & null pointer dereferences under load #1449

Closed

dward mentioned this issue Aug 26, 2013

kernel panic on zfs send #1678

Closed

struanb mentioned this issue Sep 9, 2013

zfs-0.4.9 on 32bit Debian Lenny using 2.6.32-bpo.5-686-bigmem #44

Closed

olw2005 mentioned this issue Oct 17, 2013

Kernel panic during "zfs recv" of multiple snapshot stream #1795

Closed

marku89 mentioned this issue Dec 6, 2013

Zfs on sparc64 #1934

Closed

vicksters mentioned this issue Jan 2, 2014

Destroying a zvol: Thread overran stack, or stack corrupted #2019

Closed

Xaseron mentioned this issue Jan 10, 2014

ztest: pthread_mutex_destroy() fails with EBUSY #2027

Closed

clefru mentioned this issue Mar 2, 2014

zdb -dddd dies with kernel.c:301 assertion #2154

Closed

inevity mentioned this issue Oct 15, 2014

sa_find_idx_tab() ASSERTION failed ，spl panic，lead that ls hung #2801

Closed

apaxucok mentioned this issue Oct 16, 2014

Do not are working commands ZFS... #2803

Closed

Mehul1313 mentioned this issue Jan 14, 2015

zfs 0.3.6-1 kernel crash #3016

Closed

ofaaland mentioned this issue Feb 3, 2015

SPLError: 1121:0:(dnode_sync.c:484:dnode_sync_free()) VERIFY3(dn->dn_bonus == ((void *)0)) failed (ffff880812c9f1c0 = #2728

Closed

wangdbang mentioned this issue May 29, 2015

zfs 0.6.4.1 could not work on sparc64 #3455

Closed

pzwahlen mentioned this issue Jun 12, 2015

[WIP; RFC, discussion, feedback] account for ashift when choosing buffers … #3451

Closed

djdunn mentioned this issue Jun 29, 2015

SPL linux 4.0.6 and 4.0.1 compatibility #3523

Closed

pospim19 mentioned this issue Jul 28, 2015

Kernel NULL pointer dereference on kernel 4.1.2 with zfs when mounting file located on zfs #3640

Closed

BonganiMalalane mentioned this issue Aug 3, 2018

Panic in NFS filehandle decoding for snapshots [code issue identified and reproducible] #7764

Closed

jdike mentioned this issue Aug 1, 2019

Fix lockdep circular locking false positive involving sa_lock #9110

Merged

10 tasks

UralZima mentioned this issue Aug 31, 2019

Build failure on 4.9.135-dappersec kernel. Patch available, needs rewrite and merge to master #9265

Closed

zaffle mentioned this issue Feb 27, 2020

Add ZFS to the Clear Linux kernel #10068

Closed

mattmacy mentioned this issue Sep 22, 2020

Fix ZFS_DEBUG_MODIFY assert in arc_buf_try_copy_decompressed_data #10943

Closed

12 tasks

ColinIanKing mentioned this issue Nov 25, 2020

sendfile regression detected in ZFS 2.0 (tip) with 5.10-rc2 (and not 5.9) #11151

Closed

justinpryzby mentioned this issue Mar 16, 2021

stuck in futex #11749

Closed

bu7cher mentioned this issue Jun 11, 2021

Possible circular locking dependency in dmu_buf_will_dirty_impl vs free_children #10583

Open

arun-kv mentioned this issue Jul 28, 2021

Fixed data integrity issue when underlying disk returns error to zfs #12443

Merged

13 tasks

htkmmo mentioned this issue Oct 15, 2021

Failure importing pool from initrd (initial ramdisk) on Linux #12645

Closed

sdimitro pushed a commit to sdimitro/zfs that referenced this issue Nov 11, 2021

destroy_task panics if the pool has already been destroyed (openzfs#5)

496f30e

gchmurka123 mentioned this issue Jun 8, 2022

zpool create - internal error: out of memory #13538

Closed

tongfw mentioned this issue Feb 6, 2023

kernel panic on write #14463

Closed

alex-stetsenko added a commit to KlaraSystems/zfs that referenced this issue Apr 20, 2024

Add ZAP shrinking support (zfs-test, try openzfs#5)

297a274

Sponsored-by: iXsystems, Inc. Sponsored-by: Klara, Inc. Signed-off-by: Alexander Stetsenko <alex.stetsenko@klarasystems.com>

This issue was closed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Verify On Disk Compatibility #5

Verify On Disk Compatibility #5

behlendorf commented May 14, 2010

behlendorf commented Aug 2, 2010

behlendorf commented Sep 28, 2010

Verify On Disk Compatibility #5

Verify On Disk Compatibility #5

Comments

behlendorf commented May 14, 2010

behlendorf commented Aug 2, 2010

behlendorf commented Sep 28, 2010