Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs rename hangs after changing the volsize property #2525

Closed
arturpzol opened this issue Jul 23, 2014 · 2 comments
Closed

zfs rename hangs after changing the volsize property #2525

arturpzol opened this issue Jul 23, 2014 · 2 comments
Milestone

Comments

@arturpzol
Copy link

Hello,
I have problem with zvol rename which hangs (seems as deadlock) after changing the volsize property and removing it from SCST using script. When I use the same scenario using terminal manually problem is not repeatable. When I use scenario using script problem is repeatable each time. Sleep between removing zvol from SCST and zvol rename solves the issue.

Please check the dmesg output from kernel-trigger which show-blocked-tasks:

[ 768.200202] zd0: detected capacity change from 1288437760 to 2421489664
[ 768.300701] zd0: detected capacity change from 2421489664 to 1342177280
[ 768.393700] scst: Removed LUN 0 from group iqn.2014-7:target0 (target iqn.2014-7:target0)
[ 768.484603] dev_vdisk: Detached virtual device 5c79653767732f41 ("/dev/Pool-0/2")
[ 768.484615] scst: Detached from virtual device 5c79653767732f41 (id 1)
[ 768.484617] dev_vdisk: Virtual device 5c79653767732f41 unregistered
[ 967.081617] SysRq : Show Blocked State
[ 967.081630] task PC stack pid father
[ 967.081669] txg_sync D 0000000000000000 0 16367 2 0x00000000
[ 967.081679] ffff88003f890000 0000000000000002 0000000000000292 ffffffff81967440
[ 967.081687] ffff8800220b3fd8 ffff8800220b3fd8 ffff8800220b3fd8 ffff88003f890000
[ 967.081694] 0000000000000030 ffff88003f890000 00000000ffffffff ffff88003f890000
[ 967.081701] Call Trace:
[ 967.081778] [] ? arc_buf_thaw+0x3b/0x90 [zfs]
[ 967.081840] [] ? dbuf_dirty+0x4a8/0x870 [zfs]
[ 967.081852] [] ? avl_find+0x52/0x90 [zavl]
[ 967.081863] [] ? __kmalloc+0xda/0x120
[ 967.081875] [] ? schedule_preempt_disabled+0x9/0x10
[ 967.081959] [] ? mze_find+0xbe/0xd0 [zfs]
[ 967.081966] [] ? __mutex_lock_slowpath+0x109/0x1a0
[ 967.081988] [] ? kmem_free_debug+0x41/0x140 [spl]
[ 967.081996] [] ? mutex_lock+0x1a/0x40
[ 967.082073] [] ? zvol_rename_minors+0x69/0x170 [zfs]
[ 967.082143] [] ? dsl_dir_rename_sync+0x1c5/0x390 [zfs]
[ 967.082249] [] ? dsl_sync_task_sync+0x10a/0x110 [zfs]
[ 967.082320] [] ? dsl_pool_sync+0x2fb/0x420 [zfs]
[ 967.082394] [] ? spa_sync+0x3f9/0xaf0 [zfs]
[ 967.082403] [] ? kvm_clock_read+0x1f/0x30
[ 967.082411] [] ? ktime_get_ts+0x3d/0xd0
[ 967.082489] [] ? txg_sync_thread+0x3a8/0x600 [zfs]
[ 967.082496] [] ? kvm_clock_read+0x1f/0x30
[ 967.082573] [] ? txg_thread_wait.isra.2+0x30/0x30 [zfs]
[ 967.082592] [] ? thread_generic_wrapper+0x75/0x90 [spl]
[ 967.082610] [] ? __thread_create+0x310/0x310 [spl]
[ 967.082618] [] ? kthread+0xb3/0xc0
[ 967.082625] [] ? alloc_pid+0x1c0/0x490
[ 967.082633] [] ? kthread_freezable_should_stop+0x60/0x60
[ 967.082640] [] ? ret_from_fork+0x7c/0xb0
[ 967.082648] [] ? kthread_freezable_should_stop+0x60/0x60
[ 967.082678] kworker/0:1 D 0000000000000002 0 30097 2 0x00000000
[ 967.082691] Workqueue: events delayed_fput
[ 967.082695] ffff88003d098d00 0000000000000002 ffff88001130618e ffff88003f891380
[ 967.082700] ffff880013831fd8 ffff880013831fd8 ffff880013831fd8 ffff88003d098d00
[ 967.082706] ffff880013831c68 ffffffffff0a0000 00000000000002ad ffff88003bb24000
[ 967.082712] Call Trace:
[ 967.082730] [] ? trace_put_tcd+0x11/0x20 [spl]
[ 967.082746] [] ? spl_debug_msg+0x43a/0x860 [spl]
[ 967.082753] [] ? try_to_wake_up+0xcb/0x290
[ 967.082775] [] ? cv_wait_common+0x105/0x1c0 [spl]
[ 967.082783] [] ? abort_exclusive_wait+0xb0/0xb0
[ 967.082862] [] ? txg_wait_synced+0xab/0x180 [zfs]
[ 967.082936] [] ? zvol_last_close+0xa6/0xb0 [zfs]
[ 967.083008] [] ? zvol_release+0x88/0x90 [zfs]
[ 967.083016] [] ? __blkdev_put+0x15b/0x1a0
[ 967.083023] [] ? blkdev_close+0x20/0x30
[ 967.083029] [] ? __fput+0xae/0x240
[ 967.083035] [] ? delayed_fput+0x92/0xb0
[ 967.083041] [] ? process_one_work+0x141/0x3b0
[ 967.083050] [] ? pwq_activate_delayed_work+0x2e/0x50
[ 967.083056] [] ? worker_thread+0x114/0x370
[ 967.083064] [] ? sched_clock+0x5/0x10
[ 967.083071] [] ? manage_workers.isra.28+0x2c0/0x2c0
[ 967.083078] [] ? kthread+0xb3/0xc0
[ 967.083084] [] ? alloc_pid+0x1c0/0x490
[ 967.083092] [] ? kthread_freezable_should_stop+0x60/0x60
[ 967.083098] [] ? ret_from_fork+0x7c/0xb0
[ 967.083106] [] ? kthread_freezable_should_stop+0x60/0x60
[ 967.083111] zfs D 0000000000000001 0 32560 31545 0x00000000
[ 967.083118] ffff88003f891380 0000000000000002 ffff88001130621b ffff88003f896800
[ 967.083123] ffff880011343fd8 ffff880011343fd8 ffff880011343fd8 ffff88003f891380
[ 967.083128] ffff880011343c88 ffffffffff0a0000 00000000000002ad ffff88003bb24000
[ 967.083134] Call Trace:
[ 967.083151] [] ? trace_put_tcd+0x11/0x20 [spl]
[ 967.083167] [] ? spl_debug_msg+0x43a/0x860 [spl]
[ 967.083187] [] ? cv_wait_common+0x105/0x1c0 [spl]
[ 967.083195] [] ? abort_exclusive_wait+0xb0/0xb0
[ 967.083273] [] ? txg_wait_synced+0xab/0x180 [zfs]
[ 967.083343] [] ? dsl_dir_hold+0x360/0x360 [zfs]
[ 967.083412] [] ? dsl_dir_rename_check+0x260/0x260 [zfs]
[ 967.083482] [] ? dsl_sync_task+0xc9/0x240 [zfs]
[ 967.083551] [] ? dsl_dir_hold+0x360/0x360 [zfs]
[ 967.083620] [] ? dsl_dir_rename_check+0x260/0x260 [zfs]
[ 967.083689] [] ? dsl_dir_rename+0x29/0x30 [zfs]
[ 967.083765] [] ? zfsdev_ioctl+0x4a1/0x510 [zfs]
[ 967.083773] [] ? get_vtime_delta+0x16/0x80
[ 967.083780] [] ? do_vfs_ioctl+0x8a/0x4f0
[ 967.083786] [] ? get_vtime_delta+0x16/0x80
[ 967.083793] [] ? vtime_account_user+0x50/0x70
[ 967.083799] [] ? SyS_ioctl+0xa0/0xc0
[ 967.083806] [] ? tracesys+0xdd/0xe2

ps shows:

32560 ? D 0:00 _ /usr/sbin/zfs rename Pool-0/2 Pool-0/rrr

I use kernel 3.10.36 with zol 0.6.3.

@behlendorf behlendorf added this to the 0.6.5 milestone Jul 24, 2014
@behlendorf behlendorf added the Bug label Jul 24, 2014
@behlendorf
Copy link
Contributor

Thanks for the bug report. The stacks you provided pretty clearly show the deadlock so while I haven't looked at this too closely I suspect it will be a pretty easy fix.

To quickly summarize the txg_sync thread is blocked waiting on the zvol_state_lock in the zvol rename a sync task. Meanwhile a generic worker thread is doing a delayed fput on the block device which has acquired the zvol_state_lock and is waiting for the txg to finish syncing. Deadlock.

@ryao
Copy link
Contributor

ryao commented Sep 2, 2014

This is related to #2652. Here we take the zvol_state_lock mutex have in zvol_release() and then block on the transaction sync. Then txg_sync tries to take zvol_state_lock and we deadlock. The same fix ought to tackle both issues.

@behlendorf behlendorf modified the milestones: 0.6.4, 0.6.5 Sep 4, 2014
ryao pushed a commit to ryao/zfs that referenced this issue Sep 5, 2014
This commit should prevent a deadlock on dp_config_rwlock when
running `zfs rename` by ensuring zvol_rename_minors() is not
called under this lock.

Signed-off-by: Stanislav Seletskiy <s.seletskiy@gmail.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#2652.
Closes openzfs#2525.
DeHackEd pushed a commit to DeHackEd/zfs that referenced this issue Sep 18, 2014
This commit should prevent a deadlock on dp_config_rwlock when
running `zfs rename` by ensuring zvol_rename_minors() is not
called under this lock.

Signed-off-by: Stanislav Seletskiy <s.seletskiy@gmail.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#2652.
Closes openzfs#2525.
ryao pushed a commit to ryao/zfs that referenced this issue Nov 29, 2014
This commit should prevent a deadlock on dp_config_rwlock when
running `zfs rename` by ensuring zvol_rename_minors() is not
called under this lock.

Signed-off-by: Stanislav Seletskiy <s.seletskiy@gmail.com>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Signed-off-by: Tim Chase <tim@chase2k.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Closes openzfs#2652.
Closes openzfs#2525.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants