Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hang of zfs on heavy delete #568

Closed
fox-pluto opened this issue Feb 9, 2012 · 7 comments
Closed

Hang of zfs on heavy delete #568

fox-pluto opened this issue Feb 9, 2012 · 7 comments
Milestone

Comments

@fox-pluto
Copy link

HI,

I found a problem deleting a lot of file from a very full volume. My system is intel x86_64 ubuntu 3.0.0-12-server #20-Ubuntu SMP with 4 GB of ram.

The linux system is live but all the operations of reading and writing on the zfs volumes stop.

My zfs version is 0.6RC6

Here is the log I found in messages:

Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.636307] INFO: task arc_reclaim:2461 blocked for more than 120 seconds.
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.636690] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637078] arc_reclaim D 0000000000000001 0 2461 2 0x00000000
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637083] ffff88010af23a40 0000000000000046 ffff88010af239e0 ffffffff810329a9
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637085] ffff88010af23fd8 ffff88010af23fd8 ffff88010af23fd8 0000000000012a40
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637088] ffff8800d1925c80 ffff88010ebcdc80 ffff88010af23a60 ffff880103ede330
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637091] Call Trace:
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637100] [] ? default_spin_lock_flags+0x9/0x10
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637115] [] cv_wait_common+0x77/0xd0 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637118] [] ? add_wait_queue+0x60/0x60
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637123] [] __cv_wait+0x13/0x20 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637156] [] txg_wait_open+0x73/0xa0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637171] [] dmu_tx_wait+0xed/0xf0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637185] [] dmu_tx_assign+0x66/0x420 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637196] [] dmu_free_long_range_impl+0x184/0x270 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637208] [] dmu_free_long_range+0x4b/0x70 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637224] [] ? sa_handle_destroy+0x8f/0xa0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637241] [] zfs_rmnode+0x60/0x340 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637260] [] zfs_zinactive+0x89/0xd0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637278] [] zfs_inactive+0x5e/0x1c0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637282] [] ? truncate_pagecache+0x59/0x70
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637300] [] zpl_evict_inode+0x28/0x30 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637313] [] evict+0x91/0x170
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637316] [] dispose_list+0x3e/0x50
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637318] [] prune_icache+0x16f/0x320
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637321] [] shrink_icache_memory+0x21/0x50
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637331] [] arc_kmem_reap_now+0x82/0x110 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637342] [] arc_reclaim_thread+0x11a/0x150 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637352] [] ? arc_shrinker_func+0xd0/0xd0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637357] [] thread_generic_wrapper+0x78/0x90 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637362] [] ? __thread_create+0x160/0x160 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637364] [] kthread+0x8c/0xa0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637371] [] kernel_thread_helper+0x4/0x10
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637373] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637375] [] ? gs_change+0x13/0x13
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637472] INFO: task zfs_iput_taskq/:12268 blocked for more than 120 seconds.
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.637892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638326] zfs_iput_taskq/ D 0000000000000000 0 12268 2 0x00000000
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638330] ffff880104789ad0 0000000000000046 ffff880104789a70 ffffffff810329a9
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638332] ffff880104789fd8 ffff880104789fd8 ffff880104789fd8 0000000000012a40
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638335] ffff8800810ddc80 ffff8800c701ae40 0000000000000000 ffffffff81c02380
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638337] Call Trace:
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638342] [] ? default_spin_lock_flags+0x9/0x10
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638345] [] __wait_on_freeing_inode+0xa8/0xd0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638349] [] ? autoremove_wake_function+0x40/0x40
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638351] [] find_inode_fast+0x6c/0xa0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638353] [] ilookup+0x76/0xd0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638380] [] zfs_zget+0xba/0x210 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638393] [] ? dnode_rele+0x54/0x90 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638410] [] zfs_unlinked_drain+0x94/0x120 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638416] [] ? __switch_to+0xca/0x310
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638420] [] ? _raw_spin_lock+0xe/0x20
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638424] [] ? add_partial+0x58/0x90
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638441] [] taskq_thread+0x1b5/0x390 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638444] [] ? try_to_wake_up+0x200/0x200
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638449] [] ? task_alloc+0x160/0x160 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.638452] [] kthread+0x8c/0xa0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639466] [] ? dnode_rele+0x54/0x90 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639484] [] zfs_unlinked_drain+0x94/0x120 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639487] [] ? __switch_to+0xca/0x310
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639490] [] ? _raw_spin_lock+0xe/0x20
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639493] [] ? add_partial+0x58/0x90
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639501] [] taskq_thread+0x1b5/0x390 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639512] [] ? try_to_wake_up+0x200/0x200
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639517] [] ? task_alloc+0x160/0x160 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639520] [] kthread+0x8c/0xa0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639523] [] kernel_thread_helper+0x4/0x10
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639526] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639528] [] ? gs_change+0x13/0x13
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.639552] INFO: task rm:6210 blocked for more than 120 seconds.
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640054] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640560] rm D 0000000000000001 0 6210 27181 0x00000000
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640563] ffff880052791ad8 0000000000000086 ffff880052791a78 ffffffff810329a9
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640566] ffff880052791fd8 ffff880052791fd8 ffff880052791fd8 0000000000012a40
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640569] ffff8800a0712e40 ffff88010e9b2e40 ffff880052791af8 ffff880103ede330
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640572] Call Trace:
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640576] [] ? default_spin_lock_flags+0x9/0x10
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640586] [] cv_wait_common+0x77/0xd0 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640589] [] ? add_wait_queue+0x60/0x60
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640593] [] __cv_wait+0x13/0x20 [spl]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640617] [] txg_wait_open+0x73/0xa0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640629] [] dmu_tx_wait+0xed/0xf0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640642] [] dmu_tx_assign+0x66/0x420 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640663] [] dmu_free_long_range_impl+0x184/0x270 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640674] [] dmu_free_long_range+0x4b/0x70 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640692] [] zfs_rmnode+0x60/0x340 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640710] [] zfs_zinactive+0x89/0xd0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640728] [] zfs_inactive+0x5e/0x1c0 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640745] [] ? truncate_pagecache+0x59/0x70
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640762] [] zpl_evict_inode+0x28/0x30 [zfs]
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640765] [] evict+0x91/0x170
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640767] [] iput_final+0xd2/0x1a0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640769] [] iput+0x38/0x50
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640771] [] do_unlinkat+0x148/0x1c0
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640774] [] ? sys_newfstatat+0x2a/0x40
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640776] [] sys_unlinkat+0x22/0x40
Feb 9 10:23:43 ubuntu-brick kernel: [ 2157.640780] [] system_call_fastpath+0x16/0x1b
Feb 9 10:24:44 ubuntu-brick kernel: [ 2217.814731] INFO: rcu_sched_state detected stall on CPU 0 (t=15000 jiffies)
Feb 9 10:25:01 ubuntu-brick CRON[18556]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.458556] INFO: task arc_reclaim:2461 blocked for more than 120 seconds.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459156] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459761] arc_reclaim D 0000000000000001 0 2461 2 0x00000000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459765] ffff88010af23a40 0000000000000046 ffff88010af239e0 ffffffff810329a9
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459768] ffff88010af23fd8 ffff88010af23fd8 ffff88010af23fd8 0000000000012a40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459771] ffff8800d1925c80 ffff88010ebcdc80 ffff88010af23a60 ffff880103ede330
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459774] Call Trace:
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459782] [] ? default_spin_lock_flags+0x9/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459795] [] cv_wait_common+0x77/0xd0 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459800] [] ? add_wait_queue+0x60/0x60
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459805] [] __cv_wait+0x13/0x20 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459830] [] txg_wait_open+0x73/0xa0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459844] [] dmu_tx_wait+0xed/0xf0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459858] [] dmu_tx_assign+0x66/0x420 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459871] [] dmu_free_long_range_impl+0x184/0x270 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459884] [] dmu_free_long_range+0x4b/0x70 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459902] [] ? sa_handle_destroy+0x8f/0xa0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459919] [] zfs_rmnode+0x60/0x340 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459939] [] zfs_zinactive+0x89/0xd0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459957] [] zfs_inactive+0x5e/0x1c0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459961] [] ? truncate_pagecache+0x59/0x70
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459979] [] zpl_evict_inode+0x28/0x30 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459983] [] evict+0x91/0x170
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459985] [] dispose_list+0x3e/0x50
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459987] [] prune_icache+0x16f/0x320
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.459989] [] shrink_icache_memory+0x21/0x50
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460000] [] arc_kmem_reap_now+0x82/0x110 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460010] [] arc_reclaim_thread+0x11a/0x150 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460021] [] ? arc_shrinker_func+0xd0/0xd0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460026] [] thread_generic_wrapper+0x78/0x90 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460031] [] ? __thread_create+0x160/0x160 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460034] [] kthread+0x8c/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460038] [] kernel_thread_helper+0x4/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460041] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460043] [] ? gs_change+0x13/0x13
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460094] INFO: task z_wr_int/5:11812 blocked for more than 120 seconds.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.460702] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461338] z_wr_int/5 D 0000000000000001 0 11812 2 0x00000000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461341] ffff8800816d1d10 0000000000000046 ffff880110f87488 ffffffffa01812b6
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461344] ffff8800816d1fd8 ffff8800816d1fd8 ffff8800816d1fd8 0000000000012a40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461346] ffff880098e78000 ffff8800816cae40 ffffffffffffff10 ffff880082a12720
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461349] Call Trace:
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461355] [] ? kmem_free_debug+0x16/0x20 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461358] [] __mutex_lock_slowpath+0xd7/0x150
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461377] [] ? zio_destroy+0xa6/0xe0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461379] [] mutex_lock+0x22/0x40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461397] [] vdev_queue_io_done+0x30/0xd0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461415] [] zio_vdev_io_done+0x88/0x190 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461433] [] zio_execute+0x9f/0xf0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461438] [] taskq_thread+0x1b5/0x390 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461442] [] ? try_to_wake_up+0x200/0x200
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461446] [] ? task_alloc+0x160/0x160 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461448] [] kthread+0x8c/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461451] [] kernel_thread_helper+0x4/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461453] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461455] [] ? gs_change+0x13/0x13
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.461457] INFO: task z_wr_int/6:11813 blocked for more than 120 seconds.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462118] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462876] z_wr_int/6 D 0000000000000002 0 11813 2 0x00000000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462891] ffff8800816d3d10 0000000000000046 000000002cd05616 ffffffffa01812b6
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462898] ffff8800816d3fd8 ffff8800816d3fd8 ffff8800816d3fd8 0000000000012a40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462904] ffff88008164c560 ffff8800816cc560 ffffffffffffff10 ffff880082a12720
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462910] Call Trace:
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462917] [] ? kmem_free_debug+0x16/0x20 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462921] [] __mutex_lock_slowpath+0xd7/0x150
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462942] [] ? vdev_cache_write+0x154/0x170 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462946] [] mutex_lock+0x22/0x40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462965] [] vdev_queue_io_done+0x30/0xd0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.462985] [] zio_vdev_io_done+0x88/0x190 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463004] [] zio_execute+0x9f/0xf0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463011] [] taskq_thread+0x1b5/0x390 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463014] [] ? try_to_wake_up+0x200/0x200
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463021] [] ? task_alloc+0x160/0x160 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463025] [] kthread+0x8c/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463029] [] kernel_thread_helper+0x4/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463033] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463037] [] ? gs_change+0x13/0x13
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463040] INFO: task z_wr_int/9:11816 blocked for more than 120 seconds.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.463750] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464480] z_wr_int/9 D 0000000000000001 0 11816 2 0x00000000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464484] ffff8800816e1d10 0000000000000046 ffffffffa01812b6 ffff8800816d9720
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464487] ffff8800816e1fd8 ffff8800816e1fd8 ffff8800816e1fd8 0000000000012a40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464490] ffff88010da18000 ffff8800816d9720 ffff8800816e1d40 ffff880082a12720
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464493] Call Trace:
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464500] [] ? kmem_free_debug+0x16/0x20 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464503] [] __mutex_lock_slowpath+0xd7/0x150
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464524] [] ? vdev_cache_write+0x154/0x170 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464527] [] mutex_lock+0x22/0x40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464546] [] vdev_queue_io_done+0x30/0xd0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464564] [] zio_vdev_io_done+0x88/0x190 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464581] [] zio_execute+0x9f/0xf0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464586] [] taskq_thread+0x1b5/0x390 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464589] [] ? try_to_wake_up+0x200/0x200
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464594] [] ? task_alloc+0x160/0x160 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464596] [] kthread+0x8c/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464599] [] kernel_thread_helper+0x4/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464602] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464604] [] ? gs_change+0x13/0x13
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.464606] INFO: task z_wr_int/13:11820 blocked for more than 120 seconds.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.465332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466093] z_wr_int/13 D 0000000000000001 0 11820 2 0x00000000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466096] ffff8800816f9d10 0000000000000046 ffffffffa01812b6 ffff8800816f0000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466099] ffff8800816f9fd8 ffff8800816f9fd8 ffff8800816f9fd8 0000000000012a40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466101] ffff8800986adc80 ffff8800816f0000 ffff8800816f9d40 ffff880082a12720
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466104] Call Trace:
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466109] [] ? kmem_free_debug+0x16/0x20 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466112] [] __mutex_lock_slowpath+0xd7/0x150
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466130] [] ? vdev_cache_write+0x154/0x170 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466133] [] mutex_lock+0x22/0x40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466150] [] vdev_queue_io_done+0x30/0xd0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466168] [] zio_vdev_io_done+0x88/0x190 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466186] [] zio_execute+0x9f/0xf0 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466191] [] taskq_thread+0x1b5/0x390 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466193] [] ? try_to_wake_up+0x200/0x200
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466198] [] ? task_alloc+0x160/0x160 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466200] [] kthread+0x8c/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466203] [] kernel_thread_helper+0x4/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466205] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466207] [] ? gs_change+0x13/0x13
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.466220] INFO: task txg_sync:11946 blocked for more than 120 seconds.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467064] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467890] txg_sync D 0000000000000002 0 11946 2 0x00000000
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467893] ffff880100ad5c00 0000000000000046 ffff880100ad5ba0 ffffffff810329a9
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467896] ffff880100ad5fd8 ffff880100ad5fd8 ffff880100ad5fd8 0000000000012a40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467899] ffff8800816dae40 ffff880082a02e40 ffff880082821640 ffffc9008622d790
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467901] Call Trace:
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467905] [] ? default_spin_lock_flags+0x9/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467912] [] cv_wait_common+0x77/0xd0 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467915] [] ? add_wait_queue+0x60/0x60
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467920] [] __cv_wait+0x13/0x20 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467938] [] zio_wait+0xfb/0x170 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467954] [] dsl_pool_sync+0xca/0x450 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467971] [] spa_sync+0x38e/0xa00 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467973] [] ? default_wake_function+0x12/0x20
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467976] [] ? autoremove_wake_function+0x16/0x40
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467979] [] ? __wake_up+0x53/0x70
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.467996] [] txg_sync_thread+0x216/0x390 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468012] [] ? txg_init+0x260/0x260 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468029] [] ? txg_init+0x260/0x260 [zfs]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468034] [] thread_generic_wrapper+0x78/0x90 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468039] [] ? __thread_create+0x160/0x160 [spl]
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468041] [] kthread+0x8c/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468044] [] kernel_thread_helper+0x4/0x10
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468047] [] ? flush_kthread_worker+0xa0/0xa0
Feb 9 10:25:43 ubuntu-brick kernel: [ 2277.468049] [] ? gs_change+0x13/0x13
Feb 9 10:27:44 ubuntu-brick kernel: [ 2397.660752] INFO: rcu_sched_state detected stall on CPU 0 (t=60031 jiffies)

Do you think the but is already solved in the head??

thanks for your help, let me know if you need more info.
Stefano

@behlendorf
Copy link
Contributor

Yes, there's a good chance your issue was resolved by commit 08d08eb which is in head. You'll want to update to the latest spl and zfs sources and try it again.

@fox-pluto
Copy link
Author

Hi Brian,

I follow your suggestions, now I am trying with RHEL 6.2 kernel 2.6.32-220.el6.x86_64 #1 SMP Wed Nov 9 08:03:13 EST 2011 x86_64 x86_64 x86_64 GNU/Linux and the head zfs version zfsonlinux-zfs-b10c77f and zfsonlinux-spl-feedc43 but I found again a kernel crash deleting a lot of files.
Here are the dump lines:

BUG: scheduling while atomic: zfs_iput_taskq//4826/0xffff8800
Modules linked in: bridge stp llc autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 dm_round_robin dm_multipath r8169 mii xhci_hcd microcode serio_raw iTCO_wdt iTCO_vendor_support i2c_i801 shpchp zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc ses enclosure sg ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi ata_generic ata_piix mpt2sas scsi_transport_sas raid_class i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
Pid: 4826, comm: zfs_iput_taskq/ Tainted: P W ---------------- 2.6.32-220.el6.x86_64 #1
Call Trace:
[] ? __schedule_bug+0x66/0x70
[] ? thread_return+0x67a/0x77e
[] ? arc_read_nolock+0x530/0x810 [zfs]
BUG: unable to handle kernel paging request at 0000000606de70a0
IP: [] task_rq_lock+0x4d/0xa0
PGD 75f69067 PUD 0
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/host0/port-0:1/expander-0:3/port-0:3:1/expander-0:4/port-0:4:26/end_device-0:4:26/target0:0:104/0:0:104:0/state
CPU 2
Modules linked in: bridge stp llc autofs4 sunrpc cpufreq_ondemand acpi_cpufreq freq_table mperf ipv6 dm_round_robin dm_multipath r8169 mii xhci_hcd microcode serio_raw iTCO_wdt iTCO_vendor_support i2c_i801 shpchp zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd soundcore snd_page_alloc ses enclosure sg ext4 mbcache jbd2 sd_mod crc_t10dif pata_acpi ata_generic ata_piix mpt2sas scsi_transport_sas raid_class i915 drm_kms_helper drm i2c_algo_bit i2c_core video output dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]

Pid: 4642, comm: z_rd_int/2 Tainted: P W ---------------- 2.6.32-220.el6.x86_64 #1 Gigabyte Technology Co., Ltd. H55N-USB3/H55N-USB3
RIP: 0010:[] [] task_rq_lock+0x4d/0xa0
RSP: 0018:ffff8800d188fae0 EFLAGS: 00010082
RAX: 00000000d0a3e070 RBX: 0000000000015fc0 RCX: 0000000000000000
RDX: 0000000000000082 RSI: ffff8800d188fb38 RDI: ffff8800d0afd580
RBP: ffff8800d188fb00 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8800d0afd580
R13: ffff8800d188fb38 R14: 0000000000015fc0 R15: 0000000000000003
FS: 0000000000000000(0000) GS:ffff880028280000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000606de70a0 CR3: 0000000075f40000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process z_rd_int/2 (pid: 4642, threadinfo ffff8800d188e000, task ffff8800d188d500)
Stack:
ffff8800d0afd580 0000000000000001 0000000000000000 0000000000000002
<0> ffff8800d188fb70 ffffffff8105f68c 0000000000011210 ffff880000000003
<0> 00000003d188fb88 0000000000000001 ffff8800d188fbe0 0000000000000082
Call Trace:
[] try_to_wake_up+0x3c/0x400
[] default_wake_function+0x12/0x20
[] autoremove_wake_function+0x16/0x40
[] __wake_up_common+0x59/0x90
[] __wake_up+0x48/0x70
[] __cv_broadcast+0x3c/0xd0 [spl]
[] ? mutex_lock+0x1e/0x50
[] zio_done+0x4ad/0xbf0 [zfs]
[] ? zio_remove_child+0x97/0xb0 [zfs]
[] zio_done+0x686/0xbf0 [zfs]
[] zio_done+0x686/0xbf0 [zfs]
[] zio_done+0x686/0xbf0 [zfs]
[] zio_execute+0x99/0xf0 [zfs]
[] taskq_thread+0x212/0x590 [spl]
[] ? thread_return+0x4e/0x77e
[] ? default_wake_function+0x0/0x20
[] ? taskq_thread+0x0/0x590 [spl]
[] kthread+0x96/0xa0
[] child_rip+0xa/0x20
[] ? kthread+0x0/0xa0
[] ? child_rip+0x0/0x20
Code: c3 c0 5f 01 00 49 89 fc 49 89 f5 9c 58 0f 1f 44 00 00 48 89 c2 fa 66 0f 1f 44 00 00 49 89 55 00 49 8b 44 24 08 49 89 de 8b 40 18 <4c> 03 34 c5 20 6d bf 81 4c 89 f7 e8 53 da 49 00 49 8b 44 24 08
RIP [] task_rq_lock+0x4d/0xa0
RSP
CR2: 0000000606de70a0
crash>
crash> ps -p 4642
PID: 0 TASK: ffffffff81a8d020 CPU: 0 COMMAND: "swapper"
PID: 2 TASK: ffff8801155baa80 CPU: 2 COMMAND: "kthreadd"
PID: 4642 TASK: ffff8800d188d500 CPU: 2 COMMAND: "z_rd_int/2"

crash>

Do you think these two bug are related? Do you need some more info? (I have the vmcore memory dump if you need)

Thanks,
Stefano

@ryao
Copy link
Contributor

ryao commented Feb 18, 2012

Stefano, what did you do to trigger this? How can it be reproduced?

@mikhmv
Copy link

mikhmv commented Feb 24, 2012

Hi, I just reported a similar issue.

In my case the problem happens everytime when I remove big file like 180GB. There is no problem with removing 40GB file.
bug number is #583

@behlendorf
Copy link
Contributor

The second issue is certainly different than the first, in fact the first should be fixed in the latest source. We'll certainly dig in to the second crash you reported too.

@mikhmv
Copy link

mikhmv commented Mar 15, 2012

This bug returned with new version of ZFS

@behlendorf
Copy link
Contributor

Can you please open a new issue with any debugging available from dmesg. To avoid confusion I'm closing this issue.

pcd1193182 added a commit to pcd1193182/zfs that referenced this issue Sep 26, 2023
…lamation stress test (openzfs#568)

Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants