Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

blocked for more than 120 seconds, then lockup #853

Closed
chrisrd opened this Issue · 8 comments

5 participants

@chrisrd

spl-e8267ac, zfs-ed3fc80, linux-3.3

I had a lockup occur after 40 hours of a constant rsync destination workload (without xattrs):

Jul 29 05:01:28 b5 kernel: [146061.152476] INFO: task kswapd0:78 blocked for more than 120 seconds.
Jul 29 05:01:28 b5 kernel: [146061.152530] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 29 05:01:28 b5 kernel: [146061.152611] kswapd0         D 0000000000000000     0    78      2 0x00000000
Jul 29 05:01:28 b5 kernel: [146061.152667]  ffff8806215a3910 0000000000000046 0000000000000000 0000000000000000
Jul 29 05:01:28 b5 kernel: [146061.152751]  ffff880624049350 ffff8806215a3fd8 ffff8806215a3fd8 ffff8806215a3fd8
Jul 29 05:01:28 b5 kernel: [146061.152834]  ffff880621418670 ffff880624049350 ffff8806215a3920 ffff88062372a2e8
Jul 29 05:01:28 b5 kernel: [146061.152917] Call Trace:
Jul 29 05:01:28 b5 kernel: [146061.152961]  [<ffffffff813c503f>] schedule+0x3f/0x60
Jul 29 05:01:28 b5 kernel: [146061.153015]  [<ffffffffa03eb8c8>] cv_wait_common+0x98/0x190 [spl]
Jul 29 05:01:28 b5 kernel: [146061.153066]  [<ffffffff81074813>] ? __wake_up+0x53/0x70
Jul 29 05:01:28 b5 kernel: [146061.153114]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.153163]  [<ffffffffa03eb9f3>] __cv_wait+0x13/0x20 [spl]
Jul 29 05:01:28 b5 kernel: [146061.153245]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.153307]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.153369]  [<ffffffffa04bb489>] dmu_tx_assign+0xa9/0x630 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.153440]  [<ffffffffa0559e46>] zfs_inactive+0x216/0x400 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.153488]  [<ffffffff810f018e>] ? truncate_pagecache+0x5e/0x70
Jul 29 05:01:28 b5 kernel: [146061.153558]  [<ffffffffa0578988>] zpl_evict_inode+0x28/0x30 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.153607]  [<ffffffff81152548>] evict+0xb8/0x1c0
Jul 29 05:01:28 b5 kernel: [146061.153650]  [<ffffffff81152be1>] dispose_list+0x41/0x50
Jul 29 05:01:28 b5 kernel: [146061.153695]  [<ffffffff81153615>] prune_icache_sb+0x185/0x340
Jul 29 05:01:28 b5 kernel: [146061.153743]  [<ffffffff8113bdfe>] prune_super+0xee/0x1a0
Jul 29 05:01:28 b5 kernel: [146061.153789]  [<ffffffff810f3c64>] shrink_slab+0x154/0x310
Jul 29 05:01:28 b5 kernel: [146061.153833]  [<ffffffff810f7037>] balance_pgdat+0x4f7/0x6c0
Jul 29 05:01:28 b5 kernel: [146061.153879]  [<ffffffff810f7362>] kswapd+0x162/0x3d0
Jul 29 05:01:28 b5 kernel: [146061.153922]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.153968]  [<ffffffff810f7200>] ? balance_pgdat+0x6c0/0x6c0
Jul 29 05:01:28 b5 kernel: [146061.154013]  [<ffffffff8106b7e3>] kthread+0x93/0xa0
Jul 29 05:01:28 b5 kernel: [146061.154058]  [<ffffffff813cefa4>] kernel_thread_helper+0x4/0x10
Jul 29 05:01:28 b5 kernel: [146061.154104]  [<ffffffff8106b750>] ? kthread_freezable_should_stop+0x70/0x70
Jul 29 05:01:28 b5 kernel: [146061.154153]  [<ffffffff813cefa0>] ? gs_change+0x13/0x13
Jul 29 05:01:28 b5 kernel: [146061.154197] INFO: task kswapd1:79 blocked for more than 120 seconds.
Jul 29 05:01:28 b5 kernel: [146061.154243] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 29 05:01:28 b5 kernel: [146061.154318] kswapd1         D ffffffff81407b00     0    79      2 0x00000000
Jul 29 05:01:28 b5 kernel: [146061.154372]  ffff8806215a5910 0000000000000046 ae958806215a5960 0000000000000044
Jul 29 05:01:28 b5 kernel: [146061.154455]  ffff8806240499c0 ffff8806215a5fd8 ffff8806215a5fd8 ffff8806215a5fd8
Jul 29 05:01:28 b5 kernel: [146061.154539]  ffff8806245db9f0 ffff8806240499c0 ffff8806215a5920 ffff88062372a2e8
Jul 29 05:01:28 b5 kernel: [146061.154622] Call Trace:
Jul 29 05:01:28 b5 kernel: [146061.154657]  [<ffffffff813c503f>] schedule+0x3f/0x60
Jul 29 05:01:28 b5 kernel: [146061.154704]  [<ffffffffa03eb8c8>] cv_wait_common+0x98/0x190 [spl]
Jul 29 05:01:28 b5 kernel: [146061.154751]  [<ffffffff81074813>] ? __wake_up+0x53/0x70
Jul 29 05:01:28 b5 kernel: [146061.154795]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.154844]  [<ffffffffa03eb9f3>] __cv_wait+0x13/0x20 [spl]
Jul 29 05:01:28 b5 kernel: [146061.154911]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.154973]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.155034]  [<ffffffffa04bb489>] dmu_tx_assign+0xa9/0x630 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.155103]  [<ffffffffa0559e46>] zfs_inactive+0x216/0x400 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.155150]  [<ffffffff810f018e>] ? truncate_pagecache+0x5e/0x70
Jul 29 05:01:28 b5 kernel: [146061.155219]  [<ffffffffa0578988>] zpl_evict_inode+0x28/0x30 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.155266]  [<ffffffff81152548>] evict+0xb8/0x1c0
Jul 29 05:01:28 b5 kernel: [146061.155309]  [<ffffffff81152be1>] dispose_list+0x41/0x50
Jul 29 05:01:28 b5 kernel: [146061.155354]  [<ffffffff81153615>] prune_icache_sb+0x185/0x340
Jul 29 05:01:28 b5 kernel: [146061.155399]  [<ffffffff8113bdfe>] prune_super+0xee/0x1a0
Jul 29 05:01:28 b5 kernel: [146061.155444]  [<ffffffff810f3c64>] shrink_slab+0x154/0x310
Jul 29 05:01:28 b5 kernel: [146061.155488]  [<ffffffff810f7037>] balance_pgdat+0x4f7/0x6c0
Jul 29 05:01:28 b5 kernel: [146061.155534]  [<ffffffff810f7362>] kswapd+0x162/0x3d0
Jul 29 05:01:28 b5 kernel: [146061.155577]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.155622]  [<ffffffff810f7200>] ? balance_pgdat+0x6c0/0x6c0
Jul 29 05:01:28 b5 kernel: [146061.155668]  [<ffffffff8106b7e3>] kthread+0x93/0xa0
Jul 29 05:01:28 b5 kernel: [146061.155712]  [<ffffffff810782f0>] ? __migrate_task+0x170/0x170
Jul 29 05:01:28 b5 kernel: [146061.155758]  [<ffffffff813cefa4>] kernel_thread_helper+0x4/0x10
Jul 29 05:01:28 b5 kernel: [146061.155805]  [<ffffffff8106b750>] ? kthread_freezable_should_stop+0x70/0x70
Jul 29 05:01:28 b5 kernel: [146061.155854]  [<ffffffff813cefa0>] ? gs_change+0x13/0x13
Jul 29 05:01:28 b5 kernel: [146061.155960] INFO: task txg_quiesce:13876 blocked for more than 120 seconds.
Jul 29 05:01:28 b5 kernel: [146061.156009] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 29 05:01:28 b5 kernel: [146061.156083] txg_quiesce     D ffffffff81407b00     0 13876      2 0x00000000
Jul 29 05:01:28 b5 kernel: [146061.156137]  ffff8805f3f69d50 0000000000000046 ffff8805f3f69d20 0000000000000049
Jul 29 05:01:28 b5 kernel: [146061.156220]  ffff8805f4d34d40 ffff8805f3f69fd8 ffff8805f3f69fd8 ffff8805f3f69fd8
Jul 29 05:01:28 b5 kernel: [146061.156303]  ffff88062457ed70 ffff8805f4d34d40 ffff8805f3f69d60 ffffc900547b6c50
Jul 29 05:01:28 b5 kernel: [146061.156386] Call Trace:
Jul 29 05:01:28 b5 kernel: [146061.156422]  [<ffffffff813c503f>] schedule+0x3f/0x60
Jul 29 05:01:28 b5 kernel: [146061.156469]  [<ffffffffa03eb8c8>] cv_wait_common+0x98/0x190 [spl]
Jul 29 05:01:28 b5 kernel: [146061.156516]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.156565]  [<ffffffffa03eb9f3>] __cv_wait+0x13/0x20 [spl]
Jul 29 05:01:28 b5 kernel: [146061.156632]  [<ffffffffa0512e5b>] txg_quiesce_thread+0x20b/0x3a0 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.156702]  [<ffffffffa0512c50>] ? txg_sync_thread+0x4d0/0x4d0 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.156753]  [<ffffffffa03e4f58>] thread_generic_wrapper+0x78/0x90 [spl]
Jul 29 05:01:28 b5 kernel: [146061.156805]  [<ffffffffa03e4ee0>] ? __thread_create+0x310/0x310 [spl]
Jul 29 05:01:28 b5 kernel: [146061.156853]  [<ffffffff8106b7e3>] kthread+0x93/0xa0
Jul 29 05:01:28 b5 kernel: [146061.156896]  [<ffffffff813cefa4>] kernel_thread_helper+0x4/0x10
Jul 29 05:01:28 b5 kernel: [146061.156942]  [<ffffffff8106b750>] ? kthread_freezable_should_stop+0x70/0x70
Jul 29 05:01:28 b5 kernel: [146061.156992]  [<ffffffff813cefa0>] ? gs_change+0x13/0x13
Jul 29 05:01:28 b5 kernel: [146061.157039] INFO: task rsync:5738 blocked for more than 120 seconds.
Jul 29 05:01:28 b5 kernel: [146061.157085] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 29 05:01:28 b5 kernel: [146061.157159] rsync           D 0000000000000000     0  5738   5737 0x00000000
Jul 29 05:01:28 b5 kernel: [146061.157214]  ffff8801b3987b38 0000000000000082 0000000000000000 0000000000000000
Jul 29 05:01:28 b5 kernel: [146061.157297]  ffff88062404ed70 ffff8801b3987fd8 ffff8801b3987fd8 ffff8801b3987fd8
Jul 29 05:01:28 b5 kernel: [146061.157380]  ffff880613526700 ffff88062404ed70 ffff8801b3987b48 ffff88062372a2e8
Jul 29 05:01:28 b5 kernel: [146061.157463] Call Trace:
Jul 29 05:01:28 b5 kernel: [146061.157498]  [<ffffffff813c503f>] schedule+0x3f/0x60
Jul 29 05:01:28 b5 kernel: [146061.157545]  [<ffffffffa03eb8c8>] cv_wait_common+0x98/0x190 [spl]
Jul 29 05:01:28 b5 kernel: [146061.157592]  [<ffffffff81074813>] ? __wake_up+0x53/0x70
Jul 29 05:01:28 b5 kernel: [146061.157636]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.157685]  [<ffffffffa03eb9f3>] __cv_wait+0x13/0x20 [spl]
Jul 29 05:01:28 b5 kernel: [146061.157752]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.157814]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.157883]  [<ffffffffa0558cf7>] zfs_write+0x377/0xf80 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.157930]  [<ffffffff8104cab6>] ? current_fs_time+0x16/0x60
Jul 29 05:01:28 b5 kernel: [146061.157998]  [<ffffffffa0577472>] zpl_write_common+0x52/0x80 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.158068]  [<ffffffffa0577508>] zpl_write+0x68/0xa0 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.158113]  [<ffffffff81138cc3>] vfs_write+0xb3/0x180
Jul 29 05:01:28 b5 kernel: [146061.158156]  [<ffffffff81138fea>] sys_write+0x4a/0x90
Jul 29 05:01:28 b5 kernel: [146061.158203]  [<ffffffff813cdce9>] system_call_fastpath+0x16/0x1b
Jul 29 05:01:28 b5 kernel: [146061.158253] INFO: task rsync:8813 blocked for more than 120 seconds.
Jul 29 05:01:28 b5 kernel: [146061.158299] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 29 05:01:28 b5 kernel: [146061.158374] rsync           D 0000000000000000     0  8813   8812 0x00000000
Jul 29 05:01:28 b5 kernel: [146061.158428]  ffff8808e02f72a8 0000000000000086 0000000000000000 0000000000000000
Jul 29 05:01:28 b5 kernel: [146061.158511]  ffff880c23181350 ffff8808e02f7fd8 ffff8808e02f7fd8 ffff8808e02f7fd8
Jul 29 05:01:28 b5 kernel: [146061.158594]  ffff880c24126090 ffff880c23181350 ffff8808e02f72b8 ffff88062372a2e8
Jul 29 05:01:28 b5 kernel: [146061.158677] Call Trace:
Jul 29 05:01:28 b5 kernel: [146061.158712]  [<ffffffff813c503f>] schedule+0x3f/0x60
Jul 29 05:01:28 b5 kernel: [146061.158759]  [<ffffffffa03eb8c8>] cv_wait_common+0x98/0x190 [spl]
Jul 29 05:01:28 b5 kernel: [146061.158807]  [<ffffffff81074813>] ? __wake_up+0x53/0x70
Jul 29 05:01:28 b5 kernel: [146061.160052]  [<ffffffff8106c240>] ? add_wait_queue+0x60/0x60
Jul 29 05:01:28 b5 kernel: [146061.160100]  [<ffffffffa03eb9f3>] __cv_wait+0x13/0x20 [spl]
Jul 29 05:01:28 b5 kernel: [146061.160167]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.160230]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.160291]  [<ffffffffa04bb489>] dmu_tx_assign+0xa9/0x630 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.160360]  [<ffffffffa0559e46>] zfs_inactive+0x216/0x400 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.160407]  [<ffffffff810f018e>] ? truncate_pagecache+0x5e/0x70
Jul 29 05:01:28 b5 kernel: [146061.160476]  [<ffffffffa0578988>] zpl_evict_inode+0x28/0x30 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.160523]  [<ffffffff81152548>] evict+0xb8/0x1c0
Jul 29 05:01:28 b5 kernel: [146061.160566]  [<ffffffff81152be1>] dispose_list+0x41/0x50
Jul 29 05:01:28 b5 kernel: [146061.160610]  [<ffffffff81153615>] prune_icache_sb+0x185/0x340
Jul 29 05:01:28 b5 kernel: [146061.160656]  [<ffffffff8113bdfe>] prune_super+0xee/0x1a0
Jul 29 05:01:28 b5 kernel: [146061.160700]  [<ffffffff810f3c64>] shrink_slab+0x154/0x310
Jul 29 05:01:28 b5 kernel: [146061.160745]  [<ffffffff810f6752>] do_try_to_free_pages+0x1a2/0x420
Jul 29 05:01:28 b5 kernel: [146061.160792]  [<ffffffff810f6a7b>] try_to_free_pages+0xab/0x170
Jul 29 05:01:28 b5 kernel: [146061.160839]  [<ffffffff810eb0ad>] __alloc_pages_nodemask+0x49d/0x7f0
Jul 29 05:01:28 b5 kernel: [146061.160890]  [<ffffffffa03e1c13>] ? kmem_alloc_debug+0x263/0x350 [spl]
Jul 29 05:01:28 b5 kernel: [146061.160942]  [<ffffffff81121113>] alloc_pages_current+0x83/0xf0
Jul 29 05:01:28 b5 kernel: [146061.160989]  [<ffffffff810e6c0e>] __get_free_pages+0xe/0x40
Jul 29 05:01:28 b5 kernel: [146061.161037]  [<ffffffffa03ee912>] kv_alloc.isra.9+0x2c/0x88 [spl]
Jul 29 05:01:28 b5 kernel: [146061.161087]  [<ffffffffa03e247f>] spl_kmem_cache_alloc+0x59f/0xa90 [spl]
Jul 29 05:01:28 b5 kernel: [146061.161137]  [<ffffffff81037769>] ? default_spin_lock_flags+0x9/0x10
Jul 29 05:01:28 b5 kernel: [146061.161196]  [<ffffffffa049f7db>] ? dbuf_read+0x88b/0xe30 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161258]  [<ffffffffa04c11a2>] dnode_create+0x42/0x1e0 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161321]  [<ffffffffa04c3f83>] dnode_hold_impl+0x713/0xa10 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161386]  [<ffffffffa04de413>] ? dsl_dir_tempreserve_impl+0x293/0x460 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161462]  [<ffffffffa03e1b9b>] ? kmem_alloc_debug+0x1eb/0x350 [spl]
Jul 29 05:01:28 b5 kernel: [146061.161527]  [<ffffffffa04ce40a>] ? dsl_dataset_check_quota+0x10a/0x260 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161615]  [<ffffffffa04b0ce4>] dmu_object_alloc+0xe4/0x1b0 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161686]  [<ffffffffa05333df>] zap_create_norm+0x2f/0x60 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161755]  [<ffffffffa055ed87>] zfs_mknode+0xcf7/0xe30 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161820]  [<ffffffffa04dfb61>] ? dsl_dir_tempreserve_space+0x1e1/0x320 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161914]  [<ffffffffa04f5758>] ? refcount_add_many+0x78/0x110 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.161984]  [<ffffffffa0511ba1>] ? txg_rele_to_quiesce+0x11/0x20 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.162047]  [<ffffffffa04bb8bc>] ? dmu_tx_assign+0x4dc/0x630 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.162118]  [<ffffffffa0556fdd>] zfs_mkdir+0x45d/0x5d0 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.162186]  [<ffffffffa0578272>] zpl_mkdir+0xa2/0x130 [zfs]
Jul 29 05:01:28 b5 kernel: [146061.162232]  [<ffffffff81144879>] vfs_mkdir+0x69/0xe0
Jul 29 05:01:28 b5 kernel: [146061.162275]  [<ffffffff8114899c>] sys_mkdirat+0xbc/0xd0
Jul 29 05:01:28 b5 kernel: [146061.162319]  [<ffffffff811489c9>] sys_mkdir+0x19/0x20
Jul 29 05:01:28 b5 kernel: [146061.162424]  [<ffffffff813cdce9>] system_call_fastpath+0x16/0x1b

An identical (except for timestamps) set of messages appeared 120 seconds after this one, then the kern log was empty until I power cycled (the reboot command hung) at 11:08.

After the "blocked" messages the box made no further progress in the next 6 hours, until the power cycle.

The box has 48 GB memory, and a few zfs parameters changed from default:

zfs_arc_max=21474836480 zfs_arc_meta_limit=21474836480 zfs_arc_shrink_shift=3 zfs_arc_meta_prune=104857600

I had a zpool iostat 60 running from before the lockup, during the 6 hours following the lockup it showed occasional bursts of activity between hours of zero activity:

2012-07-29-04:56:05 pool2       38.4T  21.2T    242    863  3.57M  46.9M
2012-07-29-04:57:05 pool2       38.4T  21.2T    265  1.01K   469K  43.1M
2012-07-29-04:58:05 pool2       38.4T  21.2T    139    520   204K  19.2M
2012-07-29-04:59:05 pool2       38.4T  21.2T      0      0      0      0
2012-07-29-05:00:05 pool2       38.4T  21.2T      0      0      0      0
2012-07-29-05:01:05 pool2       38.4T  21.2T      0      0      0      0
...
2012-07-29-05:54:05 pool2       38.4T  21.2T      0      0  1.18K      0
...
2012-07-29-07:28:05 pool2       38.4T  21.2T      1      0  3.30K      0
...
2012-07-29-08:52:05 pool2       38.4T  21.2T      0      0    179      0
...
2012-07-29-10:42:06 pool2       38.4T  21.2T     11      0  52.7K      0
...
2012-07-29-10:53:06 pool2       38.4T  21.2T     14      0  72.4K      0
...
2012-07-29-11:05:06 pool2       38.4T  21.2T      0      0      0      0
2012-07-29-11:06:06 pool2       38.4T  21.2T      0      0      0      0
2012-07-29-11:07:06 pool2       38.4T  21.2T      0      0      0      0
<<< power-cycle >>>

I had a few hours of some arc stats logged at 5m intervals on screen, which showed little or no change in the hours preceding the power cycle, e.g.:

arc_meta_used                   4    16995899248
arc_meta_limit                  4    21474836480
arc_meta_max                    4    21514774104

Just before the power cycle, uptime showed a constant load of around 7 and perf top showed ZFS wasn't active:

Events: 3K cycles
 23.84%  [kernel]                       [k] intel_idle
  4.37%  [kernel]                       [k] find_busiest_group
  2.21%  libc-2.13.so                   [.] 0x78e34
  2.18%  [kernel]                       [k] _raw_spin_unlock_irqrestore
  1.95%  [kernel]                       [k] __schedule
  1.77%  [kernel]                       [k] native_read_tsc
  1.54%  [kernel]                       [k] menu_select
  1.47%  [kernel]                       [k] __hrtimer_start_range_ns
  1.32%  [kernel]                       [k] tick_program_event
  1.27%  [kernel]                       [k] read_tsc
  1.26%  [kernel]                       [k] __ticket_spin_lock
  1.25%  [kernel]                       [k] leave_mm
  1.24%  [kernel]                       [k] rb_prev
  1.23%  libslang.so.2.2.4              [.] 0x8d1fa
  1.21%  [kernel]                       [k] cpuidle_idle_call
  1.01%  [kernel]                       [k] do_select
  0.96%  [kernel]                       [k] apic_timer_interrupt
  0.95%  [kernel]                       [k] getnstimeofday
  0.94%  [kernel]                       [k] pm_qos_request
  0.92%  [kernel]                       [k] find_next_bit
  0.89%  [kernel]                       [k] ktime_get_real
  0.82%  [kernel]                       [k] __remove_hrtimer
  0.80%  [kernel]                       [k] idle_cpu
  0.79%  [kernel]                       [k] cpumask_next_and
  0.73%  [kernel]                       [k] __switch_to

What further evidence can be gathered to help track down what's causing these lockups? (I'm happy to add debug code or whatever.)

@chrisrd

My naive reading of this says that we have 3 tasks trying to free up memory by evicting inodes using zpl_evict_inode(), but to do this they're waiting for a transaction group to become free:

[146061.152476] INFO: task kswapd0:78 blocked for more than 120 seconds.
[146061.153245]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
[146061.153307]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
[146061.153369]  [<ffffffffa04bb489>] dmu_tx_assign+0xa9/0x630 [zfs]
[146061.153440]  [<ffffffffa0559e46>] zfs_inactive+0x216/0x400 [zfs]
[146061.153488]  [<ffffffff810f018e>] ? truncate_pagecache+0x5e/0x70
[146061.153558]  [<ffffffffa0578988>] zpl_evict_inode+0x28/0x30 [zfs]
[146061.153607]  [<ffffffff81152548>] evict+0xb8/0x1c0
[146061.153650]  [<ffffffff81152be1>] dispose_list+0x41/0x50
[146061.153695]  [<ffffffff81153615>] prune_icache_sb+0x185/0x340
[146061.153743]  [<ffffffff8113bdfe>] prune_super+0xee/0x1a0
[146061.153789]  [<ffffffff810f3c64>] shrink_slab+0x154/0x310

[146061.154197] INFO: task kswapd1:79 blocked for more than 120 seconds.
[146061.154911]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
[146061.154973]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
[146061.155034]  [<ffffffffa04bb489>] dmu_tx_assign+0xa9/0x630 [zfs]
[146061.155103]  [<ffffffffa0559e46>] zfs_inactive+0x216/0x400 [zfs]
[146061.155150]  [<ffffffff810f018e>] ? truncate_pagecache+0x5e/0x70
[146061.155219]  [<ffffffffa0578988>] zpl_evict_inode+0x28/0x30 [zfs]
[146061.155266]  [<ffffffff81152548>] evict+0xb8/0x1c0
[146061.155309]  [<ffffffff81152be1>] dispose_list+0x41/0x50
[146061.155354]  [<ffffffff81153615>] prune_icache_sb+0x185/0x340
[146061.155399]  [<ffffffff8113bdfe>] prune_super+0xee/0x1a0
[146061.155444]  [<ffffffff810f3c64>] shrink_slab+0x154/0x310

[146061.158253] INFO: task rsync:8813 blocked for more than 120 seconds.
[146061.160167]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
[146061.160230]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
[146061.160291]  [<ffffffffa04bb489>] dmu_tx_assign+0xa9/0x630 [zfs]
[146061.160360]  [<ffffffffa0559e46>] zfs_inactive+0x216/0x400 [zfs]
[146061.160407]  [<ffffffff810f018e>] ? truncate_pagecache+0x5e/0x70
[146061.160476]  [<ffffffffa0578988>] zpl_evict_inode+0x28/0x30 [zfs]
[146061.160523]  [<ffffffff81152548>] evict+0xb8/0x1c0
[146061.160566]  [<ffffffff81152be1>] dispose_list+0x41/0x50
[146061.160610]  [<ffffffff81153615>] prune_icache_sb+0x185/0x340
[146061.160656]  [<ffffffff8113bdfe>] prune_super+0xee/0x1a0
[146061.160700]  [<ffffffff810f3c64>] shrink_slab+0x154/0x310

We have another task also waiting for a transaction group to become free, this time from zfs_write():

[146061.157039] INFO: task rsync:5738 blocked for more than 120 seconds.
[146061.157752]  [<ffffffffa0511d1b>] txg_wait_open+0x9b/0x170 [zfs]
[146061.157814]  [<ffffffffa04bb38d>] dmu_tx_wait+0xfd/0x150 [zfs]
[146061.157883]  [<ffffffffa0558cf7>] zfs_write+0x377/0xf80 [zfs]

Then we have another one more task which I guess is trying involved in freeing up transaction groups but is also stalled:

[146061.155960] INFO: task txg_quiesce:13876 blocked for more than 120 seconds.
[146061.156632]  [<ffffffffa0512e5b>] txg_quiesce_thread+0x20b/0x3a0 [zfs]
[146061.156702]  [<ffffffffa0512c50>] ? txg_sync_thread+0x4d0/0x4d0 [zfs]
[146061.156753]  [<ffffffffa03e4f58>] thread_generic_wrapper+0x78/0x90 [spl]
[146061.156805]  [<ffffffffa03e4ee0>] ? __thread_create+0x310/0x310 [spl]
[146061.156853]  [<ffffffff8106b7e3>] kthread+0x93/0xa0

I'm having difficulty follow the code flow here, e.g. the stack trace shows txg_quiesce_thread() called from txg_sync_thread() although that doesn't happen (directly) in the code. And then the various threads and mutexes etc. are making my brain hurt. However it looks like txg_quiesce_thread() is probably waiting on txg_sync_thread(), running in another task.

So it may be that the problem lies in txg_sync_thread() getting hung up somehow.

I see that txg_sync_thread() has:

#ifdef _KERNEL
        /*
         * Disable the normal reclaim path for the txg_sync thread.  This
         * ensures the thread will never enter dmu_tx_assign() which can
         * otherwise occur due to direct reclaim.  If this is allowed to
         * happen the system can deadlock.  Direct reclaim call path:
         *
         *   ->shrink_icache_memory->prune_icache->dispose_list->
         *   clear_inode->zpl_clear_inode->zfs_inactive->dmu_tx_assign
         */
        current->flags |= PF_MEMALLOC;
#endif /* _KERNEL */

...and, per issues zfsonlinux/spl#116 and #726, it seems that PF_MEMALLOC can be problematical (although I'm not seeing any page allocation failure messages).

Is this pointing towards the problem, or am I completely off track here?

@behlendorf
Owner

Your on the right track. Those threads are all waiting on the next txg, and the thread that's responsible for moving those txg's forward is called txg_sync. It's likely holding the mutex is question and is blocked on something. Often this is related to allocating memory, or a reclaim patch, which is where the PF_MEMALLOC issues come in to play. Anyway, if you can still get the txg_sync stack we can determine what that something is.

@chrisrd

Unfortunately I didn't think to dump the stacks before the reboot, but I'll do that if/when I see the (or any) problem again.

In the meantime I've added in Ryao's PF_MEMALLOC ZoL and kernel patches from #726 (updating to linux-3.4 in the process).

I had another odd, possibly related, occurrence when trying to reboot into the new kernel: I had two unkillable rsync processes, with ps showing both in state D and with WCHAN cv_wait_common, i.e. they were stuck in ZoL. Dumping everything in /proc/spl a minute apart showed some seemingly minor differences (full dumps available if required):

# /proc/spl/kstat/zfs/fm
-0 1 0x01 4 192 198141201357 77210358545308
+0 1 0x01 4 192 198141201357 77270403744022
-zio_cache                             0x00020    311296    245616    16384      952     19    17  4510    304   258 72160
+zio_cache                             0x00020    409600    367472    16384      952     25    25  4510    400   386 72160

# /proc/sys/kernel/spl/kmem/slab_kmem_alloc
-23308995616
+23309117472

# /proc/sys/kernel/spl/kmem/slab_kmem_total
-29162696704
+29162795008

# /proc/sys/kernel/spl/vm/availrmem
-1352620
+1352585

# /proc/sys/kernel/spl/vm/freemem
-1321846
+1321799

With no other signs of life after about 15 minutes, I just power cycled.

@ryao

Try the patches in zfsonlinux/spl#147.

@behlendorf
Owner

Closing issue. This is believed to have been resolved by the recent memory management improvement which will be part of -rc11.

@behlendorf behlendorf closed this
@Stoatwblr

Still getting it on rc12

@maxximino

@Stoatwblr are you seeing exactly the same stack traces? If not, can you post yours? What about rc13?

@Stoatwblr

It hasn't crashed - yet - but it has on previous occasions. Right now spl_kmem_cache is eating all available CPU.

Update: It was like that for about 2 hours and then recovered.

If it's any help to the devs: This is also happening with an rsync (zfs to zfs) It's about 500Gb and there are a shade over 5 million files involved. I've had it happen about a dozen times while rsyncing and this is the first time it's not needed rebooting to recover (New zfs drivers installed 4 days ago at last crash)

The source area has compression and dedupe off. Target has them both on.

ZFS version is 0.6.0.91-0ubuntu1~quantal1, 8Gb ram, Arc limited to 4Gb. Kernel 3.7.0-7

Because it's recovered, I doubt stack traces will help much.

Let me know if there's anything I can supply which will help out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.