0.6.3-1 zfs fop(write or getxattr maybe) soft lockup and txg_quiesce hung and other fop hung #2897

inevity · 2014-11-14T10:45:54Z

centos 2.6.32-358.el6.x86_64
0.6.3-1 zfs
no crash kdump info,only dmesg log

some question:
1,the soft lockup call trace have all question mark,how can we confirm the call chain?
and soft lockup call trace have two same one,why ?
2,can we move on only according with the info supplied?

Nov 12 22:10:11 node1 kernel: BUG: soft lockup - CPU#2 stuck for 67s! [glusterfsd:17385]
Nov 12 22:10:11 node1 kernel: Modules linked in: zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate tcp_diag inet_diag bonding 8021q garp stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm uinput acpi_pad sg microcode sb_edac edac_core iTCO_wdt iTCO_vendor_support shpchp e1000e i2c_i801 i2c_core igb dca ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zlib_deflate]
Nov 12 22:10:11 node1 kernel: CPU 2
Nov 12 22:10:11 node1 kernel: Modules linked in: zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate tcp_diag inet_diag bonding 8021q garp stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm uinput acpi_pad sg microcode sb_edac edac_core iTCO_wdt iTCO_vendor_support shpchp e1000e i2c_i801 i2c_core igb dca ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zlib_deflate]
Nov 12 22:10:11 node1 kernel:
Nov 12 22:10:11 node1 kernel: Pid: 17385, comm: glusterfsd Tainted: P --------------- 2.6.32-358.el6.x86_64 #1 Sugon Romley-EN/60R8U2
Nov 12 22:10:11 node1 kernel: RIP: 0010:[] [] _spin_lock+0x1e/0x30
Nov 12 22:10:11 node1 kernel: RSP: 0018:ffff88028d135a78 EFLAGS: 00000202
Nov 12 22:10:11 node1 kernel: RAX: 0000000000000000 RBX: ffff88028d135a78 RCX: 0000000000000000
Nov 12 22:10:11 node1 kernel: RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff8806bc744de4
Nov 12 22:10:11 node1 kernel: RBP: ffffffff8100bb8e R08: 0000000000000000 R09: ffffc90212254000
Nov 12 22:10:11 node1 kernel: R10: ffff88028d1359d8 R11: ffff8803853080b0 R12: ffff880600000040
Nov 12 22:10:11 node1 kernel: R13: 0000000000000000 R14: ffff880385308048 R15: ffff8803853080b8
Nov 12 22:10:11 node1 kernel: FS: 00007f48ef600700(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
Nov 12 22:10:11 node1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 12 22:10:11 node1 kernel: CR2: 0000000001822498 CR3: 0000000870a94000 CR4: 00000000000407e0
Nov 12 22:10:11 node1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 12 22:10:11 node1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov 12 22:10:11 node1 kernel: Process glusterfsd (pid: 17385, threadinfo ffff88028d134000, task ffff8808562e2ae0)
Nov 12 22:10:11 node1 kernel: Stack:
Nov 12 22:10:11 node1 kernel: ffff88028d135a98 ffffffff8150e935 ffff8806bc744d88 0000000000000029
Nov 12 22:10:11 node1 kernel: ffff88028d135aa8 ffffffff8150e90b ffff88028d135b58 ffffffffa040f547
Nov 12 22:10:11 node1 kernel: ffff88028d135b08 ffff8803f0745b30 ffff8803f0745b30 0000000000000000
Nov 12 22:10:11 node1 kernel: Call Trace:
Nov 12 22:10:11 node1 kernel: [] ? __mutex_unlock_slowpath+0x25/0x60
Nov 12 22:10:11 node1 kernel: [] ? mutex_unlock+0x1b/0x20
Nov 12 22:10:11 node1 kernel: [] ? dbuf_read+0x3d7/0x770 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_buf_will_dirty+0xa4/0x100 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dnode+0x70/0x140 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dbuf+0x46/0x60 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zfs_write+0xb95/0xcf0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:10:11 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_write+0xb8/0x1a0
Nov 12 22:10:11 node1 kernel: [] ? sys_pwrite64+0x82/0xa0
Nov 12 22:10:11 node1 kernel: [] ? system_call_fastpath+0x16/0x1b
Nov 12 22:10:11 node1 kernel: Code: 00 00 00 01 74 05 e8 b2 33 d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89
Nov 12 22:10:11 node1 kernel: Call Trace:
Nov 12 22:10:11 node1 kernel: [] ? __mutex_unlock_slowpath+0x25/0x60
Nov 12 22:10:11 node1 kernel: [] ? mutex_unlock+0x1b/0x20
Nov 12 22:10:11 node1 kernel: [] ? dbuf_read+0x3d7/0x770 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_buf_will_dirty+0xa4/0x100 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dnode+0x70/0x140 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dbuf+0x46/0x60 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zfs_write+0xb95/0xcf0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:10:11 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_write+0xb8/0x1a0
Nov 12 22:10:11 node1 kernel: [] ? sys_pwrite64+0x82/0xa0
Nov 12 22:10:11 node1 kernel: [] ? system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task txg_quiesce:16538 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: txg_quiesce D 0000000000000000 0 16538 2 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8803c5dfbd40 0000000000000046 0000000000000000 ffff8804dcf5da00
Nov 12 22:11:24 node1 kernel: 0000000000000049 0000000000000286 ffff8803c5dfbde0 ffffffffa0348a07
Nov 12 22:11:24 node1 kernel: ffff880141bfdaf8 ffff8803c5dfbfd8 000000000000fb88 ffff880141bfdaf8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? spl_debug_msg+0x427/0x9d0 [spl]
Nov 12 22:11:24 node1 kernel: [] cv_wait_common+0x105/0x1c0 [spl]
Nov 12 22:11:24 node1 kernel: [] ? autoremove_wake_function+0x0/0x40
Nov 12 22:11:24 node1 kernel: [] ? __bitmap_weight+0x8c/0xb0
Nov 12 22:11:24 node1 kernel: [] __cv_wait+0x15/0x20 [spl]
Nov 12 22:11:24 node1 kernel: [] txg_quiesce_thread+0x21b/0x370 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? set_user_nice+0xc9/0x130
Nov 12 22:11:24 node1 kernel: [] ? txg_quiesce_thread+0x0/0x370 [zfs]
Nov 12 22:11:24 node1 kernel: [] thread_generic_wrapper+0x68/0x80 [spl]
Nov 12 22:11:24 node1 kernel: [] ? thread_generic_wrapper+0x0/0x80 [spl]
Nov 12 22:11:24 node1 kernel: [] kthread+0x96/0xa0
Nov 12 22:11:24 node1 kernel: [] child_rip+0xa/0x20
Nov 12 22:11:24 node1 kernel: [] ? kthread+0x0/0xa0
Nov 12 22:11:24 node1 kernel: [] ? child_rip+0x0/0x20
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17366 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17366 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8804df141878 0000000000000082 ffff8804df141840 ffff8804df14183c
Nov 12 22:11:24 node1 kernel: ffff8804df1417f8 ffff88087fc23280 ffff880028316700 00000000000003fd
Nov 12 22:11:24 node1 kernel: ffff88036cf07058 ffff8804df141fd8 000000000000fb88 ffff88036cf07058
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] dbuf_find+0x5b/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] __dbuf_hold_impl+0xa1/0x4a0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold_impl+0x86/0xc0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold+0x20/0x30 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold_noread+0x87/0x180 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold+0x3b/0x90 [zfs]
Nov 12 22:11:24 node1 kernel: [] zap_get_leaf_byblk+0x4f/0x2c0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? dmu_object_info_from_dnode+0xa5/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? mutex_lock+0x1e/0x50
Nov 12 22:11:24 node1 kernel: [] zap_deref_leaf+0x6a/0x80 [zfs]
Nov 12 22:11:24 node1 kernel: [] fzap_cursor_retrieve+0xfc/0x2a0 [zfs]
Nov 12 22:11:24 node1 kernel: [] zap_cursor_retrieve+0x17b/0x2f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? sa_lookup_impl+0x13/0x20 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? filldir+0x7d/0xe0
Nov 12 22:11:24 node1 kernel: [] zfs_readdir+0x17f/0x530 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? nameidata_to_filp+0x54/0x70
Nov 12 22:11:24 node1 kernel: [] ? do_filp_open+0x72a/0xdd0
Nov 12 22:11:24 node1 kernel: [] zpl_readdir+0x58/0x80 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? filldir+0x0/0xe0
Nov 12 22:11:24 node1 kernel: [] ? filldir+0x0/0xe0
Nov 12 22:11:24 node1 kernel: [] vfs_readdir+0xc0/0xe0
Nov 12 22:11:24 node1 kernel: [] ? generic_file_llseek+0x5b/0x70
Nov 12 22:11:24 node1 kernel: [] sys_getdents+0x89/0xf0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17367 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000000 0 17367 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8806bec6b968 0000000000000082 0000000000000000 ffff8803a25c0000
Nov 12 22:11:24 node1 kernel: ffff8806bec6b9e8 ffffffffa034e0ba ffff8806bec6b918 0000000000000246
Nov 12 22:11:24 node1 kernel: ffff880720337098 ffff8806bec6bfd8 000000000000fb88 ffff880720337098
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? spl_kmem_cache_alloc+0x9a/0x1070 [spl]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_write_failed+0x23/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_write_failed+0x13/0x20
Nov 12 22:11:24 node1 kernel: [] ? down_write+0x32/0x40
Nov 12 22:11:24 node1 kernel: [] dnode_new_blkid+0xcf/0x310 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_dirty+0x61e/0x820 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_will_dirty+0xaf/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_write_uio_dnode+0x70/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_write_uio_dbuf+0x46/0x60 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0xb95/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17379 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17379 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff88047294b980 0000000000000082 0000000000000000 ffff880874a9b9c0
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88017867d800 000000000000000e ffff8805160c13a8
Nov 12 22:11:24 node1 kernel: ffff8806805dfaf8 ffff88047294bfd8 000000000000fb88 ffff8806805dfaf8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17380 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17380 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff880285729980 0000000000000082 0000000000000000 0000000000000000
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88023db2b200 000000000000000e ffff88061a7b96a8
Nov 12 22:11:24 node1 kernel: ffff88086e1c8638 ffff880285729fd8 000000000000fb88 ffff88086e1c8638
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17381 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000000 0 17381 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff88028572b918 0000000000000082 ffff88028572b8e0 ffff88028572b8dc
Nov 12 22:11:24 node1 kernel: ffff88086f5f4418 ffff88087fc23080 ffff880028296700 0000000000000400
Nov 12 22:11:24 node1 kernel: ffff8803b1893058 ffff88028572bfd8 000000000000fb88 ffff8803b1893058
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] ? dbuf_read+0x49f/0x770 [zfs]
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] dbuf_find+0x5b/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_prefetch+0x65/0x270 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_zfetch_dofetch+0xfe/0x170 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_zfetch+0x7f8/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? arc_read+0x5f3/0x910 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_read+0x407/0x770 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold_array_by_dnode+0x139/0x530 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold_array+0x65/0x90 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? avl_add+0x38/0x50 [zavl]
Nov 12 22:11:24 node1 kernel: [] dmu_read_uio+0x41/0xd0 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_read+0x178/0x4c0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? do_futex+0x100/0xb60
Nov 12 22:11:24 node1 kernel: [] zpl_read+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_read+0xb5/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pread64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17382 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000000 0 17382 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8802ae51d980 0000000000000082 0000000000000000 ffff8802ae51d918
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88086e3ce080 000000000000000e ffff88081f934468
Nov 12 22:11:24 node1 kernel: ffff8807c481e638 ffff8802ae51dfd8 000000000000fb88 ffff8807c481e638
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17383 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17383 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff88027e0cd8e8 0000000000000082 ffff88021d643b58 0000000000000242
Nov 12 22:11:24 node1 kernel: 0000000000000001 ffff8804aafa97e8 ffffffff8100bb8e ffff88027e0cd8e8
Nov 12 22:11:24 node1 kernel: ffff880852e825f8 ffff88027e0cdfd8 000000000000fb88 ffff880852e825f8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? apic_timer_interrupt+0xe/0x20
Nov 12 22:11:24 node1 kernel: [] ? mutex_spin_on_owner+0x8d/0xc0
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] dbuf_find+0x5b/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] __dbuf_hold_impl+0xa1/0x4a0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold_impl+0x86/0xc0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold_level+0x1f/0x30 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x4a/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17384 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17384 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8802f7fa1ba8 0000000000000082 ffff8802f7fa1b70 ffff8802f7fa1b6c
Nov 12 22:11:24 node1 kernel: ffff8807518b7b90 ffff88087fc23280 ffff880028316700 00000000000002aa
Nov 12 22:11:24 node1 kernel: ffff880850427098 ffff8802f7fa1fd8 000000000000fb88 ffff880850427098
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] ? __d_lookup+0xa7/0x150
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] do_lookup+0x11b/0x230
Nov 12 22:11:24 node1 kernel: [] __link_path_walk+0x734/0x1030
Nov 12 22:11:24 node1 kernel: [] path_walk+0x6a/0xe0
Nov 12 22:11:24 node1 kernel: [] do_path_lookup+0x5b/0xa0
Nov 12 22:11:24 node1 kernel: [] user_path_at+0x57/0xa0
Nov 12 22:11:24 node1 kernel: [] ? do_readv_writev+0x169/0x1f0
Nov 12 22:11:24 node1 kernel: [] vfs_fstatat+0x3c/0x80
Nov 12 22:11:24 node1 kernel: [] vfs_stat+0x1b/0x20
Nov 12 22:11:24 node1 kernel: [] sys_newstat+0x24/0x50
Nov 12 22:11:24 node1 kernel: [] ? mnt_drop_write+0x34/0x40
Nov 12 22:11:24 node1 kernel: [] ? sys_fsetxattr+0x78/0xe0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17386 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17386 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8802d0f4f980 0000000000000082 0000000000000000 ffff8802d0f4f928
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88086e3ce080 000000000000000e ffff8808219c0b68
Nov 12 22:11:24 node1 kernel: ffff880850517ab8 ffff8802d0f4ffd8 000000000000fb88 ffff880850517ab8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b

behlendorf · 2014-11-14T21:45:04Z

This looks very similar to #2523 which is planned to be fixed in the next tag.

inevity · 2014-11-18T04:35:24Z

Thanks ,i will follow it

It is known that mutex in Linux is not safe when using it to synchronize the freeing of object in which the mutex is embedded: http://lwn.net/Articles/575477/ The known places in ZFS which are suspected to suffer from the race condition are zio->io_lock and dbuf->db_mtx: zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait() and zio_done(). dbuf uses dbuf->db_mtx to protect reference counting. Add it's been reported multiple time that zio->io_lock and dbuf->db_mtx suffer from corruption, which is exactly the symptom of the race. openzfs/zfs#2523 openzfs/zfs#2897 This patch fix this kind of race by forcing serializaion on mutex_exit() with a spinlock, making the mutex safer by sacrificing a bit of performance and memory overhead. Signed-off-by: Chunwei Chen <tuxoko@gmail.com>

It is known that mutexes in Linux are not safe when using them to synchronize the freeing of object in which the mutex is embedded: http://lwn.net/Articles/575477/ The known places in ZFS which are suspected to suffer from the race condition are zio->io_lock and dbuf->db_mtx. * zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait() and zio_done(). * dbuf uses dbuf->db_mtx to protect reference counting. This patch fixes this kind of race by forcing serialization on mutex_exit() with a spin lock, making the mutex safe by sacrificing a bit of performance and memory overhead. This issue most commonly manifests itself as a deadlock in the zio pipeline caused by a process spinning on the damaged mutex. Similar deadlocks have been reported for the dbuf->db_mtx mutex. And it can also cause a NULL dereference or bad paging request under the right circumstances. This issue any many like it are linked off the openzfs/zfs#2523 issue. Specifically this fix resolves at least the following outstanding issues: openzfs/zfs#401 openzfs/zfs#2523 openzfs/zfs#2679 openzfs/zfs#2684 openzfs/zfs#2704 openzfs/zfs#2708 openzfs/zfs#2517 openzfs/zfs#2827 openzfs/zfs#2850 openzfs/zfs#2891 openzfs/zfs#2897 openzfs/zfs#2247 openzfs/zfs#2939 Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #421

behlendorf · 2014-12-19T19:03:02Z

This issue which is a duplicate of #2523 was resolved by the following commit. Full details can be found in the commit message and related lwn article.

openzfs/spl@a3c1eb7 mutex: force serialization on mutex_exit() to fix races

Commit: openzfs/zfs@a3c1eb7 From: Chunwei Chen <tuxoko@gmail.com> Date: Fri, 19 Dec 2014 11:31:59 +0800 Subject: mutex: force serialization on mutex_exit() to fix races It is known that mutexes in Linux are not safe when using them to synchronize the freeing of object in which the mutex is embedded: http://lwn.net/Articles/575477/ The known places in ZFS which are suspected to suffer from the race condition are zio->io_lock and dbuf->db_mtx. * zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait() and zio_done(). * dbuf uses dbuf->db_mtx to protect reference counting. This patch fixes this kind of race by forcing serialization on mutex_exit() with a spin lock, making the mutex safe by sacrificing a bit of performance and memory overhead. This issue most commonly manifests itself as a deadlock in the zio pipeline caused by a process spinning on the damaged mutex. Similar deadlocks have been reported for the dbuf->db_mtx mutex. And it can also cause a NULL dereference or bad paging request under the right circumstances. This issue any many like it are linked off the openzfs/zfs#2523 issue. Specifically this fix resolves at least the following outstanding issues: openzfs/zfs#401 openzfs/zfs#2523 openzfs/zfs#2679 openzfs/zfs#2684 openzfs/zfs#2704 openzfs/zfs#2708 openzfs/zfs#2517 openzfs/zfs#2827 openzfs/zfs#2850 openzfs/zfs#2891 openzfs/zfs#2897 openzfs/zfs#2247 openzfs/zfs#2939 Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Backported-by: Darik Horn <dajhorn@vanadac.com> Closes #421 Conflicts: include/sys/mutex.h

It is known that mutexes in Linux are not safe when using them to synchronize the freeing of object in which the mutex is embedded: http://lwn.net/Articles/575477/ The known places in ZFS which are suspected to suffer from the race condition are zio->io_lock and dbuf->db_mtx. * zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait() and zio_done(). * dbuf uses dbuf->db_mtx to protect reference counting. This patch fixes this kind of race by forcing serialization on mutex_exit() with a spin lock, making the mutex safe by sacrificing a bit of performance and memory overhead. This issue most commonly manifests itself as a deadlock in the zio pipeline caused by a process spinning on the damaged mutex. Similar deadlocks have been reported for the dbuf->db_mtx mutex. And it can also cause a NULL dereference or bad paging request under the right circumstances. This issue any many like it are linked off the openzfs/zfs#2523 issue. Specifically this fix resolves at least the following outstanding issues: openzfs/zfs#401 openzfs/zfs#2523 openzfs/zfs#2679 openzfs/zfs#2684 openzfs/zfs#2704 openzfs/zfs#2708 openzfs/zfs#2517 openzfs/zfs#2827 openzfs/zfs#2850 openzfs/zfs#2891 openzfs/zfs#2897 openzfs/zfs#2247 openzfs/zfs#2939 Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes #421

ryao · 2015-02-10T18:43:44Z

@behlendorf This issue is not a duplicate of #2523. Instead, it is a duplicate of #3091, provided that it occurred on a multi-socket system.

It is known that mutexes in Linux are not safe when using them to synchronize the freeing of object in which the mutex is embedded: http://lwn.net/Articles/575477/ The known places in ZFS which are suspected to suffer from the race condition are zio->io_lock and dbuf->db_mtx. * zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait() and zio_done(). * dbuf uses dbuf->db_mtx to protect reference counting. This patch fixes this kind of race by forcing serialization on mutex_exit() with a spin lock, making the mutex safe by sacrificing a bit of performance and memory overhead. This issue most commonly manifests itself as a deadlock in the zio pipeline caused by a process spinning on the damaged mutex. Similar deadlocks have been reported for the dbuf->db_mtx mutex. And it can also cause a NULL dereference or bad paging request under the right circumstances. This issue any many like it are linked off the openzfs/zfs#2523 issue. Specifically this fix resolves at least the following outstanding issues: openzfs/zfs#401 openzfs/zfs#2523 openzfs/zfs#2679 openzfs/zfs#2684 openzfs/zfs#2704 openzfs/zfs#2708 openzfs/zfs#2517 openzfs/zfs#2827 openzfs/zfs#2850 openzfs/zfs#2891 openzfs/zfs#2897 openzfs/zfs#2247 openzfs/zfs#2939 Signed-off-by: Chunwei Chen <tuxoko@gmail.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Richard Yao <ryao@gentoo.org> Closes openzfs#421 Conflicts: include/sys/mutex.h

behlendorf added Bug - Blocker labels Nov 14, 2014

behlendorf added this to the 0.6.4 milestone Nov 14, 2014

tuxoko mentioned this issue Dec 17, 2014

zio->io_lock() spinlock damaged #2523

Closed

tuxoko mentioned this issue Dec 18, 2014

mutex: force serialization on mutex_exit() to fix races openzfs/spl#421

Closed

behlendorf closed this as completed Dec 19, 2014

tuxoko mentioned this issue Feb 11, 2015

ZFS lockup 0.6.3.1 #3091

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.6.3-1 zfs fop(write or getxattr maybe) soft lockup and txg_quiesce hung and other fop hung #2897

0.6.3-1 zfs fop(write or getxattr maybe) soft lockup and txg_quiesce hung and other fop hung #2897

inevity commented Nov 14, 2014

behlendorf commented Nov 14, 2014

inevity commented Nov 18, 2014

behlendorf commented Dec 19, 2014

ryao commented Feb 10, 2015

0.6.3-1 zfs fop(write or getxattr maybe) soft lockup and txg_quiesce hung and other fop hung #2897

0.6.3-1 zfs fop(write or getxattr maybe) soft lockup and txg_quiesce hung and other fop hung #2897

Comments

inevity commented Nov 14, 2014

behlendorf commented Nov 14, 2014

inevity commented Nov 18, 2014

behlendorf commented Dec 19, 2014

ryao commented Feb 10, 2015