Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.6.3-1 zfs fop(write or getxattr maybe) soft lockup and txg_quiesce hung and other fop hung #2897

Closed
inevity opened this issue Nov 14, 2014 · 4 comments
Milestone

Comments

@inevity
Copy link

inevity commented Nov 14, 2014

centos 2.6.32-358.el6.x86_64
0.6.3-1 zfs
no crash kdump info,only dmesg log

some question:
1,the soft lockup call trace have all question mark,how can we confirm the call chain?
and soft lockup call trace have two same one,why ?
2,can we move on only according with the info supplied?

Nov 12 22:10:11 node1 kernel: BUG: soft lockup - CPU#2 stuck for 67s! [glusterfsd:17385]
Nov 12 22:10:11 node1 kernel: Modules linked in: zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate tcp_diag inet_diag bonding 8021q garp stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm uinput acpi_pad sg microcode sb_edac edac_core iTCO_wdt iTCO_vendor_support shpchp e1000e i2c_i801 i2c_core igb dca ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zlib_deflate]
Nov 12 22:10:11 node1 kernel: CPU 2
Nov 12 22:10:11 node1 kernel: Modules linked in: zfs(P)(U) zcommon(P)(U) znvpair(P)(U) zavl(P)(U) zunicode(P)(U) spl(U) zlib_deflate tcp_diag inet_diag bonding 8021q garp stp llc ipv6 vhost_net macvtap macvlan tun kvm_intel kvm uinput acpi_pad sg microcode sb_edac edac_core iTCO_wdt iTCO_vendor_support shpchp e1000e i2c_i801 i2c_core igb dca ptp pps_core ext3 jbd mbcache sd_mod crc_t10dif ahci isci libsas mpt2sas scsi_transport_sas raid_class dm_mirror dm_region_hash dm_log dm_mod [last unloaded: zlib_deflate]
Nov 12 22:10:11 node1 kernel:
Nov 12 22:10:11 node1 kernel: Pid: 17385, comm: glusterfsd Tainted: P --------------- 2.6.32-358.el6.x86_64 #1 Sugon Romley-EN/60R8U2
Nov 12 22:10:11 node1 kernel: RIP: 0010:[] [] _spin_lock+0x1e/0x30
Nov 12 22:10:11 node1 kernel: RSP: 0018:ffff88028d135a78 EFLAGS: 00000202
Nov 12 22:10:11 node1 kernel: RAX: 0000000000000000 RBX: ffff88028d135a78 RCX: 0000000000000000
Nov 12 22:10:11 node1 kernel: RDX: 0000000000000002 RSI: 0000000000000000 RDI: ffff8806bc744de4
Nov 12 22:10:11 node1 kernel: RBP: ffffffff8100bb8e R08: 0000000000000000 R09: ffffc90212254000
Nov 12 22:10:11 node1 kernel: R10: ffff88028d1359d8 R11: ffff8803853080b0 R12: ffff880600000040
Nov 12 22:10:11 node1 kernel: R13: 0000000000000000 R14: ffff880385308048 R15: ffff8803853080b8
Nov 12 22:10:11 node1 kernel: FS: 00007f48ef600700(0000) GS:ffff880028300000(0000) knlGS:0000000000000000
Nov 12 22:10:11 node1 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 12 22:10:11 node1 kernel: CR2: 0000000001822498 CR3: 0000000870a94000 CR4: 00000000000407e0
Nov 12 22:10:11 node1 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Nov 12 22:10:11 node1 kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Nov 12 22:10:11 node1 kernel: Process glusterfsd (pid: 17385, threadinfo ffff88028d134000, task ffff8808562e2ae0)
Nov 12 22:10:11 node1 kernel: Stack:
Nov 12 22:10:11 node1 kernel: ffff88028d135a98 ffffffff8150e935 ffff8806bc744d88 0000000000000029
Nov 12 22:10:11 node1 kernel: ffff88028d135aa8 ffffffff8150e90b ffff88028d135b58 ffffffffa040f547
Nov 12 22:10:11 node1 kernel: ffff88028d135b08 ffff8803f0745b30 ffff8803f0745b30 0000000000000000
Nov 12 22:10:11 node1 kernel: Call Trace:
Nov 12 22:10:11 node1 kernel: [] ? __mutex_unlock_slowpath+0x25/0x60
Nov 12 22:10:11 node1 kernel: [] ? mutex_unlock+0x1b/0x20
Nov 12 22:10:11 node1 kernel: [] ? dbuf_read+0x3d7/0x770 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_buf_will_dirty+0xa4/0x100 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dnode+0x70/0x140 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dbuf+0x46/0x60 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zfs_write+0xb95/0xcf0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:10:11 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_write+0xb8/0x1a0
Nov 12 22:10:11 node1 kernel: [] ? sys_pwrite64+0x82/0xa0
Nov 12 22:10:11 node1 kernel: [] ? system_call_fastpath+0x16/0x1b
Nov 12 22:10:11 node1 kernel: Code: 00 00 00 01 74 05 e8 b2 33 d7 ff c9 c3 55 48 89 e5 0f 1f 44 00 00 b8 00 00 01 00 f0 0f c1 07 0f b7 d0 c1 e8 10 39 c2 74 0e f3 90 <0f> b7 17 eb f5 83 3f 00 75 f4 eb df c9 c3 0f 1f 40 00 55 48 89
Nov 12 22:10:11 node1 kernel: Call Trace:
Nov 12 22:10:11 node1 kernel: [] ? __mutex_unlock_slowpath+0x25/0x60
Nov 12 22:10:11 node1 kernel: [] ? mutex_unlock+0x1b/0x20
Nov 12 22:10:11 node1 kernel: [] ? dbuf_read+0x3d7/0x770 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_buf_will_dirty+0xa4/0x100 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dnode+0x70/0x140 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? dmu_write_uio_dbuf+0x46/0x60 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zfs_write+0xb95/0xcf0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:10:11 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:10:11 node1 kernel: [] ? vfs_write+0xb8/0x1a0
Nov 12 22:10:11 node1 kernel: [] ? sys_pwrite64+0x82/0xa0
Nov 12 22:10:11 node1 kernel: [] ? system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task txg_quiesce:16538 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: txg_quiesce D 0000000000000000 0 16538 2 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8803c5dfbd40 0000000000000046 0000000000000000 ffff8804dcf5da00
Nov 12 22:11:24 node1 kernel: 0000000000000049 0000000000000286 ffff8803c5dfbde0 ffffffffa0348a07
Nov 12 22:11:24 node1 kernel: ffff880141bfdaf8 ffff8803c5dfbfd8 000000000000fb88 ffff880141bfdaf8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? spl_debug_msg+0x427/0x9d0 [spl]
Nov 12 22:11:24 node1 kernel: [] cv_wait_common+0x105/0x1c0 [spl]
Nov 12 22:11:24 node1 kernel: [] ? autoremove_wake_function+0x0/0x40
Nov 12 22:11:24 node1 kernel: [] ? __bitmap_weight+0x8c/0xb0
Nov 12 22:11:24 node1 kernel: [] __cv_wait+0x15/0x20 [spl]
Nov 12 22:11:24 node1 kernel: [] txg_quiesce_thread+0x21b/0x370 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? set_user_nice+0xc9/0x130
Nov 12 22:11:24 node1 kernel: [] ? txg_quiesce_thread+0x0/0x370 [zfs]
Nov 12 22:11:24 node1 kernel: [] thread_generic_wrapper+0x68/0x80 [spl]
Nov 12 22:11:24 node1 kernel: [] ? thread_generic_wrapper+0x0/0x80 [spl]
Nov 12 22:11:24 node1 kernel: [] kthread+0x96/0xa0
Nov 12 22:11:24 node1 kernel: [] child_rip+0xa/0x20
Nov 12 22:11:24 node1 kernel: [] ? kthread+0x0/0xa0
Nov 12 22:11:24 node1 kernel: [] ? child_rip+0x0/0x20
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17366 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17366 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8804df141878 0000000000000082 ffff8804df141840 ffff8804df14183c
Nov 12 22:11:24 node1 kernel: ffff8804df1417f8 ffff88087fc23280 ffff880028316700 00000000000003fd
Nov 12 22:11:24 node1 kernel: ffff88036cf07058 ffff8804df141fd8 000000000000fb88 ffff88036cf07058
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] dbuf_find+0x5b/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] __dbuf_hold_impl+0xa1/0x4a0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold_impl+0x86/0xc0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold+0x20/0x30 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold_noread+0x87/0x180 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold+0x3b/0x90 [zfs]
Nov 12 22:11:24 node1 kernel: [] zap_get_leaf_byblk+0x4f/0x2c0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? dmu_object_info_from_dnode+0xa5/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? mutex_lock+0x1e/0x50
Nov 12 22:11:24 node1 kernel: [] zap_deref_leaf+0x6a/0x80 [zfs]
Nov 12 22:11:24 node1 kernel: [] fzap_cursor_retrieve+0xfc/0x2a0 [zfs]
Nov 12 22:11:24 node1 kernel: [] zap_cursor_retrieve+0x17b/0x2f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? sa_lookup_impl+0x13/0x20 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? filldir+0x7d/0xe0
Nov 12 22:11:24 node1 kernel: [] zfs_readdir+0x17f/0x530 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? nameidata_to_filp+0x54/0x70
Nov 12 22:11:24 node1 kernel: [] ? do_filp_open+0x72a/0xdd0
Nov 12 22:11:24 node1 kernel: [] zpl_readdir+0x58/0x80 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? filldir+0x0/0xe0
Nov 12 22:11:24 node1 kernel: [] ? filldir+0x0/0xe0
Nov 12 22:11:24 node1 kernel: [] vfs_readdir+0xc0/0xe0
Nov 12 22:11:24 node1 kernel: [] ? generic_file_llseek+0x5b/0x70
Nov 12 22:11:24 node1 kernel: [] sys_getdents+0x89/0xf0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17367 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000000 0 17367 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8806bec6b968 0000000000000082 0000000000000000 ffff8803a25c0000
Nov 12 22:11:24 node1 kernel: ffff8806bec6b9e8 ffffffffa034e0ba ffff8806bec6b918 0000000000000246
Nov 12 22:11:24 node1 kernel: ffff880720337098 ffff8806bec6bfd8 000000000000fb88 ffff880720337098
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? spl_kmem_cache_alloc+0x9a/0x1070 [spl]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_write_failed+0x23/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_write_failed+0x13/0x20
Nov 12 22:11:24 node1 kernel: [] ? down_write+0x32/0x40
Nov 12 22:11:24 node1 kernel: [] dnode_new_blkid+0xcf/0x310 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_dirty+0x61e/0x820 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_will_dirty+0xaf/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_write_uio_dnode+0x70/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_write_uio_dbuf+0x46/0x60 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0xb95/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17379 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17379 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff88047294b980 0000000000000082 0000000000000000 ffff880874a9b9c0
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88017867d800 000000000000000e ffff8805160c13a8
Nov 12 22:11:24 node1 kernel: ffff8806805dfaf8 ffff88047294bfd8 000000000000fb88 ffff8806805dfaf8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17380 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17380 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff880285729980 0000000000000082 0000000000000000 0000000000000000
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88023db2b200 000000000000000e ffff88061a7b96a8
Nov 12 22:11:24 node1 kernel: ffff88086e1c8638 ffff880285729fd8 000000000000fb88 ffff88086e1c8638
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17381 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000000 0 17381 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff88028572b918 0000000000000082 ffff88028572b8e0 ffff88028572b8dc
Nov 12 22:11:24 node1 kernel: ffff88086f5f4418 ffff88087fc23080 ffff880028296700 0000000000000400
Nov 12 22:11:24 node1 kernel: ffff8803b1893058 ffff88028572bfd8 000000000000fb88 ffff8803b1893058
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] ? dbuf_read+0x49f/0x770 [zfs]
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] dbuf_find+0x5b/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_prefetch+0x65/0x270 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_zfetch_dofetch+0xfe/0x170 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_zfetch+0x7f8/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? arc_read+0x5f3/0x910 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_read+0x407/0x770 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold_array_by_dnode+0x139/0x530 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_buf_hold_array+0x65/0x90 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? avl_add+0x38/0x50 [zavl]
Nov 12 22:11:24 node1 kernel: [] dmu_read_uio+0x41/0xd0 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_read+0x178/0x4c0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? do_futex+0x100/0xb60
Nov 12 22:11:24 node1 kernel: [] zpl_read+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_read+0xb5/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pread64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17382 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000000 0 17382 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8802ae51d980 0000000000000082 0000000000000000 ffff8802ae51d918
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88086e3ce080 000000000000000e ffff88081f934468
Nov 12 22:11:24 node1 kernel: ffff8807c481e638 ffff8802ae51dfd8 000000000000fb88 ffff8807c481e638
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17383 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17383 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff88027e0cd8e8 0000000000000082 ffff88021d643b58 0000000000000242
Nov 12 22:11:24 node1 kernel: 0000000000000001 ffff8804aafa97e8 ffffffff8100bb8e ffff88027e0cd8e8
Nov 12 22:11:24 node1 kernel: ffff880852e825f8 ffff88027e0cdfd8 000000000000fb88 ffff880852e825f8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? apic_timer_interrupt+0xe/0x20
Nov 12 22:11:24 node1 kernel: [] ? mutex_spin_on_owner+0x8d/0xc0
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] dbuf_find+0x5b/0x100 [zfs]
Nov 12 22:11:24 node1 kernel: [] __dbuf_hold_impl+0xa1/0x4a0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold_impl+0x86/0xc0 [zfs]
Nov 12 22:11:24 node1 kernel: [] dbuf_hold_level+0x1f/0x30 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x4a/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17384 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17384 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8802f7fa1ba8 0000000000000082 ffff8802f7fa1b70 ffff8802f7fa1b6c
Nov 12 22:11:24 node1 kernel: ffff8807518b7b90 ffff88087fc23280 ffff880028316700 00000000000002aa
Nov 12 22:11:24 node1 kernel: ffff880850427098 ffff8802f7fa1fd8 000000000000fb88 ffff880850427098
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Nov 12 22:11:24 node1 kernel: [] ? __d_lookup+0xa7/0x150
Nov 12 22:11:24 node1 kernel: [] mutex_lock+0x2b/0x50
Nov 12 22:11:24 node1 kernel: [] do_lookup+0x11b/0x230
Nov 12 22:11:24 node1 kernel: [] __link_path_walk+0x734/0x1030
Nov 12 22:11:24 node1 kernel: [] path_walk+0x6a/0xe0
Nov 12 22:11:24 node1 kernel: [] do_path_lookup+0x5b/0xa0
Nov 12 22:11:24 node1 kernel: [] user_path_at+0x57/0xa0
Nov 12 22:11:24 node1 kernel: [] ? do_readv_writev+0x169/0x1f0
Nov 12 22:11:24 node1 kernel: [] vfs_fstatat+0x3c/0x80
Nov 12 22:11:24 node1 kernel: [] vfs_stat+0x1b/0x20
Nov 12 22:11:24 node1 kernel: [] sys_newstat+0x24/0x50
Nov 12 22:11:24 node1 kernel: [] ? mnt_drop_write+0x34/0x40
Nov 12 22:11:24 node1 kernel: [] ? sys_fsetxattr+0x78/0xe0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b
Nov 12 22:11:24 node1 kernel: INFO: task glusterfsd:17386 blocked for more than 120 seconds.
Nov 12 22:11:24 node1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 12 22:11:24 node1 kernel: glusterfsd D 0000000000000001 0 17386 1 0x00000000
Nov 12 22:11:24 node1 kernel: ffff8802d0f4f980 0000000000000082 0000000000000000 ffff8802d0f4f928
Nov 12 22:11:24 node1 kernel: 0000000000000000 ffff88086e3ce080 000000000000000e ffff8808219c0b68
Nov 12 22:11:24 node1 kernel: ffff880850517ab8 ffff8802d0f4ffd8 000000000000fb88 ffff880850517ab8
Nov 12 22:11:24 node1 kernel: Call Trace:
Nov 12 22:11:24 node1 kernel: [] ? ip_output+0xb8/0xc0
Nov 12 22:11:24 node1 kernel: [] rwsem_down_failed_common+0x95/0x1d0
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x16/0x30
Nov 12 22:11:24 node1 kernel: [] ? dmu_zfetch+0x32b/0xe30 [zfs]
Nov 12 22:11:24 node1 kernel: [] rwsem_down_read_failed+0x26/0x30
Nov 12 22:11:24 node1 kernel: [] call_rwsem_down_read_failed+0x14/0x30
Nov 12 22:11:24 node1 kernel: [] ? down_read+0x24/0x30
Nov 12 22:11:24 node1 kernel: [] dmu_tx_check_ioerr+0x35/0x140 [zfs]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_count_write+0x409/0x6f0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __kmalloc+0x20c/0x220
Nov 12 22:11:24 node1 kernel: [] ? kmem_alloc_debug+0x8f/0x4c0 [spl]
Nov 12 22:11:24 node1 kernel: [] dmu_tx_hold_write+0x4f/0x70 [zfs]
Nov 12 22:11:24 node1 kernel: [] zfs_write+0x406/0xcf0 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? __zpl_xattr_get+0x185/0x200 [zfs]
Nov 12 22:11:24 node1 kernel: [] ? strfree+0xe/0x10 [spl]
Nov 12 22:11:24 node1 kernel: [] ? vfs_getxattr+0xae/0xe0
Nov 12 22:11:24 node1 kernel: [] ? zfs_getattr_fast+0xe7/0x160 [zfs]
Nov 12 22:11:24 node1 kernel: [] zpl_write+0x9b/0xe0 [zfs]
Nov 12 22:11:24 node1 kernel: [] vfs_write+0xb8/0x1a0
Nov 12 22:11:24 node1 kernel: [] sys_pwrite64+0x82/0xa0
Nov 12 22:11:24 node1 kernel: [] system_call_fastpath+0x16/0x1b

@behlendorf
Copy link
Contributor

This looks very similar to #2523 which is planned to be fixed in the next tag.

@behlendorf behlendorf added this to the 0.6.4 milestone Nov 14, 2014
@inevity
Copy link
Author

inevity commented Nov 18, 2014

Thanks ,i will follow it

tuxoko added a commit to tuxoko/spl that referenced this issue Dec 18, 2014
It is known that mutex in Linux is not safe when using it to synchronize
the freeing of object in which the mutex is embedded:
http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race condition
are zio->io_lock and dbuf->db_mtx:
  zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait()
  and zio_done().
  dbuf uses dbuf->db_mtx to protect reference counting.

Add it's been reported multiple time that zio->io_lock and dbuf->db_mtx suffer
from corruption, which is exactly the symptom of the race.
openzfs/zfs#2523
openzfs/zfs#2897

This patch fix this kind of race by forcing serializaion on mutex_exit() with
a spinlock, making the mutex safer by sacrificing a bit of performance and
memory overhead.

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
tuxoko added a commit to tuxoko/spl that referenced this issue Dec 18, 2014
It is known that mutex in Linux is not safe when using it to synchronize
the freeing of object in which the mutex is embedded:
http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race condition
are zio->io_lock and dbuf->db_mtx:
  zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait()
  and zio_done().
  dbuf uses dbuf->db_mtx to protect reference counting.

Add it's been reported multiple time that zio->io_lock and dbuf->db_mtx suffer
from corruption, which is exactly the symptom of the race.
openzfs/zfs#2523
openzfs/zfs#2897

This patch fix this kind of race by forcing serializaion on mutex_exit() with
a spinlock, making the mutex safer by sacrificing a bit of performance and
memory overhead.

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
tuxoko added a commit to tuxoko/spl that referenced this issue Dec 19, 2014
It is known that mutex in Linux is not safe when using it to synchronize
the freeing of object in which the mutex is embedded:
http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race condition
are zio->io_lock and dbuf->db_mtx:
  zio uses zio->io_lock and zio->io_cv to synchronize freeing between zio_wait()
  and zio_done().
  dbuf uses dbuf->db_mtx to protect reference counting.

Add it's been reported multiple time that zio->io_lock and dbuf->db_mtx suffer
from corruption, which is exactly the symptom of the race.
openzfs/zfs#2523
openzfs/zfs#2897

This patch fix this kind of race by forcing serializaion on mutex_exit() with
a spinlock, making the mutex safer by sacrificing a bit of performance and
memory overhead.

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
behlendorf pushed a commit to openzfs/spl that referenced this issue Dec 19, 2014
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #421
@behlendorf
Copy link
Contributor

This issue which is a duplicate of #2523 was resolved by the following commit. Full details can be found in the commit message and related lwn article.

openzfs/spl@a3c1eb7 mutex: force serialization on mutex_exit() to fix races

dajhorn added a commit to zfsonlinux/pkg-spl that referenced this issue Dec 20, 2014
Commit: openzfs/zfs@a3c1eb7
From: Chunwei Chen <tuxoko@gmail.com>
Date: Fri, 19 Dec 2014 11:31:59 +0800
Subject: mutex: force serialization on mutex_exit() to fix races

It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Backported-by: Darik Horn <dajhorn@vanadac.com>
Closes #421

Conflicts:
        include/sys/mutex.h
behlendorf pushed a commit to openzfs/spl that referenced this issue Dec 23, 2014
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes #421
@ryao
Copy link
Contributor

ryao commented Feb 10, 2015

@behlendorf This issue is not a duplicate of #2523. Instead, it is a duplicate of #3091, provided that it occurred on a multi-socket system.

ryao pushed a commit to ryao/spl that referenced this issue Feb 19, 2015
It is known that mutexes in Linux are not safe when using them to
synchronize the freeing of object in which the mutex is embedded:

http://lwn.net/Articles/575477/

The known places in ZFS which are suspected to suffer from the race
condition are zio->io_lock and dbuf->db_mtx.

* zio uses zio->io_lock and zio->io_cv to synchronize freeing
  between zio_wait() and zio_done().
* dbuf uses dbuf->db_mtx to protect reference counting.

This patch fixes this kind of race by forcing serialization on
mutex_exit() with a spin lock, making the mutex safe by sacrificing
a bit of performance and memory overhead.

This issue most commonly manifests itself as a deadlock in the zio
pipeline caused by a process spinning on the damaged mutex.  Similar
deadlocks have been reported for the dbuf->db_mtx mutex.  And it can
also cause a NULL dereference or bad paging request under the right
circumstances.

This issue any many like it are linked off the openzfs/zfs#2523
issue.  Specifically this fix resolves at least the following
outstanding issues:

openzfs/zfs#401
openzfs/zfs#2523
openzfs/zfs#2679
openzfs/zfs#2684
openzfs/zfs#2704
openzfs/zfs#2708
openzfs/zfs#2517
openzfs/zfs#2827
openzfs/zfs#2850
openzfs/zfs#2891
openzfs/zfs#2897
openzfs/zfs#2247
openzfs/zfs#2939

Signed-off-by: Chunwei Chen <tuxoko@gmail.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Richard Yao <ryao@gentoo.org>
Closes openzfs#421

Conflicts:
	include/sys/mutex.h
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants