Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

System hangs (bad RIP value) when disk used in pool is removed (zfs-0.6.5.1) #3821

Closed
ab-oe opened this issue Sep 23, 2015 · 9 comments
Closed
Milestone

Comments

@ab-oe
Copy link
Contributor

ab-oe commented Sep 23, 2015

Hello,
I created new zpool with one disk and started copying files. Then I unplugged this disk and system hung instead of suspending I/O. After a few tests I was able to capture the call trace:
call_trace

On version 0.6.4 everything works well. There is no possibility to get I/O suspended with the latest ZoL because system always hangs.

@kernelOfTruth
Copy link
Contributor

@ab-oe thanks for the report !

referencing: #3817 Hang on USB disconnect

that report also shows

z_null_int

@behlendorf behlendorf added this to the 0.6.5.2 milestone Sep 23, 2015
@behlendorf
Copy link
Contributor

@ab-oe f you're able to reproduce this could you try setting the module option spl_taskq_thread_dynamic=0. That may resolve the issue and we an acceptable work around until it can be address properly. This effectively reverts the taskq's to their 0.6.4 behavior.

@ab-oe
Copy link
Contributor Author

ab-oe commented Sep 24, 2015

@behlendorf unfortunatelly setting spl_taskq_thread_dynamic to 0 doesn't resolve this issue. System hung just like before.

@ab-oe
Copy link
Contributor Author

ab-oe commented Sep 24, 2015

I performed bisection and see that issue with missing I/O suspend was introduced in b39c22b the z_null_int takes 100% of CPU. System is hardly responsive but it still works.

I got following call trace:

[ 1039.410577] INFO: rcu_sched self-detected stall on CPU { 2}  (t=6000 jiffies g=10308 c=10307 q=55247)
[ 1039.410585] sending NMI to all CPUs:
[ 1039.410643] NMI backtrace for cpu 0
[ 1039.410648] CPU: 0 PID: 6010 Comm: dd Tainted: P           O 3.10.36-oe64-ga7b1265 #4
[ 1039.410650] Hardware name: Supermicro X8DTS/X8DTS, BIOS 2.1     06/25/2012
[ 1039.410652] task: ffff8802f1038d00 ti: ffff8802eccaa000 task.ti: ffff8802eccaa000
[ 1039.410654] RIP: 0010:[<ffffffff81065898>]  [<ffffffff81065898>] hrtimer_run_pending+0x8/0xe0
[ 1039.410662] RSP: 0018:ffff880333203eb0  EFLAGS: 00000296
[ 1039.410664] RAX: 0000000000000001 RBX: 0000000000000001 RCX: ffffffff81980a40
[ 1039.410665] RDX: 000000000000008c RSI: 0000000000000010 RDI: ffffffff81951088
[ 1039.410667] RBP: 0000000000000100 R08: 000000000000c590 R09: 000000001966ad03
[ 1039.410668] R10: 0000020abff35118 R11: 000000f201b2b18c R12: 0000000000000001
[ 1039.410670] R13: ffff8802eccabfd8 R14: ffffffff81951088 R15: ffffffff81b60480
[ 1039.410672] FS:  00007fa329928700(0000) GS:ffff880333200000(0000) knlGS:0000000000000000
[ 1039.410674] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1039.410675] CR2: 0000000000915088 CR3: 00000002ed2a3000 CR4: 00000000000007f0
[ 1039.410677] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1039.410678] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1039.410680] Stack:
[ 1039.410681]  ffffffff8105001c 000000000000008c 0000000000000010 ffff8802f1038d00
[ 1039.410683]  0000000000000001 0000000000000100 0000000000000001 ffff8802eccabfd8
[ 1039.410686]  ffffffff81951088 0000000000000001 ffffffff8104a2f6 00000000810805ca
[ 1039.410688] Call Trace:
[ 1039.410690]  <IRQ> 

[ 1039.410694]  [<ffffffff8105001c>] ? run_timer_softirq+0x1c/0x200
[ 1039.410699]  [<ffffffff8104a2f6>] ? __do_softirq+0x116/0x210
[ 1039.410703]  [<ffffffff8169a01c>] ? call_softirq+0x1c/0x30
[ 1039.410707]  [<ffffffff8100d345>] ? do_softirq+0x45/0x80
[ 1039.410710]  [<ffffffff8104a55e>] ? irq_exit+0x9e/0xc0
[ 1039.410713]  [<ffffffff8102ba07>] ? smp_apic_timer_interrupt+0x67/0xa0
[ 1039.410716]  [<ffffffff8169985d>] ? apic_timer_interrupt+0x6d/0x80
[ 1039.410717]  <EOI> 

[ 1039.410750]  [<ffffffffa03bb065>] ? dbuf_hold_impl+0x75/0x90 [zfs]
[ 1039.410766]  [<ffffffffa03bacc8>] ? __dbuf_hold_impl+0x1d8/0x500 [zfs]
[ 1039.410772]  [<ffffffffa030ca83>] ? spl_kmem_zalloc+0x93/0x1c0 [spl]
[ 1039.410788]  [<ffffffffa03bb056>] ? dbuf_hold_impl+0x66/0x90 [zfs]
[ 1039.410804]  [<ffffffffa03bb346>] ? dbuf_hold+0x16/0x30 [zfs]
[ 1039.410825]  [<ffffffffa03d2e1a>] ? dnode_hold_impl+0x8a/0x4f0 [zfs]
[ 1039.410847]  [<ffffffffa03d71d9>] ? dsl_dataset_prev_snap_txg+0x29/0x70 [zfs]
[ 1039.410868]  [<ffffffffa03d7ddd>] ? dsl_dataset_block_freeable+0x3d/0x60 [zfs]
[ 1039.410888]  [<ffffffffa03ceb07>] ? dmu_tx_hold_object_impl.isra.7+0x27/0x110 [zfs]
[ 1039.410909]  [<ffffffffa03cfc05>] ? dmu_tx_hold_write+0x15/0x50 [zfs]
[ 1039.410939]  [<ffffffffa04476f5>] ? zfs_write+0x375/0xb00 [zfs]
[ 1039.410943]  [<ffffffff813bdfae>] ? account+0x10e/0x280
[ 1039.410972]  [<ffffffffa045c313>] ? zpl_write+0xb3/0x120 [zfs]
[ 1039.410975]  [<ffffffff810ff2e6>] ? vfs_write+0xb6/0x1b0
[ 1039.410978]  [<ffffffff810ffbb2>] ? SyS_write+0x42/0x90
[ 1039.410980]  [<ffffffff81698e18>] ? tracesys+0x7e/0xe2
[ 1039.410982]  [<ffffffff81698e77>] ? tracesys+0xdd/0xe2
[ 1039.410984] Code: 44 00 00 e8 9b ff ff ff eb e7 66 0f 1f 84 00 00 00 00 00 e9 bb ff ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 65 8b 04 25 d8 e2 00 00 <85> c0 74 04 c3 0f 1f 00 8b 15 a2 b0 9b 00 41 54 31 ff 55 53 85 
[ 1039.411009] NMI backtrace for cpu 1
[ 1039.411014] CPU: 1 PID: 0 Comm: swapper/1 Tainted: P           O 3.10.36-oe64-ga7b1265 #4
[ 1039.411015] Hardware name: Supermicro X8DTS/X8DTS, BIOS 2.1     06/25/2012
[ 1039.411017] task: ffff88033109ba80 ti: ffff8803310c6000 task.ti: ffff8803310c6000
[ 1039.411019] RIP: 0010:[<ffffffff81353b4c>]  [<ffffffff81353b4c>] intel_idle+0xcc/0x132
[ 1039.411025] RSP: 0018:ffff8803310c7e88  EFLAGS: 00000046
[ 1039.411027] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[ 1039.411028] RDX: 0000000000000000 RSI: ffff8803310c7fd8 RDI: 0000000000000046
[ 1039.411030] RBP: 0000000000000020 R08: 0000000000d8e696 R09: 000000001966ad03
[ 1039.411032] R10: 0000020abec22151 R11: 00000000000001dc R12: 0000000000000003
[ 1039.411033] R13: 0000000000000004 R14: 0000000000000004 R15: ffff8803310c7fd8
[ 1039.411035] FS:  0000000000000000(0000) GS:ffff880333220000(0000) knlGS:0000000000000000
[ 1039.411037] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1039.411038] CR2: 00007f7912018510 CR3: 0000000001958000 CR4: 00000000000007e0
[ 1039.411040] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1039.411041] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1039.411042] Stack:
[ 1039.411043]  000000018107a90c ffff880333239000 ffff8803310c7ed8 ffffffff8199adc0
[ 1039.411046]  000000f1f7892584 ffffffff815643d5 ffff880333239000 ffffffff8199adc0
[ 1039.411048]  0000000000000000 ffff8803310c7fd8 0000000000000004 ffffffff815644f9
[ 1039.411051] Call Trace:
[ 1039.411056]  [<ffffffff815643d5>] ? cpuidle_enter_state+0x35/0xc0
[ 1039.411059]  [<ffffffff815644f9>] ? cpuidle_idle_call+0x99/0x160
[ 1039.411062]  [<ffffffff81013495>] ? arch_cpu_idle+0x5/0x30
[ 1039.411067]  [<ffffffff8107a309>] ? cpu_startup_entry+0x129/0x180
[ 1039.411068] Code: 89 d1 48 8d 86 38 e0 ff ff 0f 01 c8 0f ae f0 65 48 8b 04 25 70 c6 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 48 89 e8 0f 01 c9 <85> 1d 5e 72 64 00 75 0f 48 8d 74 24 04 bf 05 00 00 00 e8 6d c7 
[ 1039.411092] NMI backtrace for cpu 2
[ 1039.411095] CPU: 2 PID: 3503 Comm: z_null_int Tainted: P           O 3.10.36-oe64-ga7b1265 #4
[ 1039.411096] Hardware name: Supermicro X8DTS/X8DTS, BIOS 2.1     06/25/2012
[ 1039.411098] task: ffff88032c72ce00 ti: ffff8802ef4f4000 task.ti: ffff8802ef4f4000
[ 1039.411100] RIP: 0010:[<ffffffff813049f6>]  [<ffffffff813049f6>] delay_tsc+0x26/0x60
[ 1039.411106] RSP: 0018:ffff880333243de8  EFLAGS: 00000046
[ 1039.411108] RAX: 0000000000000002 RBX: 00000000c0d05c2e RCX: 0000000000000003
[ 1039.411109] RDX: 00000000c0d05c4e RSI: 0000000000000080 RDI: 0000000000030d49
[ 1039.411111] RBP: 0000000000030d49 R08: ffffffff81a208f0 R09: 000000000000fffe
[ 1039.411112] R10: ffff88033324e880 R11: 00000000000005bd R12: 0000000000000002
[ 1039.411114] R13: 0000000000000400 R14: 0000000000000002 R15: 0000000000000006
[ 1039.411116] FS:  0000000000000000(0000) GS:ffff880333240000(0000) knlGS:0000000000000000
[ 1039.411117] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1039.411119] CR2: 00007ff19ee92510 CR3: 0000000001958000 CR4: 00000000000007e0
[ 1039.411120] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1039.411122] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1039.411123] Stack:
[ 1039.411128]  0000000000001000 00000000000003e9 ffffffff81a208f0 ffffffff8102b8e2
[ 1039.411148]  0000000000000003 000000000000b03a ffffffff8102c5dd 0000000000000082
[ 1039.411166]  0000000000002710 ffffffff81a20900 ffffffff81984900 ffffffff81984840
[ 1039.411186] Call Trace:
[ 1039.411191]  <IRQ> 

[ 1039.411203]  [<ffffffff8102b8e2>] ? native_safe_apic_wait_icr_idle+0x22/0x50
[ 1039.411213]  [<ffffffff8102c5dd>] ? default_send_IPI_mask_sequence_phys+0xcd/0xf0
[ 1039.411224]  [<ffffffff8102c742>] ? arch_trigger_all_cpu_backtrace+0x52/0x90
[ 1039.411235]  [<ffffffff810adf0d>] ? rcu_check_callbacks+0x2ed/0x580
[ 1039.411240]  [<ffffffff8105052b>] ? update_process_times+0x3b/0x70
[ 1039.411244]  [<ffffffff81082744>] ? tick_sched_timer+0x34/0x50
[ 1039.411247]  [<ffffffff8106509c>] ? __run_hrtimer.isra.32+0x3c/0xc0
[ 1039.411249]  [<ffffffff810656bf>] ? hrtimer_interrupt+0xef/0x240
[ 1039.411252]  [<ffffffff8102ba02>] ? smp_apic_timer_interrupt+0x62/0xa0
[ 1039.411255]  [<ffffffff8169985d>] ? apic_timer_interrupt+0x6d/0x80
[ 1039.411256]  <EOI> 

[ 1039.411290]  [<ffffffffa0454bdf>] ? zio_execute+0xff/0x190 [zfs]
[ 1039.411296]  [<ffffffffa0310110>] ? taskq_thread+0x220/0x470 [spl]
[ 1039.411300]  [<ffffffff8106cf80>] ? wake_up_process+0x50/0x50
[ 1039.411305]  [<ffffffffa030fef0>] ? taskq_cancel_id+0x110/0x110 [spl]
[ 1039.411309]  [<ffffffff81061e12>] ? kthread+0xc2/0xd0
[ 1039.411312]  [<ffffffff81060000>] ? add_sysfs_param.isra.7+0x150/0x1a0
[ 1039.411315]  [<ffffffff81061d50>] ? kthread_create_on_node+0x110/0x110
[ 1039.411318]  [<ffffffff81698bbc>] ? ret_from_fork+0x7c/0xb0
[ 1039.411321]  [<ffffffff81061d50>] ? kthread_create_on_node+0x110/0x110
[ 1039.411322] Code: 00 00 00 00 00 41 54 55 89 fd 53 65 44 8b 24 25 34 b0 00 00 66 66 90 0f ae e8 e8 f6 d1 d0 ff 66 90 89 c3 66 90 66 66 90 0f ae e8 <e8> e5 d1 d0 ff 66 90 89 c2 29 d8 39 e8 73 1b f3 90 65 8b 04 25 
[ 1039.411416] NMI backtrace for cpu 3
[ 1039.411420] CPU: 3 PID: 0 Comm: swapper/3 Tainted: P           O 3.10.36-oe64-ga7b1265 #4
[ 1039.411422] Hardware name: Supermicro X8DTS/X8DTS, BIOS 2.1     06/25/2012
[ 1039.411424] task: ffff88033109c780 ti: ffff8803310ca000 task.ti: ffff8803310ca000
[ 1039.411425] RIP: 0010:[<ffffffff81353b4c>]  [<ffffffff81353b4c>] intel_idle+0xcc/0x132
[ 1039.411431] RSP: 0018:ffff8803310cbe88  EFLAGS: 00000046
[ 1039.411433] RAX: 0000000000000020 RBX: 0000000000000008 RCX: 0000000000000001
[ 1039.411434] RDX: 0000000000000000 RSI: ffff8803310cbfd8 RDI: 0000000000000046
[ 1039.411436] RBP: 0000000000000020 R08: 0000000000d950ab R09: 000000001966ad03
[ 1039.411438] R10: 0000020abff35118 R11: 00000000000001de R12: 0000000000000003
[ 1039.411439] R13: 0000000000000004 R14: 0000000000000004 R15: ffff8803310cbfd8
[ 1039.411441] FS:  0000000000000000(0000) GS:ffff880333260000(0000) knlGS:0000000000000000
[ 1039.411443] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 1039.411444] CR2: 00007fa79ec9255b CR3: 0000000001958000 CR4: 00000000000007e0
[ 1039.411446] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1039.411447] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1039.411448] Stack:
[ 1039.411449]  000000038107a90c ffff880333279000 ffff8803310cbed8 ffffffff8199adc0
[ 1039.411452]  000000f1f821ee85 ffffffff815643d5 ffff880333279000 ffffffff8199adc0
[ 1039.411454]  0000000000000000 ffff8803310cbfd8 0000000000000004 ffffffff815644f9
[ 1039.411457] Call Trace:
[ 1039.411461]  [<ffffffff815643d5>] ? cpuidle_enter_state+0x35/0xc0
[ 1039.411463]  [<ffffffff815644f9>] ? cpuidle_idle_call+0x99/0x160
[ 1039.411467]  [<ffffffff81013495>] ? arch_cpu_idle+0x5/0x30
[ 1039.411470]  [<ffffffff8107a309>] ? cpu_startup_entry+0x129/0x180
[ 1039.411471] Code: 89 d1 48 8d 86 38 e0 ff ff 0f 01 c8 0f ae f0 65 48 8b 04 25 70 c6 00 00 48 8b 80 38 e0 ff ff a8 08 75 08 b1 01 48 89 e8 0f 01 c9 <85> 1d 5e 72 64 00 75 0f 48 8d 74 24 04 bf 05 00 00 00 e8 6d c7 

@behlendorf
Copy link
Contributor

@ab-oe thanks for posting the debugging. This definitely looks like a duplicate of #3652, and it's clear that the z_null_int thread is getting blocked spinning on the taskq spin lock. Thanks for bisecting the change, that's helpful. Have you tried reverting b39c22b and setting spl_taskq_thread_dynamic to 0. Does it resolve the issue?

@ab-oe
Copy link
Contributor Author

ab-oe commented Sep 25, 2015

@behlendorf yes it works with reverted b39c22b it works even if spl_taskq_thread_dynamic is set to 1. I haven't found the commit that causes immediate system hang when disk is removed yet.

@behlendorf
Copy link
Contributor

Fix proposed in #3833.

behlendorf added a commit that referenced this issue Sep 25, 2015
Commit b39c22b set the READ_SYNC and WRITE_SYNC flags for a bio
based on the ZIO_PRIORITY_* flag passed in.  This had the unnoticed
side-effect of making the vdev_disk_io_start() synchronous for
certain I/Os.

This in turn resulted in vdev_disk_io_start() being able to
re-dispatch zio's which would result in a RCU stalls when a disk
was removed from the system.  Additionally, this could negatively
impact performance and explains the performance regressions reported
in both #3829 and #3780.

This patch resolves the issue by making the blocking behavior
dependent on a 'wait' flag being passed rather than overloading
the passed bio flags.

Finally, the WRITE_SYNC and READ_SYNC behavior is restricted to
non-rotational devices where there is no benefit to queuing to
aggregate the I/O.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #3652
Issue #3780
Issue #3785
Issue #3817
Issue #3821
Issue #3829
Issue #3832
Issue #3870
@behlendorf
Copy link
Contributor

Resolved by 5592404 which will be cherry-picked in to 0.6.5.2 release.

@ab-oe
Copy link
Contributor Author

ab-oe commented Sep 28, 2015

@behlendorf thank you. The 5592404 fixes this issue.
There is another issue #3577 I tested it lately on ZFS version with b39c22b and it seems that WRITE_SYNC resolved it and now it is introduced again.

behlendorf added a commit that referenced this issue Sep 30, 2015
Commit b39c22b set the READ_SYNC and WRITE_SYNC flags for a bio
based on the ZIO_PRIORITY_* flag passed in.  This had the unnoticed
side-effect of making the vdev_disk_io_start() synchronous for
certain I/Os.

This in turn resulted in vdev_disk_io_start() being able to
re-dispatch zio's which would result in a RCU stalls when a disk
was removed from the system.  Additionally, this could negatively
impact performance and explains the performance regressions reported
in both #3829 and #3780.

This patch resolves the issue by making the blocking behavior
dependent on a 'wait' flag being passed rather than overloading
the passed bio flags.

Finally, the WRITE_SYNC and READ_SYNC behavior is restricted to
non-rotational devices where there is no benefit to queuing to
aggregate the I/O.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #3652
Issue #3780
Issue #3785
Issue #3817
Issue #3821
Issue #3829
Issue #3832
Issue #3870
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants