Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

task txg_sync blocked for more than 120 seconds on 0.6.5.3 #3952

Closed
stevenj opened this issue Oct 26, 2015 · 5 comments
Closed

task txg_sync blocked for more than 120 seconds on 0.6.5.3 #3952

stevenj opened this issue Oct 26, 2015 · 5 comments

Comments

@stevenj
Copy link

stevenj commented Oct 26, 2015

task txg_sync blocked for more than 120 seconds

I dont know if its because of the latest ZFS release, but now my Zpools are painfully slow, Booting takes in the order of minutes. I am also seeing task "txg_sync:1625 blocked for more than 120 seconds." periodically inside dmesg and also during boot.

This is on Ubuntu 15.04. My Kernel : Linux version 3.19.0-31-generic (buildd@lcy01-07) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) ) #36-Ubuntu SMP Wed Oct 7 15:04:02 UTC 2015

I believe this may be the same problem, or related to #3950 because I also have the issue of very very slow read speeds. This issue was not a problem before, but seems to have occurred recently (Possibly after the last upgrade of ZFS to V0.6.5.3-1).

This is the first time it happens during Boot. I have tried rebooting, and it does this all the time now.

[   59.004566] zavl: module license 'CDDL' taints kernel.
[   59.004569] Disabling lock debugging due to kernel taint
[   59.004582] zavl: module verification failed: signature and/or  required key missing - tainting kernel
[   59.006984] SPL: Loaded module v0.6.5.3-1~vivid
[   59.023353] ZFS: Loaded module v0.6.5.3-1~vivid, ZFS pool version 5000, ZFS filesystem version 5
[   68.632938] SPL: using hostid 0x007f0101
[  480.115322] INFO: task txg_sync:1625 blocked for more than 120 seconds.
[  480.115392]       Tainted: P           OE  3.19.0-31-generic #36-Ubuntu
[  480.115452] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  480.115514] txg_sync        D ffff88040b78fb38     0  1625      2 0x00000000
[  480.115518]  ffff88040b78fb38 ffff88040b73b110 0000000000014240 ffff88040b78ffd8
[  480.115520]  0000000000014240 ffff88042c70a740 ffff88040b73b110 ffff88042b6263c0
[  480.115522]  ffff88043ed94b38 ffff880402eb0be8 ffff880402eb0c28 ffff880402eb0c10
[  480.115524] Call Trace:
[  480.115531]  [<ffffffff817c7920>] io_schedule+0xa0/0x130
[  480.115538]  [<ffffffffc0236d0f>] cv_wait_common+0x9f/0x120 [spl]
[  480.115541]  [<ffffffff810b76a0>] ? wait_woken+0x90/0x90
[  480.115546]  [<ffffffffc0236de8>] __cv_wait_io+0x18/0x20 [spl]
[  480.115577]  [<ffffffffc04080b3>] zio_wait+0x123/0x210 [zfs]
[  480.115593]  [<ffffffffc03907a1>] dsl_pool_sync+0xb1/0x460 [zfs]
[  480.115612]  [<ffffffffc03b7560>] ? spa_lookup+0x60/0x60 [zfs]
[  480.115630]  [<ffffffffc03abc30>] spa_sync+0x360/0xb40 [zfs]
[  480.115632]  [<ffffffff810b6ff5>] ? __wake_up_common+0x55/0x90
[  480.115635]  [<ffffffff8101e169>] ? read_tsc+0x9/0x10
[  480.115655]  [<ffffffffc03bd98c>] txg_sync_thread+0x3cc/0x6a0 [zfs]
[  480.115657]  [<ffffffff810b194e>] ? pick_next_task_fair+0x1ae/0x8b0
[  480.115677]  [<ffffffffc03bd5c0>] ? txg_delay+0x160/0x160 [zfs]
[  480.115680]  [<ffffffffc0231fd1>] thread_generic_wrapper+0x71/0x80 [spl]
[  480.115684]  [<ffffffffc0231f60>] ? __thread_exit+0x20/0x20 [spl]
[  480.115686]  [<ffffffff81095959>] kthread+0xc9/0xe0
[  480.115689]  [<ffffffff81095890>] ? kthread_create_on_node+0x1c0/0x1c0
[  480.115691]  [<ffffffff817cbe98>] ret_from_fork+0x58/0x90
[  480.115693]  [<ffffffff81095890>] ? kthread_create_on_node+0x1c0/0x1c0

And here it is again, happening later on. It happens regularly.

[ 3118.527371] INFO: task txg_sync:1625 blocked for more than 120 seconds.
[ 3118.527390]       Tainted: P           OE  3.19.0-31-generic #36-Ubuntu
[ 3118.527393] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3118.527397] txg_sync        D ffff88040b78fb38     0  1625      2 0x00000000
[ 3118.527405]  ffff88040b78fb38 ffff88040b73b110 0000000000014240 ffff88040b78ffd8
[ 3118.527412]  0000000000014240 ffffffff81c1a580 ffff88040b73b110 ffff88042b6263c0
[ 3118.527418]  ffff88043ec14b38 ffff88020d8a14a8 ffff88020d8a14e8 ffff88020d8a14d0
[ 3118.527424] Call Trace:
[ 3118.527439]  [<ffffffff817c7920>] io_schedule+0xa0/0x130
[ 3118.527473]  [<ffffffffc0236d0f>] cv_wait_common+0x9f/0x120 [spl]
[ 3118.527481]  [<ffffffff810b76a0>] ? wait_woken+0x90/0x90
[ 3118.527499]  [<ffffffffc0236de8>] __cv_wait_io+0x18/0x20 [spl]
[ 3118.527582]  [<ffffffffc04080b3>] zio_wait+0x123/0x210 [zfs]
[ 3118.527633]  [<ffffffffc03907a1>] dsl_pool_sync+0xb1/0x460 [zfs]
[ 3118.527694]  [<ffffffffc03b7560>] ? spa_lookup+0x60/0x60 [zfs]
[ 3118.527754]  [<ffffffffc03abc30>] spa_sync+0x360/0xb40 [zfs]
[ 3118.527760]  [<ffffffff810b6ff5>] ? __wake_up_common+0x55/0x90
[ 3118.527768]  [<ffffffff8101e169>] ? read_tsc+0x9/0x10
[ 3118.527831]  [<ffffffffc03bd98c>] txg_sync_thread+0x3cc/0x6a0 [zfs]
[ 3118.527838]  [<ffffffff810b194e>] ? pick_next_task_fair+0x1ae/0x8b0
[ 3118.527902]  [<ffffffffc03bd5c0>] ? txg_delay+0x160/0x160 [zfs]
[ 3118.527917]  [<ffffffffc0231fd1>] thread_generic_wrapper+0x71/0x80 [spl]
[ 3118.527932]  [<ffffffffc0231f60>] ? __thread_exit+0x20/0x20 [spl]
[ 3118.527939]  [<ffffffff81095959>] kthread+0xc9/0xe0
[ 3118.527946]  [<ffffffff81095890>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3118.527951]  [<ffffffff817cbe98>] ret_from_fork+0x58/0x90
[ 3118.527958]  [<ffffffff81095890>] ? kthread_create_on_node+0x1c0/0x1c0
@stevenj stevenj changed the title task txg_sync blocked for more than 120 seconds. task txg_sync blocked for more than 120 seconds. on 0.6.5.3 Oct 26, 2015
@stevenj stevenj changed the title task txg_sync blocked for more than 120 seconds. on 0.6.5.3 task txg_sync blocked for more than 120 seconds on 0.6.5.3 Oct 26, 2015
@stangri
Copy link

stangri commented Jan 6, 2016

Same problem on 14.04 LTS (Linux 3.13.0-74-generic #118-Ubuntu SMP Thu Dec 17 22:52:10 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux). When it happens the ZFS read operations take very long time. And it's not just txg_sync, but any process on the server accessing ZFS pool it seems.

[ 1321.169300] INFO: task txg_sync:1443 blocked for more than 120 seconds.
[ 1321.169442]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 1321.169485] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1321.169534] txg_sync        D ffff8802bcd13180     0  1443      2 0x00000000
[ 1321.169539]  ffff8800ba487bc8 0000000000000046 ffff8802ac111800 ffff8800ba487fd8
[ 1321.169544]  0000000000013180 0000000000013180 ffff8802ac111800 ffff8802bcd13a18
[ 1321.169546]  ffff88026f3e3c08 ffff88026f3e3c48 ffff88026f3e3c30 0000000000000001
[ 1321.169549] Call Trace:
[ 1321.169557]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 1321.169635]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 1321.169645]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 1321.169654]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 1321.169712]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 1321.169743]  [<ffffffffa01837c1>] dsl_pool_sync+0xb1/0x470 [zfs]
[ 1321.169775]  [<ffffffffa019e2b5>] spa_sync+0x365/0xb20 [zfs]
[ 1321.169778]  [<ffffffff810ab472>] ? autoremove_wake_function+0x12/0x40
[ 1321.169782]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[ 1321.169816]  [<ffffffffa01b0639>] txg_sync_thread+0x3b9/0x620 [zfs]
[ 1321.169849]  [<ffffffffa01b0280>] ? txg_quiesce_thread+0x3f0/0x3f0 [zfs]
[ 1321.169856]  [<ffffffffa009be81>] thread_generic_wrapper+0x71/0x80 [spl]
[ 1321.169862]  [<ffffffffa009be10>] ? __thread_exit+0x20/0x20 [spl]
[ 1321.169865]  [<ffffffff8108b8a2>] kthread+0xd2/0xf0
[ 1321.169867]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 1321.169871]  [<ffffffff817356a8>] ret_from_fork+0x58/0x90
[ 1321.169874]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 1680.995172] INFO: task Plex Media Serv:2645 blocked for more than 120 seconds.
[ 1680.995233]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 1680.995274] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1680.995325] Plex Media Serv D ffff8802bcc13180     0  2645   2270 0x00000000
[ 1680.995330]  ffff880295397c00 0000000000000082 ffff8802ac0fc800 ffff880295397fd8
[ 1680.995334]  0000000000013180 0000000000013180 ffff8802ac0fc800 ffff8802bcc13a18
[ 1680.995337]  ffff8802641f6a88 ffff8802641f6ac8 ffff8802641f6ab0 0000000000000001
[ 1680.995340] Call Trace:
[ 1680.995347]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 1680.995365]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 1680.995370]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 1680.995378]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 1680.995432]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 1680.995458]  [<ffffffffa015cc8f>] dmu_buf_hold_array_by_dnode+0x14f/0x4d0 [zfs]
[ 1680.995485]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 1680.995511]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 1680.995548]  [<ffffffffa01e30c6>] zfs_read+0x286/0x400 [zfs]
[ 1680.995584]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 1680.995619]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 1680.995623]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 1680.995626]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 1680.995629]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 1989.553640] init: gateone main process (1706) killed by TERM signal
[ 2640.537339] INFO: task Plex Media Serv:2645 blocked for more than 120 seconds.
[ 2640.537400]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 2640.537776] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2640.538100] Plex Media Serv D ffff8802bcc13180     0  2645   2270 0x00000000
[ 2640.538106]  ffff880295397c48 0000000000000082 ffff8802ac0fc800 ffff880295397fd8
[ 2640.538110]  0000000000013180 0000000000013180 ffff8802ac0fc800 ffff88025dc868c0
[ 2640.538113]  ffff88025dc86870 ffff88025dc868c8 ffff88025dc86898 0000000000000000
[ 2640.538116] Call Trace:
[ 2640.538124]  [<ffffffff81728f79>] schedule+0x29/0x70
[ 2640.538141]  [<ffffffffa00a0b55>] cv_wait_common+0xe5/0x120 [spl]
[ 2640.538147]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 2640.538154]  [<ffffffffa00a0ba5>] __cv_wait+0x15/0x20 [spl]
[ 2640.538195]  [<ffffffffa015cd03>] dmu_buf_hold_array_by_dnode+0x1c3/0x4d0 [zfs]
[ 2640.538223]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 2640.538249]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 2640.538290]  [<ffffffffa01e30c6>] zfs_read+0x286/0x400 [zfs]
[ 2640.538327]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 2640.538362]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 2640.538386]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 2640.538390]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 2640.538393]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 2640.538396] INFO: task Plex Media Serv:2646 blocked for more than 120 seconds.
[ 2640.538735]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 2640.538995] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 2640.539343] Plex Media Serv D ffff8802bcd13180     0  2646   2270 0x00000000
[ 2640.539347]  ffff8802953c5c00 0000000000000082 ffff8802ac0fe000 ffff8802953c5fd8
[ 2640.539350]  0000000000013180 0000000000013180 ffff8802ac0fe000 ffff8802bcd13a18
[ 2640.539353]  ffff88025d6b7c08 ffff88025d6b7c48 ffff88025d6b7c30 0000000000000001
[ 2640.539355] Call Trace:
[ 2640.539359]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 2640.539368]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 2640.539371]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 2640.539378]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 2640.539417]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 2640.539443]  [<ffffffffa015cc8f>] dmu_buf_hold_array_by_dnode+0x14f/0x4d0 [zfs]
[ 2640.539469]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 2640.539495]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 2640.539530]  [<ffffffffa01e30c6>] zfs_read+0x286/0x400 [zfs]
[ 2640.539565]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 2640.539601]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 2640.539604]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 2640.539606]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 2640.539609]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 3240.280943] INFO: task txg_sync:1443 blocked for more than 120 seconds.
[ 3240.281003]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 3240.281486] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3240.281782] txg_sync        D ffff8802bcd13180     0  1443      2 0x00000000
[ 3240.281787]  ffff8800ba487bc8 0000000000000046 ffff8802ac111800 ffff8800ba487fd8
[ 3240.281792]  0000000000013180 0000000000013180 ffff8802ac111800 ffff8802bcd13a18
[ 3240.281794]  ffff880252da3c08 ffff880252da3c48 ffff880252da3c30 0000000000000001
[ 3240.281797] Call Trace:
[ 3240.281805]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 3240.281823]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 3240.281829]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 3240.281837]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 3240.281891]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 3240.281921]  [<ffffffffa01837c1>] dsl_pool_sync+0xb1/0x470 [zfs]
[ 3240.281954]  [<ffffffffa019e2b5>] spa_sync+0x365/0xb20 [zfs]
[ 3240.281957]  [<ffffffff810ab472>] ? autoremove_wake_function+0x12/0x40
[ 3240.281961]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[ 3240.281995]  [<ffffffffa01b0639>] txg_sync_thread+0x3b9/0x620 [zfs]
[ 3240.282028]  [<ffffffffa01b0280>] ? txg_quiesce_thread+0x3f0/0x3f0 [zfs]
[ 3240.282035]  [<ffffffffa009be81>] thread_generic_wrapper+0x71/0x80 [spl]
[ 3240.282041]  [<ffffffffa009be10>] ? __thread_exit+0x20/0x20 [spl]
[ 3240.282044]  [<ffffffff8108b8a2>] kthread+0xd2/0xf0
[ 3240.282047]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 3240.282050]  [<ffffffff817356a8>] ret_from_fork+0x58/0x90
[ 3240.282053]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 4799.569431] INFO: task python:23350 blocked for more than 120 seconds.
[ 4799.569648]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 4799.569963] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4799.570344] python          D ffff8802bcd13180     0 23350   2652 0x00000000
[ 4799.570350]  ffff88024f865c00 0000000000000086 ffff8802ade14800 ffff88024f865fd8
[ 4799.570355]  0000000000013180 0000000000013180 ffff8802ade14800 ffff8802bcd13a18
[ 4799.570358]  ffff88008ee0cbe8 ffff88008ee0cc28 ffff88008ee0cc10 0000000000000001
[ 4799.570381] Call Trace:
[ 4799.570406]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 4799.570428]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 4799.570434]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 4799.570441]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 4799.570497]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 4799.570523]  [<ffffffffa015cc8f>] dmu_buf_hold_array_by_dnode+0x14f/0x4d0 [zfs]
[ 4799.570549]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 4799.570576]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 4799.570611]  [<ffffffffa01e2f7d>] zfs_read+0x13d/0x400 [zfs]
[ 4799.570648]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 4799.570683]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 4799.570688]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 4799.570690]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 4799.570694]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 4919.514992] INFO: task Plex Media Serv:21840 blocked for more than 120 seconds.
[ 4919.515054]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 4919.515328] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4919.515554] Plex Media Serv D ffff8802bcc13180     0 21840   2270 0x00000000
[ 4919.515560]  ffff88008ef83c48 0000000000000082 ffff880245044800 ffff88008ef83fd8
[ 4919.515564]  0000000000013180 0000000000013180 ffff880245044800 ffff88024f860670
[ 4919.515567]  ffff88024f860620 ffff88024f860678 ffff88024f860648 0000000000000000
[ 4919.515570] Call Trace:
[ 4919.515578]  [<ffffffff81728f79>] schedule+0x29/0x70
[ 4919.515597]  [<ffffffffa00a0b55>] cv_wait_common+0xe5/0x120 [spl]
[ 4919.515602]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 4919.515610]  [<ffffffffa00a0ba5>] __cv_wait+0x15/0x20 [spl]
[ 4919.515656]  [<ffffffffa015cd03>] dmu_buf_hold_array_by_dnode+0x1c3/0x4d0 [zfs]
[ 4919.515683]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 4919.515710]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 4919.515757]  [<ffffffffa01e2f7d>] zfs_read+0x13d/0x400 [zfs]
[ 4919.515804]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 4919.515839]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 4919.515844]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 4919.515847]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 4919.515850]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 4919.515862] INFO: task python:23350 blocked for more than 120 seconds.
[ 4919.516096]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 4919.516365] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4919.516707] python          D ffff8802bcd13180     0 23350   2652 0x00000000
[ 4919.516711]  ffff88024f865c00 0000000000000086 ffff8802ade14800 ffff88024f865fd8
[ 4919.516714]  0000000000013180 0000000000013180 ffff8802ade14800 ffff8802bcd13a18
[ 4919.516717]  ffff88008ee0cbe8 ffff88008ee0cc28 ffff88008ee0cc10 0000000000000001
[ 4919.516719] Call Trace:
[ 4919.516723]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 4919.516732]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 4919.516736]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 4919.516743]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 4919.516788]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 4919.516814]  [<ffffffffa015cc8f>] dmu_buf_hold_array_by_dnode+0x14f/0x4d0 [zfs]
[ 4919.516840]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 4919.516866]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 4919.516902]  [<ffffffffa01e2f7d>] zfs_read+0x13d/0x400 [zfs]
[ 4919.516937]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 4919.516972]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 4919.516975]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 4919.516978]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 4919.516980]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 4919.517008] INFO: task Plex New Transc:24600 blocked for more than 120 seconds.
[ 4919.517303]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 4919.517716] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4919.517981] Plex New Transc D ffff8802bcc13180     0 24600  23369 0x00000004
[ 4919.517985]  ffff88008c179c00 0000000000000082 ffff8802396e4800 ffff88008c179fd8
[ 4919.517988]  0000000000013180 0000000000013180 ffff8802396e4800 ffff8802bcc13a18
[ 4919.517990]  ffff880237ce4328 ffff880237ce4368 ffff880237ce4350 0000000000000001
[ 4919.517992] Call Trace:
[ 4919.517996]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 4919.518005]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 4919.518010]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 4919.518016]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 4919.518056]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 4919.518082]  [<ffffffffa015cc8f>] dmu_buf_hold_array_by_dnode+0x14f/0x4d0 [zfs]
[ 4919.518108]  [<ffffffffa015d853>] dmu_read_uio_dnode+0x43/0xe0 [zfs]
[ 4919.518134]  [<ffffffffa015e827>] dmu_read_uio_dbuf+0x47/0x60 [zfs]
[ 4919.518169]  [<ffffffffa01e2f7d>] zfs_read+0x13d/0x400 [zfs]
[ 4919.518205]  [<ffffffffa01ff4b1>] zpl_read_common_iovec.constprop.9+0x71/0xc0 [zfs]
[ 4919.518240]  [<ffffffffa01ff5ff>] zpl_read+0x6f/0x90 [zfs]
[ 4919.518243]  [<ffffffff811be395>] vfs_read+0x95/0x160
[ 4919.518245]  [<ffffffff811beea9>] SyS_read+0x49/0xa0
[ 4919.518248]  [<ffffffff8173575d>] system_call_fastpath+0x1a/0x1f
[ 8997.630826] INFO: task txg_sync:1443 blocked for more than 120 seconds.
[ 8997.630946]       Tainted: P           OX 3.13.0-74-generic #118-Ubuntu
[ 8997.631165] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8997.631424] txg_sync        D ffff8802bcc13180     0  1443      2 0x00000000
[ 8997.631430]  ffff8800ba487bc8 0000000000000046 ffff8802ac111800 ffff8800ba487fd8
[ 8997.631434]  0000000000013180 0000000000013180 ffff8802ac111800 ffff8802bcc13a18
[ 8997.631437]  ffff880077a314a8 ffff880077a314e8 ffff880077a314d0 0000000000000001
[ 8997.631440] Call Trace:
[ 8997.631449]  [<ffffffff8172929d>] io_schedule+0x9d/0x140
[ 8997.631468]  [<ffffffffa00a0b0f>] cv_wait_common+0x9f/0x120 [spl]
[ 8997.631473]  [<ffffffff810ab460>] ? prepare_to_wait_event+0x100/0x100
[ 8997.631481]  [<ffffffffa00a0be8>] __cv_wait_io+0x18/0x20 [spl]
[ 8997.631545]  [<ffffffffa01f9213>] zio_wait+0x123/0x210 [zfs]
[ 8997.631598]  [<ffffffffa01837c1>] dsl_pool_sync+0xb1/0x470 [zfs]
[ 8997.631630]  [<ffffffffa019e2b5>] spa_sync+0x365/0xb20 [zfs]
[ 8997.631634]  [<ffffffff810ab472>] ? autoremove_wake_function+0x12/0x40
[ 8997.631637]  [<ffffffff8101b709>] ? read_tsc+0x9/0x20
[ 8997.631671]  [<ffffffffa01b0639>] txg_sync_thread+0x3b9/0x620 [zfs]
[ 8997.631704]  [<ffffffffa01b0280>] ? txg_quiesce_thread+0x3f0/0x3f0 [zfs]
[ 8997.631711]  [<ffffffffa009be81>] thread_generic_wrapper+0x71/0x80 [spl]
[ 8997.631717]  [<ffffffffa009be10>] ? __thread_exit+0x20/0x20 [spl]
[ 8997.631720]  [<ffffffff8108b8a2>] kthread+0xd2/0xf0
[ 8997.631723]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 8997.631726]  [<ffffffff817356a8>] ret_from_fork+0x58/0x90
[ 8997.631729]  [<ffffffff8108b7d0>] ? kthread_create_on_node+0x1c0/0x1c0

@STRML
Copy link

STRML commented Jul 1, 2016

I have this problem as well on a server I'm using on AWS to evaluate switching some of our data storage to ZFS. It's ZFS over EBS, which is potentially problematic as we can see.

We have the same txg_sync errors as above. In addition, I've seen:

[493320.284084] INFO: task arc_reclaim:355 blocked for more than 120 seconds.
[493320.290423]       Tainted: P         C OE  3.19.0-61-generic #69~14.04.1-Ubuntu
[493320.295910] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[493320.300814] arc_reclaim     D ffff8800ea3cfc08     0   355      2 0x00000000
[493320.300820]  ffff8800ea3cfc08 ffff8800354cbae0 0000000000013e80 ffff8800ea3cffd8
[493320.300823]  0000000000013e80 ffff8800354cb110 ffff8800354cbae0 ffff8800ea3cfbe8
[493320.300826]  ffff8800e95a81b0 ffff8800e95a81b4 ffff8800354cbae0 00000000ffffffff
[493320.300829] Call Trace:
[493320.300840]  [<ffffffff817b9179>] schedule_preempt_disabled+0x29/0x70
[493320.300843]  [<ffffffff817bae95>] __mutex_lock_slowpath+0x95/0x100
[493320.300888]  [<ffffffffc02e50b1>] ? arc_state_multilist_index_func+0x51/0x70 [zfs]
[493320.300892]  [<ffffffff817baf23>] mutex_lock+0x23/0x37
[493320.300916]  [<ffffffffc02e6362>] arc_hdr_realloc+0x92/0x230 [zfs]
[493320.300941]  [<ffffffffc02e965d>] arc_evict_state+0x70d/0x840 [zfs]
[493320.300967]  [<ffffffffc02e9837>] arc_adjust_impl.constprop.27+0x37/0x50 [zfs]
[493320.300991]  [<ffffffffc02e9ab6>] arc_adjust+0x266/0x480 [zfs]
[493320.301015]  [<ffffffffc02eb233>] arc_shrink+0xb3/0xc0 [zfs]
[493320.301040]  [<ffffffffc02eb468>] arc_reclaim_thread+0x228/0x230 [zfs]
[493320.301064]  [<ffffffffc02eb240>] ? arc_shrink+0xc0/0xc0 [zfs]
[493320.301074]  [<ffffffffc0204e91>] thread_generic_wrapper+0x71/0x80 [spl]
[493320.301081]  [<ffffffffc0204e20>] ? __thread_exit+0x20/0x20 [spl]
[493320.301086]  [<ffffffff81094ae9>] kthread+0xc9/0xe0
[493320.301089]  [<ffffffff81094a20>] ? kthread_create_on_node+0x1c0/0x1c0
[493320.301093]  [<ffffffff817bcf58>] ret_from_fork+0x58/0x90
[493320.301096]  [<ffffffff81094a20>] ? kthread_create_on_node+0x1c0/0x1c0
[493320.301127] INFO: task txg_sync:1516 blocked for more than 120 seconds.
[493320.305010]       Tainted: P         C OE  3.19.0-61-generic #69~14.04.1-Ubuntu
[493320.309580] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[493320.314466] txg_sync        D ffff8800e6567b48     0  1516      2 0x00000000
[493320.314471]  ffff8800e6567b48 ffff8800e7f313a0 0000000000013e80 ffff8800e6567fd8
[493320.314474]  0000000000013e80 ffff8800ea90ebf0 ffff8800e7f313a0 ffff88001bd98328
[493320.314477]  ffff8800efc34778 ffff88001bd98328 ffff88001bd98368 ffff88001bd98350
[493320.314480] Call Trace:
[493320.314488]  [<ffffffff817b8f70>] io_schedule+0xa0/0x130
[493320.314505]  [<ffffffffc0209b21>] cv_wait_common+0xa1/0x120 [spl]
[493320.314509]  [<ffffffff810b6310>] ? prepare_to_wait_event+0x110/0x110
[493320.314517]  [<ffffffffc0209bf8>] __cv_wait_io+0x18/0x20 [spl]
[493320.314561]  [<ffffffffc0398e03>] zio_wait+0x123/0x210 [zfs]
[493320.314593]  [<ffffffffc0321df1>] dsl_pool_sync+0xb1/0x460 [zfs]
[493320.314628]  [<ffffffffc03483e0>] ? spa_lookup+0x60/0x60 [zfs]
[493320.314661]  [<ffffffffc033ccd8>] spa_sync+0x378/0xb50 [zfs]
[493320.314664]  [<ffffffff810b5b48>] ? __wake_up_common+0x58/0x90
[493320.314668]  [<ffffffff8100b665>] ? xen_clocksource_get_cycles+0x15/0x20
[493320.314703]  [<ffffffffc034f20f>] txg_sync_thread+0x3bf/0x630 [zfs]
[493320.314737]  [<ffffffffc034ee50>] ? txg_quiesce_thread+0x3f0/0x3f0 [zfs]
[493320.314745]  [<ffffffffc0204e91>] thread_generic_wrapper+0x71/0x80 [spl]
[493320.314751]  [<ffffffffc0204e20>] ? __thread_exit+0x20/0x20 [spl]
[493320.314755]  [<ffffffff81094ae9>] kthread+0xc9/0xe0
[493320.314758]  [<ffffffff81094a20>] ? kthread_create_on_node+0x1c0/0x1c0
[493320.314762]  [<ffffffff817bcf58>] ret_from_fork+0x58/0x90
[493320.314765]  [<ffffffff81094a20>] ? kthread_create_on_node+0x1c0/0x1c0

(Apologies if you got an email with another comment; that was for our internal tracking issue)

@eniac111
Copy link

I have the same problem on a production server:

1200.556245] INFO: task sh:6471 blocked for more than 120 seconds.
[ 1200.564882] Tainted: P O 4.4.0-79-generic #100-Ubuntu
[ 1200.573637] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1200.590820] sh D ffff883f7e35f918 0 6471 6469 0x00000000
[ 1200.590830] ffff883f7e35f918 ffff883f7e35f9a0 ffff881fb2604600 ffff883facc81c00
[ 1200.590835] ffff883f7e360000 ffff881fbf856dc0 7fffffffffffffff ffffffff8183d150
[ 1200.590838] ffff883f7e35fa70 ffff883f7e35f930 ffffffff8183c955 0000000000000000
[ 1200.590843] Call Trace:
[ 1200.590858] [] ? bit_wait+0x60/0x60
[ 1200.590864] [] schedule+0x35/0x80
[ 1200.590868] [] schedule_timeout+0x1b5/0x270
[ 1200.590888] [] ? generic_make_request_checks+0x185/0x530
[ 1200.590898] [] ? mempool_alloc_slab+0x15/0x20
[ 1200.590909] [] ? ktime_get+0x3c/0xb0
[ 1200.590912] [] ? bit_wait+0x60/0x60
[ 1200.590917] [] io_schedule_timeout+0xa4/0x110
[ 1200.590920] [] bit_wait_io+0x1b/0x70
[ 1200.590923] [] __wait_on_bit+0x5d/0x90
[ 1200.590927] [] ? bit_wait+0x60/0x60
[ 1200.590930] [] out_of_line_wait_on_bit+0x82/0xb0
[ 1200.590938] [] ? autoremove_wake_function+0x40/0x40
[ 1200.590945] [] __wait_on_buffer+0x32/0x40
[ 1200.590954] [] ext4_find_entry+0x1dd/0x720
[ 1200.590963] [] ? __d_alloc+0x25/0x180
[ 1200.590972] [] ext4_lookup+0x7c/0x270
[ 1200.590984] [] lookup_real+0x1d/0x60
[ 1200.590993] [] path_openat+0xbd4/0x1330
[ 1200.591003] [] ? cpumask_any_but+0x2c/0x40
[ 1200.591015] [] ? wp_page_copy.isra.56+0x333/0x550
[ 1200.591018] [] do_filp_open+0x91/0x100
[ 1200.591025] [] ? apparmor_cred_prepare+0x2f/0x50
[ 1200.591031] [] ? security_prepare_creds+0x43/0x60
[ 1200.591040] [] do_open_execat+0x78/0x1d0
[ 1200.591047] [] do_execveat_common.isra.33+0x1aa/0x740
[ 1200.591052] [] SyS_execve+0x3a/0x50
[ 1200.591058] [] stub_execve+0x5/0x5
[ 1200.591067] [] ? entry_SYSCALL_64_fastpath+0x16/0x71

@dreamcat4
Copy link

are you guys all taking snapshots with cron (zfs-auto-snapshot) ?

@STRML
Copy link

STRML commented Jul 12, 2017

I was not at the time I reported this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants
@dreamcat4 @stevenj @eniac111 @STRML @stangri and others