Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

writing process hung at txg_wait_open #3645

Closed
jxiong opened this issue Jul 28, 2015 · 4 comments
Closed

writing process hung at txg_wait_open #3645

jxiong opened this issue Jul 28, 2015 · 4 comments
Milestone

Comments

@jxiong
Copy link
Contributor

jxiong commented Jul 28, 2015

I saw this issue when I was benchmarking ZFS performance. An iozone thread has been hung for a long time. txg_sync process is busy waiting. Please check the backtrace of both processes.

{noformat}
[10459.403319] iozone D ffff88082f353440 0 6714 1 0x00000004
[10459.403321] ffff880085e83b28 0000000000000082 ffff88055b805180 ffff880085e83fd8
[10459.403323] 0000000000013440 0000000000013440 ffff880803713d20 ffff8807f3701368
[10459.403326] ffff8807f3701220 ffff8807f3701370 ffff8807f3701248 0000000000000000
[10459.403328] Call Trace:
[10459.403330] [] schedule+0x29/0x70
[10459.403334] [] cv_wait_common+0xe5/0x120 [spl]
[10459.403337] [] ? prepare_to_wait_event+0x100/0x100
[10459.403341] [] __cv_wait+0x15/0x20 [spl]
[10459.403360] [] txg_wait_open+0x83/0xd0 [zfs]
[10459.403379] [] dmu_tx_wait+0x380/0x390 [zfs]
[10459.403387] [] ? mutex_lock+0x12/0x2f
[10459.403406] [] dmu_tx_assign+0x9a/0x510 [zfs]
[10459.403423] [] dmu_free_long_range+0x18c/0x240 [zfs]
[10459.403449] [] zfs_rmnode+0x5d/0x340 [zfs]
[10459.403473] [] zfs_zinactive+0x168/0x180 [zfs]
[10459.403494] [] zfs_inactive+0x60/0x200 [zfs]
[10459.403518] [] zpl_evict_inode+0x43/0x60 [zfs]
[10459.403521] [] evict+0xb4/0x180
[10459.403523] [] iput+0xf5/0x180
[10459.403525] [] do_unlinkat+0x193/0x2c0
[10459.403528] [] ? SYSC_newstat+0x2f/0x40
[10459.403530] [] SyS_unlink+0x16/0x20
[10459.403532] [] system_call_fastpath+0x1a/0x1f
{noformat}

and txg_sync:

{noformat}
[16932.253158] txg_sync R running task 0 26338 2 0x00000000
[16932.253166] ffff8807b1c77c38 ffff8807f659a3e8 ffffc900086ad7f0 000000000ccd7dca
[16932.253177] ffffffffc09adb20 ffffffffc09adb20 0000000000000000 000000000000c210
[16932.253185] 0000000000001000 0000000000000000 0000000000000000 ffff8807b1c77c68
[16932.253187] Call Trace:
[16932.253190] [] ? __kmalloc_node+0x1c9/0x2a0
[16932.253193] [] ? spl_kmem_zalloc+0xc0/0x170 [spl]
[16932.253197] [] ? spl_kmem_zalloc+0xc0/0x170 [spl]
[16932.253201] [] ? __cv_wait_io+0x18/0x20 [spl]
[16932.253226] [] ? zio_wait+0x123/0x210 [zfs]
[16932.253240] [] ? ddt_get_dedup_stats+0x3a/0x60 [zfs]
[16932.253243] [] ? mod_timer+0x12a/0x1e0
[16932.253260] [] ? dsl_pool_sync+0xb1/0x470 [zfs]
[16932.253267] [] ? spl_kmem_free+0x2a/0x40 [spl]
[16932.253292] [] ? spa_update_dspace+0x26/0x40 [zfs]
[16932.253315] [] ? spa_sync+0x3a2/0xb20 [zfs]
[16932.253321] [] ? autoremove_wake_function+0x12/0x40
[16932.253327] [] ? read_tsc+0x9/0x20
[16932.253350] [] ? txg_sync_thread+0x36b/0x630 [zfs]
[16932.253357] [] ? sched_clock+0x9/0x10
[16932.253380] [] ? txg_quiesce_thread+0x380/0x380 [zfs]
[16932.253388] [] ? thread_generic_wrapper+0x71/0x80 [spl]
[16932.253395] [] ? __thread_exit+0x20/0x20 [spl]
[16932.253402] [] ? kthread+0xd2/0xf0
[16932.253408] [] ? kthread_create_on_node+0x1c0/0x1c0
[16932.253413] [] ? ret_from_fork+0x7c/0xb0
[16932.253415] [] ? kthread_create_on_node+0x1c0/0x1c0
{noformat}

screen shot 2015-07-28 at 2 38 30 pm

@behlendorf
Copy link
Contributor

@jxiong have you been able to reproduce this with the latest code?

@behlendorf behlendorf added this to the 0.7.0 milestone Nov 4, 2015
@jxiong
Copy link
Contributor Author

jxiong commented Nov 4, 2015

I only saw it once when I was doing it last time. I haven't tried it on latest baseline yet, I will do that.

@behlendorf behlendorf modified the milestones: 0.8.0, 0.7.0 Mar 26, 2016
@fdr
Copy link

fdr commented May 24, 2016

I've managed to run across something very similar. The workload was a parallel bulk load to Postgres on EC2, with Amazon Linux 2016.03. The ZFS version is 0.6.5.7-1.el6. A postmaster process was using 100% system CPU right before the instance was auto-restarted by monitoring, and that means that it was unable to accept connections for a couple of minutes consecutively. Interesting other information is it is preceded by an OOM condition, though this system has a sysctl.conf that disables overcommit. Without further ado, the logs:

May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.913779] [ pid ]   uid  tgid total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.918620] [ 2310]     0  2310     2721       94      11       3        0         -1000 udevd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.923526] [ 6796]     0  6796     2340      122       9       3        0             0 dhclient
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.928459] [ 6855]     0  6855    28022       68      25       3        0         -1000 auditd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.933328] [ 6876]     0  6876    61847       72      23       3        0             0 rsyslogd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.938381] [ 6890]     0  6890     3468       78      11       3        0             0 irqbalance
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.943471] [ 6899]     0  6899     1095       21       8       3        0             0 rngd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.948297] [ 6917]    32  6917     8823       99      23       3        0             0 rpcbind
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.953395] [ 6938]    29  6938     9965      201      24       3        0             0 rpc.statd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.958230] [ 6969]    81  6969     5448       58      16       3        0             0 dbus-daemon
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.963083] [ 7020]     0  7020     7835       62      20       3        0             0 zed
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.967636] [ 7055]    99  7055     3291       41      10       3        0             0 dnsmasq
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.972194] [ 7169]     0  7169    19460      205      41       3        0         -1000 sshd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.976664] [ 7291]    38  7291     7322      143      18       3        0             0 ntpd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.981176] [ 7377]     0  7377    22247      429      45       3        0             0 sendmail
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.989394] [ 7386]    51  7386    20112      367      41       3        0             0 sendmail
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.993958] [ 7398]     0  7398    29879      145      15       4        0             0 crond
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4803.998390] [ 7412]     0  7412     4267       39      13       3        0             0 atd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.002896] [ 7492]     0  7492     2720       97      10       3        0         -1000 udevd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.007442] [ 7499]     0  7499     2830      207      10       3        0         -1000 udevd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.011840] [ 7670]     0  7670     1615       30       8       3        0             0 agetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.016591] [ 7672]     0  7672     1078       24       8       3        0             0 mingetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.021168] [ 7675]     0  7675     1078       24       8       3        0             0 mingetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.025730] [ 7678]     0  7678     1078       24       8       3        0             0 mingetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.030375] [ 7680]     0  7680     1078       24       7       3        0             0 mingetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.034942] [ 7682]     0  7682     1078       23       8       3        0             0 mingetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.039505] [ 7684]     0  7684     1078       24       8       3        0             0 mingetty
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.044220] [ 8636]     0  8636    29451      268      62       4        0             0 sshd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.049072] [ 8638]   500  8638    29643      470      62       4        0             0 sshd
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.053854] [ 8639]   500  8639    28837       96      15       3        0             0 bash
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.058169] [ 9348]     0  9348    46016      155      47       3        0             0 sudo
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.062624] [ 9349]     0  9349    28870      144      14       3        0             0 bash
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.067074] [14296]    26 14296  4088198   101665     253       5        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.071878] [14298]    26 14298    49334      301      47       3        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.076560] [14300]    26 14300  4090188   825739    1680       8        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.081445] [14301]    26 14301  4088243    31372     123       5        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.086085] [14302]    26 14302  4088198     4420      60       4        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.090669] [14303]    26 14303  4088339      643      59       5        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.095259] [14304]    26 14304    49801      310      48       4        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.100784] [14305]    26 14305    49836      341      48       4        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.105494] [14306]    26 14306  4088229      337      51       4        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.110183] [15432]    26 15432  4093319    42310    1103       8        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.114796] [15433]    26 15433  4088581    22839     136       5        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.119705] [15434]    26 15434  4093183    22827     171       5        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.124520] [15435]    26 15435  4088577    22932     165       5        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.129736] [15436]    26 15436  4096914   400685    1691       8        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.135685] [15437]    26 15437  4094302   119586     564       8        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.140870] [15438]    26 15438  4093223   171873     648       7        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.145620] [15439]    26 15439  4102753   490134    1854       8        0             0 postmaster
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.150205] [22421]     0 22421    33775       95      24       3        0             0 zpool
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.154848] Out of memory: Kill process 14300 (postmaster) score 52 or sacrifice child
May 24 17:28:37 ip-172-31-55-208 kernel: [ 4804.158789] Killed process 14300 (postmaster) total-vm:16360752kB, anon-rss:9040kB, file-rss:3293916kB
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.420074] INFO: task kthreadd:2 blocked for more than 120 seconds.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.423361]       Tainted: P           OE   4.4.10-22.54.amzn1.x86_64 #1
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.426716] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.430538] kthreadd        D ffff880f0e3eb6a0     0     2      0 0x00000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.434075]  ffff880f0e3eb6a0 ffff880f0dc30000 ffff880f0e379d40 ffff880f0e3ec000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.437992]  ffff880f08b59220 ffff880f08b59370 ffff880f08b59248 0000000000000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.441925]  ffff880f0e3eb6b8 ffffffff814d99f5 ffff880f08b59368 ffff880f0e3eb718
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.445755] Call Trace:
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.446985]  [<ffffffff814d99f5>] schedule+0x35/0x80
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.449452]  [<ffffffffa02e274d>] cv_wait_common+0xed/0x120 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.452385]  [<ffffffff810ac670>] ? prepare_to_wait_event+0xf0/0xf0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.455414]  [<ffffffffa02e2795>] __cv_wait+0x15/0x20 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.458113]  [<ffffffffa0b1065a>] txg_wait_open+0xba/0x100 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.461152]  [<ffffffffa0acf19f>] dmu_tx_wait+0x36f/0x380 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.464098]  [<ffffffffa0acf23e>] dmu_tx_assign+0x8e/0x4e0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.467001]  [<ffffffffa0b492f3>] zfs_inactive+0xd3/0x230 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.469906]  [<ffffffffa0b5e603>] zpl_evict_inode+0x43/0x60 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.472905]  [<ffffffff811ebf6e>] evict+0xbe/0x1a0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.475228]  [<ffffffff811ec086>] dispose_list+0x36/0x50
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.477962]  [<ffffffff811ed3cb>] prune_icache_sb+0x4b/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.480884]  [<ffffffff811d5d21>] super_cache_scan+0x141/0x190
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.483841]  [<ffffffff81170934>] shrink_slab.part.41+0x1e4/0x390
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.486853]  [<ffffffff811746a9>] shrink_zone+0x2a9/0x2c0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.489548]  [<ffffffff81174a45>] do_try_to_free_pages+0x175/0x440
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.492512]  [<ffffffff81174dc5>] try_to_free_pages+0xb5/0x170
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.495293]  [<ffffffff81168a8a>] __alloc_pages_nodemask+0x53a/0xa60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.498436]  [<ffffffff8100c345>] ? xen_clocksource_read+0x15/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.501416]  [<ffffffff8109fd3f>] ? update_curr+0xdf/0x170
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.504079]  [<ffffffff81169171>] alloc_kmem_pages_node+0x51/0xd0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.506985]  [<ffffffff8106b63a>] copy_process+0x15a/0x1a60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.509743]  [<ffffffff8109dfc1>] ? set_next_entity+0x1d1/0x710
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.512606]  [<ffffffff8109fb65>] ? pick_next_entity+0xa5/0x160
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.515409]  [<ffffffff81089fb0>] ? kthread_park+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.518100]  [<ffffffff8106d0b2>] _do_fork+0x82/0x300
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.520492]  [<ffffffff814d94bb>] ? __schedule+0x34b/0x850
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.523138]  [<ffffffff810b2c91>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.526943]  [<ffffffff8106d359>] kernel_thread+0x29/0x30
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.529592]  [<ffffffff8108abf4>] kthreadd+0x2d4/0x320
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.532119]  [<ffffffff8108a920>] ? kthread_create_on_cpu+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.535294]  [<ffffffff814dd6cf>] ret_from_fork+0x3f/0x70
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.538085]  [<ffffffff8108a920>] ? kthread_create_on_cpu+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.541395] INFO: task kswapd0:689 blocked for more than 120 seconds.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.544870]       Tainted: P           OE   4.4.10-22.54.amzn1.x86_64 #1
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.548558] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.552586] kswapd0         D ffff880f0a7e7998     0   689      2 0x00000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.556253]  ffff880f0a7e7998 ffff880f0dc31d40 ffff880f0dfdd7c0 ffff880f0a7e8000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.560344]  ffff880f08b59220 ffff880f08b59370 ffff880f08b59248 0000000000000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.564361]  ffff880f0a7e79b0 ffffffff814d99f5 ffff880f08b59368 ffff880f0a7e7a10
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.568389] Call Trace:
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.569701]  [<ffffffff814d99f5>] schedule+0x35/0x80
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.572271]  [<ffffffffa02e274d>] cv_wait_common+0xed/0x120 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.575350]  [<ffffffff810ac670>] ? prepare_to_wait_event+0xf0/0xf0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.578673]  [<ffffffffa02e2795>] __cv_wait+0x15/0x20 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.581697]  [<ffffffffa0b1065a>] txg_wait_open+0xba/0x100 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.584768]  [<ffffffffa0acf19f>] dmu_tx_wait+0x36f/0x380 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.587788]  [<ffffffffa0acf23e>] dmu_tx_assign+0x8e/0x4e0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.590910]  [<ffffffffa0b492f3>] zfs_inactive+0xd3/0x230 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.593953]  [<ffffffffa0b5e603>] zpl_evict_inode+0x43/0x60 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.597091]  [<ffffffff811ebf6e>] evict+0xbe/0x1a0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.599599]  [<ffffffff811ec086>] dispose_list+0x36/0x50
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.602253]  [<ffffffff811ed3cb>] prune_icache_sb+0x4b/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.604943]  [<ffffffff811d5d21>] super_cache_scan+0x141/0x190
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.607832]  [<ffffffff81170934>] shrink_slab.part.41+0x1e4/0x390
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.610886]  [<ffffffff811746a9>] shrink_zone+0x2a9/0x2c0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.613666]  [<ffffffff81175614>] kswapd+0x4b4/0x960
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.616304]  [<ffffffff81175160>] ? mem_cgroup_shrink_node_zone+0x190/0x190
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.619710]  [<ffffffff8108a079>] kthread+0xc9/0xe0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.622223]  [<ffffffff81089fb0>] ? kthread_park+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.624860]  [<ffffffff814dd6cf>] ret_from_fork+0x3f/0x70
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.627484]  [<ffffffff81089fb0>] ? kthread_park+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.630240] INFO: task txg_sync:5378 blocked for more than 120 seconds.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.633420]       Tainted: P           OE   4.4.10-22.54.amzn1.x86_64 #1
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.636628] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.640425] txg_sync        D ffff880efb98faf8     0  5378      2 0x00000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.643951]  ffff880efb98faf8 ffff880f0c0b0000 ffff880efb9157c0 ffff880efb990000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.647793]  0000000000000000 7fffffffffffffff ffff880e0d977dd0 0000000000000001
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.651874]  ffff880efb98fb10 ffffffff814d99f5 ffff880f52015340 ffff880efb98fbb0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.655825] Call Trace:
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.657084]  [<ffffffff814d99f5>] schedule+0x35/0x80
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.659497]  [<ffffffff814dc2d1>] schedule_timeout+0x231/0x2b0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.662333]  [<ffffffff810b2c91>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.666161]  [<ffffffff8100c365>] ? xen_clocksource_get_cycles+0x15/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.669387]  [<ffffffff810d233a>] ? ktime_get+0x3a/0x90
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.671912]  [<ffffffff814d9106>] io_schedule_timeout+0xa6/0x110
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.674937]  [<ffffffffa02e270d>] cv_wait_common+0xad/0x120 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.677939]  [<ffffffff810ac670>] ? prepare_to_wait_event+0xf0/0xf0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.681163]  [<ffffffffa02e27d8>] __cv_wait_io+0x18/0x20 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.684102]  [<ffffffffa0b5739e>] zio_wait+0x10e/0x1f0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.686905]  [<ffffffffa0ae60e8>] dsl_pool_sync+0xa8/0x410 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.689957]  [<ffffffffa0afda38>] spa_sync+0x358/0xaf0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.692865]  [<ffffffff810ac682>] ? autoremove_wake_function+0x12/0x40
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.696122]  [<ffffffff810b2c91>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.700154]  [<ffffffffa0b111aa>] txg_sync_thread+0x3aa/0x600 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.703204]  [<ffffffffa0b10e00>] ? txg_quiesce_thread+0x3e0/0x3e0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.706457]  [<ffffffffa02ddca1>] thread_generic_wrapper+0x71/0x80 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.709634]  [<ffffffffa02ddc30>] ? __thread_exit+0x20/0x20 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.712590]  [<ffffffff8108a079>] kthread+0xc9/0xe0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.714970]  [<ffffffff81089fb0>] ? kthread_park+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.717639]  [<ffffffff814dd6cf>] ret_from_fork+0x3f/0x70
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.720272]  [<ffffffff81089fb0>] ? kthread_park+0x60/0x60
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.722947] INFO: task postmaster:14300 blocked for more than 120 seconds.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.726261]       Tainted: P           OE   4.4.10-22.54.amzn1.x86_64 #1
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.729520] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.733300] postmaster      D ffff880e1d033c48     0 14300  14296 0x00100004
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.736888]  ffff880e1d033c48 ffff880f086c9d40 ffff8800eb1957c0 ffff880e1d034000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.740769]  0000000000000000 7fffffffffffffff ffff88066b0c44f0 0000000000000001
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.744686]  ffff880e1d033c60 ffffffff814d99f5 ffff880f520d5340 ffff880e1d033cf8
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.748669] Call Trace:
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.750021]  [<ffffffff814d99f5>] schedule+0x35/0x80
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.752440]  [<ffffffff814dc2d1>] schedule_timeout+0x231/0x2b0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.755263]  [<ffffffff8100c365>] ? xen_clocksource_get_cycles+0x15/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.758495]  [<ffffffff810d233a>] ? ktime_get+0x3a/0x90
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.761249]  [<ffffffff814d9106>] io_schedule_timeout+0xa6/0x110
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.764288]  [<ffffffffa02e270d>] cv_wait_common+0xad/0x120 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.767317]  [<ffffffff810ac670>] ? prepare_to_wait_event+0xf0/0xf0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.770369]  [<ffffffffa02e27d8>] __cv_wait_io+0x18/0x20 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.773369]  [<ffffffffa0b5739e>] zio_wait+0x10e/0x1f0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.776020]  [<ffffffffa0b51654>] zil_commit.part.11+0x434/0x7a0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.779208]  [<ffffffff810b2c91>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.783078]  [<ffffffffa0b519d7>] zil_commit+0x17/0x20 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.785826]  [<ffffffffa0b435ea>] zfs_fsync+0x7a/0xf0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.788529]  [<ffffffffa0b5cc65>] zpl_fsync+0x65/0x90 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.791203]  [<ffffffff812032bd>] vfs_fsync_range+0x3d/0xb0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.793917]  [<ffffffff8120338d>] do_fsync+0x3d/0x70
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.796317]  [<ffffffff81203600>] SyS_fsync+0x10/0x20
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.798794]  [<ffffffff814dd36e>] entry_SYSCALL_64_fastpath+0x12/0x71
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.801928] INFO: task postmaster:15432 blocked for more than 120 seconds.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.805233]       Tainted: P           OE   4.4.10-22.54.amzn1.x86_64 #1
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.808484] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.812275] postmaster      D ffff880e307979f0     0 15432  14296 0x00000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.815839]  ffff880e307979f0 ffff880efb913a80 ffff880ef958ba80 ffff880e30798000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.819722]  ffff880f08b59220 ffff880f08b59370 ffff880f08b59248 0000000000000000
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.823616]  ffff880e30797a08 ffffffff814d99f5 ffff880f08b59368 ffff880e30797a68
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.827565] Call Trace:
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.828808]  [<ffffffff814d99f5>] schedule+0x35/0x80
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.831120]  [<ffffffffa02e274d>] cv_wait_common+0xed/0x120 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.834337]  [<ffffffff810ac670>] ? prepare_to_wait_event+0xf0/0xf0
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.837568]  [<ffffffffa02e2795>] __cv_wait+0x15/0x20 [spl]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.840307]  [<ffffffffa0b1065a>] txg_wait_open+0xba/0x100 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.843249]  [<ffffffffa0acf19f>] dmu_tx_wait+0x36f/0x380 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.846186]  [<ffffffffa0acf23e>] dmu_tx_assign+0x8e/0x4e0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.849142]  [<ffffffffa0b4495d>] zfs_write+0x3dd/0xbb0 [zfs]
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.851884]  [<ffffffff81167f8c>] ? get_page_from_freelist+0x3cc/0x990
May 24 17:29:28 ip-172-31-55-208 kernel: [ 4920.855049]  [<ffffffff8116867f>] ? __alloc_pages_nodemask+0x12f/0xa60
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.858253]  [<ffffffff810b2c91>] ? __raw_callee_save___pv_queued_spin_unlock+0x11/0x20
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.862117]  [<ffffffffa0b5c7ba>] zpl_write_common_iovec+0x7a/0xd0 [zfs]
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.865447]  [<ffffffffa0b5c9b8>] zpl_write+0x78/0xa0 [zfs]
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.868121]  [<ffffffff811d2378>] __vfs_write+0x28/0xe0
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.870670]  [<ffffffff810b26a7>] ? percpu_down_read+0x17/0x50
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.873436]  [<ffffffff811d2a22>] vfs_write+0xa2/0x1a0
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.875889]  [<ffffffff8105ea40>] ? __do_page_fault+0x1a0/0x3f0
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.878807]  [<ffffffff811d3726>] SyS_write+0x46/0xa0
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.881248]  [<ffffffff811d25a7>] ? SyS_lseek+0x87/0xb0
May 24 17:29:29 ip-172-31-55-208 kernel: [ 4920.883766]  [<ffffffff814dd36e>] entry_SYSCALL_64_fastpath+0x12/0x71

@behlendorf
Copy link
Contributor

Closing as stale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants