Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

spl_kmem_cache spinning problem (again) combined with txg_sync hung task #247

Closed
cousins opened this issue Jun 10, 2013 · 18 comments
Closed
Labels
Milestone

Comments

@cousins
Copy link

cousins commented Jun 10, 2013

I have 0.6.1 running on a CentOS 6.4 system. I have a large pool made up of 60 4 TB drives in six raidz2 sets of 10 drives. To create this I did:

zpool create pool2 -f -o ashift=12 raidz2 dev1 dev2 dev3 ... dev10 raidz2 dev11 dev12 ... dev20 raidz2 ...

It has worked fine mostly but once in a while when I rsync to it it will hang and the spl_kmem_cache process will be at 100%. dmesg shows a series of:

INFO: task txg_sync:790 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
txg_sync D 0000000000000010 0 790 2 0x00000080
ffff88201e751b60 0000000000000046 ffff88201e751b20 ffffffff8106317e
ffff88201e751b90 ffff882000000000 00000000fee6cd10 0000000000000001
ffff88201f5465f8 ffff88201e751fd8 000000000000fb88 ffff88201f5465f8
Call Trace:
[] ? try_to_wake_up+0x24e/0x3e0
[] ? ktime_get_ts+0xa9/0xe0
[] io_schedule+0x73/0xc0
[] cv_wait_common+0xac/0x1c0 [spl]
[] ? autoremove_wake_function+0x0/0x40
[] __cv_wait_io+0x18/0x20 [spl]
[] zio_wait+0xfb/0x190 [zfs]
[] dsl_pool_sync+0x2f5/0x540 [zfs]
[] spa_sync+0x39e/0xa00 [zfs]
[] ? read_tsc+0x9/0x20
[] txg_sync_thread+0x307/0x590 [zfs]
[] ? set_user_nice+0xc9/0x130
[] ? txg_sync_thread+0x0/0x590 [zfs]
[] thread_generic_wrapper+0x68/0x80 [spl]
[] ? thread_generic_wrapper+0x0/0x80 [spl]
[] kthread+0x96/0xa0
[] child_rip+0xa/0x20
[] ? kthread+0x0/0xa0
[] ? child_rip+0x0/0x20

I have searched through other spl_kmem_cache issues on this list but I haven't seen any that match this so I wanted to report it. If I can do anything to help resolve this I will.

Thanks.

Steve

@ryao
Copy link
Contributor

ryao commented Jun 10, 2013

This was reported to the ZFS issue tracker in openzfs/zfs#1454. Your back trace is much better than the one in that report. It should be easier to identify the cause of this bug using it. That report inspired me to write ryao/spl@5717902. The commit message describes an additional issue that could cause what you have observed, but it would be hard to speculate on what is happening on your system without first testing this patch.

Would you test my patch and report whether it had any effect? Also, would you elaborate more on your system configuration? Specifically, where is the rootfs stored, is there swap and if you have swap, is it on a zvol?

@cousins
Copy link
Author

cousins commented Jun 10, 2013

I'll give it a try. It sometimes takes a while (days) for it to happen. I'll let you know if it happens again with this patch. Just to be sure I got the right thing, it is just editing one line and adding __GFP_IO | in?

To answer your questions, if I understand your rootfs question correctly, the OS is installed on a pair of mirrored (md) 120 GB Intel 520 SSD drives. The root file system is ext4. There is 4 GB of swap space but it is not on a zvol. It is also on the mirrored SSD's.

Also, at least right now it is showing that no Swap is being used which isn't surprising with this much RAM.

Other information: It is a Supermicro system with two Xeon E5-2620 CPU's (HT is on) and 128 GB of RAM.

Thanks. Let me know if you need anything else.

Steve

@ryao
Copy link
Contributor

ryao commented Jun 10, 2013

That patch is a 1-line change and from what you told me, I think it is unlikely to resolve your problem. Feedback from testing would still be slightly useful, although it is probably not necessary. A fix for the other issue that I spotted would have a better chance of fixing your issue. I will post an update when I have it, although I do not expect to find time in the next few days.

@cousins
Copy link
Author

cousins commented Jun 10, 2013

OK. I'll leave it as is then and wait for you're next fix. Thanks very much.

Steve

@cousins
Copy link
Author

cousins commented Jun 11, 2013

This system is now having some hardware problems that I am diagnosing with the vendor. I believe one of the HBA's is bad. I'll get back to you on this once that is resolved.

@ryao
Copy link
Contributor

ryao commented Jun 11, 2013

The deadman thread that was added by openzfs/zfs@cc92e9d to zfsonlinux/zfs head would force a reboot if one of the controllers stopped responding. This is not in 0.6.1 and conceivably, it is possible for things to start spinning indefinitely if a controller stops working. However, it is not clear to me how the spinning would happen without having the pool stop responding to IO.

@cousins
Copy link
Author

cousins commented Jun 14, 2013

I did a couple of sizable rsyncs and all seemed ok but now I see that kswapd0 and kswap1 are at 100% (the system is otherwise idle):

top - 14:36:10 up 2 days, 19:21, 2 users, load average: 2.00, 2.00, 2.00
Tasks: 974 total, 4 running, 970 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 8.4%sy, 0.0%ni, 91.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132126220k total, 109089644k used, 23036576k free, 115772k buffers
Swap: 4192184k total, 0k used, 4192184k free, 61491424k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
258 root 20 0 0 0 0 R 99.7 0.0 1200:21 kswapd0
259 root 20 0 0 0 0 R 99.4 0.0 1191:56 kswapd1
347 root 39 19 0 0 0 S 1.0 0.0 39:36.95 kipmi0
4579 root RT 0 5543m 15m 3376 S 0.7 0.0 44:04.58 multipathd
19717 root 20 0 15692 1968 952 S 0.7 0.0 0:01.47 top

Now for some reason my shell is also using 100% of one core:

top - 17:44:46 up 2 days, 22:29, 4 users, load average: 3.01, 3.02, 3.04
Tasks: 980 total, 4 running, 976 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 12.6%sy, 0.0%ni, 87.4%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132126220k total, 34503956k used, 97622264k free, 15856k buffers
Swap: 4192184k total, 0k used, 4192184k free, 27456k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
259 root 20 0 0 0 0 R 99.6 0.0 1379:55 kswapd1
20752 root 20 0 105m 1844 1368 R 99.6 0.0 71:12.64 sh
258 root 20 0 0 0 0 R 99.3 0.0 1388:19 kswapd0
347 root 39 19 0 0 0 S 1.0 0.0 41:38.08 kipmi0
21924 root 20 0 15700 1960 952 R 1.0 0.0 0:00.08 top

Looking at perf top I see:

Samples: 26K of event 'cycles', Event count (approx.): 14163818214
44.91% [spl] [k] __spl_kmem_cache_generic_shrinker
34.35% [spl] [k] spl_kmem_cache_reap_now
7.51% [kernel] [k] _spin_lock
5.18% [spl] [k] spl_slab_reclaim
1.62% [kernel] [k] bit_waitqueue
0.74% [zfs] [k] arc_kmem_reap_now
0.57% [kernel] [k] up_read
0.54% [kernel] [k] wake_up_bit
0.44% [kernel] [k] __wake_up_bit
0.39% [kernel] [k] __phys_addr
0.36% [kernel] [k] down_read
0.30% [kernel] [k] port_inb
0.29% perf [.] 0x000000000009b893

Does this relate to the spl_kmem_cache spinning problem at all? I see no entries in dmesg or /var/log/messages at all pertaining to much of anything (a couple of packages added and an attempt at clearing the kswapd spinning by turning off swap and then back on.

Any ideas or things to check?

Thanks,

Steve

@cousins
Copy link
Author

cousins commented Jun 17, 2013

Any closer on this? It has happened again. spl_kmem_cache at up around 100%

top shows:

top - 10:10:46 up 2 days, 14:38, 3 users, load average: 2.04, 6.57, 10.98
Tasks: 983 total, 2 running, 981 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 4.3%sy, 0.0%ni, 95.7%id, 0.1%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132126220k total, 80799416k used, 51326804k free, 146824k buffers
Swap: 4192184k total, 12k used, 4192172k free, 21797336k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
16476 root 39 19 0 0 0 R 99.7 0.0 793:00.29 spl_kmem_cache/
347 root 39 19 0 0 0 S 1.0 0.0 36:18.00 kipmi0
21657 root 20 0 15696 1956 952 R 0.7 0.0 0:00.27 top
20678 root RT 0 5549m 15m 3376 S 0.3 0.0 43:37.61 multipathd
21564 cousins 20 0 97864 1808 856 S 0.3 0.0 0:00.01 sshd
28251 root 20 0 0 0 0 S 0.3 0.0 34:46.61 kondemand/0
28298 root 20 0 11028 804 424 S 0.3 0.0 2:35.88 irqbalance
30847 root 0 -20 0 0 0 S 0.3 0.0 219:47.84 txg_sync
1 root 20 0 19360 1436 1128 S 0.0 0.0 0:02.07 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.06 kthreadd

Looking at each core (HT is active) I see:

Cpu0 : 0.0%us,100.0%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu1 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu2 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu3 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu4 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu5 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu6 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu7 : 0.3%us, 1.0%sy, 0.0%ni, 98.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu8 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu9 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu10 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu11 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu12 : 0.0%us, 0.0%sy, 0.0%ni, 99.3%id, 0.7%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu13 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu14 : 0.0%us, 1.0%sy, 0.0%ni, 99.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu15 : 0.0%us, 0.3%sy, 0.0%ni, 99.7%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu16 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu17 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu18 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu19 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu20 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu21 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu22 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st
Cpu23 : 0.0%us, 0.0%sy, 0.0%ni,100.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st

perf top shows (is this helpful?) :
Samples: 1K of event 'cycles', Event count (approx.): 698121684
26.22% [spl] [k] spl_kmem_cache_reap_now
14.67% [spl] [k] __spl_kmem_cache_generic_shrinker
8.42% [spl] [k] spl_slab_reclaim
7.81% [kernel] [k] _spin_lock
4.68% [kernel] [k] bit_waitqueue
4.44% perf [.] 0x000000000004bd4b
3.59% [kernel] [k] strcmp
3.04% [kernel] [k] module_get_kallsym
2.08% [kernel] [k] kallsyms_expand_symbol
1.95% [kernel] [k] number
1.87% [kernel] [k] wake_up_bit
1.40% [kernel] [k] vsnprintf
1.40% [kernel] [k] format_decode
1.25% [kernel] [k] __phys_addr
1.16% [kernel] [k] strnlen
1.09% perf [.] rb_next
1.06% libc-2.12.so [.] memcpy
1.00% [kernel] [k] string
0.86% perf [.] kallsyms__parse
0.78% [kernel] [k] pointer
0.76% libc-2.12.so [.] _IO_getdelim
0.62% perf [.] rb_insert_color
0.62% [kernel] [k] get_task_cred
0.60% [kernel] [k] clear_page_c
0.55% libc-2.12.so [.] __strcmp_sse42
0.55% libc-2.12.so [.] __strstr_sse42
0.55% libc-2.12.so [.] _int_malloc
0.55% [kernel] [k] __wake_up_bit
0.47% [kernel] [k] seq_vprintf
0.39% [kernel] [k] update_iter
0.39% [kernel] [k] memcpy
0.39% libc-2.12.so [.] memchr
0.39% [kernel] [k] strlcpy
0.39% libc-2.12.so [.] __strchr_sse42
0.31% [kernel] [k] copy_user_generic_string
0.23% [kernel] [k] security_real_capable_noaudit
0.23% libc-2.12.so [.] _IO_feof
0.23% libc-2.12.so [.] __strlen_sse42
0.23% [kernel] [k] seq_read
0.22% [kernel] [k] list_del

This happened while rsyncing again. The directory tree that it was working on has a lot of files in one of its directories. 285224 files for a total of about 250 GB. rsync was building the file list of files to copy when this spinning occurred.

Any help would be greatly appreciated.

Thanks,

Steve

@cousins cousins closed this as completed Jun 17, 2013
@cousins cousins reopened this Jun 17, 2013
@behlendorf
Copy link
Contributor

@cousins The first issue you reported about the pool hang would be completely explained by the bad HBA. It was clearly waiting on IO. The second (different) issue regarding the spinning shows that ZFS is trying very hard to reclaim memory. When it's in this state could you issue a echo t >/proc/sysrq-trigger to dump the stacks for the system to the console. This should help in determining exactly why it's unable to reclaim memory.

@cousins
Copy link
Author

cousins commented Jul 18, 2013

Hi Brian,

After going over the hardware in detail and finally finding a flaky SAS cable connection we have been up and running for about four weeks without any hardware issues and other than the performance probem you're helping with (du performance on TB's of data with millions of files) it has been very solid. Until today. Last night I had the two rsyncs running and then added another smaller rsync from another server. All of these were going to /pool2. At the same time I started deleting files on another pool: /pool2/asl which was set up to test the hardware.

This morning I found spl_kmem_cache spinning and the load up at 48.00. I can login but I can't do anything zfs related. df shows the volumes ok (it doesn't hank) but I can't cd to them. zpool status hangs. In /var/log/messages it shows that it happened shortly after midnight last night (trace is below).

I don't see any messages about hardware problems. multipath -l shows all 81 drives active and all looks good in that regard.

Let me know if you need any more information. I'll leave it in this state for a while in case you need more information.

Here is the trace:

Jul 18 00:23:51 nfs1 kernel: INFO: task z_wr_int/0:10333 blocked for more than 600 seconds.
Jul 18 00:23:51 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:51 nfs1 kernel: z_wr_int/0 D 0000000000000000 0 10333 2 0x00000080
Jul 18 00:23:52 nfs1 kernel: ffff88101e7e1ce0 0000000000000046 ffff880dfcb5b880 ffff881401dbaa18
Jul 18 00:23:52 nfs1 kernel: ffff88101e7e1c80 ffffffffa0644617 ffff880c3a721380 ffff881401dba710
Jul 18 00:23:52 nfs1 kernel: ffff881066a98638 ffff88101e7e1fd8 000000000000fb88 ffff881066a98638
Jul 18 00:23:52 nfs1 kernel: Call Trace:
Jul 18 00:23:52 nfs1 kernel: [] ? zio_remove_child+0x97/0xb0 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:52 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:52 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:52 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:52 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:52 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:52 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:52 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:52 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:52 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:52 nfs1 kernel: INFO: task z_wr_int/1:10334 blocked for more than 600 seconds.
Jul 18 00:23:52 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:52 nfs1 kernel: z_wr_int/1 D 0000000000000001 0 10334 2 0x00000080
Jul 18 00:23:52 nfs1 kernel: ffff88101e7e3ce0 0000000000000046 ffff88101e7e3ca8 ffff88101e7e3ca4
Jul 18 00:23:52 nfs1 kernel: 0000000000003ed9 ffff88107fc24500 ffff8800282f6700 000000000000020e
Jul 18 00:23:52 nfs1 kernel: ffff881066a99af8 ffff88101e7e3fd8 000000000000fb88 ffff881066a99af8
Jul 18 00:23:52 nfs1 kernel: Call Trace:
Jul 18 00:23:52 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:52 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:52 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:52 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:52 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:52 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:52 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:52 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:53 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:53 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:53 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:53 nfs1 kernel: INFO: task z_wr_int/2:10335 blocked for more than 600 seconds.
Jul 18 00:23:53 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:53 nfs1 kernel: z_wr_int/2 D 0000000000000002 0 10335 2 0x00000080
Jul 18 00:23:53 nfs1 kernel: ffff88101e7e5ce0 0000000000000046 0000000000000000 ffff88101e7e5ca4
Jul 18 00:23:53 nfs1 kernel: 0000000000000000 ffff88107fc24700 ffff880028216700 000000000000000f
Jul 18 00:23:53 nfs1 kernel: ffff8810663cb098 ffff88101e7e5fd8 000000000000fb88 ffff8810663cb098
Jul 18 00:23:53 nfs1 kernel: Call Trace:
Jul 18 00:23:53 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:53 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:53 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:53 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:53 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:53 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:53 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:53 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:53 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:53 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:53 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:53 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:53 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:53 nfs1 kernel: INFO: task z_wr_int/7:10340 blocked for more than 600 seconds.
Jul 18 00:23:53 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:53 nfs1 kernel: z_wr_int/7 D 0000000000000007 0 10340 2 0x00000080
Jul 18 00:23:53 nfs1 kernel: ffff88101e7efce0 0000000000000046 0000000000000000 ffffffff81050247
Jul 18 00:23:53 nfs1 kernel: 0000000000016700 ffff8810672daaa0 ffff8820694d4080 ffffffff8160b780
Jul 18 00:23:53 nfs1 kernel: ffff8810672db058 ffff88101e7effd8 000000000000fb88 ffff8810672db058
Jul 18 00:23:53 nfs1 kernel: Call Trace:
Jul 18 00:23:53 nfs1 kernel: [] ? walk_tg_tree_from+0x67/0xc0
Jul 18 00:23:53 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:53 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:53 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:53 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:54 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:54 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:54 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:54 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:54 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:54 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:54 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:54 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:54 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:54 nfs1 kernel: INFO: task z_wr_int/8:10341 blocked for more than 600 seconds.
Jul 18 00:23:54 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:54 nfs1 kernel: z_wr_int/8 D 0000000000000008 0 10341 2 0x00000080
Jul 18 00:23:54 nfs1 kernel: ffff88101e0e1ce0 0000000000000046 ffff88101e0e1ca8 ffff88101e0e1ca4
Jul 18 00:23:54 nfs1 kernel: 0000000000016700 ffff88107fc25300 ffff8800282f6700 000000000000042c
Jul 18 00:23:54 nfs1 kernel: ffff8810672da5f8 ffff88101e0e1fd8 000000000000fb88 ffff8810672da5f8
Jul 18 00:23:54 nfs1 kernel: Call Trace:
Jul 18 00:23:54 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:54 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:54 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:54 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:54 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:54 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:54 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:54 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:54 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:54 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:54 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:54 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:54 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:54 nfs1 kernel: INFO: task z_wr_int/9:10342 blocked for more than 600 seconds.
Jul 18 00:23:54 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:54 nfs1 kernel: z_wr_int/9 D 0000000000000009 0 10342 2 0x00000080
Jul 18 00:23:54 nfs1 kernel: ffff88101e0e5ce0 0000000000000046 ffff88101e0e5c80 ffffffff81050247
Jul 18 00:23:54 nfs1 kernel: 0000000000016700 ffff88101a541540 ffff8820694d4ae0 ffffffff8160b780
Jul 18 00:23:55 nfs1 kernel: ffff88101a541af8 ffff88101e0e5fd8 000000000000fb88 ffff88101a541af8
Jul 18 00:23:55 nfs1 kernel: Call Trace:
Jul 18 00:23:55 nfs1 kernel: [] ? walk_tg_tree_from+0x67/0xc0
Jul 18 00:23:55 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:55 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:55 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:55 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:55 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:55 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:55 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:55 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:55 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:55 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:55 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:55 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:55 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:55 nfs1 kernel: INFO: task z_wr_int/10:10343 blocked for more than 600 seconds.
Jul 18 00:23:55 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:55 nfs1 kernel: z_wr_int/10 D 000000000000000a 0 10343 2 0x00000080
Jul 18 00:23:55 nfs1 kernel: ffff88101e0e7ce0 0000000000000046 ffff882069624c00 0000000000016700
Jul 18 00:23:55 nfs1 kernel: 0000000000016700 ffff8810158f7500 ffff881069076aa0 ffffffff8160b780
Jul 18 00:23:55 nfs1 kernel: ffff8810158f7ab8 ffff88101e0e7fd8 000000000000fb88 ffff8810158f7ab8
Jul 18 00:23:55 nfs1 kernel: Call Trace:
Jul 18 00:23:55 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:55 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:55 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:55 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:55 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:55 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:55 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:55 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:55 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:55 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:55 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:55 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:55 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:55 nfs1 kernel: INFO: task z_wr_int/11:10344 blocked for more than 600 seconds.
Jul 18 00:23:55 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:56 nfs1 kernel: z_wr_int/11 D 000000000000000b 0 10344 2 0x00000080
Jul 18 00:23:56 nfs1 kernel: ffff88101e0e9ce0 0000000000000046 0000000000000000 ffff88101e0e9ca4
Jul 18 00:23:56 nfs1 kernel: 0000000000000000 ffff88107fc25900 ffff8810b88f6700 0000000000000400
Jul 18 00:23:56 nfs1 kernel: ffff8810158f65f8 ffff88101e0e9fd8 000000000000fb88 ffff8810158f65f8
Jul 18 00:23:56 nfs1 kernel: Call Trace:
Jul 18 00:23:56 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:56 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:56 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:56 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:56 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:56 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:56 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:56 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:56 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:56 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:56 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:56 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:56 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:56 nfs1 kernel: INFO: task z_wr_int/12:10345 blocked for more than 600 seconds.
Jul 18 00:23:56 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:56 nfs1 kernel: z_wr_int/12 D 000000000000000c 0 10345 2 0x00000080
Jul 18 00:23:56 nfs1 kernel: ffff88101e0edce0 0000000000000046 ffff8800282cfec0 0000000000016700
Jul 18 00:23:56 nfs1 kernel: 0000000000016700 ffff881066398080 ffff8810690bcae0 ffffffff8160b780
Jul 18 00:23:56 nfs1 kernel: ffff881066398638 ffff88101e0edfd8 000000000000fb88 ffff881066398638
Jul 18 00:23:56 nfs1 kernel: Call Trace:
Jul 18 00:23:56 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:56 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:56 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:56 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:56 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:56 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:56 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:56 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:56 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:56 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:57 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:57 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:57 nfs1 kernel: [] ? child_rip+0x0/0x20
Jul 18 00:23:57 nfs1 kernel: INFO: task z_wr_int/13:10346 blocked for more than 600 seconds.
Jul 18 00:23:57 nfs1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jul 18 00:23:57 nfs1 kernel: z_wr_int/13 D 000000000000000d 0 10346 2 0x00000080
Jul 18 00:23:57 nfs1 kernel: ffff88101e0efce0 0000000000000046 ffff8806778da148 ffff881c9bf75dd8
Jul 18 00:23:57 nfs1 kernel: ffff88101e0efc80 ffffffffa0644617 ffff8819b6ffaa40 ffff881c9bf75ad0
Jul 18 00:23:57 nfs1 kernel: ffff881066399af8 ffff88101e0effd8 000000000000fb88 ffff881066399af8
Jul 18 00:23:57 nfs1 kernel: Call Trace:
Jul 18 00:23:57 nfs1 kernel: [] ? zio_remove_child+0x97/0xb0 [zfs]
Jul 18 00:23:57 nfs1 kernel: [] __mutex_lock_slowpath+0x13e/0x180
Jul 18 00:23:57 nfs1 kernel: [] mutex_lock+0x2b/0x50
Jul 18 00:23:57 nfs1 kernel: [] vdev_queue_io_done+0x30/0xd0 [zfs]
Jul 18 00:23:57 nfs1 kernel: [] zio_vdev_io_done+0x88/0x190 [zfs]
Jul 18 00:23:57 nfs1 kernel: [] zio_execute+0xb3/0x130 [zfs]
Jul 18 00:23:57 nfs1 kernel: [] taskq_thread+0x218/0x4b0 [spl]
Jul 18 00:23:57 nfs1 kernel: [] ? thread_return+0x4e/0x76e
Jul 18 00:23:57 nfs1 kernel: [] ? default_wake_function+0x0/0x20
Jul 18 00:23:57 nfs1 kernel: [] ? taskq_thread+0x0/0x4b0 [spl]
Jul 18 00:23:57 nfs1 kernel: [] kthread+0x96/0xa0
Jul 18 00:23:57 nfs1 kernel: [] child_rip+0xa/0x20
Jul 18 00:23:57 nfs1 kernel: [] ? kthread+0x0/0xa0
Jul 18 00:23:57 nfs1 kernel: [] ? child_rip+0x0/0x20

Thanks for your help.

Steve

@cousins
Copy link
Author

cousins commented Jul 18, 2013

FWIW, here is what "perf top" shows:

Samples: 212K of event 'cycles', Event count (approx.): 35451406588
37.77% [spl] [k] spl_kmem_cache_reap_now
19.02% [spl] [k] __spl_kmem_cache_generic_shrinker
13.16% [kernel] [k] _spin_lock
11.73% [spl] [k] spl_slab_reclaim
5.62% [kernel] [k] bit_waitqueue
2.34% [kernel] [k] wake_up_bit
1.81% libxul.so (deleted) [.] 0x0000000000c0c908
1.60% [kernel] [k] __wake_up_bit
1.48% [kernel] [k] __phys_addr
0.83% [kernel] [k] port_inb
0.24% [kernel] [k] up_read
0.22% [kernel] [k] down_read
0.19% [kernel] [k] __vmalloc_node
0.17% [kernel] [k] remove_vm_area

and just plain top:

top - 13:01:24 up 26 days, 21:38, 8 users, load average: 51.01, 51.02, 51.00
Tasks: 1003 total, 2 running, 1000 sleeping, 0 stopped, 1 zombie
Cpu(s): 0.0%us, 4.3%sy, 0.0%ni, 87.4%id, 8.3%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 132126220k total, 109173804k used, 22952416k free, 264580k buffers
Swap: 4192184k total, 3448k used, 4188736k free, 1173484k cached

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5686 root 39 19 0 0 0 R 99.0 0.0 2497:53 spl_kmem_cache/
347 root 39 19 0 0 0 S 1.0 0.0 401:23.35 kipmi0
30899 root 20 0 15692 1992 952 R 1.0 0.0 0:00.18 top
261 root 39 19 0 0 0 S 0.3 0.0 8:10.55 khugepaged
16821 root 20 0 267m 160m 520 D 0.3 0.1 215:26.10 rsync
20080 asl 20 0 1031m 231m 25m S 0.3 0.2 74:56.90 firefox
1 root 20 0 19356 1496 1172 S 0.0 0.0 0:11.84 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.63 kthreadd
3 root RT 0 0 0 0 S 0.0 0.0 2:08.68 migration/0
4 root 20 0 0 0 0 S 0.0 0.0 20:48.71 ksoftirqd/0

@behlendorf
Copy link
Contributor

@cousins Go ahead and reboot the machine, and thanks for the additional debugging. As it happens this looks like exactly the issue I'm working on right now. Others have reported the same problem occurring after many weeks of uptime. This is one of the few issues still blocking the 0.6.2 release.

@cousins
Copy link
Author

cousins commented Jul 18, 2013

HI Brian,

OK. Thanks. Any rough estimate on how long this might take to fix?

Steve

@cousins
Copy link
Author

cousins commented Jul 19, 2013

Something somewhat similar seems to have happened today. Uptime of only 17 hours. Nothing seems to have crashed yet (no backtrace in logs or console) but the zfs volume has hung and kswapd1, kswapd0 and spl_kmem_cache are all at 100% along with one rsync process and makewhatis (started by cron). The load is up around 10.0 and I can't do anything with that zfs volume however, zpool status does respond and shows everything in good shape.

What I did after starting the system up again was:

   # destroyed the temporary pool
   zfs destroy pool2/asl

and then two rsyncs to the pool2 pool.

I have another smaller system (20 TB with a 13 drive raidz2 pool plus a spare) that hasn't had any of this problem. It is on Ubuntu 12.04.2 LTS with the 3.2.0-29-generic kernel. It is at zfs/spl 0.6.1

The larger system (that I'm having trouble with) is on CentOS 6.4 with the 2.6.32-358.11.1.el6.x86_64 kernel.

Are there kernel related issues too?

As I am using this more more and more commands are locking up. My guess is that I'll get a trace soon ... sure enough I just checked the console and got:

nfs1.localdomain login: INFO: task khugepaged:261 blocked for more than 600 sec.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
khugepaged D 000000000000000a 0 261 2 0x00000000
ffff88106654fc90 0000000000000046 0000000000000000 ffff880000033c00
ffff88106654d540 00000000004352da ffff88106654fd40 ffffffff8112bb13
ffff88106654daf8 ffff88106654ffd8 000000000000fb88 ffff88106654daf8
Call Trace:
[] ? __alloc_pages_nodemask+0x113/0x8d0
[] rwsem_down_failed_common+0x95/0x1d0
[] rwsem_down_write_failed+0x23/0x30
[] call_rwsem_down_write_failed+0x13/0x20
[] ? down_write+0x32/0x40
[] khugepaged+0x7f6/0x1310
[] ? autoremove_wake_function+0x0/0x40
[] ? khugepaged+0x0/0x1310
[] kthread+0x96/0xa0
[] child_rip+0xa/0x20
[] ? kthread+0x0/0xa0
[] ? child_rip+0x0/0x20
INFO: task firefox:19169 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
firefox D 0000000000000004 0 19169 18845 0x00000084
ffff8811e7209788 0000000000000086 0000000000000000 ffff880e16508df8
ffff8811e7209718 ffffffffa058c86a 0000000000000000 0000000000000000
ffff882056b825f8 ffff8811e7209fd8 000000000000fb88 ffff882056b825f8
Call Trace:
[] ? arc_buf_eviction_needed+0x6a/0xa0 [zfs]
[] __mutex_lock_slowpath+0x13e/0x180
[] mutex_lock+0x2b/0x50
[] zfs_zinactive+0x7b/0x110 [zfs]
[] zfs_inactive+0x7f/0x220 [zfs]
[] zpl_clear_inode+0xe/0x10 [zfs]
[] clear_inode+0xac/0x140
[] dispose_list+0x40/0x120
[] shrink_icache_memory+0x274/0x2e0
[] shrink_slab+0x12a/0x1a0
[] do_try_to_free_pages+0x3f7/0x610
[] try_to_free_pages+0x92/0x120
[] ? __alloc_pages_direct_compact+0x40/0x1c0
[] __alloc_pages_nodemask+0x478/0x8d0
[] alloc_pages_vma+0x9a/0x150
[] handle_pte_fault+0x76b/0xb50
[] ? pte_alloc_one+0x37/0x50
[] ? do_huge_pmd_anonymous_page+0xb9/0x380
[] handle_mm_fault+0x23a/0x310
[] __do_page_fault+0x139/0x480
[] ? do_mmap_pgoff+0x33a/0x380
[] do_page_fault+0x3e/0xa0
[] page_fault+0x25/0x30
INFO: task Socket Thread:19190 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Socket Thread D 0000000000000008 0 19190 18845 0x00000080
ffff880cc544bcf0 0000000000000086 ffffffff81484f40 0000000000000000
00000b500000356b ffff8810084d4040 ffff880612d28840 ffff880cc544be58
ffff8810048bfab8 ffff880cc544bfd8 000000000000fb88 ffff8810048bfab8
Call Trace:
[] ? ip_queue_xmit+0x190/0x420
[] rwsem_down_failed_common+0x95/0x1d0
[] rwsem_down_read_failed+0x26/0x30
[] call_rwsem_down_read_failed+0x14/0x30
[] ? down_read+0x24/0x30
[] __do_page_fault+0x18f/0x480
[] ? __d_free+0x3f/0x60
[] ? d_free+0x58/0x60
[] ? mntput_no_expire+0x30/0x110
[] ? __fput+0x1a1/0x210
[] do_page_fault+0x3e/0xa0
[] page_fault+0x25/0x30
INFO: task DNS Resolver #1:19204 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
DNS Resolver D 0000000000000011 0 19204 18845 0x00000080
ffff880e44761d80 0000000000000086 0000000000000000 ffff8820539987c0
0000000000000d4c 0000000000000000 0000000000000000 0000000000000000
ffff881006b99058 ffff880e44761fd8 000000000000fb88 ffff881006b99058
Call Trace:
[] ? futex_wake+0x10e/0x120
[] rwsem_down_failed_common+0x95/0x1d0
[] rwsem_down_read_failed+0x26/0x30
[] call_rwsem_down_read_failed+0x14/0x30
[] ? down_read+0x24/0x30
[] sys_madvise+0x18c/0x7b0
[] ? sys_futex+0x7b/0x170
[] ? audit_syscall_entry+0x1d7/0x200
[] ? __audit_syscall_exit+0x265/0x290
[] system_call_fastpath+0x16/0x1b
INFO: task DNS Resolver #2:19205 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
DNS Resolver D 0000000000000011 0 19205 18845 0x00000080
ffff880a3cd3dd80 0000000000000086 0000000000000000 ffff8820539987c0
0000000000000d4c 0000000000000000 0000000000000000 0000000000000000
ffff8810085ab098 ffff880a3cd3dfd8 000000000000fb88 ffff8810085ab098
Call Trace:
[] ? futex_wake+0x10e/0x120
[] rwsem_down_failed_common+0x95/0x1d0
[] rwsem_down_read_failed+0x26/0x30
[] call_rwsem_down_read_failed+0x14/0x30
[] ? down_read+0x24/0x30
[] sys_madvise+0x18c/0x7b0
[] ? sys_futex+0x7b/0x170
[] ? audit_syscall_entry+0x1d7/0x200
[] ? __audit_syscall_exit+0x265/0x290
[] system_call_fastpath+0x16/0x1b
INFO: task DNS Resolver #3:19206 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
DNS Resolver D 0000000000000011 0 19206 18845 0x00000080
ffff880a31f29d80 0000000000000086 0000000000000000 ffff8820539987c0
0000000000000d4c 0000000000000000 0000000000000000 0000000000000000
ffff8810048a3af8 ffff880a31f29fd8 000000000000fb88 ffff8810048a3af8
Call Trace:
[] ? futex_wake+0x10e/0x120
[] rwsem_down_failed_common+0x95/0x1d0
[] rwsem_down_read_failed+0x26/0x30
[] call_rwsem_down_read_failed+0x14/0x30
[] ? down_read+0x24/0x30
[] sys_madvise+0x18c/0x7b0
[] ? sys_futex+0x7b/0x170
[] ? audit_syscall_entry+0x1d7/0x200
[] ? __audit_syscall_exit+0x265/0x290
[] system_call_fastpath+0x16/0x1b
INFO: task ps:19504 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ps D 0000000000000014 0 19504 19469 0x00000084
ffff8815e4e13c50 0000000000000086 0000000000000000 ffff8815e4e13c48
ffff882069bcbd40 ffff8815e4e13bd8 ffffffff8119b00a ffff8815e4e13d18
ffff882051cbdab8 ffff8815e4e13fd8 000000000000fb88 ffff882051cbdab8
Call Trace:
[] ? dput+0x9a/0x150
[] rwsem_down_failed_common+0x95/0x1d0
[] ? path_to_nameidata+0x25/0x60
[] rwsem_down_read_failed+0x26/0x30
[] call_rwsem_down_read_failed+0x14/0x30
[] ? down_read+0x24/0x30
[] __access_remote_vm+0x41/0x1f0
[] ? security_inode_permission+0x1f/0x30
[] ? nameidata_to_filp+0x54/0x70
[] access_process_vm+0x5b/0x80
[] proc_pid_cmdline+0x6d/0x120
[] ? alloc_pages_current+0xaa/0x110
[] proc_info_read+0xad/0xf0
[] vfs_read+0xb5/0x1a0
[] sys_read+0x51/0x90
[] ? __audit_syscall_exit+0x265/0x290
[] system_call_fastpath+0x16/0x1b
INFO: task ps:19517 blocked for more than 600 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
ps D 0000000000000006 0 19517 18743 0x00000080
ffff8815e4e19c50 0000000000000082 0000000000000000 ffff8815e4e19c48
ffff882069bcbd40 ffff8815e4e19bd8 ffffffff8119b00a ffff8815e4e19d18
ffff882056a65ab8 ffff8815e4e19fd8 000000000000fb88 ffff882056a65ab8
Call Trace:
[] ? dput+0x9a/0x150
[] rwsem_down_failed_common+0x95/0x1d0
[] ? path_to_nameidata+0x25/0x60
[] rwsem_down_read_failed+0x26/0x30
[] call_rwsem_down_read_failed+0x14/0x30
[] ? down_read+0x24/0x30
[] __access_remote_vm+0x41/0x1f0
[] ? security_inode_permission+0x1f/0x30
[] ? nameidata_to_filp+0x54/0x70
[] access_process_vm+0x5b/0x80
[] proc_pid_cmdline+0x6d/0x120
[] ? alloc_pages_current+0xaa/0x110
[] proc_info_read+0xad/0xf0
[] vfs_read+0xb5/0x1a0
[] sys_read+0x51/0x90
[] ? __audit_syscall_exit+0x265/0x290
[] system_call_fastpath+0x16/0x1b

Any ideas?

Thanks,

Steve

@cousins
Copy link
Author

cousins commented Aug 8, 2013

@behlendorf Just checking to see what the status is. Any idea on when a fix may be out? I have reduced zfs_arc_max to 30 GB which seems to be helping in general but I don't know if it will do anything as far as this long-term problem goes. This system has 128 GB of RAM. Would I be better off by setting xfs_arc_max to something even lower or would it potentially help performance if I tried to increase it a bit. The default value seemed to play a part in making the system unstable.

Thanks,

Steve

@behlendorf
Copy link
Contributor

Several fixes were merged in too master just in the last few days which may help. Can you pull the latest SPL/ZFS source and use that. I've also heard reports of issues with transparent huge pages with RHEL6.3.

@cousins
Copy link
Author

cousins commented Aug 9, 2013

Thanks @behlendorf and @ryao . I have gotten installed latest code and it is running now. We'll see how it does over the weekend and (hopefully) beyond.

Have a great weekend.

Steve

@behlendorf
Copy link
Contributor

Since we haven't heard anything back I'm going to assume things are better. We'll open a new issue if that's not the case.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants