Disable direct reclaim on zvols #669

ryao · 2012-04-16T11:23:16Z

Previously, it was possible for the direct reclaim path to be invoked
when a write to a zvol was made. When a zvol is used as a swap device,
this often causes swap requests to depend on additional swap requests,
which deadlocks. We address this by disabling the direct reclaim path
on zvols.

This closes issue #342.

pyavdr · 2012-04-16T12:21:28Z

As for my Opensuse 12.1, this patch don´t work. Increased the min_free_kbytes too and patched zvol.c, but with memtester the system freezes as soon it hits the swap space. I tried it twice: once it freezes at 270 MB,next time at 670 MB.

ryao · 2012-04-16T12:23:16Z

@pyavdr Does your kernel have CONFIG_PREEMPT_VOLUNTARY=y set? Preemptible kernels will still deadlock, but the solution for that should come with the resolution of issue #83.

pyavdr · 2012-04-16T12:43:57Z

This is a default Kernel, CONFIG_PREEMPT_NONE=y (is set) CONFIG_PREEMPT_VOLUNTARY= is not set and CONFIG_PREEMT is not set. So i need to recompile the kernel to test yr patch.

ryao · 2012-04-16T12:45:51Z

CONFIG_PREEMPT_NONE=y is what should be set. How much RAM do you have? What is your pool configuration? How quickly does it freeze without this patch? If you are booting off ZFS, did you remember to rebuild your initramfs?

ryao · 2012-04-16T12:50:09Z

@pyavdr I just tried running memtester on my system and it works. Note that I also have patches from pull requests #618, #651 and #660 installed, so they might be the reason my system is stable while yours is not. In particular, I suspect that pull request #618 is responsible. Pull request #660 might also play a role.

pyavdr · 2012-04-16T13:13:15Z

Ok, i use 12 GB in a vm session, can go up to 25 GB when needed. ( pool is latest version, 4 drives, no raidz/mirror, 10GB zfsswap volume) I would like to try that patches too, but can´t figure out how to apply all these various patches in different states (I edited yr patch manually). So i need some time to apply them. The "how to" to apply all these patches is widespreaded allover. If you find the time, maybe you can put them together in a single one, it would be easier to figure out how to apply them. But besides that, if you really found it, congratulations!

ryao · 2012-04-16T13:17:02Z

@pyavdr You can get the code with all of these patches from the gentoo branch of my fork:

https://github.com/gentoofan/zfs/tree/gentoo

I intend to snapshot that branch when I feel that ZFS is ready to enter Gentoo's testing tree.

By the way, pull request #660 resolves a stability issue that predominantly affects systems with more than 8GB of RAM. It seems like it might address your issue. Alternatively, you could reduce the amount of RAM that you give to your virtual machine.

pyavdr · 2012-04-16T13:55:58Z

Ok, i applied #618, #660 and #669, set min_free_kbytes to 131072, using 12 GB RAM, started memtester ... system freezes not completly, but almost. There is some write traffic to the zfsswap, but cant move cursor or type on cmdline.
Did not work.

ryao · 2012-04-16T13:58:40Z

How much RAM are you asking memtester to use? It will do mlock, which will prevent the kernel from having access to the RAM. It is possible that is what is killing your system. On my system with 8GB of RAM, I only let memtester take half of it. There was a significant lag when it did this because of mlock, although it went away after the kernel finished reorganizing system memory.

Also note that 131072 is the correct value for my system, but the correct value for yours could be higher given that you have more RAM.

pyavdr · 2012-04-16T14:19:18Z

Ok, applied #651, set vm.min_free_kbytes=512000. Started memtester with 2000/4000/6000/8000/10000 no problem, runs through, no swap needed. Started memtester with 12000 : mhhhhh : freeze. some traffic on the zfsswap devices. After some minutes: My windows 7 host gives a BSOD: memory management. Need to reboot the whole system , brrrrrr.

ryao · 2012-04-16T14:27:44Z

I cannot say that this is surprising. Memtester does allocations that are incapable of being swapped. When you run memtester, memory is effectively removed from your system. Giving it nearly all of your RAM like you did would likely kill Linux regardless of the filesystem used.

You need to find something else to do allocations that can be swapped. A few instances of python -c "print 2**10**10" will likely suffice. I had originally assumed that you were using memtester alongside something like this because of memtester's ability to reduce your system's effective memory.

pyavdr · 2012-04-16T14:45:12Z

Ok, starting to start instances of python -c "print 21010" & ... one ... 20 sec waiting... two .... three and so on ... after 7 instances the system hits the swap space which leads to freeze.

ryao · 2012-04-16T14:50:41Z

What is the default setting for vm.min_free_kbytes? Did you try increasing it?

pyavdr · 2012-04-16T15:06:57Z

I started some hours ago with the default of something 68000 ... increased it to 131072 ... and finally to 512000.

ryao · 2012-04-16T15:09:36Z

Are you certain that you properly patched the kernel modules and that you are running tests with the updated setting? If you are using code from my GIT, did you do git checkout gentoo before building anything?

pyavdr · 2012-04-16T15:30:34Z

Yes, im pretty sure. What i did to merge the patches: edit the specific files ( .../module/zfs/arc.c ...) with the changes in the patches. These are only few lines, no problem. After that: make; make install. Make finds the changes, compiles and links. "Make install" copies the news to destination, then reboot. In the last weeks i changed several source files in zpool.c or zfs.c, so that procedure works as expected. Just to be sure : For now i tried it from start: make clean; ./configure; make; make install; reboot; after starting the pythons: it freezes again. Don´t know these "git" interface, need to check this, there is already a book lying here on my desk :-).

behlendorf · 2012-04-16T17:26:24Z

Thanks for digging in to this, I suspected something like this was going on. There have been a number of similar deadlocks to this which were resolved by disabling direct reclaim where appropriate. In general, I've tried to do this in as targeted a manor as possible.

As I'm sure you saw I have been forced to resort to setting PF_MEMALOC on occasion. Considerable care needs to be taken when doing this to ensure no other bits gets mistakenly cleared or set in the current->flags. For this reason, it's usually better to target the specific memory allocations which are causing the issue. Direct reclaim can be disabled for those kmem_alloc()'s by using KM_PUSHPAGE instead of KM_SLEEP.

In this case I'm not 100% sure it will be possible to pass KM_PUSHPAGE to all the offending allocations under zvol_write() but it's worth investigating since it may be cleaner. Getting a full stack of the exact deadlock would be helpful.

However, if we stick with your proposed patch you're going to need to move setting PF_MEMALLOC to the very top of the function, or even better in to a wrapper function. Both the zil_commit() and zfs_range_lock() have the potential to enter direct relcaim and they are outside the flag.

ryao · 2012-04-16T23:07:45Z

@behlendorf I am not certain if a wrapper function would be appropriate given that there is only a single call to zvol_write(), but it might be useful to write a helper macro that takes a variable, its type, some flags to set temporarily in that variable, a function pointer and the arguments to that function. It could be used to wrap the zvol_write() call, but I am not sure where the appropriate place would be to put this macro.

I have modified the patch to address the issues that you highlighted. I have also updated the pull request message at the top. In theory, we could use thread local storage to control the value used in all allocations that are currently use KM_SLEEP so that we can flip them to KM_PUSHPAGE on demand. That would work well with the wrapper macro idea.

In addition, I suspect that the reason that I need to set vm.min_free_kbytes is because indirect reclaim can fail. I believe that results in the additional deadlock that I have observed with this patch where the system is not immediately crippled. I think that we can solve that by modifying the SPL to maintain a pool of pages as per /proc/sys/kernel/spl/vm/swapfs_reserve and to provide a thread local storage flag that ZFS can set to permit indirect reclaim to draw from those pages. That probably should be a separate patch.

ryao · 2012-04-16T23:44:37Z

On second thought, maintaining a pool of pages in the SPL that are released on demand would suffer from a race condition where another thread could steal the pages meant for ZFS on SMP systems. This could also happen on uniprocessor systems where preemption is possible if we are not careful. Addressing that could require implementing a memory allocator for ZFS, which should be able to guarantee that pages reserved for ZFS would only be used by ZFS.

ryao · 2012-04-17T01:16:15Z

I have done some additional testing. This patch permits a single disk pool to use swap on a zvol if vm.min_free_kbytes is sufficiently high, but it does not appear to have the same effect on a pool with a single 6-disk raidz vdev on my server, which has 16GB of RAM. Increasing the value of vm.min_free_kbytes to 1048576 permits some amount of writing to swap, but then all writes will stop as what appears to be a soft deadlock occurs. Running the reboot command as root before the soft deadlock becomes a hard deadlock appears to return the system to a sane state during the shutdown process.

behlendorf · 2012-04-17T18:35:11Z

@gentoofan Your updated patch doesn't do what you think it does. The zvol_dispatch() function will dispatch the zvol_write() function to be executed in the context of one of the zvol taskq threads. So your setting the PF_MEMALLOC flag in the dispatching thread, but that will have no effect since zvol_write() will be done by one of the taskq worker thread. If we're going to take the PF_MEMALLOC approach this bit must be set in zvol_write(). I'd suggest a wrapped function like this.

__zvol_write()
{
        /* Existing zvol_write() implementation */
}

zvol_write()
        if (current->flags & PF_MEMALLOC) {
                error = __zvol_write()
        } else {
                current->flags |= PFMEMALLOC;
                error = __zvol_write()
                current->flags &= ~PFMEMALLOC;
        }

        return (error);
}

This should also resolve your indirect reclaim case for kswapd which will already have set PF_MEMALLOC. See commit 6a95d0b for a better explanation of this race, we fixed a similar subtle issue in the mmap code this same way.

Longer term I think the best way to address this is still use KM_PUSHPAGE in all the offending allocations. This is what all the other filesystems in the Linux kernel do, they must be very careful about not allocating any memory in the write path. If they absolutely have too them this flag can be used... check really maps to GFP_NOFS.

behlendorf · 2012-04-17T18:44:30Z

Related to this, I still really want to see a stack to ensure we're addressing the real deadlock here. Do you happen to have a trivial reproducer for a VM, I'm setup to get a stack.

ryao · 2012-04-17T20:47:41Z

@behlendorf Thanks for catching that. I will revise the patch shortly.

As for reproducing this, give the VM 2G of RAM and do this:

zfs create -o primarycache=metadata -V 2G rpool/swap
mkswap -f /dev/zvol/rpool/swap
swapon /dev/zvol/rpool/swap
python -c "print 2**10**10"

ryao · 2012-04-17T22:19:09Z

@behlendorf I have pushed a revised version of my patch, but I think that this still needs more work.

swap appears to work properly on both my desktop and my server and there is no longer any need to edit vm.min_free_kbytes. Unfortunately, I was able to observe a hard deadlock on my desktop when running an instance of python -c "print 2**10**10" for each of its 4 logical cores simultaneously.

I believe that a deadlock can occur where other threads consume pages as indirect reclaim frees them, starving the kernel thread that needs the pages to be able to swap. I think that can be fixed by implementing a TLS flag in the SPL that would enable us to flip allocations to KM_NOSLEEP. This will guarantee that allocations will fallback on emergency memory pools, which should prevent the deadlock I observed under load.

ryao · 2012-04-18T03:16:43Z

It looks like using PF_MEMALLOC is inappropriate:

http://lkml.indiana.edu/hypermail/linux/kernel/0911.2/00576.html

The main issue is that PF_MEMALLOC permits ZFS to take pages out of ZONE_DMA. That could cause a crash by exhausting pages available for DMA. We should be able to address this issue by flipping allocations to use KM_NOSLEEP instead of setting PF_MEMALLOC.

behlendorf · 2012-04-18T03:49:51Z

Exactly right, PF_MEMALLOC is a bit of a last resort and we should avoid using it if at all possible. The two places in the existing code where I was forced to use it are because I was unable to modify the exact point of allocation because it was in the kernel proper. Seeing PF_MEMALLOC allowed me to work around the issue without forcing people to patch their kernels.

Anyway, back to this particular issue. I agree the best solution is to pass the proper flags at all the offending allocation points. KM_PUSHPAGE should be enough for this, I don't think we need to resort to KM_NOSLEEP which comes with it's own issues. We just need to avoid a deadlock due to reentering reclaim while we're writing out pages. I'll try and get some stacks tomorrow which I expect will make the issue a bit more concrete.

ryao · 2012-04-18T04:19:35Z

@behlendorf If you have a kernel patch to eliminate the need for PF_MEMALLOC in ZFS code that is ready for upstream, I could try talking to Greg Kroah-Hartman about sending it to Linus Torvalds for inclusion.

Greg is a Gentoo developer and he might be willing to assist my ZFS efforts in Gentoo.

behlendorf · 2012-04-18T04:29:53Z

@gentoofan Yes and no. Ricardo and I started a nice thread on linux-mm with Andrew Morton and got everyone to agree that in fact this is a real bug which should be fixed. However, the right fix (in the thread) in pretty invasive and ends up touching all the various arch's which makes it a testing nightmare. Anyway, since I was able to work around it (which I need to do for older kernels anyway) I stopped pushing to issue. Also since it relates to vmalloc() which is something we need to stop using heavily in the long run I didn't feel it was worth the fight. Still, I encourage you to read the thread.

http://marc.info/?l=linux-mm&m=128942194520631&w=4

ryao · 2012-04-18T16:53:15Z

@behlendorf I tried modifying the code to use KM_PUSHPAGE, but the system will not write to swap:

https://github.com/gentoofan/spl/commits/gentoo
https://github.com/gentoofan/zfs/commits/spl-swap

I am still examining this, but any thoughts that you might have would be appreciated.

ryao · 2012-04-18T17:15:54Z

Increasing vm.min_free_kbytes to 524288 enables my new patchset to swap. Without that, the system refuses to write to swap, but it does not hard deadlock immediately. Lower values might also work, although I have not tested them yet.

Also, the deadlock involving 4 simultaneous python processes does not appear to occur with my new patchset.

behlendorf · 2012-04-18T21:43:27Z

To be honest, I'm not a big fan of the tsd approach. Having the lower layers modify the passed flags is asking for problems in my view. Plus the tsd code is already rarely used in zfs and I've been tempted a few times to remove it. I'd prefer to either:

A) Just set PF_MEMALLOC

B) Explicitly pass KM_PUSHPAGE for all impacted allocations. This might be a little broad but it hardly any worse than disabling reclaim for all zvol writes.

I will try to spend some time on this myself over the next week or two.

ryao · 2012-04-19T00:18:17Z

The following appears to work:

dd if=/dev/zero of=/swap bs=4096 count=2097152
losetup -f /swap
mkswap /dev/loop0
swapon /dev/loop0
python -c "print 2**10**10"

I guess the question should be what the loopback device does that zvols fail to do.

ryao · 2012-04-19T00:47:01Z

@behlendorf It looks like the loopback device works because of the following lines in taskq_thread() in the SPL, which set PF_MEMALLOC:

   /* Disable the direct memory reclaim path */
   if (tq->tq_flags & TASKQ_NORECLAIM)
           current->flags |= PF_MEMALLOC;

ryao · 2012-04-19T01:19:03Z

I noticed the the loopback device sets nice to -20. I patched the SPL to use that as well and ran my stress tests. My desktop no longer deadlocks when running 4 simultaneous python processes, so I have opened a pull request with zfsonlinux/spl:

openzfs/spl#99

Addressing the issue with PF_MEMALLOC taking pages from ZONE_DMA is important, but that probably would be best addressed as part of a more comprehensive fix.

ryao · 2012-04-19T02:03:58Z

I have revised my patch to set the flag that is being used in the codepath taken by the loopback device. I have also revised the commit message to reflect that. I am now at the point where I feel that this should close issue #342.

Previously, it was possible for the direct reclaim path to be invoked when a write to a zvol was made. When a zvol is used as a swap device, this often causes swap requests to depend on additional swap requests, which deadlocks. We address this by disabling the direct reclaim path on zvols. This closes issue openzfs#342.

behlendorf · 2012-04-30T17:14:43Z

@ryao So what's the latest on this change? I lost track of the latest testing. Is setting TASKQ_NORECLAIM enough to resolve most issues? If so I'm not adverse to merging it since it clear does help, although I suspect this will need more work.

ryao · 2012-04-30T17:59:14Z

@behlendorf Setting TASKQ_NORECLAIM eliminated all issues that I have encountered with swap on zvols. The only known issue is the theoretical issue of DMA pages being consumed by ARC.

The dma-kmalloc* entries in my desktop's /proc/slabinfo only show a single slab consuming DMA pages after several days of uptime and several instances of heavy swap usage. This suggests to me that crashes caused by DMA page consumption would be incredibly rare in practice:

slabinfo - version: 2.1

name <active_objs> <num_objs> : tunables : slabdata <active_slabs> <num_slabs>

delayed_node extent_map extent_buffers extent_state btrfs_free_space_cache btrfs_path_cache btrfs_transaction_cache btrfs_trans_handle_cache btrfs_inode_cache fat_inode_cache fat_cache ip6_dst_cache UDPLITEv6 UDPv6 tw_sock_TCPv6 TCPv6 nv_stack_t fuse_request fuse_inode xfs_inode xfs_efd_item xfs_buf_item xfs_trans xfs_da_state xfs_log_ticket nfs_direct_cache nfs_write_data nfs_read_data nfs_inode_cache rpc_inode_cache reiser_inode_cache ext4_inode_cache ext4_xattr ext4_free_data ext4_allocation_context ext4_prealloc_space ext4_io_end jbd2_journal_handle jbd2_journal_head jbd2_revoke_table jbd2_revoke_record cfq_io_cq cfq_queue bsg_cmd mqueue_inode_cache hugetlbfs_inode_cache kioctx dnotify_mark dio pid_namespace UDP-Lite ip_fib_trie UDP tw_sock_TCP TCP blkdev_integrity blkdev_queue blkdev_requests blkdev_ioc fsnotify_event_holder bip-256 bip-128 bip-64 bip-16 sock_inode_cache file_lock_cache net_namespace shmem_inode_cache 1543 1722 Acpi-ParseExt Acpi-State Acpi-Namespace task_delay_info taskstats proc_inode_cache sigqueue bdev_cache sysfs_dir_cache inode_cache dentry buffer_head vm_area_struct mm_struct files_cache signal_cache sighand_cache task_xstate task_struct anon_vma_chain anon_vma radix_tree_node idr_layer_cache dma-kmalloc-8192 dma-kmalloc-4096 dma-kmalloc-2048 dma-kmalloc-1024 dma-kmalloc-512 dma-kmalloc-256 dma-kmalloc-128 dma-kmalloc-64 dma-kmalloc-32 dma-kmalloc-16 dma-kmalloc-8 dma-kmalloc-192 dma-kmalloc-96 kmalloc-8192 kmalloc-4096 kmalloc-2048 kmalloc-1024 kmalloc-512 kmalloc-256 kmalloc-128 kmalloc-64 260022 261504 kmalloc-32 kmalloc-16 kmalloc-8 70704 205312 kmalloc-192 kmalloc-96 kmem_cache kmem_cache_node 0 0 328 24 2 : tunables 0 0 0 : slabdata 0 0 0
0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 224 18 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 144 28 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 416 19 2 : tunables 0 0 0 : slabdata 0 0 0
0 0 72 56 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 1328 24 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 824 19 4 : tunables 0 0 0 : slabdata 0 0 0
0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0
100 100 320 25 2 : tunables 0 0 0 : slabdata 4 4 0
0 0 1216 26 8 : tunables 0 0 0 : slabdata 0 0 0
104 104 1216 26 8 : tunables 0 0 0 : slabdata 4 4 0
0 0 320 25 2 : tunables 0 0 0 : slabdata 0 0 0
60 60 2112 15 8 : tunables 0 0 0 : slabdata 4 4 0
85 90 12288 2 8 : tunables 0 0 0 : slabdata 45 45 0
0 0 624 26 4 : tunables 0 0 0 : slabdata 0 0 0
0 0 832 19 4 : tunables 0 0 0 : slabdata 0 0 0
0 0 1088 30 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 400 20 2 : tunables 0 0 0 : slabdata 0 0 0
0 0 224 18 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 280 29 2 : tunables 0 0 0 : slabdata 0 0 0
0 0 488 16 2 : tunables 0 0 0 : slabdata 0 0 0
0 0 216 18 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 176 23 1 : tunables 0 0 0 : slabdata 0 0 0
38 38 832 19 4 : tunables 0 0 0 : slabdata 2 2 0
0 0 768 21 4 : tunables 0 0 0 : slabdata 0 0 0
0 0 1152 28 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 960 17 4 : tunables 0 0 0 : slabdata 0 0 0
0 0 880 18 4 : tunables 0 0 0 : slabdata 0 0 0
0 0 1096 29 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 88 46 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 56 73 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 136 30 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 120 34 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 1128 29 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 24 170 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 112 36 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0
429 429 104 39 1 : tunables 0 0 0 : slabdata 11 11 0
374 374 232 17 1 : tunables 0 0 0 : slabdata 22 22 0
0 0 312 26 2 : tunables 0 0 0 : slabdata 0 0 0
16 16 1024 16 4 : tunables 0 0 0 : slabdata 1 1 0
24 24 680 24 4 : tunables 0 0 0 : slabdata 1 1 0
0 0 448 18 2 : tunables 0 0 0 : slabdata 0 0 0
338 338 152 26 1 : tunables 0 0 0 : slabdata 13 13 0
50 50 640 25 4 : tunables 0 0 0 : slabdata 2 2 0
60 60 2120 15 8 : tunables 0 0 0 : slabdata 4 4 0
0 0 1024 16 4 : tunables 0 0 0 : slabdata 0 0 0
292 292 56 73 1 : tunables 0 0 0 : slabdata 4 4 0
64 64 1024 16 4 : tunables 0 0 0 : slabdata 4 4 0
224 304 256 16 1 : tunables 0 0 0 : slabdata 19 19 0
370 425 1920 17 8 : tunables 0 0 0 : slabdata 25 25 0
36 36 112 36 1 : tunables 0 0 0 : slabdata 1 1 0
60 60 2088 15 8 : tunables 0 0 0 : slabdata 4 4 0
397 505 352 23 2 : tunables 0 0 0 : slabdata 23 23 0
306 306 120 34 1 : tunables 0 0 0 : slabdata 9 9 0
680 680 24 170 1 : tunables 0 0 0 : slabdata 4 4 0
7 7 4224 7 8 : tunables 0 0 0 : slabdata 1 1 0
0 0 2176 15 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 1152 28 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 384 21 2 : tunables 0 0 0 : slabdata 0 0 0
1428 1428 768 21 4 : tunables 0 0 0 : slabdata 68 68 0
285 285 208 19 1 : tunables 0 0 0 : slabdata 15 15 0
48 48 2560 12 8 : tunables 0 0 0 : slabdata 4 4 0
760 21 4 : tunables 0 0 0 : slabdata 82 82 0
2856 2856 72 56 1 : tunables 0 0 0 : slabdata 51 51 0
102 102 80 51 1 : tunables 0 0 0 : slabdata 2 2 0
1428 1428 40 102 1 : tunables 0 0 0 : slabdata 14 14 0
1998 2100 136 30 1 : tunables 0 0 0 : slabdata 70 70 0
96 96 328 24 2 : tunables 0 0 0 : slabdata 4 4 0
6193 6358 744 22 4 : tunables 0 0 0 : slabdata 289 289 0
1950 1950 160 25 1 : tunables 0 0 0 : slabdata 78 78 0
64 64 1024 16 4 : tunables 0 0 0 : slabdata 4 4 0
21924 21924 144 28 1 : tunables 0 0 0 : slabdata 783 783 0
1498 1848 680 24 4 : tunables 0 0 0 : slabdata 77 77 0
199402 210276 216 18 1 : tunables 0 0 0 : slabdata 11682 11682 0
741 741 104 39 1 : tunables 0 0 0 : slabdata 19 19 0
45848 46536 168 24 1 : tunables 0 0 0 : slabdata 1939 1939 0
563 578 960 17 4 : tunables 0 0 0 : slabdata 34 34 0
632 828 704 23 4 : tunables 0 0 0 : slabdata 36 36 0
481 728 1216 26 8 : tunables 0 0 0 : slabdata 28 28 0
411 525 2176 15 8 : tunables 0 0 0 : slabdata 35 35 0
694 980 576 28 4 : tunables 0 0 0 : slabdata 35 35 0
720 828 1728 18 8 : tunables 0 0 0 : slabdata 46 46 0
40528 42075 48 85 1 : tunables 0 0 0 : slabdata 495 495 0
19830 20628 112 36 1 : tunables 0 0 0 : slabdata 573 573 0
8697 9632 568 28 4 : tunables 0 0 0 : slabdata 344 344 0
660 660 544 30 4 : tunables 0 0 0 : slabdata 22 22 0
0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 2048 16 8 : tunables 0 0 0 : slabdata 0 0 0
0 0 1024 16 4 : tunables 0 0 0 : slabdata 0 0 0
16 16 512 16 2 : tunables 0 0 0 : slabdata 1 1 0
0 0 256 16 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0
0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0
16410 17964 8192 4 8 : tunables 0 0 0 : slabdata 4491 4491 0
397 512 4096 8 8 : tunables 0 0 0 : slabdata 64 64 0
1009 1120 2048 16 8 : tunables 0 0 0 : slabdata 70 70 0
3915 3936 1024 16 4 : tunables 0 0 0 : slabdata 246 246 0
1254 1328 512 16 2 : tunables 0 0 0 : slabdata 83 83 0
7999 13280 256 16 1 : tunables 0 0 0 : slabdata 830 830 0
5787 11360 128 32 1 : tunables 0 0 0 : slabdata 355 355 0
64 64 1 : tunables 0 0 0 : slabdata 4086 4086 0
36095 42752 32 128 1 : tunables 0 0 0 : slabdata 334 334 0
4864 4864 16 256 1 : tunables 0 0 0 : slabdata 19 19 0
8 512 1 : tunables 0 0 0 : slabdata 401 401 0
99450 130998 192 21 1 : tunables 0 0 0 : slabdata 6238 6238 0
3021 7098 96 42 1 : tunables 0 0 0 : slabdata 169 169 0
42 42 192 21 1 : tunables 0 0 0 : slabdata 2 2 0
192 192 128 32 1 : tunables 0 0 0 : slabdata 6 6 0

behlendorf · 2012-04-30T18:14:42Z

Awesome, then I'll merge this patch in to master since it's clearly safe and improves stability. It may be a little broad but we can always revisit this latter if that leads to issues such as larger latencies. Thank you again for all your testing of this change and iterating with me on a reasonable fix.

behlendorf · 2012-04-30T19:44:50Z

Merged as commit ce90208

ryao mentioned this pull request Apr 16, 2012

Support swap on zvol #342

Closed

ryao mentioned this pull request Apr 19, 2012

ARC grows well past the zfs_arc_max #676

Closed

This was referenced Apr 19, 2012

Import Illumos #734: taskq_dispatch_prealloc() desired #684

Closed

Deadlock with hung tasks in kmalloc #446

Closed

mounting btrfs FS on zfs zvol hangs #469

Closed

'BUG: scheduling while atomic' #688

Closed

behlendorf closed this Apr 30, 2012

Disable direct reclaim on zvols #669

Disable direct reclaim on zvols #669

Conversation

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

ryao commented Apr 16, 2012

pyavdr commented Apr 16, 2012

behlendorf commented Apr 16, 2012

ryao commented Apr 16, 2012

ryao commented Apr 16, 2012

ryao commented Apr 17, 2012

behlendorf commented Apr 17, 2012

behlendorf commented Apr 17, 2012

ryao commented Apr 17, 2012

ryao commented Apr 17, 2012

ryao commented Apr 18, 2012

behlendorf commented Apr 18, 2012

ryao commented Apr 18, 2012

behlendorf commented Apr 18, 2012

ryao commented Apr 18, 2012

ryao commented Apr 18, 2012

behlendorf commented Apr 18, 2012

ryao commented Apr 19, 2012

ryao commented Apr 19, 2012

ryao commented Apr 19, 2012

ryao commented Apr 19, 2012

behlendorf commented Apr 30, 2012

ryao commented Apr 30, 2012

name <active_objs> <num_objs> : tunables : slabdata <active_slabs> <num_slabs>

behlendorf commented Apr 30, 2012

behlendorf commented Apr 30, 2012