Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reinstate zvol_taskq to fix aio on zvol #5824

Closed
wants to merge 2 commits into from

Conversation

tuxoko
Copy link
Contributor

@tuxoko tuxoko commented Feb 23, 2017

Commit 37f9dac removed the zvol_taskq for processing zvol request. I imagined
this was removed because after we switched to make_request_fn based, we no
longer received request from interrupt.

However, this also made all bio request synchronous, and cause serious
performance issue as the bio submitter would wait for every bio it submitted,
effectly making iodepth to be 1.

This patch reinstate zvol_taskq, and refactor zvol_{read,write,discard} to
make them take bio as argument.

Signed-off-by: Chunwei Chen david.chen@osnexus.com

How Has This Been Tested?

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My code follows the ZFS on Linux code style requirements.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • Change has been approved by a ZFS on Linux member.

@mention-bot
Copy link

@tuxoko, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bprotopopov, @behlendorf and @edillmann to be potential reviewers.

rl_t *rl;
dmu_tx_t *tx;
#ifdef HAVE_GENERIC_IO_ACCT
unsigned long start_jif = jiffies;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I prefer to see the start time jiffies adjacent to the generic_start_io_acct()
indeed, you can even include the generic_start_io_acct() inside the ifdef, for clarity

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be nice. You can even now declare the variable there since 4a5d7f8 was merged.

@@ -56,9 +56,11 @@

unsigned int zvol_inhibit_dev = 0;
unsigned int zvol_major = ZVOL_MAJOR;
unsigned int zvol_threads = 32;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that reflecting the current value of zvol_threads via blk_update_nr_requests() would make this more consistent with other blk devices.
A stretch goal, which is fine to defer as later work, is to allow zvol_threads to be changed via nr_requests in sysfs.

Copy link
Contributor

@behlendorf behlendorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By paralyzing this again with a taskq we're potentially allowing for IOs to be submitted and completed out of order. For example, the request queue might contain a WRITE followed by a READ for the same LBA. That could end up being processed by the taskq as a READ followed by a WRITE which wouldn't be good.

One possible way to correctly handle this would be to take the required range locks synchronously in zvol_request() prior to dispatching the bio to the taskq. That way we're certain all the operations will be processed in the order in which they were received and still done in parallel when possible. We may however need to take some to ensure this can never deadlock due to pending requests in the taskq.

This would also be a great opportunity to investigating implementing the zvol as a multi-queue block device. In theory this should also allow us to improve the sustained IOPs rate for a zvol.

zvol_state_t *zv = q->queuedata;
fstrans_cookie_t cookie = spl_fstrans_mark();
uint64_t offset = BIO_BI_SECTOR(bio)<<9;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Let's add single spaces around the << operator.

rl_t *rl;
dmu_tx_t *tx;
#ifdef HAVE_GENERIC_IO_ACCT
unsigned long start_jif = jiffies;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that would be nice. You can even now declare the variable there since 4a5d7f8 was merged.

@tuxoko
Copy link
Contributor Author

tuxoko commented Feb 27, 2017

One possible way to correctly handle this would be to take the required range locks synchronously in zvol_request() prior to dispatching the bio to the taskq. That way we're certain all the operations will be processed in the order in which they were received and still done in parallel when possible. We may however need to take some to ensure this can never deadlock due to pending requests in the taskq.

I have thought of that, just wasn't sure if we need it. But I think your right that we should do this. I don't think there's anyway of deadlock since the lock dependency doesn't change at all.

This would also be a great opportunity to investigating implementing the zvol as a multi-queue block device. In theory this should also allow us to improve the sustained IOPs rate for a zvol.

I had wanted to do that for a long time, but it would mean we need to support two completely different zvol_request path due to legacy kernel support. So I haven't brought myself to do it.

@behlendorf
Copy link
Contributor

I don't think there's anyway of deadlock since the lock dependency doesn't change at all.

I can't think of a scenario either since we won't be taking any locks in the taskq function.

@sempervictus
Copy link
Contributor

sempervictus commented Feb 28, 2017

This PR seems to be hitting the codepath which causes the bug i describe in #4265. I ran the same non sync IO tiotests against a fresh ZVOL created before the upgrade, and recreated after the upgrade, and saw something rather interesting. In one of the test cycles after the upgrade, the linear/random write speed ratio flipped - the linear writes went >2X faster than the random, but the random slowed down to hell. I have Linux prefetch disabled on ZVOLs using fad4380317, both before and after, and up till now have not seen anything since the ZVOL overhaul which made linear writes work even near the speed of the underlying SSD.

As i mention in the issue, the ZVOL use case is rather important for shared storage, and today, its impractical for significantly loading consumers and/or contending ones. If there is something serializing the write requests into blocking IOs, this PR might be hitting it, and while the results arent great from the paltry bench i ran (random writes @ half the speed, sometimes, maybe), it seems to be in the right ballpark.

Here's what i'm seeing using the same non-sync-IO tiotest i ran for the issue benchmarks:
Pre patch:

| Write        1024 MBs |    6.8 s | 151.517 MB/s |   8.0 %  | 767.9 % |
| Random Write  469 MBs |    1.9 s | 241.722 MB/s |  24.2 %  | 756.3 % |
| Write        1024 MBs |    7.5 s | 137.415 MB/s |   7.7 %  | 711.5 % |
| Random Write  469 MBs |    2.0 s | 234.369 MB/s |  19.4 %  | 751.2 % |
| Write        1024 MBs |    7.5 s | 136.030 MB/s |   7.1 %  | 703.2 % |
| Random Write  469 MBs |    3.0 s | 157.182 MB/s |  12.9 %  | 524.1 % |

Post patch:

| Write        1024 MBs |    7.5 s | 135.746 MB/s |  11.2 %  | 484.5 % |
| Random Write  469 MBs |    1.8 s | 265.354 MB/s |  23.2 %  | 261.1 % |
| Write        1024 MBs |    3.5 s | 289.724 MB/s |  17.1 %  | 477.1 % |
| Random Write  469 MBs |    3.8 s | 122.846 MB/s |   8.1 %  | 265.2 % |
| Write        1024 MBs |    7.0 s | 145.581 MB/s |   9.5 %  | 249.2 % |
| Random Write  469 MBs |    2.3 s | 199.482 MB/s |  15.1 %  | 176.1 % |

3X loop of tiotest to reproduce:

for i in 1 2 3; do tiotest -R -d /dev/zvol/dpool/images/ssd-testvol -r 30000 -f 256 -t 4 ; done

EDIT:

This PR also destroys read speeds. Looking at the same benchmarks, i'm guessing we're getting blocked in the AIO layer before we hit ARC:
Pre patch:

| Read         1024 MBs |    0.8 s | 1260.590 MB/s |  64.9 %  | 1142.5 % |
| Random Read   469 MBs |    1.3 s | 369.003 MB/s |  35.4 %  | 1337.2 % |
| Read         1024 MBs |    0.8 s | 1236.318 MB/s |  59.3 %  | 1130.5 % |
| Random Read   469 MBs |    1.3 s | 351.625 MB/s |  37.2 %  | 1282.8 % |
| Read         1024 MBs |    0.8 s | 1245.544 MB/s |  65.5 %  | 1121.4 % |
| Random Read   469 MBs |    1.3 s | 358.593 MB/s |  32.2 %  | 1320.2 % |

Post patch:

| Read         1024 MBs |    4.2 s | 245.208 MB/s |  26.8 %  | 419.7 % |
| Random Read   469 MBs |    1.8 s | 259.411 MB/s |  28.3 %  | 409.6 % |
| Read         1024 MBs |    4.2 s | 244.125 MB/s |  24.4 %  | 419.7 % |
| Random Read   469 MBs |    1.8 s | 259.415 MB/s |  28.8 %  | 408.0 % |
| Read         1024 MBs |    4.2 s | 244.016 MB/s |  25.5 %  | 421.1 % |
| Random Read   469 MBs |    1.8 s | 260.071 MB/s |  30.5 %  | 404.5 % |

@sempervictus
Copy link
Contributor

Running these workloads at 15% of the volsize 10X (each overwriting the last, supposedly), the performance goes down to 80/80MB for writes, while reads appear capped at that ~250 mark.

On a related note, we need a consistent benchmarking approach which accounts for all the desirable synthetic counters (fio, others?) and real-world applications written by people who dont think about IO models or OS semantics (tiotest, others?), and produces common outputs we can all work with. Different users will be interested in different representations of said benchmarks and their associated tunables, since people backing ASM with ZVOL and those backing Cinder will probably not have the same IO patterns (till a Cinder user builds a RAC in their cloud).

@tuxoko: This work is obviously of great interest to us, so if you need us to test ideas/options, we can definitely allocate some resources, including bare metal if need be.

@tuxoko
Copy link
Contributor Author

tuxoko commented Feb 28, 2017

@sempervictus
Thanks, for the testing. That's strange, I wouldn't expect it would affect sequential read that much.

Also, I'm still struggling with some performance inconsistencies. I originally test this patch on a bare metal 8 spindle disks mirror pair pool with 256GB ram. Doing fio 4k randwrite with 32 iodepth would cause strange behavior as it would write for 3~5 sec and hang for 0.5~2min and repeat. I figured out that reducing zfs_dirty_data_max from 10% total mem to 1G greatly reduce such behavior.

However, when testing on a KVM with ssd back virual disks with 8GB ram. Doing the same test would still have intermittent hang for 2~4sec.

@sempervictus
Copy link
Contributor

@tuxoko: the readahead patch was not helping me here, dropped that, added TRIM support, and numbers are looking a lot better on the same 840 SSD.
Reads after TRIM + allowing Linux readahead ot work on ZVOLs:

| Read         1024 MBs |    0.7 s | 1394.343 MB/s |  79.6 %  | 968.1 % |
| Random Read   469 MBs |    2.0 s | 232.904 MB/s |  27.5 %  | 426.8 % |
| Read         1024 MBs |    0.7 s | 1372.518 MB/s |  65.8 %  | 967.8 % |
| Random Read   469 MBs |    2.0 s | 229.402 MB/s |  31.0 %  | 420.7 % |
| Read         1024 MBs |    0.7 s | 1391.270 MB/s |  74.3 %  | 972.9 % |
| Random Read   469 MBs |    2.0 s | 234.224 MB/s |  27.7 %  | 425.2 % |

Writes in the same conditions:

| Write        1024 MBs |    3.2 s | 319.611 MB/s |  20.4 %  | 459.8 % |
| Random Write  469 MBs |    3.6 s | 128.858 MB/s |  10.9 %  | 131.5 % |
| Write        1024 MBs |    3.6 s | 286.579 MB/s |  15.5 %  | 485.8 % |
| Random Write  469 MBs |    1.2 s | 407.385 MB/s |  23.3 %  | 479.7 % |
| Write        1024 MBs |    3.7 s | 273.846 MB/s |  11.6 %  | 430.9 % |
| Random Write  469 MBs |    0.9 s | 496.128 MB/s |  39.7 %  | 588.1 % |

To see how reuse of the volume space affects this stack, i ran the same test 20X with 8 threads instead of 4 to double up on the amount written. Its a bit "all over the place" in terms of numbers, which is slightly concerning, but definitely shows improvement, and doesnt have a clear trend of degrading the performance down to a crawl. Results paste - http://pastebin.com/BvQ9XLnu

@behlendorf
Copy link
Contributor

behlendorf commented Feb 28, 2017

It may be worth exploring creating multiple zvol taskqs. Previously there was some concern over contention of the taskq spin lock which could result in inconsistent behavior. One per multi-queue might work well.

On a related note setting maxalloc to the number of threads might be helpful to restrict how fast new tasks can be added to the taskq. If this approach seems promising we'd want to improve the blocking code in the spl's task_alloc() function.

@dweeezil
Copy link
Contributor

I just got a test rig set up for this and have run a couple of initial 1 hour write-only tests with this fio configuration:

[test]
        blocksize=8k
        scramble_buffers=1
        disk_util=0
        invalidate=0
        size=256m
        numjobs=32
        create_serialize=1
        direct=1
        filename=/dev/zvol/tank/v1
        offset=0
        offset_increment=10g
        group_reporting=1
        ioengine=libaio
        iodepth=10
        rw=write
        thread=1
        time_based=1
        runtime=3600
        fsync=0
        fallocate=none

The pool is 50 spinning disks in a set of 25 2-drive mirrors so it ought to have pretty good bandwidth and iops. I capped the ARC at 32GiB for this test, but the test never used much more than 8GiB of it. One test was run with today's master and the other with the zvol taskq patch rebased to the same master. The test system has 40 cores by 2 threads (80 threads).

Picking just the write bandwidth and IO depth output from current master:

  write: io=1000.8GB, bw=291491KB/s, iops=36436, runt=3600049msec
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=131172559/d=0, short=r=0/w=0/d=0

and with the taskq it is:

  write: io=1043.2GB, bw=303849KB/s, iops=37981, runt=3600005msec
  IO depths    : 1=0.1%, 2=0.1%, 4=0.1%, 8=100.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.1%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=136732348/d=0, short=r=0/w=0/d=0

I'd be more than happy to run any other battery of tests but this one doesn't show a whole lot of difference.

@sempervictus
Copy link
Contributor

@dweeezil: Thanks for putting together such a robust rig :). What happens when job concurrency is replaced with thread concurrency? A la numjobs=1 thread=32 with size=8192m (if i remember my FIO correctly)?

@dweeezil
Copy link
Contributor

@sempervictus Actually, thanks to a corporate citizen with an interest in the advancement of OpenZFS.

Anyway, I will try the same test with threads but in my experience it generally doesn't change the results much. I normally run with processes for easier access via strace, etc.

I'd sure love to be able to concoct an fio work load that exposes a big problem with the current zvol (req based) system but so far haven't been able to do so.

@sempervictus
Copy link
Contributor

sempervictus commented Feb 28, 2017

@dweeezil: I generally use fio to work out how databases and other consumers with synchronous IO requirements will fare, or at least predictable access requirements based on their workload. When it comes to dealing with consumers such as KV indices/datastores everyone's so fond of these days, i actually often find tiotest and fio with direct=0 to be a slightly better indicator of the likely profile some of these chaotic consumers will need.

So far as test profiles go, i'd say we should keep something like what youre currently doing as a baseline, add non direct tests, and implement some real-world scenarios like everyone's favorite iSCSI export. With AIO working again, SCST might be a viable option (though IIRC, it needs SG hooks), but LIO is always there and seems to induce undue iowait on ZVOLs in real world environments. There's always sync=always if you want to ramp up the hate factor...

I've seen strange things like multiple VMs using LVM iSCSI exports from the same large ZVOL and occasionally bringing it to a crawl with some unpleasant order/priority of small IOs which just happen to line up poorly. I've always thought this to be indicative of some misunderstanding between the OS and ZFS IO elevators. There's also ASM/RAC, but that requires a faustian licensing bargain, and potentially the NFS workload which can become a different kind of mess depending on the NFS conditions in kernel and between hosts.

@richardelling
Copy link
Contributor

The value of this is not in the average, overall performance. The value is to provide an async experience to those consumers who don't handle async themselves. Specifically, a consumer that expects generic_make_request() to return prior to the BIO_END_IO() will be happier. Such consumers exist, but beware of testing with the various load generators because they tend to have pluggable engines with differing policies.

@sempervictus
Copy link
Contributor

sempervictus commented Feb 28, 2017 via email

@tuxoko
Copy link
Contributor Author

tuxoko commented Mar 1, 2017

However, when testing on a KVM with ssd back virual disks with 8GB ram. Doing the same test would still have intermittent hang for 2~4sec.

OK, so this is probably due how qemu/kvm handles virtual disk cache. I turn off cache and things looks a lot better.

@richardelling
Copy link
Contributor

@sempervictus I think we are in agreement. Regression testing is important. In this case, there are only a few consumers that are blocking on this, so the appropriate test is needed to prove the functionality works.

@tuxoko I think an appropriate change is to keep the current, blocking behaviour between what is essentially generic_start_io_acct() and generic_end_io_acct(), when zvol_threads == 0. Only create the threads and offload to them when zvol_threads > 1. That way we can take care of the blocking consumers as an exception, rather than the rule. Thoughts?

@tuxoko
Copy link
Contributor Author

tuxoko commented Mar 2, 2017

@richardelling
What do you meaning by blocking consumer? submit_bio is always expected as an asynchronous interface. If the user wants to wait, it will setup bio->bi_end_io to signal completion and explicitly wait for it. And there's no way zvol would know what the user want.

@richardelling
Copy link
Contributor

@tuxoko as I read the current code, BIO_END_IO completes before zvol_request completes. I'll use the term blocking, because the request blocks until after the I/O completes. It does not matter if the consumer sets a callback. Functionally, this is fine. There appear to be very few consumers that will expose this behaviour.

@sempervictus
Copy link
Contributor

sempervictus commented Mar 15, 2017

So looks like i found some viable tuning knobs in sysfs relating to how the block device writes are handled with this patch:
1 - Linear writes via caching strategy - /sys/devices/virtual/block/zdX/queue/write_cache going from 'write back' to 'write through' using this patch roughly doubles the linear write performance on a used-and-abused ZVOL from ~180MB/s to ~400MB/s. Makes sense, we already have ARC, why write-back?
2 - Random writes by eliminating merging at the Linux layer - /sys/devices/virtual/block/zdX/queue/nomerges going from 0 to 1 took random writes from ~250MB/s to ~700MB/s. Also makes sense since the ZFS pipeline will aggregate/merge those IOs anyway. Another 40-80MB/s in random writes seems to come from setting rotational to 1, which i'm at a loss to explain.
We may want to set ZVOLs to be 'write through' by default with nomerge at 1 if @dweeezil or others can reproduce these results. The numbers seem to "merit checking out."

EDIT: these tests were done with read_ahead_kb set to 0, compression=lz4, on a single 850 Pro. This PR, the trim PR, and multi-threaded SPA sync are included in the stack, build was done yesterday. I can pull the commit ref if needed.

EDIT2: i took a quick look through blkdev_compat.h and it looks like calling blk_queue_set_write_cache as blk_queue_set_write_cache(zv->zv_queue, B_FALSE, B_TRUE); should remove the write cache thing, but QUEUE_FLAG_NOMERGES looks a bit more complicated, and i've not yet even found where we hook the max_readahead_kb control. Benchmark loops in background are still producing the same results, seems to slow down ~15% with volblocksize=4k (from 8), but no significant increase past 8K. Also, does anyone know how we play with the blk_mq vs blk_sg differences, if at all? Seems the MQ changes fit rather well into the idea of worker threads in ZFS.

@sempervictus
Copy link
Contributor

@dweeezil: any chance you could add #5902 to this patch stack in your testbed and run those numbers again with FIO? Initial testing on my end with O_DIRECT still shows a pretty significant improvement.

@dweeezil
Copy link
Contributor

@sempervictus Sure. I'll add it to my todo list this weekend (which was mainly to get the TRIM stuff refreshed). Presumably my current 4.9.8 upstream kernel will be OK for this testing?

@sempervictus
Copy link
Contributor

sempervictus commented Mar 18, 2017 via email

@ryao
Copy link
Contributor

ryao commented Mar 20, 2017

@tuxoko Right now, only reads are synchronous while writes are asynchronous, unless we write a partial uncached record. In the case of a partial write of a cached record or a write of a whole record (cached or uncached), we return the to userspace after copying into the kernel and the DMU then handles things from there.

The correct solution is to rework the DMU so that asynchronous dispatch of reads and writes occurs there and then execute a callback to notify userspace when it finishes. There actually is a callback notification mechanism in the ZIO layer, but it needs changes to support reading/writing from a userspace buffer.

That was my original plan to take performance to the next level at some point after that patch. zvol performance became a low priority for me soon afterward due to people at my new job at the time not caring about zvols. Now that I am no longer there, I have time to care now.. This definitely has my attention because it is affecting at the moment, but it is number 4 on my list of things to do right now.

@sempervictus
Copy link
Contributor

sempervictus commented Mar 23, 2017 via email

@ryao
Copy link
Contributor

ryao commented Mar 29, 2017

@sempervictus We'll see. It is number 3 on my to do list.

@behlendorf
Copy link
Contributor

behlendorf commented Apr 8, 2017

@tuxoko OK it's pretty clear this is a change we need to make. When you get a moment can you rebase this patch against master so we can get a clean current test run.

EDIT: @sempervictus does this PR still negatively impact your workloads even with your proposed changed in #5902?

@tuxoko
Copy link
Contributor Author

tuxoko commented Apr 10, 2017

Rebased.

Commit 37f9dac removed the zvol_taskq for processing zvol request. I
imagined this was removed because after we switched to make_request_fn
based, we no longer received request from interrupt.

However, this also made all bio request synchronous, and cause serious
performance issue as the bio submitter would wait for every bio it
submitted, effectly making iodepth to be 1.

This patch reinstate zvol_taskq, and to make sure overlapped I/Os are
ordered properly, we take range lock in zvol_request, and pass it along
with bio to the I/O functions zvol_{write,discard,read}.

Signed-off-by: Chunwei Chen <david.chen@osnexus.com>
In order to faciliate getting this merged and aid in benchmarking
add a zvol_request_sync module option to switch between sync and
async request handling.  For the moment, the default behavior
remains sync.

Additionally fix a few minor nits.

Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
@behlendorf
Copy link
Contributor

Based on the discussion above, and what I've seen in other issues, it seems clear that some workloads/configurations benefit from handling bio requests asynchronously while others do better with synchronous semantics.

In order to get to this merged and facilitate more comprehensive performance testing I've added a patch to this PR which adds a zvol_request_sync=1 module option which allows administrators to switch between the sync and async behavior. I've left the default behavior as synchronous while we further investigate the performance implications.

@ryao @tuxoko @sempervictus @richardelling please comment and review.

@sempervictus
Copy link
Contributor

@behlendorf: VM testing hasn't produced any crashes in ztest loops, so that's good - with either 1 or 0 for zvol_request_sync. For clarification, when you say "switch between" do you mean to say that runtime changes to the module option will alter behavior? I've been reloading the module to test in the VM just to be safe.
Will try to get real world numbers in the next day or two. Thank you very much.

@koplover
Copy link

I've been running some benchmark tests on our servers with 0.7.0 RC3 and this PR applied.

I've outlined the results in that thread to keep it all together, see #4880

@behlendorf
Copy link
Contributor

Do you mean to say that runtime changes to the module option will alter behavior?

@sempervictus yes exactly. You can safely switch zvol_request_sync at run time to change the behavior. What you can't currently do is change the maximum number of zvol_threads at run time.

@koplover thanks for testing this patch and posting your results. I really appreciate the feedback. The more data points we have regarding performance the better.

Since this change has been reviewed, passes all the automated + manual testing, and doesn't change the existing behavior I'll be merging it to master. What I'd like to be able to determine over the next week or two is what are the optimal default values and which of the changes in #5902 we need to bring in. As always additional performance analysis is appreciated!

@tuxoko
Copy link
Contributor Author

tuxoko commented Apr 26, 2017

@behlendorf
The module option patch LGTM.

@behlendorf
Copy link
Contributor

@tuxoko thanks! I've squashed the commits and merged this.

wuyan6293 pushed a commit to wuyan6293/zfs that referenced this pull request Nov 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants