-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Huge performance drop (30%~60%) after upgrading to 0.7.9 from 0.6.5.11 #7834
Comments
The use of scatter/gather lists for the ARC rather than chopping up vmalloc()'d blocks does incur a performance hit, but this seems a bit much... |
@DeHackEd is there any module param or compile-time define I can set in order to disable s/g on 0.7.9 and redo benchmarks? |
PD: I forgot to add, I've tried 0.7.9 with kernel 4.4.152 (from elrepo), but results are even a little bit worse (~5% slower) than 0.7.9 with redhat's stock kernel. |
@pruiz you can set We've also done some work in rthe master branch to improve performance. If you're comfortable building ZFS from source it would be interesting to see how the master branch compares on your hardware. |
Hi @behlendorf, Here are some preliminary results with 0.7.9 + zfs_abd_scatter_enabled=0 (same zpool & data set settings as previously):
|
And here are some preliminary results with 0.7.9 + zfs_compressed_arc_enabled=0 (same zpool & data set settings as in my original testing):
|
And results from 0.7.9 with zfs_abd_scatter_enabled=0 + zfs_compressed_arc_enabled=0:
|
I'll try master tomorrow and report here.. |
Well, I've built zfs from master (v0.7.0-1533_g47ab01a), and initial testing does not look promising :(
|
Possibly the slowdown with the 0.7x version is somewhere in the codepath taken because of logbias=throughput on the dataset? Asking as I'm running with logbias=latency and I vaguely remember to have benchmarked 0.7 to be faster than the 0.6 series I upgraded a certain system from a while ago... |
I did some tests with logbias=latency with similar results. But I will repeat them and post here.
…Sent from my iPhone
On 28 Aug 2018, at 15:23, Gregor Kopka ***@***.***> wrote:
Possibly the slowdown with the 0.7x version is somewhere in the codepath taken because of logbias=throughput on the dataset? Asking as I'm running with logbias=latency and I vaguely remember to have benchmarked 0.7 to be faster than the 0.6 series I upgraded a certain system from a while ago...
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
One thing I didn't originally notice from your first post is that |
@behlendorf yeah, our intended use in this case is for a db server, so 8k or 16k should be the optimal recordsize.. probably not as common as bigger recordsizes, as you stated. Anyway, I would more than happy to test other configurations/options if you guys need it. |
@pruiz , great job! |
Tests with recordsize=4k, logbias=throughput (fio, using randrw 70/30, as usual, with both 1G & 16G test files):
NOTEs:
I will try to add test results against v0.7.9 if I find some spare time tonite, as I would love to know wether those 4k/SYNC IOPs results of v0.7-master are reproducible with v0.7.9 too. |
Using zfs-test performance regression, I'm seeing similar regression for cached reads, random reads, and random writes. I'll start bisecting commits between 0.6.5.11 and 0.7.0 tags. |
I would like to add some details here. I have used ZFS on FreeBSD for almost 10 years, it has always had decent ZFS performance. But I have a newer build with only SSD's and Optane 900p as slog and the sync write performance is really bad. I've compared with different Linux distributions and other filesystems. The tool I use to test sync write performance is pg_test_fsync Here is the performance on my FreeBSD server with 3x raidz * 5x 5400RPM spinners (15 disks total) and with Optane 32GB.
With 6x striped 800GB enterprise class ssds, Optane 900p as slog and ZFS on Linux 0.8.0-rc1 on Ubuntu 18.04
For comparison, exact same hardware, default settings ZoL 0.7.9, benchmarked with pg_test_fsync
If there is anything I can help with, please ask. I now know how to build from source 8-) |
What are the settings on the zfs filesystem (default of recordsize=128k would explain quite a lot)? |
In my case it is default, which is a dynamic record size, which means that when pg_test_fsync writes, ZFS will write 8kb blocks. |
ZFS will maintain files in a filesystem in $recordsize sized blocks on-disk. In case you want zfs to write 8k on-disk blocks: |
The recordsize is dynamic, it writes 8kb even if the recordsize is higher. As this is easy to test I can confirm that I get the exact same numbers. Also iostat says the slog, which is an Optane 900p is writing around 20-30MB/s |
IIRC Recordsize is dynamic for files with size < than recordsize, or then compression is enabled. |
Are you sure you're not being impacted by the write throttle? It can be tuned and https://github.com/zfsonlinux/zfs/wiki/ZFS-Transaction-Delay |
Are you able to bisect this at all, even just using released versions? As a starting point, is 0.7.0 good like 0.6.5.11 or bad like 0.7.9? |
Are you running stock settings or what was tweaked here?
would help too if you can. From what I can tell above, 0.7 may have lower bandwidth and io/s, but it has quite a bit lower latency. |
One commit that comes to mind for anyone able to bisect is 1ce23dc. It will [EDIT] increase [/EDIT] latency for single-threaded synchronous workloads such as |
So the change in In master, https://github.com/zfsonlinux/zfs/pull/7736/commits reduced taskq context switching thus solved the above issue. @pruiz Would you be able to test with master or 0.8 code to verify? |
Additionally, I noticed two things:
Pre write throttle numbers:
7736 numbers
So we may still have some regressions. I'm looking at #2 now. Does it make sense to open new issue(s)? |
I'd also like to mention that disabling dynamic taskqs ( |
Thanks! |
I use fio to test the performance of zfs-0.7.11 zvol, the write amplification more than 6x, this seriously affects the performance of zvol.
|
@kpande I don't quite understand what you mean. Can you elaborate more? |
You are handling all ZIL writes via indirect sync (logbias=throughout). This will trash your ability to aggregate read I/o over time due to data/metadata fragmentation, and will even greatly reduce your ability to agg between one data block and another. Any outstanding async write in the same sync domain may suffer as well. I understand the desire for throughput but here it is coming at the expense of the pool data at large. In the real world, you would seldom set up a dataset like this unless read performance was totally unimportant. If you will read from a block at least once, it's worth doing direct sync. If you test with logbias=latency, you need to either add a SLOG or increase zfs_immediate_write_sz. I'd recommend doing a ZFS send while you watch zpool iostat -r. With 16k indirect writes you should have some absolutely amazing unaggregatable fragmentation. |
Another note - it looks like you are suffering reads even on full block writes. This should help greatly with that: |
We have encountered a very similar issue, in the form of a significant performance drop between zfs 0.6.5.9 and 0.7.11. We are able to overcome the issue by setting zfs_abd_scatter_enabled=0 & zfs_compressed_arc_enabled=0. We are using Debian Stretch (version 9.8) and linux kernel 4.9.0-8-amd64. Our recordsize is 128K and I don't think we would be able to decrease it. |
I too have seen cases were ABD scatter/gather isn't as performant. So I can I don't believe disabling compressed ARC will make much difference. Perhaps on |
I tried enabling zfs_compressed_arc_enabled and zfs_abd_scatter_enabled in separate tests. |
@pauful What compression algorithm are you using on your datasets? lz4 is very fast to decompress but gzip would certainly cause issues. |
@jwittlincohen lz4 is the compression option used in our pools. |
@pruiz Have you tried current master? Some performance oriented commits have been applied so far |
@matveevandrey not yet, but I have it on TODO.
…On Fri, Jun 21, 2019 at 4:23 AM matveevandrey ***@***.***> wrote:
@pruiz <https://github.com/pruiz> Have you tried current master? Some
performance oriented commits have been applied so far
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7834>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABOV6PO2OQLGGV32ELMUB3P3Q3SVANCNFSM4FRUHVPA>
.
|
Did somebody do some performance tests on 0.8.2 and would like to share? |
@pruiz could You do the same test on 0.8.2 with the hardware You mentioned earlier? |
Right now I am have quite limited time, maybe in a week or two..
…On Wed, Oct 9, 2019 at 12:03 PM Interduo ***@***.***> wrote:
@pruiz <https://github.com/pruiz> could You do the same test on 0.8.2
with the hardware You mentioned earlier?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#7834>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABOV6MXPQ5AB7Q2RM6SIG3QNWT7HANCNFSM4FRUHVPA>
.
|
@pruiz could You do the same test on 0.8.3 with the hardware You mentioned earlier? |
just a lurker on this bug here as I saw a similar drop on systems here back in 2018 doing the same transition from 0.6.5 to 0.7.x, decreasing performance to about 1/5 - 1/6th when running v0.7 so had to fail back. Been testing randomly over the last two years with newer versions against the same array but still in the 0.7 line but no luck. This last week just tried 0.8.3 and performance is back / comparable with 0.6.5. This is on one of my larger dev/qa systems and will watch it closely for the next month before upgrading the other systems. So 0.8.3 looks promising. Just wanted to bump this and see if pruiz could validate if this also alleviates his original problem. |
@stevecs try 0.8.5, this version is having more ios. |
@interduo I'll see if I can get another window but it will be probably a couple weeks. I did a quick look a the commit deltas between 0.8.3 and 0.8.5 but didn't see much to catch my eye for I/O improvements (though did spot a couple other commits that were interesting). Can you give me a hint as to what commits you think may be relevant? |
I just jumped from 0.8.2 to 0.8.5 with nice surprise on io graphs. I didn't look at commits. |
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions. |
System information
Describe the problem you're observing
I've found a huge performance drop between zfs 0.6.5.11 and 0.7.9 which the following system/setup:
At such system I've created the following RAID10 zpool:
And the following datasets:
All of it created using the following commands:
While benchmarking (using fio among other tools) against DATA/db-data dataset, I've found quite a huge performance difference between version 0.6.5.11 and 0.7.9 of zfs/spl. As can be seen next.
As can be seen, performance drops at both IOPs and BW, for all use cases, examples:
IOPs intensive workload (~30% difference):
** 0.6.5.11 => 4k,1,1,SYNC => 6526/2775 - 25.5MB/10.8MB
** 0.7.9 => 4k,1,1,SYNC => 4236/1821 - 16.5MB/7.2MB
BW intensive workload (~60% difference):
** 0.6.5.11 => 256k,16,16,NOSYNC => 8480/3637 - 2120MB/909MB
** 0.7.9 => 256k,16,16,NOSYNC => 3844/1654 - 889MB/404MB
Tests have been performed using the following commands, using average from 3 repetitions on each case.
NOTEs:
The text was updated successfully, but these errors were encountered: