ZFS Performance with dedup and newer kernels #7769

yetao · 2018-08-07T11:24:04Z

System information

Type	Version/Name
Distribution Name	Debian
Distribution Version	Jessie
Linux Kernel	4.1.30 & 4.4.127 & 4.14.59
Architecture	amd64
ZFS Version	0.6.4.2 & 0.7.6 & 0.7.9
SPL Version	0.6.4.2 & 0.7.6 & 0.7.9

Describe the problem you're observing

I have performance issues that seems to be present only in recent kernels. I have backup servers running many write operations concurrently and using dedup (servers are plenty of RAM: 512GB and I have tunned arc settings).

With kernel 4.1 and ZFS 0.6.4.2 all was working fine, also with kernel 3.12 and same ZFS version. Upgrading kernel to 4.4.127 and ZFS to 0.7.9 slowed things a lot. Backup jobs dind't complete because of minimal performance and deletes had absolutely terrible speed.

We reverted back to kernel 4.1 and ZFS 0.7.6 (zpools were upgraded, so we can't put 0.6.4.2 again). Things are a lot better now: deletes are usable and performance is fine if we don't go above certain limit of concurrent operations. But it's very odd that with this ZFS version write speed with concurrent operations slows down to a few KB/s above certain limit. That was not happening with ZFS 0.6.4.2, the performance distributed well according to the number of parallel operations.

Note that these inconveniences are only happening when using dedup. For non-dedup pools, all it's fine with any kernel or ZFS version.

I know I'm talking about a 'kernel problem', but ZFS performance is the only thing which seems to be affected by this. It's difficult to reproduce the workload of those servers and play with versions/options, I cannot force either the load when things are almos blocked, because they are production servers. Therefore I made some benchmarking in test VMs to compare each version and try to get some metrics that could explain this behavior. I'm not sure if this is the root cause of the whole thing, but I have seen iops are reduced significantly under the same conditions, just playing with ZFS and kernel versions.

Describe how to reproduce the problem

Create a zpool, using a disk device directly and then create a ZFS filesystem with dedup and compress (LZ4). Run some fio testing.

The command I'm running for benchmarking is the following:
fio --time_based --name=benchmark --size=2G --runtime=60 --filename=/mnt/airback/volumes/agg1/TEST-DEDUP/supertest.0 --ioengine=libaio --randrepeat=0 --iodepth=128 --direct=0 --invalidate=1 --verify=0 --verify_fatal=0 --numjobs=50 --rw=randwrite --blocksize=4k --group_reporting

These are some results:
Kernel 4.1.30 + ZFS 0.6.4.2: io=233288KB, aggrb=3727KB/s, minb=3727KB/s, maxb=3727KB/s, mint=62584msec, maxt=62584msec
Kernel 4.1.30 + ZFS 0.7.6: io=250112KB, aggrb=4128KB/s, minb=4128KB/s, maxb=4128KB/s, mint=60584msec, maxt=60584msec
Kernel 4.4.127 + ZFS 0.7.9: io=155344KB, aggrb=2566KB/s, minb=2566KB/s, maxb=2566KB/s, mint=60526msec, maxt=60526msec
Kernel 4.14.59 + ZFS 0.7.6: io=154232KB, aggrb=2565KB/s, minb=2565KB/s, maxb=2565KB/s, mint=60111msec, maxt=60111msec

As it can bee seen, 4.1 offers more ios and bandwith in all cases than the other two tested kernels.

I tried some extra tunning (zfs_vdev_sync_write_max_active=64 zio_taskq_batch_pct=90 zfs_vdev_async_write_max_active=64 zfs_dirty_data_max_percent=20 zfs_txg_timeout=3 zvol_threads=64), but I had not any improvement.

I could run any other test you suggest.

Skaronator · 2018-09-04T20:39:09Z

Maybe it's the meltdown/spectre patches? 3.12 and 4.1 are EOL AFAIK while 4.4 and 4.14 should have gotten the meltdown/spectre patches which slows down I/O.

What's your CPU? Might be good to run the same test on AMD's ZEN CPUs since they aren't affected by Meltdown.

rincebrain · 2018-09-04T20:45:43Z

If that's the case, you could try booting with pti=off spectre_v2=off nospec_store_bypass_disable and see if perf improves, though that's, uh, Unsafe in general, for obvious reasons.

Also, which hardware (CPU/chipset/etc) is this running on? AIUI the Spectre mitigations can be very expensive for some workloads, and how painful can depend on CPU family as well.

yetao · 2018-09-06T09:44:46Z

I was already trying with pti=off for kernels 4.4 and 4.14. Certainly, not with the other options @rincebrain suggested. However, sadly it seems there are not appreciable difference with or without these options for me.

In fresh tests I got this:
Kernel 4.14.59 + ZFS 0.7.6 + pti=off: io=81432KB, aggrb=1339KB/s, minb=1339KB/s, maxb=1339KB/s, mint=60798msec, maxt=60798msec
Kernel 4.14.59 + ZFS 0.7.6 + pti=off spectre_v2=off nospec_store_bypass_disable: io=83848KB, aggrb=1393KB/s, minb=1393KB/s, maxb=1393KB/s, mint=60189msec, maxt=60189msec

Kernel 4.4.127 + ZFS 0.7.9 + pti=off: WRITE: io=150592KB, aggrb=2503KB/s, minb=2503KB/s, maxb=2503KB/s, mint=60159msec, maxt=60159msec
Kernel 4.4.127 + ZFS 0.7.9 + pti=off spectre_v2=off nospec_store_bypass_disable: WRITE: io=141872KB, aggrb=2309KB/s, minb=2309KB/s, maxb=2309KB/s, mint=61428msec, maxt=61428msec

Remember these tests are ran in test VMs, so absolute values are not too significative because of the possible load of the underlying server in every moment. But relative values between tests are relevant. Conditions are the same at the moment of every set of tests.

CPU of the VM server is Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz.

However, the production server where the original problem happened (which forced us to downgrade the kernel to 4.1.x) is a Intel(R) Xeon(R) CPU E5-2603 0 @ 1.80GHz. I have no AMD chipset to test.

An important point is I'm not sure if this difference in ios/bandwidth is responsible of the original problem with delete performance. It could be, but I'm not really sure.

About the other point I commented about the performance drop in ZFS 0.7.6 / 0.7.9 when concurrent operations go above certain limit, setting zfs_arc_max = zfs_arc_min and zfs_abd_scatter_enabled=0 have improved the situation. Not sure if definitively though. I haven't tested it in the original server yet.

It seems they are different problems:

Slow deletes and slow bandwidth/ios (not sure if related) in kernels > 4.1. (ZZFS 0.7.6 or 0.7.9)
Performance drop in writes above certain load (all kernels and ZFS 0.7.6 or 0.7.9).
I just wanted to clarify it.

About #5182, it's something new to me, I'll take a look to it, thanks.

stale · 2020-08-25T04:21:44Z

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

behlendorf added the Type: Performance Performance improvement or performance problem label Aug 14, 2018

stale bot added the Status: Stale No recent activity for issue label Aug 25, 2020

stale bot closed this as completed Nov 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZFS Performance with dedup and newer kernels #7769

ZFS Performance with dedup and newer kernels #7769

yetao commented Aug 7, 2018

Skaronator commented Sep 4, 2018

rincebrain commented Sep 4, 2018 •

edited

yetao commented Sep 6, 2018

stale bot commented Aug 25, 2020

ZFS Performance with dedup and newer kernels #7769

ZFS Performance with dedup and newer kernels #7769

Comments

yetao commented Aug 7, 2018

System information

Describe the problem you're observing

Describe how to reproduce the problem

Skaronator commented Sep 4, 2018

rincebrain commented Sep 4, 2018 • edited

yetao commented Sep 6, 2018

stale bot commented Aug 25, 2020

rincebrain commented Sep 4, 2018 •

edited