Huge performance drop (30%~60%) after upgrading to 0.7.9 from 0.6.5.11 #7834

pruiz · 2018-08-27T01:55:48Z

System information

Type	Version/Name
Distribution Name	CentOS
Distribution Version	7.5
Linux Kernel	3.10.0-862.9.1.el7.x86_64
Architecture	x86_64
ZFS Version	0.6.5.11 => 0.7.9
SPL Version	0.6.5.11 => 0.7.9

Describe the problem you're observing

I've found a huge performance drop between zfs 0.6.5.11 and 0.7.9 which the following system/setup:

System Board: SuperMicro X8DTS
CPU: 2x Intel X5687 (8c) @3.60GHz
RAM: 32GB DDR3
Controller: LSI SAS2 2008 (IT firmware)
Disks: 12 x HGST HUSMR3280ASS201 (800GB, SAS/SSD)

At such system I've created the following RAID10 zpool:

  pool: DATA
 state: ONLINE
  scan: none requested
config:

        NAME                              STATE     READ WRITE CKSUM
        DATA                              ONLINE       0     0     0
          mirror-0                        ONLINE       0     0     0
            wwn-0x5000cca09f004a1c-part1  ONLINE       0     0     0
            wwn-0x5000cca09f004c14-part1  ONLINE       0     0     0
          mirror-1                        ONLINE       0     0     0
            wwn-0x5000cca09f00500c-part1  ONLINE       0     0     0
            wwn-0x5000cca09f005214-part1  ONLINE       0     0     0
          mirror-2                        ONLINE       0     0     0
            wwn-0x5000cca09f0052a8-part1  ONLINE       0     0     0
            wwn-0x5000cca09f005318-part1  ONLINE       0     0     0
          mirror-3                        ONLINE       0     0     0
            wwn-0x5000cca09f005700-part1  ONLINE       0     0     0
            wwn-0x5000cca09f005960-part1  ONLINE       0     0     0
          mirror-4                        ONLINE       0     0     0
            wwn-0x5000cca09f006178-part1  ONLINE       0     0     0
            wwn-0x5000cca09f00640c-part1  ONLINE       0     0     0
          mirror-5                        ONLINE       0     0     0
            wwn-0x5000cca09f00642c-part1  ONLINE       0     0     0
            wwn-0x5000cca09f006530-part1  ONLINE       0     0     0

And the following datasets:

NAME           USED  AVAIL  REFER  MOUNTPOINT
DATA          2.05G  4.22T    96K  none
DATA/db-data  1.01G  4.22T  1.01G  legacy

All of it created using the following commands:

zpool create -o ashift=12   DATA \
  mirror wwn-0x5000cca09f004a1c-part1 wwn-0x5000cca09f004c14-part1 \
  mirror wwn-0x5000cca09f00500c-part1 wwn-0x5000cca09f005214-part1 \
  mirror wwn-0x5000cca09f0052a8-part1 wwn-0x5000cca09f005318-part1 \
  mirror wwn-0x5000cca09f005700-part1 wwn-0x5000cca09f005960-part1 \
  mirror wwn-0x5000cca09f006178-part1 wwn-0x5000cca09f00640c-part1  \
  mirror wwn-0x5000cca09f00642c-part1 wwn-0x5000cca09f006530-part1

zfs set compression=lz4 DATA
zfs set mountpoint=none DATA
zfs create -o mountpoint=/mnt/db-data -o xattr=sa -o acltype=off -o atime=off -o relatime=off -o logbias=throughput -o recordsize=16K -o compression=lz4 DATA/db-data

While benchmarking (using fio among other tools) against DATA/db-data dataset, I've found quite a huge performance difference between version 0.6.5.11 and 0.7.9 of zfs/spl. As can be seen next.

Performance results using v0.6.5.11:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	6526	2775	25.5MB	10.8MB
1G	8K	1	1	SYNC	6494	2783	50.7MB	21.7MB
1G	256K	1	1	SYNC	2267	974	567MB	244MB
1G	1M	1	1	SYNC	754	320	755MB	321MB
1G	4K	16	16	SYNC	21300	9151	83.2MB	35.7MB
1G	8K	16	16	SYNC	21500	9243	168MB	72.2MB
1G	256K	16	16	SYNC	3819	1638	955MB	410MB
1G	1M	16	16	SYNC	1303	560	1304MB	560MB
1G	16K	128	128	SYNC	126000	53900	1965MB	842MB
16G	4K	1	1	NOSYNC	17600	7566	68.9MB	29.6MB
16G	8K	1	1	NOSYNC	18300	7819	143MB	61.1MB
16G	256K	1	1	NOSYNC	3773	1623	936MB	403MB
16G	1M	1	1	NOSYNC	1178	502	1179MB	502MB
16G	4K	16	16	NOSYNC	125000	53400	487MB	209MB
16G	8K	16	16	NOSYNC	114000	48700	888MB	381MB
16G	256K	16	16	NOSYNC	8480	3637	2120MB	909MB
16G	1M	16	16	NOSYNC	2189	939	2189MB	940MB

Performance results using v0.7.9:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	4236	1821	16.5MB	7.2MB
1G	8K	1	1	SYNC	4137	1764	32.3MB	16.8MB
1G	256K	1	1	SYNC	1413	578	353MB	145MB
1G	1M	1	1	SYNC	471	179	471MB	179MB
1G	4k	16	16	SYNC	18200	7791	70.9MB	30.4MB
1G	16k	16	16	SYNC	16000	7257	265MB	113MB
1G	256k	16	16	SYNC	3476	1498	869MB	375MB
1G	1M	16	16	SYNC	1095	467	1096MB	468MB
1G	16k	128	128	SYNC	43600	18600	681MB	291MB
16G	4K	1	1	SYNC	1842	801	7.4MB	3.2MB
16G	8K	1	1	SYNC	1838	791	14.5MB	6.4MB
16G	256K	1	1	SYNC	783	337	196MB	84.4MB
16G	1M	1	1	SYNC	255	108	255MB	109MB
1G	4K	1	1	NOSYNC	34200	14200	134MB	57.5MB
1G	8K	1	1	NOSYNC	33200	14200	259MB	111MB
1G	16K	1	1	NOSYNC	34300	14700	536MB	229MB
1G	32K	1	1	NOSYNC	17700	7612	552MB	238MB
1G	64K	1	1	NOSYNC	10100	4519	629MB	282MB
1G	128K	1	1	NOSYNC	5824	2383	728MB	298MB
1G	256K	1	1	NOSYNC	2847	1221	718MB	305MB
1G	1M	1	1	NOSYNC	756	322	756MB	322MB
1G	4k	16	16	NOSYNC	89800	38500	351MB	150MB
1G	8k	16	16	NOSYNC	88100	37800	688MB	295MB
1G	16k	16	16	NOSYNC	84300	36100	1317MB	565MB
1G	32k	16	16	NOSYNC	42900	18400	1342MB	576MB
1G	64k	16	16	NOSYNC	20900	8964	1305MB	560MB
16G	4K	1	1	NOSYNC	3033	1280	12.2MB	5.1MB
16G	8K	1	1	NOSYNC	2996	1292	23.8MB	10.3MB
16G	16K	1	1	NOSYNC	3976	1707	62.1MB	26.7MB
16G	32K	1	1	NOSYNC	3128	1342	97.8MB	41MB
16G	64K	1	1	NOSYNC	2462	1068	154MB	66.8MB
16G	128K	1	1	NOSYNC	1673	719	209MB	89.9MB
16G	256K	1	1	NOSYNC	750	322	186MB	80.1MB
16G	1M	1	1	NOSYNC	308	134	309MB	135MB
16G	4k	16	16	NOSYNC	24200	10400	94.4MB	40.5MB
16G	8k	16	16	NOSYNC	24400	10400	190MB	81.5MB
16G	256k	16	16	NOSYNC	3844	1654	889MB	404MB
16G	1M	16	16	NOSYNC	1017	433	1017MB	434MB

As can be seen, performance drops at both IOPs and BW, for all use cases, examples:

IOPs intensive workload (~30% difference):
** 0.6.5.11 => 4k,1,1,SYNC => 6526/2775 - 25.5MB/10.8MB
** 0.7.9 => 4k,1,1,SYNC => 4236/1821 - 16.5MB/7.2MB
BW intensive workload (~60% difference):
** 0.6.5.11 => 256k,16,16,NOSYNC => 8480/3637 - 2120MB/909MB
** 0.7.9 => 256k,16,16,NOSYNC => 3844/1654 - 889MB/404MB

Tests have been performed using the following commands, using average from 3 repetitions on each case.

rm -f kk
echo 3 > /proc/sys/vm/drop_caches
sleep 30
fio --filename=kk \
  -name=test --group_reporting --fallocate=none --ioengine=libaio \
  --rw=randrw --rwmixread=70 --refill_buffers --norandommap --randrepeat=0 --runtime=60 \
  --iodepth=$IODEPTH --numjobs=$THREADS \
  --direct=0 --sync=$WMODE --size=$FILESIZE --bs=$BS --time_based

NOTEs:

The zpool and datasets have been re-created from scratch after swapping zfs versions.
When upgrading/downgrading zfs/spl kernel modules, userland utilities were upgraded/downgraded too.
I've tested using whole-disks instead of partitions, with same results.
Tuning zfs module parameters on v0.7.9 (like zfs_vdev_*, etc.) makes no observable difference.
I've tried 0.7.9 with kernel 4.4.152 (from elrepo), but results are even a little bit worse (~5% slower) than 0.7.9 with redhat's stock kernel.

The text was updated successfully, but these errors were encountered:

DeHackEd · 2018-08-27T01:59:11Z

The use of scatter/gather lists for the ARC rather than chopping up vmalloc()'d blocks does incur a performance hit, but this seems a bit much...

pruiz · 2018-08-27T02:34:25Z

@DeHackEd is there any module param or compile-time define I can set in order to disable s/g on 0.7.9 and redo benchmarks?

pruiz · 2018-08-27T02:35:46Z

PD: I forgot to add, I've tried 0.7.9 with kernel 4.4.152 (from elrepo), but results are even a little bit worse (~5% slower) than 0.7.9 with redhat's stock kernel.

behlendorf · 2018-08-27T17:57:18Z

@pruiz you can set zfs_abd_scatter_enabled=0 to force ZFS to use the 0.6.5 allocation strategy and not use scatter/gather lists. You could also try setting zfs_compressed_arc_enabled=0 to disable keeping data compressed in the ARC. Both of these options will increase ZFS's memory footprint and cpu usage, but may improve performance for your test workload. I'd be interested to see your results.

We've also done some work in rthe master branch to improve performance. If you're comfortable building ZFS from source it would be interesting to see how the master branch compares on your hardware.

pruiz · 2018-08-27T22:29:28Z

Hi @behlendorf,

Here are some preliminary results with 0.7.9 + zfs_abd_scatter_enabled=0 (same zpool & data set settings as previously):

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	5582	2399	21.8MB	9.6MB
1G	8K	1	1	SYNC	5155	2209	40.3MB	17.3MB
1G	256K	1	1	SYNC	2197	948	549MB	237MB
1G	1M	1	1	SYNC	687	294	687MB	295MB
1G	4k	16	16	SYNC	22300	9526	86.9MB	37.2MB
1G	8k	16	16	SYNC	22200	9513	174MB	74.3MB
1G	256k	16	16	SYNC	3842	1631	961MB	408MB
1G	1M	16	16	SYNC	1248	528	1248MB	528MB
16G	4K	1	1	NOSYNC	3349	1427	13.1MB	5.7MB
16G	8K	1	1	NOSYNC	3360	1452	26.3MB	11.3MB
16G	256K	1	1	NOSYNC	1674	726	419MB	182MB
16G	1M	1	1	NOSYNC	499	214	499MB	214MB
16G	4K	16	16	NOSYNC	25600	10900	99.8MB	42.7MB
16G	8K	16	16	NOSYNC	25400	10900	199MB	85.1MB
16G	256K	16	16	NOSYNC	4347	1859	1087MB	465MB
16G	1M	16	16	NOSYNC	1183	505	1184MB	506MB

Compared to other tests:

IOPs intensive workload:
** 0.6.5.11 => 4k,1,1,SYNC => 6526/2775 - 25.5MB/10.8MB
** 0.7.9 => 4k,1,1,SYNC => 4236/1821 - 16.5MB/7.2MB
** 0.7.9+scatter=0 => 4k,1,1,SYNC => 5582/2399 - 21.8MB/9.6MB

Results: ~15% increase from plain 0.7.9, still lagging behind 0.6.5.11 (by another ~15%)

BW intensive workload:
** 0.6.5.11 => 256k,16,16,NOSYNC => 8480/3637 - 2120MB/909MB
** 0.7.9 => 256k,16,16,NOSYNC => 3844/1654 - 889MB/404MB
** 0.7.9+scatter=0 => 256k,16,16,NOSYNC => 4347/1859 - 1087MB/465MB

Results: ~20% increase from plain 0.7.9, still lagging behind 0.6.5.11 (by ~50%)

pruiz · 2018-08-27T23:06:18Z

And here are some preliminary results with 0.7.9 + zfs_compressed_arc_enabled=0 (same zpool & data set settings as in my original testing):

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	4940	2122	19.3MB	8.4MB
1G	8K	1	1	SYNC	4801	2059	37.5MB	16.1MB
1G	256K	1	1	SYNC	1969	837	492MB	209MB
1G	1M	1	1	SYNC	588	256	588MB	257MB
1G	4K	16	16	SYNC	21700	9299	84.9MB	36.3MB
1G	8K	16	16	SYNC	21500	9238	168MB	72.2MB
1G	256K	16	16	SYNC	3592	1545	898MB	386MB
1G	1M	16	16	SYNC	1086	465	1086MB	466MB
16G	4k	1	1	NOSYNC	3222	1387	12.6MB	5.5MB
16G	8k	1	1	NOSYNC	3233	1381	25.3MB	10.8MB
16G	256k	1	1	NOSYNC	1524	653	381MB	163MB
16G	1M	1	1	NOSYNC	442	192	443MB	192MB
16G	4k	16	16	NOSYNC	23900	10200	93.4MB	40MB
16G	8k	16	16	NOSYNC	23500	10100	184MB	78.7MB
16G	256k	16	16	NOSYNC	3826	1637	957MB	409MB
16G	1M	16	16	NOSYNC	1023	441	1023MB	442MB

Compared to other tests:

IOPs intensive workload:
** 0.6.5.11 => 4k,1,1,SYNC => 6526/2775 - 25.5MB/10.8MB
** 0.7.9 => 4k,1,1,SYNC => 4236/1821 - 16.5MB/7.2MB
** 0.7.9+scatter=0 => 4k,1,1,SYNC => 5582/2399 - 21.8MB/9.6MB
** 0.7.9+comp_arc=0 => 4k,1,1,SYNC => 4940/2122 - 19.3MB/8.4MB

Results: ~10% increase from plain 0.7.9, still lagging behind 0.6.5.11

BW intensive workload:
** 0.6.5.11 => 256k,16,16,NOSYNC => 8480/3637 - 2120MB/909MB
** 0.7.9 => 256k,16,16,NOSYNC => 3844/1654 - 889MB/404MB
** 0.7.9+scatter=0 => 256k,16,16,NOSYNC => 4347/1859 - 1087MB/465MB
** 0.7.9+comp_arc=0 => 256k,16,16,NOSYNC => 3826/1637 - 957MB/409MB

Results: ~10% decrease from plain 0.7.9..

pruiz · 2018-08-28T00:12:58Z

And results from 0.7.9 with zfs_abd_scatter_enabled=0 + zfs_compressed_arc_enabled=0:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	5277	2271	20.6MB	9MB
1G	8K	1	1	SYNC	5224	2226	40.8MB	17.4MB
1G	256K	1	1	SYNC	2171	935	543MB	234MB
1G	1M	1	1	SYNC	672	289	673MB	290MB
1G	4K	16	16	SYNC	22500	9659	88MB	37.7MB
1G	8K	16	16	SYNC	22400	9599	175MB	74MB
1G	256K	16	16	SYNC	3813	1631	953MB	408MB
1G	1M	16	16	SYNC	1232	530	1232MB	531MB
16G	4k	1	1	NOSYNC	3366	1442	13.2MB	5.7MB
16G	8k	1	1	NOSYNC	3346	1440	26.1MB	11.2MB
16G	256k	1	1	NOSYNC	1748	751	437MB	188MB
16G	1M	1	1	NOSYNC	507	217	508MB	218MB
16G	4k	16	16	NOSYNC	25700	11000	101MB	43.1MB
16G	8k	16	16	NOSYNC	25800	11000	202MB	86.3MB
16G	256k	16	16	NOSYNC	4514	1930	1129MB	483MB
16G	1M	16	16	NOSYNC	1205	513	1206MB	514MB

Compared to other tests:

IOPs intensive workload:
** 0.6.5.11 => 4k,1,1,SYNC => 6526/2775 - 25.5MB/10.8MB
** 0.7.9 => 4k,1,1,SYNC => 4236/1821 - 16.5MB/7.2MB
** 0.7.9+scatter=0 => 4k,1,1,SYNC => 5582/2399 - 21.8MB/9.6MB
** 0.7.9+comp_arc=0 => 4k,1,1,SYNC => 4940/2122 - 19.3MB/8.4MB
** 0.7.9+scatter=0+comp_arc=0 => 4k,1,1,SYNC => 5277/2271 - 20.6MB/9MB

Results: nearly same performance from plain 0.7.9...

BW intensive workload:
** 0.6.5.11 => 256k,16,16,NOSYNC => 8480/3637 - 2120MB/909MB
** 0.7.9 => 256k,16,16,NOSYNC => 3844/1654 - 889MB/404MB
** 0.7.9+scatter=0 => 256k,16,16,NOSYNC => 4347/1859 - 1087MB/465MB
** 0.7.9+comp_arc=0 => 256k,16,16,NOSYNC => 3826/1637 - 957MB/409MB
** 0.7.9+scatter=0+comp_arc=0 => 256k,16,16,NOSYNC => 4514/1930 - 1129MB/483MB

Results: ~15% increase from plain 0.7.9..

pruiz · 2018-08-28T00:19:18Z

I'll try master tomorrow and report here..

pruiz · 2018-08-28T01:03:08Z

Well, I've built zfs from master (v0.7.0-1533_g47ab01a), and initial testing does not look promising :(

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	4426	1909	17.3MB	7.6MB
1G	8K	1	1	SYNC	4348	1869	33MB	14.6MB
1G	256K	1	1	SYNC	1840	794	460MB	199MB
1G	1M	1	1	SYNC	597	259	597MB	260MB
..

GregorKopka · 2018-08-28T13:23:30Z

Possibly the slowdown with the 0.7x version is somewhere in the codepath taken because of logbias=throughput on the dataset? Asking as I'm running with logbias=latency and I vaguely remember to have benchmarked 0.7 to be faster than the 0.6 series I upgraded a certain system from a while ago...

pruiz · 2018-08-28T15:06:29Z

I did some tests with logbias=latency with similar results. But I will repeat them and post here.

…

Sent from my iPhone

On 28 Aug 2018, at 15:23, Gregor Kopka ***@***.***> wrote: Possibly the slowdown with the 0.7x version is somewhere in the codepath taken because of logbias=throughput on the dataset? Asking as I'm running with logbias=latency and I vaguely remember to have benchmarked 0.7 to be faster than the 0.6 series I upgraded a certain system from a while ago... — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

behlendorf · 2018-08-30T23:57:30Z

One thing I didn't originally notice from your first post is that recordsize=16k set on the dataset. That's definitely a less common configuration and a potential reason which could explain why you're seeing a performance regression while others have reported an overall improvement. Regardless, we'll need to find the bottleneck. Thank you for bringing it to our attention and posting your performance results.

pruiz · 2018-08-31T01:26:07Z

@behlendorf yeah, our intended use in this case is for a db server, so 8k or 16k should be the optimal recordsize.. probably not as common as bigger recordsizes, as you stated.

Anyway, I would more than happy to test other configurations/options if you guys need it.

matveevandrey · 2018-09-06T11:11:23Z

@pruiz , great job!
Do you mind to perform your tests with recordsize=4k (as you have ashift=12 - so physical block size == 4k as well). I've also notices performance drop after upgrading (from 0.6.5 to 0.7) our NFS server (used as Proxmox shared storage)

pruiz · 2018-09-06T18:45:29Z

Tests with recordsize=4k, logbias=throughput (fio, using randrw 70/30, as usual, with both 1G & 16G test files):

Using v0.6.5.11, with 1G test file:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	6498	2786	25.4MB	10.9MB
1G	8K	1	1	SYNC	7243	3100	56.6MB	24.2MB
1G	256K	1	1	SYNC	1349	581	337MB	145MB
1G	1M	1	1	SYNC	372	160	372MB	160MB
1G	4k	16	16	SYNC	25900	11100	101MB	43.4MB
1G	8k	16	16	SYNC	22000	9827	179MB	76.8MB
1G	256k	16	16	SYNC	1927	823	482MB	206MB
1G	1M	16	16	SYNC	483	211	484MB	211MB
1G	4k	1	1	NOSYNC	75300	32200	294MB	126MB
1G	8k	1	1	NOSYNC	49200	21100	384MB	165MB
1G	256k	1	1	NOSYNC	2444	1040	611MB	260MB
1G	1M	1	1	NOSYNC	618	263	618MB	264MB
1G	4k	16	16	NOSYNC	211000	90400	824MB	353MB
1G	8k	16	16	NOSYNC	115000	49300	898MB	385MB
1G	256k	16	16	NOSYNC	4193	1800	1048MB	450MB
1G	1M	16	16	NOSYNC	1065	459	1066MB	460MB

Using v0.6.5.11, with 16G test file:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
16G	4K	1	1	SYNC	4678	2003	18.3MB	8MB
16G	1M	1	1	SYNC	287	125	288MB	125MB
16G	4K	16	16	SYNC	23100	9881	90.1MB	38.6MB
16G	1M	16	16	SYNC	462	200	463MB	200MB
16G	4K	1	1	NOSYNC	11800	5083	46.3MB	19.9MB
16G	1M	1	1	NOSYNC	359	154	359MB	155MB
16G	4K	16	16	NOSYNC	99900	42800	390MB	167MB
16G	1M	16	16	NOSYNC	874	373	874MB	374MB

Using master (v0.7-1533_g47ab01a), with 1G test file:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
1G	4K	1	1	SYNC	4942	2125	19.3MB	8.5MB
1G	8K	1	1	SYNC	4841	2072	37.8MB	16.2MB
1G	256K	1	1	SYNC	1240	533	310MB	133MB
1G	1M	1	1	SYNC	377	162	378MB	163MB
1G	4K	16	16	SYNC	42500	18200	166MB	71.1MB
1G	8K	16	16	SYNC	32600	13000	255MB	109MB
1G	256K	16	16	SYNC	1666	707	417MB	177MB
1G	1M	16	16	SYNC	438	188	438MB	188MB
1G	4K	1	1	NOSYNC	57500	24600	224MB	96.2MB
1G	8K	1	1	NOSYNC	36600	15700	286MB	122MB
1G	256K	1	1	NOSYNC	1754	754	499MB	189MB
1G	1M	1	1	NOSYNC	460	201	461MB	201MB
1G	4K	16	16	NOSYNC	130000	55800	508MB	218MB
1G	8K	16	16	NOSYNC	64200	27500	501MB	215MB
1G	256K	16	16	NOSYNC	2251	970	563MB	243MB
1G	1M	16	16	NOSYNC	583	248	584MB	249MB

Using master (v0.7-1533_g47ab01a), with 16G test file:

FILESIZE	BS	IODEPTH	THREADS	WMODE	IOP/s (R)	IOP/s (W)	BW (R)	BW (W)
16G	4K	1	1	SYNC	3612	1535	14.1MB	6.1MB
16G	1M	1	1	SYNC	288	122	288MB	123MB
16G	4K	16	16	SYNC	19500	8364	76.1MB	32.7MB
16G	1M	16	16	SYNC	389	168	390MB	169MB
16G	4K	1	1	NOSYNC	12900	5547	50.6MB	21.7MB
16G	1M	1	1	NOSYNC	339	145	340MB	146MB
16G	4K	16	16	NOSYNC	36500	15600	143MB	61.1MB
16G	1M	16	16	NOSYNC	510	219	511MB	220MB

Results summary:
1. Baseline 4k IOPs (SYNC)
  -> v0.6.5.11 - 1G/4k/1/1 => 6498 / 2786 (25.4MB / 10.9MB)
  -> v0.7-master - 1G/4k/1/1 => 4942 / 2125 (19.3MB / 8.5MB)
  => v0.6.5 wins this case by a 20%.
2. Baseline 4k IOPs (NOSYNC)
  -> v0.6.5.11 - 1G/4k/1/1 => 75300 / 32200 (294MB / 126MB)
  -> v0.7-master - 1G/4k/1/1 => 57500 / 24600 (224MB / 96.2MB)
  => v0.6.5 wins again..
3. Highest IOPs (SYNC)
  -> v0.6.5.11 - 1G/4k/16/16 => 25900 / 11100 (101MB / 43.4MB)
  -> v0.7-master - 1G/4k/16/16 => 42500 / 18200 (166MB / 71.1MB)
  => Winner is v0.7-master by an impressive 50%+
4. Highest IOPs (NOSYNC)
  -> v0.6.5.11 - 1G/4k/16/16 => 211000 / 90400 (824MB / 353MB)
  -> v0.7-master - 1G/4k/16/16 => 130000 / 55800 (508MB / 218MB)
  => In this case v0.6.5 wins by far, nearly double.
5. Highest Throughput (SYNC)
  -> v0.6.5.11 - 1G/1M/16/16 => 483 / 211 (484MB / 211MB)
  -> v0.7-master - 1G/1M/16/16 => 438 / 188 (438MB / 188MB)
  => I would call this a tie.
6. Highest Throughput (NOSYNC)
  -> v0.6.5.11 - 1G/1M/16/16 => 1065 / 459 (1066MB / 460MB)
  -> v0.7-master - 1G/1M/16/16 => 583 / 248 (584MB / 249MB)
  => Another long win for v0.6.5.

NOTEs:

I've verified with zdb that ashift is 12 as intended.
Testing against 0.7-master have abd_scatter_enabled=1 & compressed_arc_enabled=1
High concurrency/iodepth gains on 4k/sync on v0.7-master is impressive.. however it looks like we have a huge drop on the opposite use cases (no-sync tests)

I will try to add test results against v0.7.9 if I find some spare time tonite, as I would love to know wether those 4k/SYNC IOPs results of v0.7-master are reproducible with v0.7.9 too.

tonynguien · 2018-09-13T17:54:49Z

Using zfs-test performance regression, I'm seeing similar regression for cached reads, random reads, and random writes. I'll start bisecting commits between 0.6.5.11 and 0.7.0 tags.

olavgg · 2018-09-15T19:21:23Z

I would like to add some details here. I have used ZFS on FreeBSD for almost 10 years, it has always had decent ZFS performance. But I have a newer build with only SSD's and Optane 900p as slog and the sync write performance is really bad. I've compared with different Linux distributions and other filesystems.

The tool I use to test sync write performance is pg_test_fsync

Here is the performance on my FreeBSD server with 3x raidz * 5x 5400RPM spinners (15 disks total) and with Optane 32GB.

$ pg_test_fsync -f /tank/rot/testfile
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                                   n/a
        fdatasync                          7134.022 ops/sec     140 usecs/op
        fsync                              7138.345 ops/sec     140 usecs/op
        fsync_writethrough                              n/a
        open_sync                          7436.686 ops/sec     134 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                                   n/a
        fdatasync                          5139.483 ops/sec     195 usecs/op
        fsync                              4403.700 ops/sec     227 usecs/op
        fsync_writethrough                              n/a
        open_sync                          2606.494 ops/sec     384 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
         1 * 16kB open_sync write          5082.113 ops/sec     197 usecs/op
         2 *  8kB open_sync writes         3707.069 ops/sec     270 usecs/op
         4 *  4kB open_sync writes         2144.459 ops/sec     466 usecs/op
         8 *  2kB open_sync writes         1271.302 ops/sec     787 usecs/op
        16 *  1kB open_sync writes          636.725 ops/sec    1571 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
        write, fsync, close                5989.971 ops/sec     167 usecs/op
        write, close, fsync                5913.696 ops/sec     169 usecs/op

Non-sync'ed 8kB writes:
        write                             72071.214 ops/sec      14 usecs/op

With 6x striped 800GB enterprise class ssds, Optane 900p as slog and ZFS on Linux 0.8.0-rc1 on Ubuntu 18.04

$ sudo /usr/lib/postgresql/10/bin/pg_test_fsync -f /tank/rot/testfile  
5 seconds per test
O_DIRECT supported on this platform for open_datasync and open_sync.

Compare file sync methods using one 8kB write:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      2574,871 ops/sec     388 usecs/op
        fdatasync                          2265,568 ops/sec     441 usecs/op
        fsync                              2242,302 ops/sec     446 usecs/op
        fsync_writethrough                              n/a
        open_sync                          2510,196 ops/sec     398 usecs/op

Compare file sync methods using two 8kB writes:
(in wal_sync_method preference order, except fdatasync is Linux's default)
        open_datasync                      1301,706 ops/sec     768 usecs/op
        fdatasync                          2101,979 ops/sec     476 usecs/op
        fsync                              2082,698 ops/sec     480 usecs/op
        fsync_writethrough                              n/a
        open_sync                          1441,130 ops/sec     694 usecs/op

Compare open_sync with different write sizes:
(This is designed to compare the cost of writing 16kB in different write
open_sync sizes.)
         1 * 16kB open_sync write          2421,870 ops/sec     413 usecs/op
         2 *  8kB open_sync writes         1286,643 ops/sec     777 usecs/op
         4 *  4kB open_sync writes          674,385 ops/sec    1483 usecs/op
         8 *  2kB open_sync writes          352,586 ops/sec    2836 usecs/op
        16 *  1kB open_sync writes          179,682 ops/sec    5565 usecs/op

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written on a different
descriptor.)
        write, fsync, close                2469,133 ops/sec     405 usecs/op
        write, close, fsync                2522,016 ops/sec     397 usecs/op

Non-sync'ed 8kB writes:
        write                            113709,613 ops/sec       9 usecs/op

For comparison, exact same hardware, default settings ZoL 0.7.9, benchmarked with pg_test_fsync

Ubuntu 18.04 2200 iops
Debian 9 2000 iops
CentOS 7 8000 iops

FreeBSD 11.2 16000 iops

Ubuntu 18.04 + XFS 34000 iops
Ubuntu 18.04 + EXT4 32000 iops
Ubuntu 18.04 + BcacheFS 14000 iops

If there is anything I can help with, please ask. I now know how to build from source 8-)

GregorKopka · 2018-09-16T00:50:56Z

/tank/rot/testfile

What are the settings on the zfs filesystem (default of recordsize=128k would explain quite a lot)?

olavgg · 2018-09-16T08:39:28Z

In my case it is default, which is a dynamic record size, which means that when pg_test_fsync writes, ZFS will write 8kb blocks.

GregorKopka · 2018-09-16T13:56:10Z

ZFS will maintain files in a filesystem in $recordsize sized blocks on-disk.
You tested performance of read-modify-write cycles with 128k on-disk blocks by partially rewriting 8k chunks, which naturally isn't great.

In case you want zfs to write 8k on-disk blocks:
Set recordsize of the filesystem to that value, recreate the file, test again.

olavgg · 2018-09-16T16:09:28Z

The recordsize is dynamic, it writes 8kb even if the recordsize is higher.

As this is easy to test I can confirm that I get the exact same numbers. Also iostat says the slog, which is an Optane 900p is writing around 20-30MB/s

gmelikov · 2018-09-16T16:28:27Z

IIRC Recordsize is dynamic for files with size < than recordsize, or then compression is enabled.

richardelling · 2018-09-16T20:53:51Z

Are you sure you're not being impacted by the write throttle? It can be tuned and
the default tuning is a bit of a guess.

https://github.com/zfsonlinux/zfs/wiki/ZFS-Transaction-Delay

rlaager · 2018-09-17T05:38:46Z

Are you able to bisect this at all, even just using released versions? As a starting point, is 0.7.0 good like 0.6.5.11 or bad like 0.7.9?

h1z1 · 2018-09-17T08:29:19Z

Tuning zfs module parameters on v0.7.9 (like zfs_vdev_*, etc.) makes no observable difference.

Are you running stock settings or what was tweaked here?

grep . /sys/module/zfs/parameters/*

would help too if you can. From what I can tell above, 0.7 may have lower bandwidth and io/s, but it has quite a bit lower latency.

dweeezil · 2018-09-17T14:30:56Z

One commit that comes to mind for anyone able to bisect is 1ce23dc. It will [EDIT] increase [/EDIT] latency for single-threaded synchronous workloads such as pg_test_fsync but should help multi-threaded workloads as can be simulated with fio. See https://goo.gl/jBUiH5 for the author's performance testing on this commit.

tonynguien · 2018-10-02T16:51:31Z

So the change in zio_notify_parent() which replaced the zio_execute() with zio_taskq_dispatch() introduced the performance regression. I reverted that change and got similar performance to pre-throttle code.

In master, https://github.com/zfsonlinux/zfs/pull/7736/commits reduced taskq context switching thus solved the above issue.

@pruiz Would you be able to test with master or 0.8 code to verify?

tonynguien · 2018-10-02T16:59:04Z

Additionally, I noticed two things:

random writes and sequential reads numbers from pre write throttle code are still higher than numbers with 7736 change.

Pre write throttle numbers:

delphix@ZoL-ubuntu-4: grep iop random_writes.ksh.fio**
random_writes.ksh.fio.sync.8k-ios.128-threads.1-filesystems:  write: io=21403MB, bw=182629KB/s, iops=22828, runt=120005msec
random_writes.ksh.fio.sync.8k-ios.1-threads.1-filesystems:  write: io=4094.5MB, bw=34939KB/s, iops=4367, runt=120001msec
random_writes.ksh.fio.sync.8k-ios.32-threads.1-filesystems:  write: io=16115MB, bw=137498KB/s, iops=17187, runt=120011msec
delphix@ZoL-ubuntu-4:
delphix@ZoL-ubuntu-4: grep iop sequential_reads.ksh.fio*
sequential_reads.ksh.fio.sync.128k-ios.128-threads.1-filesystems:  read : io=114173MB, bw=973311KB/s, iops=7603, runt=120119msec
sequential_reads.ksh.fio.sync.128k-ios.16-threads.1-filesystems:  read : io=219190MB, bw=1826.6MB/s, iops=14612, runt=120001msec
sequential_reads.ksh.fio.sync.128k-ios.1-threads.1-filesystems:  read : io=64290MB, bw=548606KB/s, iops=4285, runt=120001msec
sequential_reads.ksh.fio.sync.128k-ios.64-threads.1-filesystems:  read : io=114187MB, bw=974304KB/s, iops=7611, runt=120011msec
sequential_reads.ksh.fio.sync.128k-ios.8-threads.1-filesystems:  read : io=270624MB, bw=2255.2MB/s, iops=18041, runt=120001msec
sequential_reads.ksh.fio.sync.1m-ios.128-threads.1-filesystems:  read : io=114034MB, bw=970752KB/s, iops=948, runt=120289msec
sequential_reads.ksh.fio.sync.1m-ios.16-threads.1-filesystems:  read : io=213920MB, bw=1782.6MB/s, iops=1782, runt=120006msec
sequential_reads.ksh.fio.sync.1m-ios.1-threads.1-filesystems:  read : io=61403MB, bw=523968KB/s, iops=511, runt=120001msec
sequential_reads.ksh.fio.sync.1m-ios.64-threads.1-filesystems:  read : io=115057MB, bw=981288KB/s, iops=958, runt=120065msec
sequential_reads.ksh.fio.sync.1m-ios.8-threads.1-filesystems:  read : io=263363MB, bw=2194.7MB/s, iops=2194, runt=120003msec
delphix@ZoL-ubuntu-4:

7736 numbers

delphix@ZoL-ubuntu-4: grep iop random_writes.ksh.fio**
random_writes.ksh.fio.sync.8k-ios.128-threads.1-filesystems:  write: io=17093MB, bw=145773KB/s, iops=18221, runt=120069msec
random_writes.ksh.fio.sync.8k-ios.1-threads.1-filesystems:  write: io=2680.9MB, bw=22876KB/s, iops=2859, runt=120001msec
random_writes.ksh.fio.sync.8k-ios.32-threads.1-filesystems:  write: io=12982MB, bw=110772KB/s, iops=13846, runt=120006msec
delphix@ZoL-ubuntu-4:
delphix@ZoL-ubuntu-4:
delphix@ZoL-ubuntu-4: grep iop sequential_reads.ksh.fio*
sequential_reads.ksh.fio.sync.128k-ios.128-threads.1-filesystems:  read : io=97437MB, bw=831078KB/s, iops=6492, runt=120055msec
sequential_reads.ksh.fio.sync.128k-ios.16-threads.1-filesystems:  read : io=158444MB, bw=1320.4MB/s, iops=10562, runt=120003msec
sequential_reads.ksh.fio.sync.128k-ios.1-threads.1-filesystems:  read : io=55948MB, bw=477418KB/s, iops=3729, runt=120001msec
sequential_reads.ksh.fio.sync.128k-ios.64-threads.1-filesystems:  read : io=92176MB, bw=786437KB/s, iops=6144, runt=120020msec
sequential_reads.ksh.fio.sync.128k-ios.8-threads.1-filesystems:  read : io=228701MB, bw=1905.9MB/s, iops=15246, runt=120001msec
sequential_reads.ksh.fio.sync.1m-ios.128-threads.1-filesystems:  read : io=98438MB, bw=838174KB/s, iops=818, runt=120262msec
sequential_reads.ksh.fio.sync.1m-ios.16-threads.1-filesystems:  read : io=155480MB, bw=1295.7MB/s, iops=1295, runt=120005msec
sequential_reads.ksh.fio.sync.1m-ios.1-threads.1-filesystems:  read : io=53686MB, bw=458117KB/s, iops=447, runt=120001msec
sequential_reads.ksh.fio.sync.1m-ios.64-threads.1-filesystems:  read : io=93378MB, bw=796341KB/s, iops=777, runt=120073msec
sequential_reads.ksh.fio.sync.1m-ios.8-threads.1-filesystems:  read : io=219143MB, bw=1826.2MB/s, iops=1826, runt=120003msec
delphix@ZoL-ubuntu-4:

cached reads performance also dropped somewhere between 0.6.5 and before write throttle commit.

So we may still have some regressions. I'm looking at #2 now. Does it make sense to open new issue(s)?

dweeezil · 2018-10-02T16:59:27Z

I'd also like to mention that disabling dynamic taskqs (spl_taskq_thread_dynamic=0) can decrease latency and improve performance, especially in single-threaded benchmark scenarios.

tonynguien · 2018-10-02T17:03:14Z

I'd also like to mention that disabling dynamic taskqs (spl_taskq_thread_dynamic=0) can decrease latency and improve performance, especially in single-threaded benchmark scenarios.

Thanks!

hedongzhang · 2018-10-16T03:59:36Z

I use fio to test the performance of zfs-0.7.11 zvol, the write amplification more than 6x, this seriously affects the performance of zvol.

Type	Version/Name
Distribution Name	redhat-7.4
Distribution Version	7.4
Linux Kernel	3.10.0-693.el7.x86_64
Architecture	x86_64
ZFS Version	0.7.11
SPL Version	0.7.11
Hardware	3 x SSD(370G)

8K zvol randwrite

hedongzhang · 2018-11-18T06:02:29Z

@kpande I don't quite understand what you mean. Can you elaborate more?

janetcampbell · 2019-04-06T23:34:52Z

You are handling all ZIL writes via indirect sync (logbias=throughout). This will trash your ability to aggregate read I/o over time due to data/metadata fragmentation, and will even greatly reduce your ability to agg between one data block and another. Any outstanding async write in the same sync domain may suffer as well.

I understand the desire for throughput but here it is coming at the expense of the pool data at large. In the real world, you would seldom set up a dataset like this unless read performance was totally unimportant. If you will read from a block at least once, it's worth doing direct sync.

If you test with logbias=latency, you need to either add a SLOG or increase zfs_immediate_write_sz.

I'd recommend doing a ZFS send while you watch zpool iostat -r. With 16k indirect writes you should have some absolutely amazing unaggregatable fragmentation.

janetcampbell · 2019-04-07T22:43:38Z

Another note - it looks like you are suffering reads even on full block writes. This should help greatly with that:

#8590

pauful · 2019-04-23T12:04:59Z

We have encountered a very similar issue, in the form of a significant performance drop between zfs 0.6.5.9 and 0.7.11. We are able to overcome the issue by setting zfs_abd_scatter_enabled=0 & zfs_compressed_arc_enabled=0.

We are using Debian Stretch (version 9.8) and linux kernel 4.9.0-8-amd64.

Our recordsize is 128K and I don't think we would be able to decrease it.

richardelling · 2019-04-23T19:10:33Z

I too have seen cases were ABD scatter/gather isn't as performant. So I can
believe it makes a difference for some workloads, but don't have a generic
guideline for when to use it and when to not use it. Experiment results appreciated.

I don't believe disabling compressed ARC will make much difference. Perhaps on
small memory machines? Can you toggle that and report results. This will be more
important soon as there is a proposal to force compressed ARC on. #7896

pauful · 2019-04-26T08:03:10Z

I tried enabling zfs_compressed_arc_enabled and zfs_abd_scatter_enabled in separate tests.
Enabling compressed ARC makes the biggest performance drop of them both. By enabling ABD scatter/gather I can see a decrease in performance but not as noticeable as when I enable compressed ARC.
Our machine uses more than 250G of memory. Let me know if I can help by providing other information.

jwittlincohen · 2019-04-26T12:10:21Z

@pauful What compression algorithm are you using on your datasets? lz4 is very fast to decompress but gzip would certainly cause issues.

pauful · 2019-05-09T09:06:59Z

@jwittlincohen lz4 is the compression option used in our pools.

matveevandrey · 2019-06-21T02:23:14Z

@pruiz Have you tried current master? Some performance oriented commits have been applied so far

pruiz · 2019-06-21T10:59:36Z

@matveevandrey not yet, but I have it on TODO.

…

On Fri, Jun 21, 2019 at 4:23 AM matveevandrey ***@***.***> wrote: @pruiz <https://github.com/pruiz> Have you tried current master? Some performance oriented commits have been applied so far — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7834>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABOV6PO2OQLGGV32ELMUB3P3Q3SVANCNFSM4FRUHVPA> .

interduo · 2019-07-17T13:15:27Z

Did somebody do some performance tests on 0.8.2 and would like to share?
I didn't find anything interesting in google.

interduo · 2019-10-09T10:03:04Z

@pruiz could You do the same test on 0.8.2 with the hardware You mentioned earlier?

pruiz · 2019-10-09T18:01:53Z

Right now I am have quite limited time, maybe in a week or two..

…

On Wed, Oct 9, 2019 at 12:03 PM Interduo ***@***.***> wrote: @pruiz <https://github.com/pruiz> could You do the same test on 0.8.2 with the hardware You mentioned earlier? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#7834>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABOV6MXPQ5AB7Q2RM6SIG3QNWT7HANCNFSM4FRUHVPA> .

interduo · 2020-02-18T14:25:46Z

@pruiz could You do the same test on 0.8.3 with the hardware You mentioned earlier?
You've done good work with that earlier.

stevecs · 2020-10-12T16:56:30Z

just a lurker on this bug here as I saw a similar drop on systems here back in 2018 doing the same transition from 0.6.5 to 0.7.x, decreasing performance to about 1/5 - 1/6th when running v0.7 so had to fail back. Been testing randomly over the last two years with newer versions against the same array but still in the 0.7 line but no luck. This last week just tried 0.8.3 and performance is back / comparable with 0.6.5. This is on one of my larger dev/qa systems and will watch it closely for the next month before upgrading the other systems. So 0.8.3 looks promising. Just wanted to bump this and see if pruiz could validate if this also alleviates his original problem.

interduo · 2020-10-12T17:08:08Z

@stevecs try 0.8.5, this version is having more ios.

stevecs · 2020-10-12T17:22:08Z

@interduo I'll see if I can get another window but it will be probably a couple weeks. I did a quick look a the commit deltas between 0.8.3 and 0.8.5 but didn't see much to catch my eye for I/O improvements (though did spot a couple other commits that were interesting). Can you give me a hint as to what commits you think may be relevant?

interduo · 2020-10-12T17:26:54Z

I just jumped from 0.8.2 to 0.8.5 with nice surprise on io graphs. I didn't look at commits.

stale · 2021-10-12T20:35:45Z

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

pruiz changed the title ~~Huge performance drop after upgrading to 0.7.9 from 0.6.5.11~~ Huge performance drop (30%~60%) after upgrading to 0.7.9 from 0.6.5.11 Aug 27, 2018

loli10K added the Type: Performance Performance improvement or performance problem label Aug 27, 2018

pruiz closed this as completed Sep 6, 2018

pruiz reopened this Sep 6, 2018

ahrens mentioned this issue Oct 5, 2018

reduce taskq and context-switch cost of zio pipe #7736

Merged

13 tasks

behlendorf mentioned this issue Oct 10, 2018

Reduce taskq and context-switch cost of zio pipe #8011

Closed

13 tasks

woquflux mentioned this issue Oct 15, 2018

The speed of random reading zvol is too slow with zfs-0.6.5.5 in centos7 OS #4429

Closed

richardelling mentioned this issue Mar 4, 2019

ZVOL write IO merging not sufficient #8472

Open

stale bot added the Status: Stale No recent activity for issue label Oct 12, 2021

stale bot closed this as completed Jan 12, 2022

Huge performance drop (30%~60%) after upgrading to 0.7.9 from 0.6.5.11 #7834

Huge performance drop (30%~60%) after upgrading to 0.7.9 from 0.6.5.11 #7834

Comments

pruiz commented Aug 27, 2018 • edited

System information

Describe the problem you're observing

DeHackEd commented Aug 27, 2018

pruiz commented Aug 27, 2018

pruiz commented Aug 27, 2018

behlendorf commented Aug 27, 2018

pruiz commented Aug 27, 2018 • edited

pruiz commented Aug 27, 2018 • edited

pruiz commented Aug 28, 2018

pruiz commented Aug 28, 2018

pruiz commented Aug 28, 2018

GregorKopka commented Aug 28, 2018

pruiz commented Aug 28, 2018 via email

behlendorf commented Aug 30, 2018

pruiz commented Aug 31, 2018

matveevandrey commented Sep 6, 2018 • edited

pruiz commented Sep 6, 2018 • edited

tonynguien commented Sep 13, 2018

olavgg commented Sep 15, 2018 • edited

GregorKopka commented Sep 16, 2018

olavgg commented Sep 16, 2018

GregorKopka commented Sep 16, 2018 • edited

olavgg commented Sep 16, 2018

gmelikov commented Sep 16, 2018

richardelling commented Sep 16, 2018

rlaager commented Sep 17, 2018

h1z1 commented Sep 17, 2018

dweeezil commented Sep 17, 2018 • edited

tonynguien commented Oct 2, 2018

tonynguien commented Oct 2, 2018 • edited

dweeezil commented Oct 2, 2018

tonynguien commented Oct 2, 2018

hedongzhang commented Oct 16, 2018 • edited

hedongzhang commented Nov 18, 2018

janetcampbell commented Apr 6, 2019

janetcampbell commented Apr 7, 2019

pauful commented Apr 23, 2019

richardelling commented Apr 23, 2019

pauful commented Apr 26, 2019

jwittlincohen commented Apr 26, 2019

pauful commented May 9, 2019

matveevandrey commented Jun 21, 2019

pruiz commented Jun 21, 2019 via email

interduo commented Jul 17, 2019 • edited

interduo commented Oct 9, 2019

pruiz commented Oct 9, 2019 via email

interduo commented Feb 18, 2020 • edited

stevecs commented Oct 12, 2020

interduo commented Oct 12, 2020

stevecs commented Oct 12, 2020

interduo commented Oct 12, 2020

stale bot commented Oct 12, 2021

pruiz commented Aug 27, 2018 •

edited

pruiz commented Aug 27, 2018 •

edited

pruiz commented Aug 27, 2018 •

edited

matveevandrey commented Sep 6, 2018 •

edited

pruiz commented Sep 6, 2018 •

edited

olavgg commented Sep 15, 2018 •

edited

GregorKopka commented Sep 16, 2018 •

edited

dweeezil commented Sep 17, 2018 •

edited

tonynguien commented Oct 2, 2018 •

edited

hedongzhang commented Oct 16, 2018 •

edited

interduo commented Jul 17, 2019 •

edited

interduo commented Feb 18, 2020 •

edited