Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce deflate table sizes #227

Merged
merged 3 commits into from
Feb 23, 2020
Merged

Reduce deflate table sizes #227

merged 3 commits into from
Feb 23, 2020

Conversation

klauspost
Copy link
Owner

@klauspost klauspost commented Feb 19, 2020

Fixes #223

benchmark                                           old bytes     new bytes     delta
BenchmarkCompressAllocations/level(-2)/flate-12     342272        342272        +0.00%
BenchmarkCompressAllocations/level(-2)/gzip-12      342448        342448        +0.00%
BenchmarkCompressAllocations/level(-1)/flate-12     3235203       2448774       -24.31%
BenchmarkCompressAllocations/level(-1)/gzip-12      3235381       2448949       -24.31%
BenchmarkCompressAllocations/level(0)/flate-12      339968        339968        +0.00%
BenchmarkCompressAllocations/level(0)/gzip-12       340144        340144        +0.00%
BenchmarkCompressAllocations/level(1)/flate-12      2186626       1924486       -11.99%
BenchmarkCompressAllocations/level(1)/gzip-12       2186802       1924662       -11.99%
BenchmarkCompressAllocations/level(2)/flate-12      3759493       2710923       -27.89%
BenchmarkCompressAllocations/level(2)/gzip-12       3759669       2711102       -27.89%
BenchmarkCompressAllocations/level(3)/flate-12      2710921       2186626       -19.34%
BenchmarkCompressAllocations/level(3)/gzip-12       2711100       2186802       -19.34%
BenchmarkCompressAllocations/level(4)/flate-12      2710925       2186626       -19.34%
BenchmarkCompressAllocations/level(4)/gzip-12       2711098       2186802       -19.34%
BenchmarkCompressAllocations/level(5)/flate-12      3235207       2448774       -24.31%
BenchmarkCompressAllocations/level(5)/gzip-12       3235382       2448948       -24.31%
BenchmarkCompressAllocations/level(6)/flate-12      3235204       2448773       -24.31%
BenchmarkCompressAllocations/level(6)/gzip-12       3235381       2448949       -24.31%
BenchmarkCompressAllocations/level(7)/flate-12      1006980       1006979       -0.00%
BenchmarkCompressAllocations/level(7)/gzip-12       1007156       1007155       -0.00%
BenchmarkCompressAllocations/level(8)/flate-12      1006980       1006979       -0.00%
BenchmarkCompressAllocations/level(8)/gzip-12       1007156       1007155       -0.00%
BenchmarkCompressAllocations/level(9)/flate-12      1006981       1006980       -0.00%
BenchmarkCompressAllocations/level(9)/gzip-12       1007155       1007155       +0.00%

When reducing table bits by 1:

benchmark                                           old bytes     new bytes     delta
BenchmarkCompressAllocations/level(-2)/flate-12     342272        342272        +0.00%
BenchmarkCompressAllocations/level(-2)/gzip-12      342448        342448        +0.00%
BenchmarkCompressAllocations/level(-1)/flate-12     3235203       2055557       -36.46%
BenchmarkCompressAllocations/level(-1)/gzip-12      3235381       2055732       -36.46%
BenchmarkCompressAllocations/level(0)/flate-12      339968        339968        +0.00%
BenchmarkCompressAllocations/level(0)/gzip-12       340144        340145        +0.00%
BenchmarkCompressAllocations/level(1)/flate-12      2186626       1793416       -17.98%
BenchmarkCompressAllocations/level(1)/gzip-12       2186802       1793587       -17.98%
BenchmarkCompressAllocations/level(2)/flate-12      3759493       2186627       -41.84%
BenchmarkCompressAllocations/level(2)/gzip-12       3759669       2186803       -41.84%
BenchmarkCompressAllocations/level(3)/flate-12      2710921       1924486       -29.01%
BenchmarkCompressAllocations/level(3)/gzip-12       2711100       1924662       -29.01%
BenchmarkCompressAllocations/level(4)/flate-12      2710925       1924485       -29.01%
BenchmarkCompressAllocations/level(4)/gzip-12       2711098       1924659       -29.01%
BenchmarkCompressAllocations/level(5)/flate-12      3235207       2055554       -36.46%
BenchmarkCompressAllocations/level(5)/gzip-12       3235382       2055730       -36.46%
BenchmarkCompressAllocations/level(6)/flate-12      3235204       2055555       -36.46%
BenchmarkCompressAllocations/level(6)/gzip-12       3235381       2055731       -36.46%
BenchmarkCompressAllocations/level(7)/flate-12      1006980       1006980       +0.00%
BenchmarkCompressAllocations/level(7)/gzip-12       1007156       1007155       -0.00%
BenchmarkCompressAllocations/level(8)/flate-12      1006980       1006979       -0.00%
BenchmarkCompressAllocations/level(8)/gzip-12       1007156       1007155       -0.00%
BenchmarkCompressAllocations/level(9)/flate-12      1006981       1006979       -0.00%
BenchmarkCompressAllocations/level(9)/gzip-12       1007155       1007155       +0.00%

Reducing history buffer to 640K (who needs more, right)

benchmark                                           old bytes     new bytes     delta
BenchmarkCompressAllocations/level(-2)/flate-12     342272        342272        +0.00%
BenchmarkCompressAllocations/level(-2)/gzip-12      342448        342448        +0.00%
BenchmarkCompressAllocations/level(-1)/flate-12     3235203       1400195       -56.72%
BenchmarkCompressAllocations/level(-1)/gzip-12      3235381       1400371       -56.72%
BenchmarkCompressAllocations/level(0)/flate-12      339968        339968        +0.00%
BenchmarkCompressAllocations/level(0)/gzip-12       340144        340144        +0.00%
BenchmarkCompressAllocations/level(1)/flate-12      2186626       1138050       -47.95%
BenchmarkCompressAllocations/level(1)/gzip-12       2186802       1138226       -47.95%
BenchmarkCompressAllocations/level(2)/flate-12      3759493       1531271       -59.27%
BenchmarkCompressAllocations/level(2)/gzip-12       3759669       1531443       -59.27%
BenchmarkCompressAllocations/level(3)/flate-12      2710921       1269123       -53.18%
BenchmarkCompressAllocations/level(3)/gzip-12       2711100       1269299       -53.18%
BenchmarkCompressAllocations/level(4)/flate-12      2710925       1269124       -53.18%
BenchmarkCompressAllocations/level(4)/gzip-12       2711098       1269298       -53.18%
BenchmarkCompressAllocations/level(5)/flate-12      3235207       1400195       -56.72%
BenchmarkCompressAllocations/level(5)/gzip-12       3235382       1400372       -56.72%
BenchmarkCompressAllocations/level(6)/flate-12      3235204       1400194       -56.72%
BenchmarkCompressAllocations/level(6)/gzip-12       3235381       1400370       -56.72%
BenchmarkCompressAllocations/level(7)/flate-12      1006980       1006979       -0.00%
BenchmarkCompressAllocations/level(7)/gzip-12       1007156       1007155       -0.00%
BenchmarkCompressAllocations/level(8)/flate-12      1006980       1006979       -0.00%
BenchmarkCompressAllocations/level(8)/gzip-12       1007156       1007155       -0.00%
BenchmarkCompressAllocations/level(9)/flate-12      1006981       1006982       +0.00%
BenchmarkCompressAllocations/level(9)/gzip-12       1007155       1007158       +0.00%

Speed in before/after pairs:

file	out	level	insize	outsize	millis	mb/s
enwik10	pgzip	1	10000000000	3906338831	3495	2728.07
enwik10	pgzip	2	10000000000	3778852959	5557	1716.09
enwik10	pgzip	3	10000000000	3708169699	4586	2079.25
enwik10	pgzip	4	10000000000	3463551112	4903	1944.73
enwik10	pgzip	5	10000000000	3430167085	7311	1304.36
enwik10	pgzip	6	10000000000	3402617678	7489	1273.32
enwik10	pgzip	7	10000000000	3334393779	9369	1017.89
enwik10	pgzip	8	10000000000	3299391670	12156	784.48
enwik10	pgzip	9	10000000000	3280982678	14925	638.96

file	out	level	insize	outsize	millis	mb/s
enwik10	pgzip	1	10000000000	3914770776	3374	2825.91
enwik10	pgzip	2	10000000000	3781000254	3601	2648.28
enwik10	pgzip	3	10000000000	3710526115	4567	2088.04
enwik10	pgzip	4	10000000000	3476682271	4411	2161.55
enwik10	pgzip	5	10000000000	3438579062	5074	1879.48
enwik10	pgzip	6	10000000000	3411536185	5318	1793.17
enwik10	pgzip	7	10000000000	3334393782	9395	1015.07
enwik10	pgzip	8	10000000000	3299391673	12193	782.10
enwik10	pgzip	9	10000000000	3280982681	14822	643.40

file	out	level	insize	outsize	millis	mb/s
sofia-air-quality-dataset.tar	pgzip	1	15464463872	4085715559	4699	3138.51
sofia-air-quality-dataset.tar	pgzip	2	15464463872	3728455097	5178	2848.12
sofia-air-quality-dataset.tar	pgzip	3	15464463872	3970513842	4879	3022.70
sofia-air-quality-dataset.tar	pgzip	4	15464463872	3086301456	5152	2862.50
sofia-air-quality-dataset.tar	pgzip	5	15464463872	3087736829	7866	1874.73
sofia-air-quality-dataset.tar	pgzip	6	15464463872	3058925149	8417	1751.99
sofia-air-quality-dataset.tar	pgzip	7	15464463872	2910183972	11103	1328.23
sofia-air-quality-dataset.tar	pgzip	8	15464463872	2896336359	14012	1052.52
sofia-air-quality-dataset.tar	pgzip	9	15464463872	2893693092	21044	700.80

sofia-air-quality-dataset.tar	pgzip	1	15464463872	4106815025	4167	3538.45
sofia-air-quality-dataset.tar	pgzip	2	15464463872	3736725250	4502	3275.88
sofia-air-quality-dataset.tar	pgzip	3	15464463872	4007987034	4896	3012.20
sofia-air-quality-dataset.tar	pgzip	4	15464463872	3110916968	4607	3201.08
sofia-air-quality-dataset.tar	pgzip	5	15464463872	3094454129	5158	2859.17
sofia-air-quality-dataset.tar	pgzip	6	15464463872	3063884577	5698	2588.16
sofia-air-quality-dataset.tar	pgzip	7	15464463872	2910183975	11071	1332.07
sofia-air-quality-dataset.tar	pgzip	8	15464463872	2896336362	13956	1056.74
sofia-air-quality-dataset.tar	pgzip	9	15464463872	2893693095	21032	701.20

file	out	level	insize	outsize	millis	mb/s
rawstudio-mint14.tar	pgzip	1	8558382592	3964506844	2773	2942.69
rawstudio-mint14.tar	pgzip	2	8558382592	3906429569	4609	1770.74
rawstudio-mint14.tar	pgzip	3	8558382592	3858462776	3602	2265.43
rawstudio-mint14.tar	pgzip	4	8558382592	3780289322	3859	2114.55
rawstudio-mint14.tar	pgzip	5	8558382592	3766946183	5550	1470.55
rawstudio-mint14.tar	pgzip	6	8558382592	3736991093	5893	1384.94
rawstudio-mint14.tar	pgzip	7	8558382592	3670197931	6448	1265.72
rawstudio-mint14.tar	pgzip	8	8558382592	3653014735	8474	963.07
rawstudio-mint14.tar	pgzip	9	8558382592	3634576029	54262	150.42

file	out	level	insize	outsize	millis	mb/s
rawstudio-mint14.tar	pgzip	1	8558382592	3967671405	2449	3332.00
rawstudio-mint14.tar	pgzip	2	8558382592	3907329379	2613	3122.87
rawstudio-mint14.tar	pgzip	3	8558382592	3865051624	3229	2527.12
rawstudio-mint14.tar	pgzip	4	8558382592	3784259677	3407	2395.09
rawstudio-mint14.tar	pgzip	5	8558382592	3770232836	4017	2031.38
rawstudio-mint14.tar	pgzip	6	8558382592	3740537477	4371	1866.87
rawstudio-mint14.tar	pgzip	7	8558382592	3670197934	6426	1270.05
rawstudio-mint14.tar	pgzip	8	8558382592	3653014738	8428	968.40
rawstudio-mint14.tar	pgzip	9	8558382592	3634576032	54335	150.21

file	out	level	insize	outsize	millis	mb/s
consensus.db.10gb	pgzip	1	10737418240	5204022570	3150	3250.06
consensus.db.10gb	pgzip	2	10737418240	5193794074	6523	1569.72
consensus.db.10gb	pgzip	3	10737418240	5085716845	4037	2535.96
consensus.db.10gb	pgzip	4	10737418240	5085583436	4504	2273.53
consensus.db.10gb	pgzip	5	10737418240	5068059941	7091	1443.96
consensus.db.10gb	pgzip	6	10737418240	5064553842	7547	1356.70
consensus.db.10gb	pgzip	7	10737418240	5002057594	5685	1801.14
consensus.db.10gb	pgzip	8	10737418240	4999037898	7130	1436.06
consensus.db.10gb	pgzip	9	10737418240	4902262269	71537	143.14

file	out	level	insize	outsize	millis	mb/s
consensus.db.10gb	pgzip	1	10737418240	5205701219	2919	3507.26
consensus.db.10gb	pgzip	2	10737418240	5194162927	3085	3318.54
consensus.db.10gb	pgzip	3	10737418240	5171856922	3440	2976.07
consensus.db.10gb	pgzip	4	10737418240	5085217445	4013	2551.13
consensus.db.10gb	pgzip	5	10737418240	5068346020	4763	2149.87
consensus.db.10gb	pgzip	6	10737418240	5064976775	5213	1964.25
consensus.db.10gb	pgzip	7	10737418240	5002057597	5786	1769.70
consensus.db.10gb	pgzip	8	10737418240	4999037901	7242	1413.85
consensus.db.10gb	pgzip	9	10737418240	4902262272	71410	143.40

Faster with very small compression loss. Especially the default level 5 gets a great boost.

Fixes #223

After:
```
BenchmarkCompressAllocations/level(-2)/flate-12   	   15889	     75134 ns/op	  342272 B/op	      11 allocs/op
BenchmarkCompressAllocations/level(-2)/gzip-12    	   15424	     76300 ns/op	  342448 B/op	      12 allocs/op
BenchmarkCompressAllocations/level(-1)/flate-12   	    2673	    378711 ns/op	 2448774 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(-1)/gzip-12    	    3342	    377507 ns/op	 2448949 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(0)/flate-12    	   17437	     76986 ns/op	  339968 B/op	       9 allocs/op
BenchmarkCompressAllocations/level(0)/gzip-12     	   15076	     82031 ns/op	  340144 B/op	      10 allocs/op
BenchmarkCompressAllocations/level(1)/flate-12    	    3382	    377466 ns/op	 1924486 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(1)/gzip-12     	    3436	    387788 ns/op	 1924662 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(2)/flate-12    	    1971	    591518 ns/op	 2710923 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(2)/gzip-12     	    2073	    516709 ns/op	 2711102 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(3)/flate-12    	    3250	    426246 ns/op	 2186626 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(3)/gzip-12     	    3084	    420084 ns/op	 2186802 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(4)/flate-12    	    2733	    390467 ns/op	 2186626 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(4)/gzip-12     	    3165	    400509 ns/op	 2186802 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(5)/flate-12    	    2797	    417904 ns/op	 2448774 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(5)/gzip-12     	    2455	    456214 ns/op	 2448948 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(6)/flate-12    	    2733	    471116 ns/op	 2448773 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(6)/gzip-12     	    2673	    443633 ns/op	 2448949 B/op	      15 allocs/op
BenchmarkCompressAllocations/level(7)/flate-12    	    6015	    198306 ns/op	 1006979 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(7)/gzip-12     	    5728	    188045 ns/op	 1007155 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(8)/flate-12    	    6684	    195617 ns/op	 1006979 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(8)/gzip-12     	    6331	    193922 ns/op	 1007155 B/op	      14 allocs/op
BenchmarkCompressAllocations/level(9)/flate-12    	    6015	    193829 ns/op	 1006980 B/op	      13 allocs/op
BenchmarkCompressAllocations/level(9)/gzip-12     	    5728	    197447 ns/op	 1007155 B/op	      14 allocs/op
```
@klauspost klauspost merged commit 0d728f0 into master Feb 23, 2020
@klauspost klauspost deleted the reduce-deflate-table-sizes branch February 23, 2020 18:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

flate: Investigate smaller tables levels1-6
1 participant