-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for hardware compression on s390x for zlib-ng #72
Conversation
Please don't duplicate the common settings. You can declare a variable for the builder and then call some of the methods conditionally. |
Signed-off-by: Valdemar Erk <valdemar@erk.io>
Sorry for the wait, I have made the changes you requested and also run a few benchmarks showing up to 20x speedup with decompression and up to 220x increase in compression (at levels 1-7) when used through Flate2Flate2 default Running unittests (target/release/deps/bench-58c514034e2bbc05)
Gnuplot not found, using plotters backend
uncompressed: 3266560 bytes
compression/flate2-1.pack
time: [26.246 ms 26.259 ms 26.271 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
flate2-1: 1425349 bytes
compression/flate2-1.unpack
time: [16.255 ms 16.284 ms 16.349 ms]
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) high mild
1 (10.00%) high severe
compression/flate2-2.pack
time: [38.978 ms 38.990 ms 39.005 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-2: 1194487 bytes
compression/flate2-2.unpack
time: [14.195 ms 14.203 ms 14.209 ms]
compression/flate2-3.pack
time: [58.104 ms 58.130 ms 58.163 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
flate2-3: 1111333 bytes
compression/flate2-3.unpack
time: [13.354 ms 13.402 ms 13.463 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
compression/flate2-4.pack
time: [69.797 ms 70.081 ms 70.445 ms]
flate2-4: 1099059 bytes
compression/flate2-4.unpack
time: [13.618 ms 13.645 ms 13.704 ms]
compression/flate2-5.pack
time: [84.634 ms 84.793 ms 84.965 ms]
flate2-5: 1082945 bytes
compression/flate2-5.unpack
time: [13.397 ms 13.430 ms 13.498 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
Benchmarking compression/flate2-6.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.7sor enable flat sampling.
compression/flate2-6.pack
time: [121.21 ms 121.34 ms 121.48 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-6: 1071897 bytes
compression/flate2-6.unpack
time: [13.205 ms 13.213 ms 13.218 ms]
Benchmarking compression/flate2-7.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 7.8sor enable flat sampling.
compression/flate2-7.pack
time: [141.34 ms 141.54 ms 142.03 ms]
flate2-7: 1068897 bytes
compression/flate2-7.unpack
time: [13.118 ms 13.163 ms 13.225 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
Benchmarking compression/flate2-8.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 8.9sor enable flat sampling.
compression/flate2-8.pack
time: [161.49 ms 161.63 ms 161.77 ms]
flate2-8: 1066961 bytes
compression/flate2-8.unpack
time: [13.051 ms 13.063 ms 13.074 ms]
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) high mild Zlib-ng Running unittests (target/release/deps/bench-75831b7ab974fa31)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.
Gnuplot not found, using plotters backend
uncompressed: 3266560 bytes
compression/flate2-1.pack
time: [61.815 ms 62.016 ms 62.230 ms]
flate2-1: 1658487 bytes
compression/flate2-1.unpack
time: [9.6884 ms 9.7319 ms 9.7932 ms]
compression/flate2-2.pack
time: [51.364 ms 51.525 ms 51.695 ms]
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) high mild
1 (10.00%) high severe
flate2-2: 1179064 bytes
compression/flate2-2.unpack
time: [9.0035 ms 9.0360 ms 9.0715 ms]
compression/flate2-3.pack
time: [61.177 ms 61.415 ms 61.680 ms]
flate2-3: 1132397 bytes
compression/flate2-3.unpack
time: [8.5760 ms 8.5807 ms 8.5847 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
compression/flate2-4.pack
time: [70.395 ms 71.033 ms 71.700 ms]
flate2-4: 1086307 bytes
compression/flate2-4.unpack
time: [8.6578 ms 8.6793 ms 8.7178 ms]
Found 4 outliers among 10 measurements (40.00%)
1 (10.00%) low severe
1 (10.00%) low mild
2 (20.00%) high severe
compression/flate2-5.pack
time: [76.064 ms 76.686 ms 77.311 ms]
flate2-5: 1078796 bytes
compression/flate2-5.unpack
time: [8.4941 ms 8.5150 ms 8.5521 ms]
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) high mild
1 (10.00%) high severe
compression/flate2-6.pack
time: [80.999 ms 81.256 ms 81.548 ms]
Found 2 outliers among 10 measurements (20.00%)
1 (10.00%) low mild
1 (10.00%) high mild
flate2-6: 1075222 bytes
compression/flate2-6.unpack
time: [8.3507 ms 8.3664 ms 8.3981 ms]
Benchmarking compression/flate2-7.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.7s or enable flat sampling.
compression/flate2-7.pack
time: [101.49 ms 102.26 ms 103.42 ms]
flate2-7: 1065662 bytes
compression/flate2-7.unpack
time: [8.7837 ms 8.8895 ms 9.0759 ms]
Benchmarking compression/flate2-8.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.3s or enable flat sampling.
compression/flate2-8.pack
time: [113.64 ms 114.89 ms 116.08 ms]
flate2-8: 1063153 bytes
compression/flate2-8.unpack
time: [8.8862 ms 8.9613 ms 9.0268 ms]
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) low mild Zlib-ng + my patch witout hardware accelerated compression (still hw acceleration on level 1 compression) Running unittests (target/release/deps/bench-8c415de9634e6f9f)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.
Gnuplot not found, using plotters backend
uncompressed: 3266560 bytes
compression/flate2-1.pack
time: [235.49 us 235.93 us 236.29 us]
change: [-99.621% -99.619% -99.618%] (p = 0.00 < 0.05)
Performance has improved.
flate2-1: 1470182 bytes
compression/flate2-1.unpack
time: [411.82 us 412.07 us 412.25 us]
change: [-95.782% -95.762% -95.745%] (p = 0.00 < 0.05)
Performance has improved.
compression/flate2-2.pack
time: [51.435 ms 51.526 ms 51.677 ms]
change: [-0.1061% +0.2418% +0.5505%] (p = 0.19 > 0.05)
No change in performance detected.
flate2-2: 1179064 bytes
compression/flate2-2.unpack
time: [454.76 us 454.91 us 455.04 us]
change: [-94.987% -94.969% -94.953%] (p = 0.00 < 0.05)
Performance has improved.
compression/flate2-3.pack
time: [61.548 ms 61.742 ms 61.872 ms]
change: [+0.0001% +0.3675% +0.7466%] (p = 0.08 > 0.05)
No change in performance detected.
flate2-3: 1132397 bytes
compression/flate2-3.unpack
time: [440.53 us 440.96 us 441.98 us]
change: [-94.872% -94.860% -94.842%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
compression/flate2-4.pack
time: [69.412 ms 69.624 ms 69.774 ms]
change: [-2.5619% -1.7645% -0.9637%] (p = 0.00 < 0.05)
Change within noise threshold.
flate2-4: 1086307 bytes
compression/flate2-4.unpack
time: [430.13 us 430.38 us 430.67 us]
change: [-95.074% -95.034% -94.994%] (p = 0.00 < 0.05)
Performance has improved.
compression/flate2-5.pack
time: [73.149 ms 73.270 ms 73.480 ms]
change: [-5.9472% -4.9114% -3.9668%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-5: 1078796 bytes
compression/flate2-5.unpack
time: [428.64 us 429.26 us 429.85 us]
change: [-94.993% -94.966% -94.943%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) high mild
compression/flate2-6.pack
time: [78.893 ms 78.966 ms 79.063 ms]
change: [-3.3955% -2.7469% -2.0639%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
flate2-6: 1075222 bytes
compression/flate2-6.unpack
time: [426.40 us 426.51 us 426.62 us]
change: [-94.951% -94.926% -94.905%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking compression/flate2-7.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.3s or enable flat sampling.
compression/flate2-7.pack
time: [96.995 ms 97.125 ms 97.243 ms]
change: [-6.2515% -5.5200% -4.7046%] (p = 0.00 < 0.05)
Performance has improved.
flate2-7: 1065662 bytes
compression/flate2-7.unpack
time: [423.29 us 423.43 us 423.71 us]
change: [-95.305% -95.242% -95.183%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
Benchmarking compression/flate2-8.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.1s or enable flat sampling.
compression/flate2-8.pack
time: [110.04 ms 110.24 ms 110.36 ms]
change: [-4.9203% -4.2248% -3.5290%] (p = 0.00 < 0.05)
Performance has improved.
flate2-8: 1063153 bytes
compression/flate2-8.unpack
time: [422.59 us 422.76 us 422.93 us]
change: [-95.308% -95.267% -95.220%] (p = 0.00 < 0.05)
Performance has improved. Zlib-ng + my patch + hardware accelerated compression Running unittests (target/release/deps/bench-8c415de9634e6f9f)
WARNING: HTML report generation will become a non-default optional feature in Criterion.rs 0.4.0.
This feature is being moved to cargo-criterion (https://github.com/bheisler/cargo-criterion) and will be optional in a future version of Criterion.rs. To silence this warning, either switch to cargo-criterion or enable the 'html_reports' feature in your Cargo.toml.
Gnuplot not found, using plotters backend
uncompressed: 3266560 bytes
compression/flate2-1.pack
time: [234.29 us 234.36 us 234.42 us]
change: [-0.4435% -0.3905% -0.3425%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-1: 1470182 bytes
compression/flate2-1.unpack
time: [413.68 us 414.00 us 414.21 us]
change: [+0.2569% +0.4367% +0.5602%] (p = 0.00 < 0.05)
Change within noise threshold.
compression/flate2-2.pack
time: [234.32 us 234.41 us 234.52 us]
change: [-99.560% -99.559% -99.558%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-2: 1470182 bytes
compression/flate2-2.unpack
time: [413.79 us 413.95 us 414.14 us]
change: [-9.5157% -9.4611% -9.4157%] (p = 0.00 < 0.05)
Performance has improved.
compression/flate2-3.pack
time: [234.39 us 234.46 us 234.56 us]
change: [-99.624% -99.622% -99.621%] (p = 0.00 < 0.05)
Performance has improved.
flate2-3: 1470182 bytes
compression/flate2-3.unpack
time: [413.51 us 413.75 us 413.96 us]
change: [-6.1808% -6.0194% -5.7633%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
compression/flate2-4.pack
time: [234.36 us 234.45 us 234.55 us]
change: [-99.661% -99.661% -99.660%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-4: 1470182 bytes
compression/flate2-4.unpack
time: [411.54 us 411.80 us 412.04 us]
change: [-4.3757% -4.3002% -4.2225%] (p = 0.00 < 0.05)
Performance has improved.
compression/flate2-5.pack
time: [234.28 us 234.32 us 234.40 us]
change: [-99.681% -99.680% -99.680%] (p = 0.00 < 0.05)
Performance has improved.
flate2-5: 1470182 bytes
compression/flate2-5.unpack
time: [411.24 us 411.41 us 411.58 us]
change: [-4.4804% -4.2792% -4.1013%] (p = 0.00 < 0.05)
Performance has improved.
Found 2 outliers among 10 measurements (20.00%)
2 (20.00%) high mild
compression/flate2-6.pack
time: [234.23 us 234.40 us 234.56 us]
change: [-99.704% -99.703% -99.703%] (p = 0.00 < 0.05)
Performance has improved.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-6: 1470182 bytes
compression/flate2-6.unpack
time: [411.42 us 411.49 us 411.57 us]
change: [-3.5549% -3.5303% -3.5062%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking compression/flate2-7.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.3s or enable flat sampling.
compression/flate2-7.pack
time: [95.855 ms 95.921 ms 96.015 ms]
change: [-1.2030% -0.8535% -0.3794%] (p = 0.00 < 0.05)
Change within noise threshold.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
flate2-7: 1065662 bytes
compression/flate2-7.unpack
time: [425.18 us 425.33 us 425.44 us]
change: [+0.1861% +0.3293% +0.4405%] (p = 0.00 < 0.05)
Change within noise threshold.
Benchmarking compression/flate2-8.pack: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 6.0s or enable flat sampling.
compression/flate2-8.pack
time: [108.34 ms 108.43 ms 108.59 ms]
change: [-1.6135% -1.3806% -1.1612%] (p = 0.00 < 0.05)
Performance has improved.
flate2-8: 1063153 bytes
compression/flate2-8.unpack
time: [424.46 us 424.74 us 425.09 us]
change: [+0.4369% +0.5072% +0.5841%] (p = 0.00 < 0.05)
Change within noise threshold. GZPI also tested it with https://github.com/sstadick/gzp and the results were surprising, using the 100 times shakespeare file as the other benchmarks on the site (and 2 threads as I am limited to that in my vps) I ran the benchmarks for gzip GZP Zlib-ngBenchmarking Compression/Gzip/2: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 129.0s.
Compression/Gzip/2 time: [12.898 s 12.913 s 12.936 s]
change: [+9943.5% +9963.3% +9984.9%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe
Benchmarking Compression/Gzip Only: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 128.4s.
Compression/Gzip Only time: [12.802 s 12.875 s 12.975 s]
change: [+10080% +10140% +10227%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high severe GZP Zlib-ng + my patchCompression/Gzip/2 time: [890.24 ms 928.41 ms 969.07 ms]
change: [-93.093% -92.810% -92.544%] (p = 0.00 < 0.05)
Performance has improved.
Benchmarking Compression/Gzip Only: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 5.6s.
Compression/Gzip Only time: [558.82 ms 559.74 ms 560.81 ms]
change: [-95.686% -95.652% -95.627%] (p = 0.00 < 0.05)
Performance has improved. They show that using GZP is around double the speed of using flate2 with 1 core for the same so there must be some dependencies that break when used with more than one thread at the time. MiscRelated: It could be nice to get zlib-ng updated for this commit zlib-ng/zlib-ng@0573840 |
Docs: https://github.com/zlib-ng/zlib-ng/tree/develop/arch/s390