Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snappy comparison with upstream need updating #54

Closed
nigeltao opened this issue Apr 23, 2016 · 1 comment
Closed

snappy comparison with upstream need updating #54

nigeltao opened this issue Apr 23, 2016 · 1 comment

Comments

@nigeltao
Copy link
Contributor

nigeltao commented Apr 23, 2016

You are obviously free to respond however you want, but FYI the upstream github.com/golang/snappy encoder now implements an asm version of what you call matchLenSSE4, and it (call this "new") now compares favorably with the github.com/klauspost/compress/snappy version (call this "old"):

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     153.77       673.38       4.38x
BenchmarkWordsEncode1e2-8     217.81       428.78       1.97x
BenchmarkWordsEncode1e3-8     282.31       446.89       1.58x
BenchmarkWordsEncode1e4-8     225.73       315.17       1.40x
BenchmarkWordsEncode1e5-8     158.92       267.72       1.68x
BenchmarkWordsEncode1e6-8     206.50       311.30       1.51x
BenchmarkRandomEncode-8       4055.50      14507.66     3.58x
Benchmark_ZFlat0-8            481.82       791.69       1.64x
Benchmark_ZFlat1-8            190.36       434.39       2.28x
Benchmark_ZFlat2-8            6436.37      16301.77     2.53x
Benchmark_ZFlat3-8            368.55       632.13       1.72x
Benchmark_ZFlat4-8            3257.82      7990.39      2.45x
Benchmark_ZFlat5-8            474.40       764.96       1.61x
Benchmark_ZFlat6-8            183.83       280.09       1.52x
Benchmark_ZFlat7-8            170.28       262.54       1.54x
Benchmark_ZFlat8-8            190.70       298.19       1.56x
Benchmark_ZFlat9-8            158.43       247.14       1.56x
Benchmark_ZFlat10-8           581.40       1028.24      1.77x
Benchmark_ZFlat11-8           310.57       408.89       1.32x

For the record, here's the -tags=noasm comparison. The numbers are worse for small inputs but better for large inputs, which I'd argue is still a net improvement:

benchmark                     old MB/s     new MB/s     speedup
BenchmarkWordsEncode1e1-8     140.02       677.54       4.84x
BenchmarkWordsEncode1e2-8     224.74       86.86        0.39x
BenchmarkWordsEncode1e3-8     274.82       258.34       0.94x
BenchmarkWordsEncode1e4-8     189.95       244.60       1.29x
BenchmarkWordsEncode1e5-8     140.10       185.91       1.33x
BenchmarkWordsEncode1e6-8     169.03       211.16       1.25x
BenchmarkRandomEncode-8       3746.11      13192.30     3.52x
Benchmark_ZFlat0-8            357.12       430.88       1.21x
Benchmark_ZFlat1-8            181.27       276.50       1.53x
Benchmark_ZFlat2-8            5959.15      14075.70     2.36x
Benchmark_ZFlat3-8            312.09       171.85       0.55x
Benchmark_ZFlat4-8            2008.62      3111.51      1.55x
Benchmark_ZFlat5-8            357.46       425.45       1.19x
Benchmark_ZFlat6-8            155.59       189.98       1.22x
Benchmark_ZFlat7-8            149.70       182.01       1.22x
Benchmark_ZFlat8-8            160.04       199.81       1.25x
Benchmark_ZFlat9-8            140.87       175.73       1.25x
Benchmark_ZFlat10-8           415.88       509.88       1.23x
Benchmark_ZFlat11-8           236.50       274.77       1.16x

In any case, the regular case (without -tags=noasm) seems always faster with upstream snappy, on this limited set of benchmarks.

@klauspost
Copy link
Owner

Snappy replaced wit upstream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants