Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove obsolete SIMD code #57

Merged
merged 6 commits into from Feb 22, 2021

Conversation

klauspost
Copy link
Contributor

Probably due to compiler optimizations the Go code is now faster than the SSSE3/AVX/AVX2 code:

BenchmarkHash/GEN_/8Bytes-32    	 6486468	       184 ns/op	  43.46 MB/s
BenchmarkHash/GEN_/1K-32        	  545470	      2172 ns/op	 471.36 MB/s
BenchmarkHash/GEN_/8K-32        	   74073	     16106 ns/op	 508.64 MB/s
BenchmarkHash/GEN_/1M-32        	     584	   2034247 ns/op	 515.46 MB/s
BenchmarkHash/GEN_/5M-32        	     100	  10190003 ns/op	 514.51 MB/s
BenchmarkHash/GEN_/10M-32       	      56	  20357139 ns/op	 515.09 MB/s

BenchmarkHash/AVX2/8Bytes-32    	 5263258	       226 ns/op	  35.44 MB/s
BenchmarkHash/AVX2/1K-32        	  444441	      2633 ns/op	 388.98 MB/s
BenchmarkHash/AVX2/8K-32        	   61855	     19513 ns/op	 419.81 MB/s
BenchmarkHash/AVX2/1M-32        	     487	   2462013 ns/op	 425.90 MB/s
BenchmarkHash/AVX2/5M-32        	      91	  12384626 ns/op	 423.34 MB/s
BenchmarkHash/AVX2/10M-32       	      44	  26636364 ns/op	 393.66 MB/s

BenchmarkHash/AVX_/8Bytes-32    	 6349206	       188 ns/op	  42.54 MB/s
BenchmarkHash/AVX_/1K-32        	  461538	      2620 ns/op	 390.91 MB/s
BenchmarkHash/AVX_/8K-32        	   61224	     19567 ns/op	 418.65 MB/s
BenchmarkHash/AVX_/1M-32        	     484	   2473140 ns/op	 423.99 MB/s
BenchmarkHash/AVX_/5M-32        	      99	  12505052 ns/op	 419.26 MB/s
BenchmarkHash/AVX_/10M-32       	      46	  24869557 ns/op	 421.63 MB/s

BenchmarkHash/SSSE/8Bytes-32    	 6282679	       192 ns/op	  41.71 MB/s
BenchmarkHash/SSSE/1K-32        	  461614	      2628 ns/op	 389.69 MB/s
BenchmarkHash/SSSE/8K-32        	   60913	     19651 ns/op	 416.88 MB/s
BenchmarkHash/SSSE/1M-32        	     481	   2488563 ns/op	 421.36 MB/s
BenchmarkHash/SSSE/5M-32        	      91	  12516477 ns/op	 418.88 MB/s
BenchmarkHash/SSSE/10M-32       	      46	  24869561 ns/op	 421.63 MB/s

Remove obsolete and slower code. Simplify CPUID code.

Probably due to compiler optimizations the Go code is now faster than the SSSE3/AVX/AVX2 code:

```
BenchmarkHash/GEN_/8Bytes-32    	 6486468	       184 ns/op	  43.46 MB/s
BenchmarkHash/GEN_/1K-32        	  545470	      2172 ns/op	 471.36 MB/s
BenchmarkHash/GEN_/8K-32        	   74073	     16106 ns/op	 508.64 MB/s
BenchmarkHash/GEN_/1M-32        	     584	   2034247 ns/op	 515.46 MB/s
BenchmarkHash/GEN_/5M-32        	     100	  10190003 ns/op	 514.51 MB/s
BenchmarkHash/GEN_/10M-32       	      56	  20357139 ns/op	 515.09 MB/s

BenchmarkHash/AVX2/8Bytes-32    	 5263258	       226 ns/op	  35.44 MB/s
BenchmarkHash/AVX2/1K-32        	  444441	      2633 ns/op	 388.98 MB/s
BenchmarkHash/AVX2/8K-32        	   61855	     19513 ns/op	 419.81 MB/s
BenchmarkHash/AVX2/1M-32        	     487	   2462013 ns/op	 425.90 MB/s
BenchmarkHash/AVX2/5M-32        	      91	  12384626 ns/op	 423.34 MB/s
BenchmarkHash/AVX2/10M-32       	      44	  26636364 ns/op	 393.66 MB/s

BenchmarkHash/AVX_/8Bytes-32    	 6349206	       188 ns/op	  42.54 MB/s
BenchmarkHash/AVX_/1K-32        	  461538	      2620 ns/op	 390.91 MB/s
BenchmarkHash/AVX_/8K-32        	   61224	     19567 ns/op	 418.65 MB/s
BenchmarkHash/AVX_/1M-32        	     484	   2473140 ns/op	 423.99 MB/s
BenchmarkHash/AVX_/5M-32        	      99	  12505052 ns/op	 419.26 MB/s
BenchmarkHash/AVX_/10M-32       	      46	  24869557 ns/op	 421.63 MB/s

BenchmarkHash/SSSE/8Bytes-32    	 6282679	       192 ns/op	  41.71 MB/s
BenchmarkHash/SSSE/1K-32        	  461614	      2628 ns/op	 389.69 MB/s
BenchmarkHash/SSSE/8K-32        	   60913	     19651 ns/op	 416.88 MB/s
BenchmarkHash/SSSE/1M-32        	     481	   2488563 ns/op	 421.36 MB/s
BenchmarkHash/SSSE/5M-32        	      91	  12516477 ns/op	 418.88 MB/s
BenchmarkHash/SSSE/10M-32       	      46	  24869561 ns/op	 421.63 MB/s
```

Remove obsolete and slower code. Simplify CPUID code.
@harshavardhana harshavardhana merged commit 6a57409 into minio:master Feb 22, 2021
@klauspost klauspost deleted the remove-obsolete-code branch February 23, 2021 10:35
@mvdan
Copy link

mvdan commented Feb 25, 2021

For those of you running Go master/tip, this PR and v1.0.0 fixed the following panic:

traceback: unexpected SPWRITE function github.com/minio/sha256-simd.blockAvx2
fatal error: traceback

Hopefully the error mention makes it easier to find the solution by googling :)

MichaelEischer added a commit to MichaelEischer/restic that referenced this pull request Jul 10, 2021
Apparently the standard Go sha256 implementation is now faster than the
assembly implementation. The library now only adds support for SHA
extensions available in some processors.

See minio/sha256-simd#57 for more details.
mfrischknecht pushed a commit to mfrischknecht/restic that referenced this pull request Jun 14, 2022
Apparently the standard Go sha256 implementation is now faster than the
assembly implementation. The library now only adds support for SHA
extensions available in some processors.

See minio/sha256-simd#57 for more details.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants