Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize scalar and avx2 implementations #29

Merged
merged 1 commit into from Oct 4, 2021

Conversation

klauspost
Copy link
Contributor

Use https://github.com/animetosho/md5-optimisation#dependency-shortcut-in-g-function and https://github.com/animetosho/md5-optimisation#h-function-re-use for shorter dependency chain.

Cleanup in AVX2 removing superfluous loads/moves.

benchmark                              old ns/op     new ns/op     delta
BenchmarkAvx2/32KB-32                  201378        194863        -3.24%
BenchmarkAvx2/64KB-32                  321507        303803        -5.51%
BenchmarkAvx2/128KB-32                 594137        577175        -2.85%
BenchmarkAvx2/256KB-32                 1089630       1014160       -6.93%
BenchmarkAvx2/512KB-32                 2077582       1959077       -5.70%
BenchmarkAvx2/1MB-32                   4191188       3949610       -5.76%
BenchmarkAvx2/2MB-32                   8439181       8042106       -4.71%
BenchmarkAvx2/4MB-32                   16655067      15739187      -5.50%
BenchmarkAvx2/8MB-32                   33017781      31620932      -4.23%
BenchmarkAvx2SingleWriter/32KB-32      41765         39763         -4.79%
BenchmarkAvx2SingleWriter/64KB-32      81884         76866         -6.13%
BenchmarkAvx2SingleWriter/128KB-32     166802        155819        -6.58%
BenchmarkAvx2SingleWriter/256KB-32     329145        306292        -6.94%
BenchmarkAvx2SingleWriter/512KB-32     653422        616564        -5.64%
BenchmarkAvx2SingleWriter/1MB-32       1303555       1237368       -5.08%
BenchmarkAvx2SingleWriter/2MB-32       2596346       2441836       -5.95%
BenchmarkAvx2SingleWriter/4MB-32       5151380       4885766       -5.16%
BenchmarkAvx2SingleWriter/8MB-32       10324461      9765875       -5.41%

Use https://github.com/animetosho/md5-optimisation#dependency-shortcut-in-g-function and https://github.com/animetosho/md5-optimisation#h-function-re-use for shorter dependency chain.

Cleanup in AVX2 removing superfluous loads/moves.

```
benchmark                              old ns/op     new ns/op     delta
BenchmarkAvx2/32KB-32                  201378        194863        -3.24%
BenchmarkAvx2/64KB-32                  321507        303803        -5.51%
BenchmarkAvx2/128KB-32                 594137        577175        -2.85%
BenchmarkAvx2/256KB-32                 1089630       1014160       -6.93%
BenchmarkAvx2/512KB-32                 2077582       1959077       -5.70%
BenchmarkAvx2/1MB-32                   4191188       3949610       -5.76%
BenchmarkAvx2/2MB-32                   8439181       8042106       -4.71%
BenchmarkAvx2/4MB-32                   16655067      15739187      -5.50%
BenchmarkAvx2/8MB-32                   33017781      31620932      -4.23%
BenchmarkAvx2SingleWriter/32KB-32      41765         39763         -4.79%
BenchmarkAvx2SingleWriter/64KB-32      81884         76866         -6.13%
BenchmarkAvx2SingleWriter/128KB-32     166802        155819        -6.58%
BenchmarkAvx2SingleWriter/256KB-32     329145        306292        -6.94%
BenchmarkAvx2SingleWriter/512KB-32     653422        616564        -5.64%
BenchmarkAvx2SingleWriter/1MB-32       1303555       1237368       -5.08%
BenchmarkAvx2SingleWriter/2MB-32       2596346       2441836       -5.95%
BenchmarkAvx2SingleWriter/4MB-32       5151380       4885766       -5.16%
BenchmarkAvx2SingleWriter/8MB-32       10324461      9765875       -5.41%
```
@harshavardhana harshavardhana merged commit 550e0bc into minio:master Oct 4, 2021
@klauspost klauspost deleted the optimize-scaler-avx2 branch October 5, 2021 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants