x/crypto/blake2b: very low performance for AVX and AVX2 code #18563

aead · 2017-01-07T22:25:35Z

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (`go version`)?

1.7.*

What operating system and processor architecture are you using (`go env`)?

amd64/linux on Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz
Further info

What did you do?

go test -bench=Benchmark for x/crypto/blake2b

What did you expect to see?

BenchmarkWrite128-12              500000              2084 ns/op          61.40 MB/s
BenchmarkWrite1K-12               100000             16227 ns/op          63.10 MB/s
BenchmarkSum128-12               1000000              2107 ns/op          60.74 MB/s
BenchmarkSum1K-12                 100000             16312 ns/op          62.77 MB/s
PASS
ok      golang.org/x/crypto/blake2b     7.647s

What did you see instead?

Performance about 800 MB/s as on a i7-6500U

The text was updated successfully, but these errors were encountered:

aead · 2017-01-07T23:49:24Z

Replacing PINSRQ through VPINSERQ and adding VZEROUPPER doesn't change anything...

gopherbot · 2017-01-09T22:10:35Z

CL https://golang.org/cl/34993 mentions this issue.

Ref golang/go#18563

On some amd64 CPUs (Xeon E5-2680v4 / E5-2620v3) using SSE and AVX instructions leads to very low performance. On a i7-6500U the SSE-AVX code performs following: AVX2: name time/op Write128-4 165ns ± 0% Write1K-4 1.20µs ± 0% Sum128-4 189ns ± 1% Sum1K-4 1.22µs ± 0% name speed Write128-4 773MB/s ± 1% Write1K-4 855MB/s ± 0% Sum128-4 675MB/s ± 1% Sum1K-4 838MB/s ± 0% while the same code achieves values < 65MB/s on a Xeon E5-2620v3. Replacing the `MOVQ` and `PINSRQ` with the AVX instructions `VMOVQ` and `VPINSRQ` increases the performance of the AVX/AVX2 code to some expected values: name old time/op new time/op delta Write128-12 2.20µs ±10% 0.22µs ± 9% -90.00% (p=0.029 n=4+4) Write1K-12 16.2µs ± 0% 1.1µs ± 0% -93.07% (p=0.029 n=4+4) Sum128-12 2.10µs ± 0% 0.22µs ± 0% -89.47% (p=0.029 n=4+4) Sum1K-12 16.3µs ± 0% 1.2µs ± 0% -92.65% (p=0.029 n=4+4) name old speed new speed delta Write128-12 58.5MB/s ±10% 582.8MB/s ±10% +897.08% (p=0.029 n=4+4) Write1K-12 63.1MB/s ± 0% 909.8MB/s ± 0% +1341.40% (p=0.029 n=4+4) Sum128-12 60.8MB/s ± 0% 576.3MB/s ± 0% +847.84% (p=0.029 n=4+4) Sum1K-12 62.8MB/s ± 0% 855.2MB/s ± 0% +1260.78% (p=0.029 n=4+4) The AVX/AVX2 code now uses only AVX (no SSE) instructions. Fixes golang/go#18563. Change-Id: I1961dd8fa02014642587523b7f099816a263c9f5 Reviewed-on: https://go-review.googlesource.com/34993 Reviewed-by: Adam Langley <agl@golang.org>

rakyll changed the title ~~blake2b: very low performance for AVX and AVX2 code~~ x/crypto/blake2b: very low performance for AVX and AVX2 code Jan 7, 2017

rakyll added this to the Unreleased milestone Jan 7, 2017

bradfitz added the Performance label Jan 7, 2017

harshavardhana mentioned this issue Jan 11, 2017

api: GetObject() transfer rate observed to have TTFB (Time To First Byte) ~5s delay minio/minio#3559

Closed

harshavardhana added a commit to minio/minio that referenced this issue Jan 25, 2017

Move to blake2b-simd due to perf problems in golang.org/x/crypto

d41dcb7

Ref golang/go#18563

gopherbot closed this as completed in golang/crypto@f671756 Feb 8, 2017

golang locked and limited conversation to collaborators Feb 8, 2018

gopherbot added the FrozenDueToAge label Feb 8, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x/crypto/blake2b: very low performance for AVX and AVX2 code #18563

x/crypto/blake2b: very low performance for AVX and AVX2 code #18563

aead commented Jan 7, 2017 •

edited

Loading

aead commented Jan 7, 2017

gopherbot commented Jan 9, 2017

x/crypto/blake2b: very low performance for AVX and AVX2 code #18563

x/crypto/blake2b: very low performance for AVX and AVX2 code #18563

Comments

aead commented Jan 7, 2017 • edited Loading

What version of Go are you using (go version)?

What operating system and processor architecture are you using (go env)?

What did you do?

What did you expect to see?

What did you see instead?

aead commented Jan 7, 2017

gopherbot commented Jan 9, 2017

aead commented Jan 7, 2017 •

edited

Loading

What version of Go are you using (`go version`)?

What operating system and processor architecture are you using (`go env`)?