-
-
Notifications
You must be signed in to change notification settings - Fork 9.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LoongArch: chacha20 speed has huge performance degraded #23300
Comments
Seems can not reproducing on Gentoo 3A6000
The gcc I used didn't support |
https://sourceware.org/git/?p=glibc.git;a=commit;h=672b91ba1060887aa8897d0b98af83b96d4a52b0 it seems glibc's hwcap has been reverted, that causes the chacha20 lsx asm pack can't be called. but kernel's hwcap bit should not be erased, why? |
Phew. I'm too stupid :(. |
In that pull request, the input length check was moved forward, but the related ori instruction was missing, and it will cause input of any length down to the much slower scalar implementation. Fixes openssl#23300 CLA: trivial
The regression was introduced in PR #22817. In that pull request, the input length check was moved forward, but the related ori instruction was missing, and it will cause input of any length down to the much slower scalar implementation. Fixes #23300 CLA: trivial Reviewed-by: Shane Lontis <shane.lontis@oracle.com> Reviewed-by: Tomas Mraz <tomas@openssl.org> (Merged from #23301) (cherry picked from commit 9710285)
9a41a3c added chacha20 simd asm accelerated pack by lsx on loongarch machines.
but now i noticed it has a performance problem on 3a6000 cpu.
after bisects, this issue has appeared after PR #22817 merged (commit: b46de72)
good chacha20 benchmark result (on dfd986b)
bad chacha20 benchmark result after PR #22817 merged (on b46de72)
The text was updated successfully, but these errors were encountered: