New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LLVM's SHA256 is very inefficient #56121
Comments
@llvm/issue-subscribers-lld-macho |
As for what to do about this: Rather than calling out to system libraries (defensible on macOS, but weird on other unixen), we should probably try to make our sha256 impl less slow. There are SIMD instructions for modern x86 chips: https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sha-extensions.html (using the intrins) There are also SIMD instructions for armv8+crypto (which M1 chips support): Looks like there's public-domain code using these at https://sources.debian.org/src/libcrypto++/8.4.0-1/sha_simd.cpp/ which can provide inspiration for replacing (Alternatively, |
858e8b1 did the cop-out for now. But still slow when e.g. linking a mac/arm binary on a linux system. |
0baf13e did the parallelization. (Commit message has some metrics.) After these changes, things are fairly fast on macOS. Linking an arm64 mac binary on different hosts is still 2.55% slower than it needs to be; the fix for that is to make LLVM's SHA256 implementation faster as described in #56121 (comment) |
One thing to note is the Intel SHA extensions aren't all that common right now. As I understand it, they were originally only on low power chips. (Though AMD Zen has it now, and I think the newest Intel ones might too?) OpenSSL and friends also have SHA-256 implementations using more general purpose SIMD that kick in more often in x86 today. SHA-256 acceleration on Arm is common though. It's part of the original cryptography extension from Armv8, so most Armv8 chips have it. |
I was able to make ld64.lld 20% (!) faster by tweaking how it computes SHA256.
CommonCrypto/CommonDigest.h
on macOS is faster. (17% speedup)https://github.com/nico/llvm-project/commits/hash has details, proof-of-concept level.
I used the repro file at https://drive.google.com/file/d/1wWCeDWQ3OAyVwadyCdFZ0WXNB11ADnBK/view?usp=sharing as benchmark case.
The text was updated successfully, but these errors were encountered: