-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add AES intrinsics when __AES__ is not defined #429
Comments
Will this, when implemented, dynamically check the users' CPU and enable/disable AES-NI at runtime? Because that is sorely needed for distributing binaries that work everywhere and fast. |
Cleared at Pull Request 461. PR 461 takes the "single source" and breaks it into a "base implementation" which is standard C++, and a "SIMD implementation" which includes different ISAs. The base implementation can be built with minimal or no flags. For example, The |
Sorry about the late reply. Your comment did not make my radar.
Yes, the checks are dynamic. The distros are well represented. We worked closely with László Böszörményi (@gcsideal) , who is our Debian packager and maintainer. He keeps us out of trouble for most things related to distros. We experienced some pain points because the way Wei did things back in 2005 did not scale well into 2017. For example, Wei literally shadowed intrinsics, and used GCC inline assembly to provide a body for the Intel intrinsics if the intrinsics was missing. It worked great when we only needed AES and CLMUL. The trick broke under Clang due to We had to switch gears and split the sources using appropriate compiler flags to solve all the problems. It was a big change, and we needed to wait until 6.0 for the change. We should not be in a position to move forward without working around lots of little problems. |
👍 |
This issue tracks adding AES intrinsics when __AES__ is not defined. This happens when a machine lacks AES or
-march=native
is not used. While it may seem counter-intuitive to add it when its not needed, the primary use case is distro's, which builds for generic x86_64 and its users who have a more capable machine. A secondary use case is Clang, which does not enable all cpu features under-march=native
on occasion.Also see Fixing "ERROR: failed to generate sha1rnds4 instruction" (and friends) on the mailing list.
The text was updated successfully, but these errors were encountered: