-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HWY_CMAKE_ARM7:BOOL=ON vs Dynamic Dispatch #1271
Comments
This impact jpeg-xl on Debian/armhf:
|
@jan-wassenberg The root issue is that clang (or gcc) did not implement target-gated arm_neon.h intrinsics right ? This is the reason why the complete codebase is compiled with neon instructions, right ? |
Now I remember, this is clang: |
That's right, HWY_CMAKE_ARM7 is an alternative to dynamic dispatch and was necessary when the headers required a compiler flag. It is indeed expected that using it generates NEON instructions. Now GCC supports dynamic dispatch, and clang-16 apparently does too but it's not in our package repo yet. As soon as it is and we test it, we can change HWY_HAVE_RUNTIME_DISPATCH to 1 there, and then stop using HWY_CMAKE_ARM7. At least on Linux, which is currently the only platform on which we are able to detect the Arm CPU capabilities. If someone was interested, we could also support Android. |
hwy claims to have dynamic dispatch. However when using the cmake flag HWY_CMAKE_ARM7:BOOL=ON the complete codebase is compiled with neon instruction (
-mfpu=neon-vfpv4
).It turns out the shared library is taking advantages of this compilation flag, and generate code with neon intructions:
The text was updated successfully, but these errors were encountered: