New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Terribly suboptimal code generated for Linux kernel arch/powerpc/lib/xor_vmx.c #1713
Comments
Thanks for the report. I'm immediately reminded of:
|
Is this with KASAN? https://lore.kernel.org/7cb1285a-42e6-2b67-664f-7d206bc9fd80@csgroup.eu/ |
No, that's without KASAN, you can see in the generated code that there is no call to kasan functions. |
Same on ppc64 (PowerMac G5 build). GCC 12.2.1_p20221126:
CLANG 15.0.6:
|
Seems even a bit worse with CLANG 16.0.4 (Talos II build). GCC 12.2.1_p20230428:
CLANG 16.0.4
|
In numbers the performance difference of the 2 compilers looks like this (on a Talos II). GCC 12.2.1_p20221126:
CLANG 16.0.4:
|
Seems this issue has been taken care of in CLANG 18.1.x. 👍 The resulting code is still larger compared to GCC but performance is en par with GCC now (in-kernel xor benchmark is faster with CLANG, raid6 benchmark is faster with GCC). GCC 13.2.1_p20240210::
CLANG 18.1.3:
Here again performance comparison on my Talos, kernel v6.9-rc4: GCC 13.2.1_p20240210:
CLANG 18.1.3:
|
Ha, so this appears to be fixed by my commit 35f20786c481 ("powerpc: xor_vmx: Add '-mhard-float' to CFLAGS"), which was fixing a new compiler error from Closing this up for now. |
Generated code with GCC, seems properly optimised, doesn't use stack.
With CLANG huge amount of stack use.
Dumping only
__xor_altivec_2()
but same problem with the 4 functions.GCC 12.2:
CLANG 14:
The text was updated successfully, but these errors were encountered: