-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scalar matrix multiply code can't recover from <float x 2> ABI restriction #60441
Comments
…t v2f32 float ops Pulled out of Issue #60441 - we really need that handling in the middle-end, but there's some obvious DAG cleanups we can try as well
CC @LuoYuanke fb91f0a came about as a stopgap fix for this - if we can fix this in the middle end then the combineConcatVectorOps fix and the widen_fadd/fsub/fmul/fdiv.ll test files become unnecessary |
If we are talking about matrix, perhaps we should use matrix type (https://godbolt.org/z/nzfYMv8Wq). It would generate llvm.matrix intrinsics and the intrinsics can be well lowered to vector operations. |
This is an example of generic code that occurs on many different projects - so asking them all to use matrix_type attributes is going to be tricky, and many will reply that gcc doesn't need it. |
In https://godbolt.org/z/19564zseG, I take a simple look on where gcc change the vector size. |
https://godbolt.org/z/19564zseG
Starting from generic C++ based vector4/matrix4 style types:
Passing Vector4 args by value results in them being converted to { <2 x float>, <2 x float> } types for the x64 ABI, and we never manage to recover from this, even if all calls are eventually inlined away. The matrix multiply code in the compiler explorer link above results in the following IR:
While GCC manages to load/store and process the vectors and matrices in whole xmm/ymm/zmm, LLVM ends up with float2 sub-vectors and relies on the DAG to load/store combine them back together as best it can.
The text was updated successfully, but these errors were encountered: