-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AArch64] neon big endian miscompiled #65884
Comments
A weird observation is when set lane in range |
@llvm/issue-subscribers-backend-aarch64 |
I gdb it, and below is some debug info
Focus on
and the
It caused by If I change |
Hi bro,do you have any idea about this problem? @davemgreen |
I compare the opt pipeline, https://godbolt.org/z/TGToW3jKM , guess the error caused by transform |
I found a Tiny difference : I let the compare range from 8 to 13( 4+1) , https://godbolt.org/z/sxcj49Goq , it work fine. And the shuffer vector IR is
The compare range from 0 to 13(8 + 4 +1), the <4 x i32> shuffle vector IR is
The shuffle mask is diff. And lead to error. |
|
https://godbolt.org/z/97qKq7rb6
let
I don't know the correct asm should be, otherwise I can fix it. |
In the function I fix it by insert one more Same issue #65058 |
I don't think this code is valid, because it mixes ACLE intrinsics with the GCC vector extension. These have different semantics for lane ordering on big endian systems. The ACLE spec has a section on this, which recommends using the ACLE intrinsics consistently: https://github.com/ARM-software/acle/blob/main/main/acle.md#compatibility-with-other-vector-programming-models |
No, ignore me, I'd just mis-read the code, it isn't actually using the GCC vector extension. |
Actually, the code is from gcc's test suit https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/gcc.target/aarch64/vld1_lane.c. |
Apologies for not responding, I received no notifications for this issue past the first two messages (since I added #backend-aarch64). The fix above sounds very sensible. |
https://godbolt.org/z/vWMz5K34r
this code fail, when
-O3 -fno-inline
.I see the https://llvm.org/docs/BigEndianNEON.html , but I still confuse about the asm:
this code seem do nothing, but appear many times.
Can anyone give me some clue about this fail? Or just narrow this problem?
I opt bisect it, it's fail in
SLPVectorizerPass
, but I guess it just introduced this problem, but not the main point.The text was updated successfully, but these errors were encountered: