Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Possible wrong compilation of scalar comparison into cmpnltpd #63561

Closed
kronbichler opened this issue Jun 27, 2023 · 3 comments
Closed
Labels
backend:X86 question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!

Comments

@kronbichler
Copy link

Hello,

I experienced a bug in code generation for the x86-64 target. For the minimal test case
test.txt
compiled on

$ clang++-16 -v
Ubuntu clang version 16.0.0 (1~exp5ubuntu3)
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/13
Candidate multilib: .;@m64
Selected multilib: .;@m64

with

 clang++-16 -Og -S test.cc

I get the assembly code

f(double, double, unsigned int, bool):                               # @f(double, double, unsigned int, bool)
        test    esi, esi
        jne     .LBB0_5
        mov     eax, edi
        cvtsi2sd        xmm2, rax
        mulsd   xmm2, xmm1
        addsd   xmm1, xmm2
        movapd  xmm3, xmm0
        cmpnltpd        xmm3, xmm2
        cmpnltpd        xmm1, xmm0
        ....

see also https://godbolt.org/z/s3v3rdfsd
Prior content in the register xmm2 from before entering this function in the upper lane can trigger a floating point exception in the second to last line for cmpnltpd xmm3, xmm2. Specifically, I see

(gdb) p $xmm2
$1 = {v2_double = {0.40000000000000002, nan(0xc000000000000)}}

showing that the upper lane contains an invalid entry. The generated code does not give the FPE with clang-15, nor does it with the optimization level -O0. Using -O2, -O3 also leads to the invalid code according to godbolt both for clang-15 and clang-16.

Please let me know if I should provide a main function to invoke this. All one needs to do is to set xmm2 to _mm_set1_pd(std::numeric_limits<float>::signaling_NaN()); and call feenableexcept(FE_DIVBYZERO | FE_INVALID); before calling f(0.2, 0.2, 2, false);. I could be wrong and something might be disallowed by my code, but I believe this is valid code and wrong within LLVM.

Note that the code is extracted from a big project, dealii/dealii#15496 (comment)

@llvmbot
Copy link
Collaborator

llvmbot commented Jun 27, 2023

@llvm/issue-subscribers-backend-x86

@efriedma-quic
Copy link
Collaborator

If you're using feenableexcept, you need to pass -ftrapping-math or equivalent.

@efriedma-quic efriedma-quic closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2023
@EugeneZelenko EugeneZelenko added the question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead! label Jun 27, 2023
@kronbichler
Copy link
Author

Thank you for the info, I did not consider the flag.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:X86 question A question, not bug report. Check out https://llvm.org/docs/GettingInvolved.html instead!
Projects
None yet
Development

No branches or pull requests

4 participants