Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clang is suboptimal for (~a ^ b) < ~a #69803

Closed
k-arrows opened this issue Oct 21, 2023 · 5 comments · Fixed by #69882
Closed

clang is suboptimal for (~a ^ b) < ~a #69803

k-arrows opened this issue Oct 21, 2023 · 5 comments · Fixed by #69882

Comments

@k-arrows
Copy link

Consider the following two functions:

int foo(int a, int b)
{
   return (~a ^ b) < ~a;
}

int bar(int a, int b)
{
   return (a ^ b) > a;
}

Clang vs GCC:
https://godbolt.org/z/rxK1nr794

Alive2:
https://alive2.llvm.org/ce/z/aUhKRe

define i32 @src(i32 %0, i32 %1) {
2:
  %3 = xor i32 %0, 4294967295
  %4 = xor i32 %3, %1
  %5 = icmp slt i32 %4, %3
  %6 = zext i1 %5 to i32
  ret i32 %6
}
=>
define i32 @tgt(i32 %0, i32 %1) {
2:
  %3 = xor i32 %1, %0
  %4 = icmp sgt i32 %3, %0
  %5 = zext i1 %4 to i32
  ret i32 %5
}
Transformation seems to be correct!
@github-actions github-actions bot added the clang Clang issues not falling into any other category label Oct 21, 2023
@EugeneZelenko EugeneZelenko added llvm:instcombine missed-optimization and removed clang Clang issues not falling into any other category labels Oct 21, 2023
@elhewaty
Copy link
Contributor

I am working on this.
but I have a question, How can I reach the code responsible for this?
I think the function foldXorOfICmps in InstCombineAndOrXor.cpp.

@dtcxzyw
Copy link
Member

dtcxzyw commented Oct 22, 2023

I am working on this. but I have a question, How can I reach the code responsible for this? I think the function foldXorOfICmps in InstCombineAndOrXor.cpp.

That is incorrect. In summary:

  • If the simplification is similar to an existing one, it is likely to be related to ValueTracking. You should handle more patterns to expose more information.
  • If the simplification doesn't create new instructions (fold into a constant or an existing inst), it is related to InstSimplify. You should find the root of the pattern and look through simplifyXXXInst.
  • If the simplification creates new instructions, it falls into the InstCombine part. You should find the root of the pattern and look through InstCombinerImpl::visitXXXInst.

In this case, you should start with InstCombinerImpl::visitICmpInst in InstCombineCompares.cpp since we want to fold icmp slt (~X ^ Y), ~X into icmp sgt (X ^ Y), X.

@elhewaty
Copy link
Contributor

@nikic @goldsteinn @arsenm @dtcxzyw candidate pull: #69882

@pinskia
Copy link

pinskia commented Nov 5, 2023

The way GCC handles this is a Canonicalization of (~a ^ b) into ~(a ^ b) and then ~a < ~b is what is optimized to a > b.

@dtcxzyw
Copy link
Member

dtcxzyw commented Nov 5, 2023

The way GCC handles this is a Canonicalization of (~a ^ b) into ~(a ^ b) and then ~a < ~b is what is optimized to a > b.

LLVM doesn't canonicalize (~a ^ b) into ~(a ^ b) due to multi-use of ~a.

dtcxzyw pushed a commit that referenced this issue Nov 14, 2023
- [InstCombine] Add test coverage for comparisons of operands including
one-complemented oparands(NFC).
- [InstCombine] Fold xored one-complemented operand comparisons.
Alive2: https://alive2.llvm.org/ce/z/PZMJeB
Fixes #69803.
zahiraam pushed a commit to zahiraam/llvm-project that referenced this issue Nov 20, 2023
…9882)

- [InstCombine] Add test coverage for comparisons of operands including
one-complemented oparands(NFC).
- [InstCombine] Fold xored one-complemented operand comparisons.
Alive2: https://alive2.llvm.org/ce/z/PZMJeB
Fixes llvm#69803.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants