Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Missed optimization: Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 #82177

Closed
XChy opened this issue Feb 18, 2024 · 6 comments · Fixed by #82280

Comments

@XChy
Copy link
Member

XChy commented Feb 18, 2024

Alive2 proof: https://alive2.llvm.org/ce/z/Bre2we

Motivating example

define i32 @src(i32 %a){
entry:
  %add = sub nuw i32 64, %a
  %cond = call i32 @llvm.usub.sat.i32(i32 %add, i32 14)
  ret i32 %cond
}

can be folded to:

define i32 @tgt(i32 %a){
entry:
  %cond = call i32 @llvm.usub.sat.i32(i32 50, i32 %a)
  ret i32 %cond
}

When C2 u< C1, we get usub_sat(C1 - C2, A), otherwise we get 0. See also the examples in alive2 proof.

Real-world motivation

This snippet of IR is derived from jemalloc/src/psset.c@psset_maybe_remove_purge_list (after O3 pipeline).
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, contact me to get it, please. Actually such pattern is found frequently in the IRs in jemalloc project.

Let me know if you can confirm that it's an optimization opportunity, thanks.

@XChy XChy changed the title [InstCombine] Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 [InstCombine] Missed optimization: Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 Feb 18, 2024
@elhewaty
Copy link
Contributor

I will work on this.

@elhewaty
Copy link
Contributor

If I need to confirm the transformation for vectors.
what can I use? I think llvm.assume doesn't work with vectors.

@cyk2018
Copy link

cyk2018 commented Feb 19, 2024

It seems we must keep the equality between two expression, in your example, it is C1 - A u< C2 with C1 - C2 u< A。And can you give the example such as how can I get this IR from C or C++ Code.

@XChy
Copy link
Member Author

XChy commented Feb 19, 2024

If I need to confirm the transformation for vectors. what can I use? I think llvm.assume doesn't work with vectors.

Since this pattern doesn't involve vector operation, the correctness for scalars implies the correctness for vectors. However, If you do want to verify this pattern in Alive2, you can transform %cmp = icmp ult <4 x i32> %c2, %c1 into a single i1 with @llvm.vector.reduce.and.i1, and then apply llvm.assume.

@XChy
Copy link
Member Author

XChy commented Feb 19, 2024

@cyk2018, tools like https://github.com/travitch/whole-program-llvm and https://github.com/SRI-CSL/gllvm can extract IRs when compiling the whole project. For a single C/C++ file, just clang -emit-llvm -S example.c.

@cyk2018
Copy link

cyk2018 commented Feb 20, 2024

@cyk2018, tools like https://github.com/travitch/whole-program-llvm and https://github.com/SRI-CSL/gllvm can extract IRs when compiling the whole project. For a single C/C++ file, just clang -emit-llvm -S example.c.

Thanks for your links. I am confused in how to generate nuw flag. I have found this method that SCCPSolver can generate this after analysis. The question is not very relevant with this issue. Still sincerely thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants