Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Missed optimization: fold other_cond && (a s> -1) && a s< b to other_cond && a u< b when b is a specific constant #76623

Closed
XChy opened this issue Dec 30, 2023 · 4 comments · Fixed by #86823

Comments

@XChy
Copy link
Member

XChy commented Dec 30, 2023

Alive2 proof: https://alive2.llvm.org/ce/z/DT9eDZ

Motivating example:

define i1 @src(i32 %a, i1 %other_cond) {
entry:                                      
  %cmp1 = icmp sgt i32 %a, -1
  %or.cond = select i1 %other_cond, i1 %cmp1, i1 false
  %cmp2 = icmp slt i32 %a, 10086
  %ret = select i1 %or.cond, i1 %cmp2, i1 false
  ret i1 %ret
}

can be folded to:

define i1 @tgt(i32 %a, i1 %other_cond) {
entry:                                      
  %or.cond = icmp ult i32 %a, 10086
  %ret = select i1 %other_cond, i1 %or.cond, i1 false
  ret i1 %ret
}

However, it folds if we replace the logical and with a bitwise and. I thought that Reassociate should be to blame, but godbolt told me that similar fold for bitwise and is handled by InstCombine. Therefore, I tag InstCombine here.

Real-world motivation

This snippet of IR is derived from qemu/net/util.c@net_parse_macaddr (after O3 pipeline).
The example above is a reduced version. If you're interested in the original suboptimal IR and optimal IR, see also:https://godbolt.org/z/Ea8GGxKba

Let me know if you can confirm that it's an optimization opportunity, thanks.

@dtcxzyw
Copy link
Member

dtcxzyw commented Dec 30, 2023

It would be fixed by #76621 if we know that select (icmp sgt i32 %a, -1), (icmp slt i32 %a, 10086), false can be simplified into icmp ult i32 %a, 10086.

@nikic
Copy link
Contributor

nikic commented Dec 30, 2023

For cases with a bitwise outer operation, this is handled by

// TODO: Make this recursive; it's a little tricky because an arbitrary
. We're missing handling for the case of a logical outer operation, I believe.

@XChy
Copy link
Member Author

XChy commented Mar 27, 2024

I think this issue can be handled easier. Note that:
(A && B) && C --> A && (B & C) if "C is poison" implies "B is poison": https://alive2.llvm.org/ce/z/nb2W5i
I realized this after I handicrafted a bunch of proofs on the general reassociation...

@XChy
Copy link
Member Author

XChy commented Mar 27, 2024

I will try to file a simple solution for it. Though I also have one much complex patch.

XChy added a commit that referenced this issue Apr 10, 2024
…6823)

Fixes #76623
Alive2 proof: https://alive2.llvm.org/ce/z/gX6znJ (I'm not sure how to
write a proof for such transform, maybe there are mistakes)

In most cases, `icmp(a, C1) && (other_cond && icmp(a, C2))` will be
reduced to `icmp(a, C1) & (other_cond && icmp(a, C2))`, since latter
icmp always implies the poison of the former. After reduction, it's
easier to simplify the icmp chain.
Similarly, this patch does the same thing for `(A && B) && C --> A && (B
& C)`. Maybe we could constraint such reduction only on icmps if there
is regression in benchmarks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants