[InstCombine] Missing fold for (1 + 2 * a * b > 0) ? a : a * b --> a * b < 0 ? a * b : a #61538

k-arrows · 2023-03-20T11:23:20Z

Test code:

int foo(int a, int b)
{
    return (1 + 2 * a * b > 0) ? a : a * b ;
}

clang

foo(int, int):                               // @foo(int, int)
        mul     w8, w0, w1
        mov     w9, #1                          // =0x1
        orr     w8, w9, w8, lsl #1
        cmp     w8, #1
        csinc   w8, w1, wzr, lt
        mul     w0, w8, w0
        ret

gcc

foo(int, int):
        mul     w1, w0, w1
        cmp     w1, 0
        csel    w0, w1, w0, lt
        ret

https://godbolt.org/z/b79ehqG77

llvmbot · 2023-03-20T14:29:22Z

@llvm/issue-subscribers-backend-aarch64

junaire · 2023-04-05T05:55:35Z

It's not applied to the AArch64 target but to other targets as well. I believe it's a missing fold in the middle end.
See https://alive2.llvm.org/ce/z/P-FXNQ

junaire · 2023-04-05T05:56:42Z

https://godbolt.org/z/dPh8P14q1

k-arrows · 2023-04-05T06:08:11Z

It's not applied to the AArch64 target but to other targets as well. I believe it's a missing fold in the middle end. See https://alive2.llvm.org/ce/z/P-FXNQ

Thank you for your comment. I'm sorry but the issue title was misleading.

junaire · 2023-04-05T06:11:00Z

Even these:

Positive * a * b + 1 > 0 ? a : a * b --> a * b < 0 : a * b ? a
Negative * a * b + 1 > 0 ? a : a * b --> a * b > 0 : a * b ? a

https://alive2.llvm.org/ce/z/9rD0BK

junaire · 2023-04-05T06:11:25Z

It's not applied to the AArch64 target but to other targets as well. I believe it's a missing fold in the middle end. See https://alive2.llvm.org/ce/z/P-FXNQ

Thank you for your comment. I'm sorry but the issue title was misleading.

Feel free to correct it!

We can eliminate the or operation based on the predicate and the relation between OrC and C. sge: X | OrC s>= C --> X s>= 0 iff OrC s>= C s>= 0 sgt: X | OrC s> C --> X s>= 0 iff OrC s> C s>= 0 sle: X | OrC s<= C --> X s< 0 iff OrC s> C s>= 0 slt: X | OrC s< C --> X s< 0 iff OrC s>= C s>= 0 Alive2 links: sge: https://alive2.llvm.org/ce/z/W-6FHE sgt: https://alive2.llvm.org/ce/z/TKK2yJ sle: https://alive2.llvm.org/ce/z/vURQGM slt: https://alive2.llvm.org/ce/z/JAsVfw Related issue: #61538 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D147597

nikic · 2023-04-13T10:48:53Z

Can you please confirm whether this report was based on a real-world motivation? I want to understand whether we should accept a non-principled fix for this or not.

junaire · 2023-04-13T11:14:53Z

Can you please confirm whether this report was based on a real-world motivation? I want to understand whether we should accept a non-principled fix for this or not.

Sorry, I don't understand. What do you mean by "based on a real-world motivation"? I'm not the original bug reporter so I don't know if this is based on some real codebase or just random fuzzing. However, IMHO it's an optimizing compiler's job to optimize the code as long as 1. the transform is beneficial 2. it doesn't cause noticeable compile time regression.

I'll be surprised to see clang can't optimize this trivial example: https://godbolt.org/z/bvPhno5xT

k-arrows · 2023-04-13T12:10:52Z

I'm the original reporter. The conditional expression considered in this issue (I mean "1 + 2 * a * b > 0" ) was originally inspired by the following derivative formula. Such calculations are necessary when considering some kind of differential equation.

$$\{(1+\alpha u)u\}'=(1+2\alpha u)u'$$

If limited to integers, we can fold as indicated in the title of this issue.

However, IMHO it's an optimizing compiler's job to optimize the code as long as 1. the transform is beneficial 2. it doesn't cause noticeable compile time regression.

I agree with junaire. I believe that compilers are based on steady improvement. I would like to express my sincere respect for junaire's efforts (creating patches and responding to reviews).

nikic · 2023-04-13T12:17:25Z

My question here is whether there is production code that will benefit from this optimization. If yes, I am much more willing to accept a hacky solution to the problem. If no, then I will insist on a principled solution.

junaire · 2023-04-13T12:38:02Z

My question here is whether there is production code that will benefit from this optimization. If yes, I am much more willing to accept a hacky solution to the problem. If no, then I will insist on a principled solution.

What do you mean by " hacky solution to the problem"? I guess there may be some misunderstanding here.
FWIW, I don't intend to add some instcombine matches mentioned in earlier comments:

Positive * a * b + 1 > 0 ? a : a * b --> a * b < 0 : a * b ? a
Negative * a * b + 1 > 0 ? a : a * b --> a * b > 0 : a * b ? a

That's too arbitrary and I want some principled solutions as well. For example, I think we at least can improve the following cases:

icmp(X | OrC, C) --> icmp(X, 0) just landed in e3175f7
icmp(Factor * X, * Y, 0) --> icmp(X* Y, 0) I created https://reviews.llvm.org/D148210 but that may be a wrong approach as you commented earlier.

After all of these low-hanging fruits, I can still see some unnecessary computation but that's another story.
Or do you mean the two improvements are also hacky? 😨

The existing logic in `foldICmpMulConstant` can only optimize the constant if we have (X * Y) * C. However, reassociate pass currently produces this form with the constant on the inner multiply, so let's reassociate it in foldICmpMulConstant to enable further optimizations. Related issue: llvm#61538 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D148210

We can eliminate the or operation based on the predicate and the relation between OrC and C. sge: X | OrC s>= C --> X s>= 0 iff OrC s>= C s>= 0 sgt: X | OrC s> C --> X s>= 0 iff OrC s> C s>= 0 sle: X | OrC s<= C --> X s< 0 iff OrC s> C s>= 0 slt: X | OrC s< C --> X s< 0 iff OrC s>= C s>= 0 Alive2 links: sge: https://alive2.llvm.org/ce/z/W-6FHE sgt: https://alive2.llvm.org/ce/z/TKK2yJ sle: https://alive2.llvm.org/ce/z/vURQGM slt: https://alive2.llvm.org/ce/z/JAsVfw Related issue: llvm#61538 Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D147597

hamza-spl · 2023-06-05T11:38:11Z

Hi,
I'm interested in what @junaire did here https://reviews.llvm.org/D148210, I had implemented something similar to this, my approach was to add the logic in foldICmpWithZero() as I supposed the optimization only makes sens when comparing to 0 , so we need to completely ignore the unuseful multiplication by the constant in this case, otherwise I can't identify what kind of optimizations could this transformation trigger in more general cases, do you have any idea ?
@junaire could you please let me know if you still working on this ?
Thanks.

pinskia · 2023-12-25T11:22:59Z

Note the missed optimization in this case is something more like:

int foo1(int a, int b)
{
    return (2 * a * b < 0) ? a : a * b ;
}

Looks like LLVM is converting it to:

int foo2(int a, int b)
{
  int c = a*b;
  int d = (2 * c < 0) ? 1 : b ;
 return d * a;
}

Note GCC does not use csel for the ?: until late(r) in the pipeline and expands it into if/then/else after the front-end.

Also 2* is not able to remove by LLVM for some reason, it was converted to <<1 looks like too early too.

github-actions bot added the new issue label Mar 20, 2023

EugeneZelenko added backend:AArch64 missed-optimization and removed new issue labels Mar 20, 2023

junaire added llvm:instcombine and removed backend:AArch64 labels Apr 5, 2023

junaire changed the title ~~[AArch64] optimization of "(1 + 2 * a * b > 0) ? a : a * b"~~ [InstCombine] Missing fold for (1 + 2 * a * b > 0) ? a : a * b --> a * b < 0 ? a * b : a Apr 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[InstCombine] Missing fold for (1 + 2 * a * b > 0) ? a : a * b --> a * b < 0 ? a * b : a #61538

[InstCombine] Missing fold for (1 + 2 * a * b > 0) ? a : a * b --> a * b < 0 ? a * b : a #61538

k-arrows commented Mar 20, 2023

llvmbot commented Mar 20, 2023

junaire commented Apr 5, 2023

junaire commented Apr 5, 2023 •

edited

k-arrows commented Apr 5, 2023

junaire commented Apr 5, 2023

junaire commented Apr 5, 2023

nikic commented Apr 13, 2023

junaire commented Apr 13, 2023

k-arrows commented Apr 13, 2023

nikic commented Apr 13, 2023

junaire commented Apr 13, 2023 •

edited

hamza-spl commented Jun 5, 2023 •

edited

pinskia commented Dec 25, 2023

[InstCombine] Missing fold for (1 + 2 * a * b > 0) ? a : a * b --> a * b < 0 ? a * b : a #61538

[InstCombine] Missing fold for (1 + 2 * a * b > 0) ? a : a * b --> a * b < 0 ? a * b : a #61538

Comments

k-arrows commented Mar 20, 2023

llvmbot commented Mar 20, 2023

junaire commented Apr 5, 2023

junaire commented Apr 5, 2023 • edited

k-arrows commented Apr 5, 2023

junaire commented Apr 5, 2023

junaire commented Apr 5, 2023

nikic commented Apr 13, 2023

junaire commented Apr 13, 2023

k-arrows commented Apr 13, 2023

nikic commented Apr 13, 2023

junaire commented Apr 13, 2023 • edited

hamza-spl commented Jun 5, 2023 • edited

pinskia commented Dec 25, 2023

junaire commented Apr 5, 2023 •

edited

junaire commented Apr 13, 2023 •

edited

hamza-spl commented Jun 5, 2023 •

edited