Missed optimizations in conditionally selected constants #53006

uncleasm · 2022-01-05T14:38:05Z

Compared to gcc, clang produces slightly more verbose code for a sequence

int test1(int a) { return a ? -1 : 1; }
        xor     eax, eax
        test    edi, edi
        sete    al
        add     eax, eax
        add     eax, -1
        ret

AFAIK, the optimal sequence is

neg edi
sbb eax, eax
or eax, 1

which could be encoded from

int test0(int a) { return a ? -1 : 0; }
        neg     edi
        sbb     eax, eax
        ret

int test1b(int a) { return test0(a) | 1; }

As a side note, gcc can produce the expected output from test1b but not from test1.

The text was updated successfully, but these errors were encountered:

RKSimon · 2022-01-06T13:59:08Z

Godbolt: https://godbolt.org/z/hd9W3hWr5

CC @rotateright @LebedevRI

rotateright · 2022-01-06T13:59:33Z

We have combining/lowering to create the neg+sbb already. It could be enhanced to match a new pattern and tack on the 'or' as the final op.

rotateright · 2022-01-06T20:47:42Z

This won't do anything for this exact example, but here's a proposal to improve the existing x86 lowering for select via SBB:
https://reviews.llvm.org/D116765

select (X != 0), -1, Y --> 0 - X; or (sbb), Y select (X != 0), Y, -1 --> X - 1; or (sbb), Y We already had these x86 carry-flag transforms, but one was over-specified to handle a "0" select arm only. That's just a special-case of the more general pattern (the 'or' will be deleted if Y is zero). This is part of solving #53006, but it misses that example because some other combine has already converted that exact pattern into math ops. Differential Revision: https://reviews.llvm.org/D116765

rotateright · 2022-01-09T14:23:27Z

We should be using the sbb hack on this example and more often in general now. Whether that translates to better real-world performance is an open question though.

There's a false dependency hazard with sbb that might exist on some (intel) uarch, and it might be better to use cmov as seen in the example from #53071.

github-actions bot added the new issue label Jan 5, 2022

rotateright added the backend:X86 label Jan 6, 2022

RKSimon mentioned this issue Jan 8, 2022

[optimization] gcc generate better code than clang base on conditionally selection #53071

Closed

rotateright closed this as completed in e745507 Jan 9, 2022

EugeneZelenko removed the new issue label Jan 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Missed optimizations in conditionally selected constants #53006

Missed optimizations in conditionally selected constants #53006

uncleasm commented Jan 5, 2022

RKSimon commented Jan 6, 2022

rotateright commented Jan 6, 2022

rotateright commented Jan 6, 2022

rotateright commented Jan 9, 2022 •

edited

Missed optimizations in conditionally selected constants #53006

Missed optimizations in conditionally selected constants #53006

Comments

uncleasm commented Jan 5, 2022

RKSimon commented Jan 6, 2022

rotateright commented Jan 6, 2022

rotateright commented Jan 6, 2022

rotateright commented Jan 9, 2022 • edited

rotateright commented Jan 9, 2022 •

edited