-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missed optimizations in conditionally selected constants #53006
Comments
Godbolt: https://godbolt.org/z/hd9W3hWr5 |
We have combining/lowering to create the neg+sbb already. It could be enhanced to match a new pattern and tack on the 'or' as the final op. |
This won't do anything for this exact example, but here's a proposal to improve the existing x86 lowering for select via SBB: |
select (X != 0), -1, Y --> 0 - X; or (sbb), Y select (X != 0), Y, -1 --> X - 1; or (sbb), Y We already had these x86 carry-flag transforms, but one was over-specified to handle a "0" select arm only. That's just a special-case of the more general pattern (the 'or' will be deleted if Y is zero). This is part of solving #53006, but it misses that example because some other combine has already converted that exact pattern into math ops. Differential Revision: https://reviews.llvm.org/D116765
We should be using the sbb hack on this example and more often in general now. Whether that translates to better real-world performance is an open question though. There's a false dependency hazard with sbb that might exist on some (intel) uarch, and it might be better to use cmov as seen in the example from #53071. |
Compared to gcc, clang produces slightly more verbose code for a sequence
AFAIK, the optimal sequence is
which could be encoded from
As a side note, gcc can produce the expected output from
test1b
but not fromtest1
.The text was updated successfully, but these errors were encountered: