-
Notifications
You must be signed in to change notification settings - Fork 10.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[PowerPC] Check value uses in ValueBit tracking #66040
Conversation
a5d5588
to
ebaafdd
Compare
Gentle ping... any comments? |
Though I do see some codegen improvement, I don't think we should check uses inside |
The motivating case: define i64 @test(i64 %a) {
entry:
%x0 = shl i64 %a, 8
%x1 = and i64 %a, 255
%x2 = or i64 %x0, %x1
%x3 = shl i64 %x2, 16
%x4 = and i64 %x2, 65535
%x5 = or i64 %x3, %x4
%x6 = shl i64 %x5, 32
%x7 = and i64 %x5, 4294967295
%x8 = or i64 %x6, %x7
ret i64 %x8
}
Bit permutation is not good at selecting globally optimized result. This is the limitation until a better selector respecting rotate operations is introduced. |
f201509
to
48a3374
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH, this is more like a workaround. Now you replace some known bits with Variable
for operands that have more than one uses. This sounds not good. We should try to get analysis info as accurate as we can. The issue is how we use the analysis info and generate the instructions based on the analysis info. As you can see, now with this patch, the case llvm/test/CodeGen/PowerPC/loop-instr-form-prepare.ll
becomes worse.
For the improvement in llvm/test/CodeGen/PowerPC/rldimi.ll
, as this IR is generated from ppc builtin __rldimi
, I think the right way to fix this maybe adding an ppc intrinsic for the builtin and map the __rldimi
to the new intrinsic and then select the intrinsic in the PPC ISEL td files. Since developer explicitly uses __rldimi
, what the developer expects should be the corresponding PPC instruction rldimi? What do you think?
We can definitely create a new intrinsic for Yes, this patch is indeed a workaround. The reason of this suboptimal codegen is local greedy BitPermutation selector. For longer consideration, we need modeling and recombination of rotate-clear/mask/insert operations in backend. If that shows good result, BitPermutation can be dropped. |
I think we just need to add intrinsic for builtin that are currently supported by clang. See https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.2?topic=rf-rldimi-builtin-ppc-rldimi-rlwimi-builtin-ppc-rlwimi and https://www.ibm.com/docs/en/openxl-c-and-cpp-aix/17.1.2?topic=functions-rlwnm-builtin-ppc-rlwnm . So we need 3 different intrinsics.
Agreed. From the other two cases diff, we indeed have some issue for the BitPermutation logic. We need a redesign here. |
I opened #82968 to replace this one since both patches differ a lot. |
PowerPC instruction selection tracks bits for every series of bitwise
operations. But in some cases the method selects worse result for complex
PowerPC rotate instructions. Check use of visited operands to avoid suboptimal
analysis.