New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[opt] Failure to recognise equivalent shuffled ops #60632
Comments
We've seen kind of problem before, but I'm not finding a previous report. Either way, we never had a solution. Here's a reduction of the original example: define <2 x i4> @src(<2 x i4> %x, <2 x i4> %y) {
%xsplat = shufflevector <2 x i4> %x, <2 x i4> poison, <2 x i32> zeroinitializer
%vv = mul <2 x i4> %xsplat, %y
%m = mul <2 x i4> %x, %y
%msplat = shufflevector <2 x i4> %m, <2 x i4> poison, <2 x i32> zeroinitializer
%res = add <2 x i4> %vv, %msplat
ret <2 x i4> %res
}
define <2 x i4> @tgt(<2 x i4> %x, <2 x i4> %y) {
%xsplat = shufflevector <2 x i4> %x, <2 x i4> poison, <2 x i32> zeroinitializer
%vv = mul <2 x i4> %xsplat, %y
%msplat = shufflevector <2 x i4> %vv, <2 x i4> poison, <2 x i32> zeroinitializer
%res = add <2 x i4> %vv, %msplat
ret <2 x i4> %res
} And here's an example with an extractelement rather than splat at the end: declare void @use(<2 x i4>)
define i4 @src(<2 x i4> %x, <2 x i4> %y) {
%xsplat = shufflevector <2 x i4> %x, <2 x i4> poison, <2 x i32> zeroinitializer
%vv = mul <2 x i4> %xsplat, %y
call void @use(<2 x i4> %vv)
%m = mul <2 x i4> %x, %y
%m0 = extractelement <2 x i4> %m, i32 0
ret i4 %m0
}
define i4 @tgt(<2 x i4> %x, <2 x i4> %y) {
%xsplat = shufflevector <2 x i4> %x, <2 x i4> poison, <2 x i32> zeroinitializer
%vv = mul <2 x i4> %xsplat, %y
call void @use(<2 x i4> %vv)
%m0 = extractelement <2 x i4> %vv, i32 0
ret i4 %m0
}
|
If we consider looking at this in DAG - we already have SelectionDAG::doesNodeExist - I wonder how well a SelectionDAG::doesShuffleNodeExist would work in helping us see if an equivalent shuffle already exists? Then the existing shuffle(binop(x,y)) -> binop(shuffle(x),shuffle(y)) folds might be extended to use it. It wouldn't help with the extractelement case though |
… shuffle-of-binop This fold was added with https://reviews.llvm.org/D135876 , but we missed the one-use check. This might be the root cause for issue #60632.
Attempt to get this in IR via demanded elements: Also, I'm not sure if 40d772c changed anything for the motivating program(s) - might want to see how the redundant math is created originally. |
…undant instructions In issue #60632, we have vector math ops that differ because an operand is shuffled, but the math has limited demanded elements, so it can be replaced by another instruction: https://alive2.llvm.org/ce/z/TKqq7H I don't think we have anything like this yet - it's like a CSE/GVN fold, but driven by demanded elements of a vector op. This is limited to splat-0 as a first step to keep it simple. Differential Revision: https://reviews.llvm.org/D144760
https://gcc.godbolt.org/z/dnsebnPz5
We seeing cases where multiple uses of a node is preventing vector-combine from merging equivalent shuffles.
Sorry the test case is still more convoluted than necessary :(
opt -O3
as the %2 fmul case will be splatted, we should be able to use %vv again:
The text was updated successfully, but these errors were encountered: