-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DAGCombiner] Freeze maybe poison operands when folding select to logic #84924
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-backend-powerpc @llvm/pr-subscribers-llvm-selectiondag Author: Björn Pettersson (bjope) ChangesWork-in-progress, to fix #84653 Full diff: https://github.com/llvm/llvm-project/pull/84924.diff 1 Files Affected:
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5476ef87971436..46675f94642cc9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -11344,28 +11344,34 @@ static SDValue foldBoolSelectToLogic(SDNode *N, SelectionDAG &DAG) {
if (VT != Cond.getValueType() || VT.getScalarSizeInBits() != 1)
return SDValue();
- // select Cond, Cond, F --> or Cond, F
- // select Cond, 1, F --> or Cond, F
+ auto FreezeIfNeeded = [&](SDValue V) {
+ if (!DAG.isGuaranteedNotToBePoison(V))
+ return DAG.getFreeze(V);
+ return V;
+ };
+
+ // select Cond, Cond, F --> or Cond, freeze(F)
+ // select Cond, 1, F --> or Cond, freeze(F)
if (Cond == T || isOneOrOneSplat(T, /* AllowUndefs */ true))
- return matcher.getNode(ISD::OR, SDLoc(N), VT, Cond, F);
+ return matcher.getNode(ISD::OR, SDLoc(N), VT, Cond, FreezeIfNeeded(F));
// select Cond, T, Cond --> and Cond, T
// select Cond, T, 0 --> and Cond, T
if (Cond == F || isNullOrNullSplat(F, /* AllowUndefs */ true))
- return matcher.getNode(ISD::AND, SDLoc(N), VT, Cond, T);
+ return matcher.getNode(ISD::AND, SDLoc(N), VT, Cond, FreezeIfNeeded(T));
// select Cond, T, 1 --> or (not Cond), T
if (isOneOrOneSplat(F, /* AllowUndefs */ true)) {
SDValue NotCond = matcher.getNode(ISD::XOR, SDLoc(N), VT, Cond,
DAG.getAllOnesConstant(SDLoc(N), VT));
- return matcher.getNode(ISD::OR, SDLoc(N), VT, NotCond, T);
+ return matcher.getNode(ISD::OR, SDLoc(N), VT, NotCond, FreezeIfNeeded(T));
}
// select Cond, 0, F --> and (not Cond), F
if (isNullOrNullSplat(T, /* AllowUndefs */ true)) {
SDValue NotCond = matcher.getNode(ISD::XOR, SDLoc(N), VT, Cond,
DAG.getAllOnesConstant(SDLoc(N), VT));
- return matcher.getNode(ISD::AND, SDLoc(N), VT, NotCond, F);
+ return matcher.getNode(ISD::AND, SDLoc(N), VT, NotCond, FreezeIfNeeded(F));
}
return SDValue();
@@ -11394,12 +11400,18 @@ static SDValue foldVSelectToSignBitSplatMask(SDNode *N, SelectionDAG &DAG) {
else
return SDValue();
+ auto FreezeIfNeeded = [&](SDValue V) {
+ if (!DAG.isGuaranteedNotToBePoison(V))
+ return DAG.getFreeze(V);
+ return V;
+ };
+
// (Cond0 s< 0) ? N1 : 0 --> (Cond0 s>> BW-1) & N1
if (isNullOrNullSplat(N2)) {
SDLoc DL(N);
SDValue ShiftAmt = DAG.getConstant(VT.getScalarSizeInBits() - 1, DL, VT);
SDValue Sra = DAG.getNode(ISD::SRA, DL, VT, Cond0, ShiftAmt);
- return DAG.getNode(ISD::AND, DL, VT, Sra, N1);
+ return DAG.getNode(ISD::AND, DL, VT, Sra, FreezeIfNeeded(N1));
}
// (Cond0 s< 0) ? -1 : N2 --> (Cond0 s>> BW-1) | N2
@@ -11407,7 +11419,7 @@ static SDValue foldVSelectToSignBitSplatMask(SDNode *N, SelectionDAG &DAG) {
SDLoc DL(N);
SDValue ShiftAmt = DAG.getConstant(VT.getScalarSizeInBits() - 1, DL, VT);
SDValue Sra = DAG.getNode(ISD::SRA, DL, VT, Cond0, ShiftAmt);
- return DAG.getNode(ISD::OR, DL, VT, Sra, N2);
+ return DAG.getNode(ISD::OR, DL, VT, Sra, FreezeIfNeeded(N2));
}
// If we have to invert the sign bit mask, only do that transform if the
@@ -11419,7 +11431,7 @@ static SDValue foldVSelectToSignBitSplatMask(SDNode *N, SelectionDAG &DAG) {
SDValue ShiftAmt = DAG.getConstant(VT.getScalarSizeInBits() - 1, DL, VT);
SDValue Sra = DAG.getNode(ISD::SRA, DL, VT, Cond0, ShiftAmt);
SDValue Not = DAG.getNOT(DL, Sra, VT);
- return DAG.getNode(ISD::AND, DL, VT, Not, N2);
+ return DAG.getNode(ISD::AND, DL, VT, Not, FreezeIfNeeded(N2));
}
// TODO: There's another pattern in this family, but it may require
|
@@ -6,21 +6,14 @@ | |||
|
|||
; (x0 < x1) && (x2 > x3) | |||
define i32 @cmp_and2(i32 %0, i32 %1, i32 %2, i32 %3) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the typical problem here is that the arguments aren't marked as noundef
. But even if they had been marked as noundef
the lowering of arguments wouldn't resulting adding FREEZE nodes to indicate that the inputs aren't poison. Nor do we have a way to put such attributes on the CopyFromReg (or loads from stack) instructions that would map to accessing the arguments.
(No idea if there is something in AArch64 that could be changed here to still optimize this in a valid way or if the old transform is unsafe from a poison/undef perspective.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does SDAG ever look through CopyFromReg in some way? If not, we could assume that the value is frozen.
We can definitely treat function arguments as frozen in SDAG, I'm just not familiar with where else CopyFromReg gets used...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just not familiar with where else CopyFromReg gets used
SDAG operates per basic block, and I believe CopyFromReg/CopyToReg are generated for any use/def of a "global" value, i.e. one that is live outside of the current basic block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually started on a patch that simply treated CopyFromReg similar as FREEZE in SelectionDAG::isGuaranteedNotToBeUndefOrPoison. Kind of based on the comments in ISDOpcodes.h about CopyFromReg loading from a register "that is defined outside of the scope of this SelectionDAG". So one could for example assume that we get the same value.
I'm not sure however if there could be other problems. On IR level we only treat arguments as not being poison depending on the noundef attribute. So then I started to think that maybe it was more complicated also for SelectionDAG. But maybe we are saved by only looking at one BB at a time.
I think CopyFromReg can be used on any BB, also dealing with values passed on from one BB to another. So it is at least not only about input arguments to the function (but maybe it can be seen as input arguments to the basic block). Would it for example be a problem if the value tracking would make assumtions based on dominating conditions in predecessor basic blocks? But maybe we avoid such things.
(I know that there were some discussion a couple of years back about poison in SelectionDAG, https://discourse.llvm.org/t/funnel-shift-select-and-poison/51255/25, but that never really landed in any formal decision about how to deal with it.)
@@ -295,22 +295,28 @@ define float @select_icmp_sle(i32 %x, i32 %y, float %a, float %b) { | |||
; Test peephole optimizations for select. | |||
define zeroext i1 @select_opt1(i1 zeroext %c, i1 zeroext %a) { | |||
; CHECK-LABEL: select_opt1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I figure that these diffs are a bit annoying. But given that the AssertZext/AssertSext are handles as maybe producing poison we can't hoist the freeze above it. And then I think we lose the optimizations based on that the inputs are masked down to a single bit. But maybe this also kind of originates from CopyFromReg not being treated as an implicit FREEZE (at least if the arguments would have been marked as noundef
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing here is that the fast-isel code still optimizes this case. So either there is a bug in fast-isel that it doesn't consider poison here. Or we should be allowed to optimize this also for the regular SelectionDAG ISel.
Worth mentioning is that GlobalISel isn't part of the checks here, but it seems like GlobalISel also would emit the explicit AND.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have wondered whether SelectionDAGBuilder should be inserting freeze to incoming nodes if we know they aren't undef/poison - that would allow freeze(assertzext(x)) or whatever?
Hi @bjope, could you please add the following test for #85190?
|
// select Cond, Cond, F --> or Cond, F | ||
// select Cond, 1, F --> or Cond, F | ||
auto FreezeIfNeeded = [&](SDValue V) { | ||
if (!DAG.isGuaranteedNotToBePoison(V)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a plan to implement impliesPoison
for SDAG
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getNode(ISD::FREEZE) should do this for us?
llvm-project/llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp
Lines 5718 to 5723 in 02cb89b
case ISD::FREEZE: | |
assert(VT == N1.getValueType() && "Unexpected VT!"); | |
if (isGuaranteedNotToBeUndefOrPoison(N1, /*PoisonOnly*/ false, | |
/*Depth*/ 1)) | |
return N1; | |
break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Nice!
Downstream I also needed some additional checks to just do this for my target (since it is a pain doing it for all targets until this has landed upstream). So that is one reason why I've wrapped the check this way.
Ping for review :) |
llvm/test/CodeGen/PowerPC/pr40922.ll
Outdated
; CHECK-NEXT: crorc 20, 1, 4 | ||
; CHECK-NEXT: andi. 5, 5, 1 | ||
; CHECK-NEXT: crnot 20, 4 | ||
; CHECK-NEXT: cror 20, 1, 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This becomes worse. A combine for i1 type seems broken. Let me check why this happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t43: i32,ch = load<(dereferenceable load (s32) from @a.b)> t10, t63, undef:i32
t44: i32,glue = addc t43, Constant:i32<6>
t46: i32 = and t44, Constant:i32<-17>
t48: i1 = setcc t46, t43, setuge:ch
t54: i1 = freeze t48 ;;This freeze breaks the instruction selection pattern for or(setcc) in PPC td files.
t55: i1 = or t58, t54
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a situation when we do not fold the freeze over an operation (in this case a setcc) due to the operation having more then one operand that may be poison.
However, we got
t43: i32,ch = load<(dereferenceable load (s32) from @a.b)> t10, GlobalAddress:i32<ptr @a.b> 0, undef:i32
t44: i32,glue = addc t43, Constant:i32<6>
t46: i32 = and t44, Constant:i32<-17>
t48: i1 = setcc t46, t43, setuge:ch
t54: i1 = freeze t48
so I think that t46 and t43 only could be poison/undef if t43 is poison/undef. So it would be enough to freeze t43, like this:
t43: i32,ch = load<(dereferenceable load (s32) from @a.b)> t10, GlobalAddress:i32<ptr @a.b> 0, undef:i32
t99: i32 = freeze t43
t44: i32,glue = addc t99, Constant:i32<6>
t46: i32 = and t44, Constant:i32<-17>
t54: i1 = setcc t46, t99, setuge:ch
Not exactly sure, but maybe if it would be possible to improve the analysis in DAGCombiner::visitFREEZE. Idea would be to allow pushing freeze through more operations (with multiple maybe poison/undef operands as we do for BUILD_VECTOR etc)). One could limit it to the scenraio when the isGuaranteedNotToBeUndefOrPoison checks for those operands reaches the same DAG node when recursing, i.e. when the likelyhood is that we still would end up with a single freeze in the end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For setcc, it's probably profitable to always push freeze, even if there are multiple operands.
The regressions needs to be mitigated to the maximum extent feasible before we make this change. |
Ping? |
Allow pushing freeze through SETCC and SELECT_CC even if there are multiple "maybe poison" operands. In the past we have limited it to a single "maybe poison" operand, but it seems profitable to also allow the multiple operand scenario. One goal here is to avoid some regressions seen in review of llvm#84924 when solving the select->and miscompiles described in llvm#84653
Allow pushing freeze through SETCC and SELECT_CC even if there are multiple "maybe poison" operands. In the past we have limited it to a single "maybe poison" operand, but it seems profitable to also allow the multiple operand scenario. One goal here is to avoid some regressions seen in review of llvm#84924 when solving the select->and miscompiles described in llvm#84653
define zeroext i1 @select_opt2(i1 zeroext %c, i1 zeroext %a) { | ||
; CHECK-LABEL: select_opt2 | ||
; CHECK: eor [[REG:w[0-9]+]], w0, #0x1 | ||
; CHECK: orr {{w[0-9]+}}, [[REG]], w1 | ||
; SISEL: orn [[REG:w[0-9]+]], w1, w0 | ||
; SISEL: and w0, [[REG]], #0x1 | ||
; FISEL: eor [[REG:w[0-9]+]], w0, #0x1 | ||
; FISEL: orr {{w[0-9]+}}, [[REG]], w1 | ||
%1 = select i1 %c, i1 %a, i1 true | ||
ret i1 %1 | ||
} | ||
|
||
define zeroext i1 @select_opt3(i1 zeroext %c, i1 zeroext %a) { | ||
; CHECK-LABEL: select_opt3 | ||
; CHECK: bic {{w[0-9]+}}, w1, w0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add copies of regression testcases with noundef arguments that show the original output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding noundef to the arguments doesn't help here. There will still be an AssertZext being added when lowering the arguments.
Not quite sure how we should deal with the (freeze(AssertZext)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we should have an equivalent AssertNoundef or something, but the intersection of all the Assert*s corresponding to argument attributes is not great
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, one idea is that we get rid of the known, IMHO, misuse of AssertSext/AssertZext in DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT. And then try to check all other places that create AssertSext/AssertZext nodes to see if they are violating the desciption of those ISD nodes (i.e. that they "record if a register contains a value that has already been zero or sign extended"). In the PromoteIntRes_FP_TO_XINT those nodes are used to indicate that it is the result is poison if the input violates the assert. But the desciption of the assert nodes just say that the result is poison if the input is poison.
This probably result in various regressions, so maybe not acceptable.
Another idea would be to add some kind of boolean parameter to AssertSext/AssertZext that makes it possible to indicate that the assert node only propagates posion or if it can create poison. This flag could then be used in situations like PromoteIntRes_FP_TO_XINT.
OTOH, considering that #66603 still is open. Maybe the revert in 8505c3b wasn't needed after all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH, considering that #66603 still is open. Maybe the revert in 8505c3b wasn't needed after all?
The issue is still open due to an optimization problem -- the revert was needed to fix a correctness problem.
Another idea would be to add some kind of boolean parameter to AssertSext/AssertZext that makes it possible to indicate that the assert node only propagates posion or if it can create poison. This flag could then be used in situations like PromoteIntRes_FP_TO_XINT.
I think to make this actually useful the statement would have to be that violating the AssertZext with the flag is immediate UB (including the case where the input is poison).
I do think that something in this direction would make sense.
Something I'm unsure about is to what degree nodes with IUB in SDAG are safe without a chain operand -- we have known miscompiles from the fact that division by zero UB is not modeled via chain.
Just like for regular IR we need to treat SELECT as conditionally blocking poison. So (unless the condition itself is poison) the result is only poison if the selected true/false value is poison. Thus, when doing DAG combines that turn SELECT into arithmetic/logical operations (e.g. AND/OR) we need to make sure that the new operations aren't more poisonous. One way to do that is to use FREEZE to make sure the operands aren't posion. This patch aims at fixing the kind of miscompiles reported in llvm#84653 and llvm#85190 Solution is to make sure that we insert FREEZE, if needed to make the fold sound, when using the foldBoolSelectToLogic and foldVSelectToSignBitSplatMask DAG combines. This may result in some (hopefully minor) regressions since we lack some ways to fold away the freeze (or due to isGuaranteedNotToBePoison being too pessimistic). Focus in this patch is to just avoid miscompiles, but I think some of the regressions can be avoided by general improvements regarding poison/freeze handling in SelectionDAG.
Allow pushing freeze through SETCC and SELECT_CC even if there are multiple "maybe poison" operands. In the past we have limited it to a single "maybe poison" operand, but it seems profitable to also allow the multiple operand scenario. One goal here is to avoid some regressions seen in review of llvm#84924 when solving the select->and miscompiles described in llvm#84653
You can test this locally with the following command:git-clang-format --diff 67aec0cd62d607b4e5b7198769be061454ce67b3 e57c1432d57227969a9d6a4f19fd5469a39a4a14 -- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp View the diff from clang-format here.diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 13683cd288..8f1bda675c 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -15582,8 +15582,7 @@ SDValue DAGCombiner::visitFREEZE(SDNode *N) {
return SDValue();
bool AllowMultipleMaybePoisonOperands =
- N0.getOpcode() == ISD::SELECT_CC ||
- N0.getOpcode() == ISD::SETCC ||
+ N0.getOpcode() == ISD::SELECT_CC || N0.getOpcode() == ISD::SETCC ||
N0.getOpcode() == ISD::BUILD_VECTOR ||
N0.getOpcode() == ISD::BUILD_PAIR ||
N0.getOpcode() == ISD::VECTOR_SHUFFLE ||
|
I've rebased this (again). But I'm not quite sure how to proceed.
|
I think we mainly need a solution for the "AssertZExt/AssertSExt on argument" problem.
Probably, especially if the latter has less test regressions?
Yes, please. If there are no test changes as-is, you could add a test with explicit freeze. |
Thanks for the advice! I'll split this patch in smaller pieces (at least breaking out the "Push freeze through SETCC and SELECT_CC" part). |
Allow pushing freeze through SETCC and SELECT_CC even if there are multiple "maybe poison" operands. In the past we have limited it to a single "maybe poison" operand, but it seems profitable to also allow the multiple operand scenario. One goal here is to avoid some regressions seen in review of llvm#84924 when solving the select->and miscompiles described in llvm#84653
Maybe we should introduce ISD::LOGICAL_AND/ISD::LOGICAL_OR nodes to SelectionDAG? If we delay the lowering to actual AND/OR until operation legalization, we can do various combines without worrying about a bunch of extra freeze operations. |
Yes, that would be an alternative -- this is essentially what we do in IR, though there we just keep the original select but recognize it as a logical and/or pattern. The advantage is that we don't have to introduce freeze -- the disadvantage is that we have to make sure that all optimizations can handle the logical variant. On the IR side, this took some very extensive work. Maybe it's simpler for SDAG as it presumably has less transforms reasoning about complex conditions. |
Reverse ping :) |
With the LLVM 19 release coming up, we might have to take this as-is :( |
Just like for regular IR we need to treat SELECT as conditionally
blocking poison in SelectionDAG. So (unless the condition itself is
poison) the result is only poison if the selected true/false value is
poison.
Thus, when doing DAG combines that turn SELECT into arithmetic/logical
operations (e.g. AND/OR) we need to make sure that the new operations
aren't more poisonous. One way to do that is to use FREEZE to make
sure the operands aren't posion.
This patch aims at fixing the kind of miscompiles reported in
#84653
and
#85190
Solution is to make sure that we insert FREEZE, if needed to make
the fold sound, when using the foldBoolSelectToLogic and
foldVSelectToSignBitSplatMask DAG combines.
This may result in some (hopefully minor) regressions since we lack
some ways to fold away the freeze (or due to isGuaranteedNotToBePoison
being too pessimistic). Focus in this patch is to just avoid
miscompiles, but I think some of the regressions can be avoided by
general improvements regarding poison/freeze handling in SelectionDAG.