Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DAGCombiner] Freeze maybe poison operands when folding select to logic #84924

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

bjope
Copy link
Collaborator

@bjope bjope commented Mar 12, 2024

Just like for regular IR we need to treat SELECT as conditionally
blocking poison in SelectionDAG. So (unless the condition itself is
poison) the result is only poison if the selected true/false value is
poison.
Thus, when doing DAG combines that turn SELECT into arithmetic/logical
operations (e.g. AND/OR) we need to make sure that the new operations
aren't more poisonous. One way to do that is to use FREEZE to make
sure the operands aren't posion.

This patch aims at fixing the kind of miscompiles reported in
#84653
and
#85190

Solution is to make sure that we insert FREEZE, if needed to make
the fold sound, when using the foldBoolSelectToLogic and
foldVSelectToSignBitSplatMask DAG combines.

This may result in some (hopefully minor) regressions since we lack
some ways to fold away the freeze (or due to isGuaranteedNotToBePoison
being too pessimistic). Focus in this patch is to just avoid
miscompiles, but I think some of the regressions can be avoided by
general improvements regarding poison/freeze handling in SelectionDAG.

@llvmbot llvmbot added the llvm:SelectionDAG SelectionDAGISel as well label Mar 12, 2024
@bjope bjope marked this pull request as draft March 12, 2024 14:42
@llvmbot
Copy link
Collaborator

llvmbot commented Mar 12, 2024

@llvm/pr-subscribers-backend-powerpc
@llvm/pr-subscribers-backend-systemz
@llvm/pr-subscribers-backend-x86
@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-selectiondag

Author: Björn Pettersson (bjope)

Changes

Work-in-progress, to fix #84653


Full diff: https://github.com/llvm/llvm-project/pull/84924.diff

1 Files Affected:

  • (modified) llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp (+21-9)
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 5476ef87971436..46675f94642cc9 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -11344,28 +11344,34 @@ static SDValue foldBoolSelectToLogic(SDNode *N, SelectionDAG &DAG) {
   if (VT != Cond.getValueType() || VT.getScalarSizeInBits() != 1)
     return SDValue();
 
-  // select Cond, Cond, F --> or Cond, F
-  // select Cond, 1, F    --> or Cond, F
+  auto FreezeIfNeeded = [&](SDValue V) {
+    if (!DAG.isGuaranteedNotToBePoison(V))
+      return DAG.getFreeze(V);
+    return V;
+  };
+
+  // select Cond, Cond, F --> or Cond, freeze(F)
+  // select Cond, 1, F    --> or Cond, freeze(F)
   if (Cond == T || isOneOrOneSplat(T, /* AllowUndefs */ true))
-    return matcher.getNode(ISD::OR, SDLoc(N), VT, Cond, F);
+    return matcher.getNode(ISD::OR, SDLoc(N), VT, Cond, FreezeIfNeeded(F));
 
   // select Cond, T, Cond --> and Cond, T
   // select Cond, T, 0    --> and Cond, T
   if (Cond == F || isNullOrNullSplat(F, /* AllowUndefs */ true))
-    return matcher.getNode(ISD::AND, SDLoc(N), VT, Cond, T);
+    return matcher.getNode(ISD::AND, SDLoc(N), VT, Cond, FreezeIfNeeded(T));
 
   // select Cond, T, 1 --> or (not Cond), T
   if (isOneOrOneSplat(F, /* AllowUndefs */ true)) {
     SDValue NotCond = matcher.getNode(ISD::XOR, SDLoc(N), VT, Cond,
                                       DAG.getAllOnesConstant(SDLoc(N), VT));
-    return matcher.getNode(ISD::OR, SDLoc(N), VT, NotCond, T);
+    return matcher.getNode(ISD::OR, SDLoc(N), VT, NotCond, FreezeIfNeeded(T));
   }
 
   // select Cond, 0, F --> and (not Cond), F
   if (isNullOrNullSplat(T, /* AllowUndefs */ true)) {
     SDValue NotCond = matcher.getNode(ISD::XOR, SDLoc(N), VT, Cond,
                                       DAG.getAllOnesConstant(SDLoc(N), VT));
-    return matcher.getNode(ISD::AND, SDLoc(N), VT, NotCond, F);
+    return matcher.getNode(ISD::AND, SDLoc(N), VT, NotCond, FreezeIfNeeded(F));
   }
 
   return SDValue();
@@ -11394,12 +11400,18 @@ static SDValue foldVSelectToSignBitSplatMask(SDNode *N, SelectionDAG &DAG) {
   else
     return SDValue();
 
+  auto FreezeIfNeeded = [&](SDValue V) {
+    if (!DAG.isGuaranteedNotToBePoison(V))
+      return DAG.getFreeze(V);
+    return V;
+  };
+
   // (Cond0 s< 0) ? N1 : 0 --> (Cond0 s>> BW-1) & N1
   if (isNullOrNullSplat(N2)) {
     SDLoc DL(N);
     SDValue ShiftAmt = DAG.getConstant(VT.getScalarSizeInBits() - 1, DL, VT);
     SDValue Sra = DAG.getNode(ISD::SRA, DL, VT, Cond0, ShiftAmt);
-    return DAG.getNode(ISD::AND, DL, VT, Sra, N1);
+    return DAG.getNode(ISD::AND, DL, VT, Sra, FreezeIfNeeded(N1));
   }
 
   // (Cond0 s< 0) ? -1 : N2 --> (Cond0 s>> BW-1) | N2
@@ -11407,7 +11419,7 @@ static SDValue foldVSelectToSignBitSplatMask(SDNode *N, SelectionDAG &DAG) {
     SDLoc DL(N);
     SDValue ShiftAmt = DAG.getConstant(VT.getScalarSizeInBits() - 1, DL, VT);
     SDValue Sra = DAG.getNode(ISD::SRA, DL, VT, Cond0, ShiftAmt);
-    return DAG.getNode(ISD::OR, DL, VT, Sra, N2);
+    return DAG.getNode(ISD::OR, DL, VT, Sra, FreezeIfNeeded(N2));
   }
 
   // If we have to invert the sign bit mask, only do that transform if the
@@ -11419,7 +11431,7 @@ static SDValue foldVSelectToSignBitSplatMask(SDNode *N, SelectionDAG &DAG) {
     SDValue ShiftAmt = DAG.getConstant(VT.getScalarSizeInBits() - 1, DL, VT);
     SDValue Sra = DAG.getNode(ISD::SRA, DL, VT, Cond0, ShiftAmt);
     SDValue Not = DAG.getNOT(DL, Sra, VT);
-    return DAG.getNode(ISD::AND, DL, VT, Not, N2);
+    return DAG.getNode(ISD::AND, DL, VT, Not, FreezeIfNeeded(N2));
   }
 
   // TODO: There's another pattern in this family, but it may require

@@ -6,21 +6,14 @@

; (x0 < x1) && (x2 > x3)
define i32 @cmp_and2(i32 %0, i32 %1, i32 %2, i32 %3) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the typical problem here is that the arguments aren't marked as noundef. But even if they had been marked as noundef the lowering of arguments wouldn't resulting adding FREEZE nodes to indicate that the inputs aren't poison. Nor do we have a way to put such attributes on the CopyFromReg (or loads from stack) instructions that would map to accessing the arguments.

(No idea if there is something in AArch64 that could be changed here to still optimize this in a valid way or if the old transform is unsafe from a poison/undef perspective.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does SDAG ever look through CopyFromReg in some way? If not, we could assume that the value is frozen.

We can definitely treat function arguments as frozen in SDAG, I'm just not familiar with where else CopyFromReg gets used...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just not familiar with where else CopyFromReg gets used

SDAG operates per basic block, and I believe CopyFromReg/CopyToReg are generated for any use/def of a "global" value, i.e. one that is live outside of the current basic block.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually started on a patch that simply treated CopyFromReg similar as FREEZE in SelectionDAG::isGuaranteedNotToBeUndefOrPoison. Kind of based on the comments in ISDOpcodes.h about CopyFromReg loading from a register "that is defined outside of the scope of this SelectionDAG". So one could for example assume that we get the same value.

I'm not sure however if there could be other problems. On IR level we only treat arguments as not being poison depending on the noundef attribute. So then I started to think that maybe it was more complicated also for SelectionDAG. But maybe we are saved by only looking at one BB at a time.
I think CopyFromReg can be used on any BB, also dealing with values passed on from one BB to another. So it is at least not only about input arguments to the function (but maybe it can be seen as input arguments to the basic block). Would it for example be a problem if the value tracking would make assumtions based on dominating conditions in predecessor basic blocks? But maybe we avoid such things.

(I know that there were some discussion a couple of years back about poison in SelectionDAG, https://discourse.llvm.org/t/funnel-shift-select-and-poison/51255/25, but that never really landed in any formal decision about how to deal with it.)

@@ -295,22 +295,28 @@ define float @select_icmp_sle(i32 %x, i32 %y, float %a, float %b) {
; Test peephole optimizations for select.
define zeroext i1 @select_opt1(i1 zeroext %c, i1 zeroext %a) {
; CHECK-LABEL: select_opt1
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figure that these diffs are a bit annoying. But given that the AssertZext/AssertSext are handles as maybe producing poison we can't hoist the freeze above it. And then I think we lose the optimizations based on that the inputs are masked down to a single bit. But maybe this also kind of originates from CopyFromReg not being treated as an implicit FREEZE (at least if the arguments would have been marked as noundef).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another thing here is that the fast-isel code still optimizes this case. So either there is a bug in fast-isel that it doesn't consider poison here. Or we should be allowed to optimize this also for the regular SelectionDAG ISel.
Worth mentioning is that GlobalISel isn't part of the checks here, but it seems like GlobalISel also would emit the explicit AND.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have wondered whether SelectionDAGBuilder should be inserting freeze to incoming nodes if we know they aren't undef/poison - that would allow freeze(assertzext(x)) or whatever?

@dtcxzyw dtcxzyw linked an issue Mar 14, 2024 that may be closed by this pull request
@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 14, 2024

Hi @bjope, could you please add the following test for #85190?

; RUN: llc < %s -mtriple=riscv64 | FileCheck %s --check-prefix=CHECK,NOZBB
; RUN: llc < %s -mtriple=riscv64 +mattr=+zbb | FileCheck %s --check-prefix=CHECK,ZBB

define i1 @pr85190(i64 %a) {
  %or = or i64 %a, 7
  %cmp1 = icmp slt i64 %a, 0
  %sub = sub nsw i64 -9223372036854775808, %or
  %cmp2 = icmp sgt i64 %sub, %a
  %res = select i1 %cmp1, i1 %cmp2, i1 false
  ret i1 %res
}

@dtcxzyw dtcxzyw requested review from dtcxzyw and topperc March 14, 2024 09:29
@bjope
Copy link
Collaborator Author

bjope commented Mar 14, 2024

Hi @bjope, could you please add the following test for #85190?

Yes, thanks!
And that reminds me that I haven't added the reproducer for #84653 either yet.
I'll prepare a pre-commit with those tests.

// select Cond, Cond, F --> or Cond, F
// select Cond, 1, F --> or Cond, F
auto FreezeIfNeeded = [&](SDValue V) {
if (!DAG.isGuaranteedNotToBePoison(V))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have a plan to implement impliesPoison for SDAG?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNode(ISD::FREEZE) should do this for us?

case ISD::FREEZE:
assert(VT == N1.getValueType() && "Unexpected VT!");
if (isGuaranteedNotToBeUndefOrPoison(N1, /*PoisonOnly*/ false,
/*Depth*/ 1))
return N1;
break;

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Nice!
Downstream I also needed some additional checks to just do this for my target (since it is a pain doing it for all targets until this has landed upstream). So that is one reason why I've wrapped the check this way.

@dtcxzyw
Copy link
Member

dtcxzyw commented Mar 21, 2024

Ping for review :)

; CHECK-NEXT: crorc 20, 1, 4
; CHECK-NEXT: andi. 5, 5, 1
; CHECK-NEXT: crnot 20, 4
; CHECK-NEXT: cror 20, 1, 20
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This becomes worse. A combine for i1 type seems broken. Let me check why this happens.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  t43: i32,ch = load<(dereferenceable load (s32) from @a.b)> t10, t63, undef:i32
  t44: i32,glue = addc t43, Constant:i32<6>
            t46: i32 = and t44, Constant:i32<-17>
          t48: i1 = setcc t46, t43, setuge:ch
        t54: i1 = freeze t48    ;;This freeze breaks the instruction selection pattern for or(setcc) in PPC td files.
      t55: i1 = or t58, t54

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a situation when we do not fold the freeze over an operation (in this case a setcc) due to the operation having more then one operand that may be poison.
However, we got

    t43: i32,ch = load<(dereferenceable load (s32) from @a.b)> t10, GlobalAddress:i32<ptr @a.b> 0, undef:i32
    t44: i32,glue = addc t43, Constant:i32<6>
    t46: i32 = and t44, Constant:i32<-17>
    t48: i1 = setcc t46, t43, setuge:ch
    t54: i1 = freeze t48

so I think that t46 and t43 only could be poison/undef if t43 is poison/undef. So it would be enough to freeze t43, like this:

    t43: i32,ch = load<(dereferenceable load (s32) from @a.b)> t10, GlobalAddress:i32<ptr @a.b> 0, undef:i32
    t99: i32 = freeze t43
    t44: i32,glue = addc t99, Constant:i32<6>
    t46: i32 = and t44, Constant:i32<-17>
    t54: i1 = setcc t46, t99, setuge:ch

Not exactly sure, but maybe if it would be possible to improve the analysis in DAGCombiner::visitFREEZE. Idea would be to allow pushing freeze through more operations (with multiple maybe poison/undef operands as we do for BUILD_VECTOR etc)). One could limit it to the scenraio when the isGuaranteedNotToBeUndefOrPoison checks for those operands reaches the same DAG node when recursing, i.e. when the likelyhood is that we still would end up with a single freeze in the end.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For setcc, it's probably profitable to always push freeze, even if there are multiple operands.

@nikic
Copy link
Contributor

nikic commented Mar 21, 2024

Ping for review :)

The regressions needs to be mitigated to the maximum extent feasible before we make this change.

@AtariDreams
Copy link
Contributor

Ping?

bjope added a commit to bjope/llvm-project that referenced this pull request Apr 9, 2024
Allow pushing freeze through SETCC and SELECT_CC even if there are
multiple "maybe poison" operands. In the past we have limited it to
a single "maybe poison" operand, but it seems profitable to also
allow the multiple operand scenario.

One goal here is to avoid some regressions seen in review of
  llvm#84924
when solving the select->and miscompiles described in
  llvm#84653
dtcxzyw added a commit to dtcxzyw/llvm-codegen-benchmark that referenced this pull request Apr 15, 2024
dtcxzyw added a commit to dtcxzyw/llvm-codegen-benchmark that referenced this pull request Apr 16, 2024
bjope added a commit to bjope/llvm-project that referenced this pull request Apr 26, 2024
Allow pushing freeze through SETCC and SELECT_CC even if there are
multiple "maybe poison" operands. In the past we have limited it to
a single "maybe poison" operand, but it seems profitable to also
allow the multiple operand scenario.

One goal here is to avoid some regressions seen in review of
  llvm#84924
when solving the select->and miscompiles described in
  llvm#84653
Comment on lines 305 to -313
define zeroext i1 @select_opt2(i1 zeroext %c, i1 zeroext %a) {
; CHECK-LABEL: select_opt2
; CHECK: eor [[REG:w[0-9]+]], w0, #0x1
; CHECK: orr {{w[0-9]+}}, [[REG]], w1
; SISEL: orn [[REG:w[0-9]+]], w1, w0
; SISEL: and w0, [[REG]], #0x1
; FISEL: eor [[REG:w[0-9]+]], w0, #0x1
; FISEL: orr {{w[0-9]+}}, [[REG]], w1
%1 = select i1 %c, i1 %a, i1 true
ret i1 %1
}

define zeroext i1 @select_opt3(i1 zeroext %c, i1 zeroext %a) {
; CHECK-LABEL: select_opt3
; CHECK: bic {{w[0-9]+}}, w1, w0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add copies of regression testcases with noundef arguments that show the original output?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding noundef to the arguments doesn't help here. There will still be an AssertZext being added when lowering the arguments.
Not quite sure how we should deal with the (freeze(AssertZext)).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we should have an equivalent AssertNoundef or something, but the intersection of all the Assert*s corresponding to argument attributes is not great

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, one idea is that we get rid of the known, IMHO, misuse of AssertSext/AssertZext in DAGTypeLegalizer::PromoteIntRes_FP_TO_XINT. And then try to check all other places that create AssertSext/AssertZext nodes to see if they are violating the desciption of those ISD nodes (i.e. that they "record if a register contains a value that has already been zero or sign extended"). In the PromoteIntRes_FP_TO_XINT those nodes are used to indicate that it is the result is poison if the input violates the assert. But the desciption of the assert nodes just say that the result is poison if the input is poison.
This probably result in various regressions, so maybe not acceptable.

Another idea would be to add some kind of boolean parameter to AssertSext/AssertZext that makes it possible to indicate that the assert node only propagates posion or if it can create poison. This flag could then be used in situations like PromoteIntRes_FP_TO_XINT.

OTOH, considering that #66603 still is open. Maybe the revert in 8505c3b wasn't needed after all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OTOH, considering that #66603 still is open. Maybe the revert in 8505c3b wasn't needed after all?

The issue is still open due to an optimization problem -- the revert was needed to fix a correctness problem.

Another idea would be to add some kind of boolean parameter to AssertSext/AssertZext that makes it possible to indicate that the assert node only propagates posion or if it can create poison. This flag could then be used in situations like PromoteIntRes_FP_TO_XINT.

I think to make this actually useful the statement would have to be that violating the AssertZext with the flag is immediate UB (including the case where the input is poison).

I do think that something in this direction would make sense.

Something I'm unsure about is to what degree nodes with IUB in SDAG are safe without a chain operand -- we have known miscompiles from the fact that division by zero UB is not modeled via chain.

dtcxzyw added a commit to dtcxzyw/llvm-codegen-benchmark that referenced this pull request Apr 26, 2024
bjope added 2 commits May 28, 2024 11:10
Just like for regular IR we need to treat SELECT as conditionally
blocking poison. So (unless the condition itself is poison) the
result is only poison if the selected true/false value is poison.
Thus, when doing DAG combines that turn SELECT into arithmetic/logical
operations (e.g. AND/OR) we need to make sure that the new operations
aren't more poisonous. One way to do that is to use FREEZE to make
sure the operands aren't posion.

This patch aims at fixing the kind of miscompiles reported in
  llvm#84653
and
  llvm#85190

Solution is to make sure that we insert FREEZE, if needed to make
the fold sound, when using the foldBoolSelectToLogic and
foldVSelectToSignBitSplatMask DAG combines.

This may result in some (hopefully minor) regressions since we lack
some ways to fold away the freeze (or due to isGuaranteedNotToBePoison
being too pessimistic). Focus in this patch is to just avoid
miscompiles, but I think some of the regressions can be avoided by
general improvements regarding poison/freeze handling in SelectionDAG.
Allow pushing freeze through SETCC and SELECT_CC even if there are
multiple "maybe poison" operands. In the past we have limited it to
a single "maybe poison" operand, but it seems profitable to also
allow the multiple operand scenario.

One goal here is to avoid some regressions seen in review of
  llvm#84924
when solving the select->and miscompiles described in
  llvm#84653
Copy link

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:
git-clang-format --diff 67aec0cd62d607b4e5b7198769be061454ce67b3 e57c1432d57227969a9d6a4f19fd5469a39a4a14 -- llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
View the diff from clang-format here.
diff --git a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
index 13683cd288..8f1bda675c 100644
--- a/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
@@ -15582,8 +15582,7 @@ SDValue DAGCombiner::visitFREEZE(SDNode *N) {
     return SDValue();
 
   bool AllowMultipleMaybePoisonOperands =
-      N0.getOpcode() == ISD::SELECT_CC ||
-      N0.getOpcode() == ISD::SETCC ||
+      N0.getOpcode() == ISD::SELECT_CC || N0.getOpcode() == ISD::SETCC ||
       N0.getOpcode() == ISD::BUILD_VECTOR ||
       N0.getOpcode() == ISD::BUILD_PAIR ||
       N0.getOpcode() == ISD::VECTOR_SHUFFLE ||

@bjope
Copy link
Collaborator Author

bjope commented May 28, 2024

I've rebased this (again). But I'm not quite sure how to proceed.

  • What remains to be solved before landing this? Any particular tests that are of concern?
  • Would it help if I split it up in one patch for foldBoolSelectToLogic and another one for foldVSelectToSignBitSplatMask?
  • Should I pre-commit e57c143 (not sure if there would be any motivating test cases for that one if doing it as a pre-commit, that is why I added it as a second patch in the patch-stack here).

@nikic
Copy link
Contributor

nikic commented May 28, 2024

I've rebased this (again). But I'm not quite sure how to proceed.

* What remains to be solved before landing this? Any particular tests that are of concern?

I think we mainly need a solution for the "AssertZExt/AssertSExt on argument" problem.

* Would it help if I split it up in one patch for foldBoolSelectToLogic and another one for foldVSelectToSignBitSplatMask?

Probably, especially if the latter has less test regressions?

* Should I pre-commit [e57c143](https://github.com/llvm/llvm-project/commit/e57c1432d57227969a9d6a4f19fd5469a39a4a14) (not sure if there would be any motivating test cases for that one if doing it as a pre-commit, that is why I added it as a second patch in the patch-stack here).

Yes, please. If there are no test changes as-is, you could add a test with explicit freeze.

@bjope
Copy link
Collaborator Author

bjope commented May 29, 2024

* Would it help if I split it up in one patch for foldBoolSelectToLogic and another one for foldVSelectToSignBitSplatMask?

Probably, especially if the latter has less test regressions?

* Should I pre-commit [e57c143](https://github.com/llvm/llvm-project/commit/e57c1432d57227969a9d6a4f19fd5469a39a4a14) (not sure if there would be any motivating test cases for that one if doing it as a pre-commit, that is why I added it as a second patch in the patch-stack here).

Yes, please. If there are no test changes as-is, you could add a test with explicit freeze.

Thanks for the advice! I'll split this patch in smaller pieces (at least breaking out the "Push freeze through SETCC and SELECT_CC" part).

bjope added a commit to bjope/llvm-project that referenced this pull request May 29, 2024
Allow pushing freeze through SETCC and SELECT_CC even if there are
multiple "maybe poison" operands. In the past we have limited it to
a single "maybe poison" operand, but it seems profitable to also
allow the multiple operand scenario.

One goal here is to avoid some regressions seen in review of
  llvm#84924
when solving the select->and miscompiles described in
  llvm#84653
@efriedma-quic
Copy link
Collaborator

Maybe we should introduce ISD::LOGICAL_AND/ISD::LOGICAL_OR nodes to SelectionDAG? If we delay the lowering to actual AND/OR until operation legalization, we can do various combines without worrying about a bunch of extra freeze operations.

@nikic
Copy link
Contributor

nikic commented Jun 7, 2024

Maybe we should introduce ISD::LOGICAL_AND/ISD::LOGICAL_OR nodes to SelectionDAG? If we delay the lowering to actual AND/OR until operation legalization, we can do various combines without worrying about a bunch of extra freeze operations.

Yes, that would be an alternative -- this is essentially what we do in IR, though there we just keep the original select but recognize it as a logical and/or pattern. The advantage is that we don't have to introduce freeze -- the disadvantage is that we have to make sure that all optimizations can handle the logical variant. On the IR side, this took some very extensive work. Maybe it's simpler for SDAG as it presumably has less transforms reasoning about complex conditions.

@dtcxzyw
Copy link
Member

dtcxzyw commented Jul 7, 2024

Reverse ping :)
I encountered this issue in real-world code (generated by verilator for RTL simulation).

@nikic
Copy link
Contributor

nikic commented Jul 7, 2024

With the LLVM 19 release coming up, we might have to take this as-is :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Miscompile at -O3 with zbb
10 participants