[SCEVExpander] Attempt to reinfer flags dropped due to CSE #72431

preames · 2023-11-15T20:18:24Z

LSR uses SCEVExpander to generate induction formulas. The expander internally tries to reuse existing IR expressions. To do that, it needs to strip any poison generating flags (nsw, nuw, exact, nneg, etc..) which may not be valid for the newly added users.

This is conservatively correct, but has the effect that LSR will strip nneg flags on zext instructions involved in trip counts in loop preheaders. To avoid this, this patch adjusts the expanded to reinfer the flags on the CSE candidate if legal for all possible users.

This should fix the regression reported in #71200.

This should arguably be done inside canReuseInstruction instead, but doing it outside is more conservative compile time wise. Both canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so right now we are performing work which is roughly O(N^2) in the size of the operand graph. We should fix that before making the per operand step more expensive. My tenative plan is to land this, and then rework the code to sink the logic into more core interfaces.

llvmbot · 2023-11-15T20:18:55Z

@llvm/pr-subscribers-backend-risc-v

@llvm/pr-subscribers-llvm-transforms

Author: Philip Reames (preames)

Changes

LSR uses SCEVExpander to generate induction formulas. The expander internally tries to reuse existing IR expressions. To do that, it needs to strip any poison generating flags (nsw, nuw, exact, nneg, etc..) which may not be valid for the newly added users.

This is conservatively correct, but has the effect that LSR will strip nneg flags on zext instructions involved in trip counts in loop preheaders. To avoid this, this patch adjusts the expanded to reinfer the flags on the CSE candidate if legal for all possible users.

This should fix the regression reported in #71200.

This should arguably be done inside canReuseInstruction instead, but doing it outside is more conservative compile time wise. Both canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so right now we are performing work which is roughly O(N^2) in the size of the operand graph. We should fix that before making the per operand step more expensive. My tenative plan is to land this, and then rework the code to sink the logic into more core interfaces.

Full diff: https://github.com/llvm/llvm-project/pull/72431.diff

7 Files Affected:

(modified) llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp (+21-1)
(modified) llvm/test/CodeGen/Thumb2/mve-blockplacement.ll (+5-7)
(modified) llvm/test/Transforms/IRCE/non-loop-invariant-rhs-instr.ll (+1-1)
(modified) llvm/test/Transforms/IndVarSimplify/promote-iv-to-eliminate-casts.ll (+1-1)
(modified) llvm/test/Transforms/LoopPredication/basic.ll (+1-1)
(modified) llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll (+1-1)
(modified) llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll (+1-1)

diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index 8bfe9e67d15e2f9..94ecaf67e085d46 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -1532,8 +1532,28 @@ Value *SCEVExpander::expand(const SCEV *S) {
     V = visit(S);
     V = fixupLCSSAFormFor(V);
   } else {
-    for (Instruction *I : DropPoisonGeneratingInsts)
+    for (Instruction *I : DropPoisonGeneratingInsts) {
       I->dropPoisonGeneratingFlagsAndMetadata();
+      // See if we can re-infer from first principles any of the flags we just
+      // dropped.
+      if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(I))
+        if (auto Flags = SE.getStrengthenedNoWrapFlagsFromBinOp(OBO)) {
+          auto *BO = cast<BinaryOperator>(I);
+          BO->setHasNoUnsignedWrap(
+            ScalarEvolution::maskFlags(*Flags, SCEV::FlagNUW) == SCEV::FlagNUW);
+          BO->setHasNoSignedWrap(
+            ScalarEvolution::maskFlags(*Flags, SCEV::FlagNSW) == SCEV::FlagNSW);
+        }
+      if (auto *NNI = dyn_cast<PossiblyNonNegInst>(I)) {
+        auto *Src = NNI->getOperand(0);
+        if (SE.isKnownNonNegative(SE.getSCEV(Src)) ||
+            isKnownNonNegative(Src, DL, 0, &SE.AC, I, &SE.DT) ||
+            isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
+                                    Constant::getNullValue(Src->getType()), I,
+                                    DL).value_or(false))
+          NNI->setNonNeg(true);
+      }
+    }
   }
   // Remember the expanded value for this SCEV at this location.
   //
diff --git a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
index e22fd4cabfa529d..4e1efdcdbf26a15 100644
--- a/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
+++ b/llvm/test/CodeGen/Thumb2/mve-blockplacement.ll
@@ -385,19 +385,17 @@ define i32 @d(i64 %e, i32 %f, i64 %g, i32 %h) {
 ; CHECK-NEXT:    @ implicit-def: $r11
 ; CHECK-NEXT:    mov.w r9, #12
 ; CHECK-NEXT:    str r4, [sp, #12] @ 4-byte Spill
-; CHECK-NEXT:    add.w r0, r0, r2, lsr #1
 ; CHECK-NEXT:    add.w r1, r1, r2, lsr #1
-; CHECK-NEXT:    movw r2, #65532
-; CHECK-NEXT:    vdup.32 q6, r0
-; CHECK-NEXT:    movt r2, #32767
-; CHECK-NEXT:    and.w r3, r1, r2
+; CHECK-NEXT:    add.w r0, r0, r2, lsr #1
+; CHECK-NEXT:    bic r3, r1, #3
 ; CHECK-NEXT:    adr r1, .LCPI1_0
-; CHECK-NEXT:    vdup.32 q7, r0
 ; CHECK-NEXT:    vldrw.u32 q0, [r1]
 ; CHECK-NEXT:    adr r1, .LCPI1_1
 ; CHECK-NEXT:    vldrw.u32 q5, [r1]
-; CHECK-NEXT:    strd r3, r7, [sp, #4] @ 8-byte Folded Spill
+; CHECK-NEXT:    vdup.32 q6, r0
 ; CHECK-NEXT:    vadd.i32 q4, q0, r7
+; CHECK-NEXT:    vdup.32 q7, r0
+; CHECK-NEXT:    strd r3, r7, [sp, #4] @ 8-byte Folded Spill
 ; CHECK-NEXT:    b .LBB1_6
 ; CHECK-NEXT:  .LBB1_2: @ %for.body6.preheader
 ; CHECK-NEXT:    @ in Loop: Header=BB1_6 Depth=1
diff --git a/llvm/test/Transforms/IRCE/non-loop-invariant-rhs-instr.ll b/llvm/test/Transforms/IRCE/non-loop-invariant-rhs-instr.ll
index c28fc59014f5c6f..58dff360ff6a5c0 100644
--- a/llvm/test/Transforms/IRCE/non-loop-invariant-rhs-instr.ll
+++ b/llvm/test/Transforms/IRCE/non-loop-invariant-rhs-instr.ll
@@ -9,7 +9,7 @@ define i32 @test_01(i32 %A, i64 %Len, ptr %array) {
 ; CHECK-NEXT:    br i1 [[TRIPCHECK]], label [[LOOP_PREHEADER:%.*]], label [[ZERO:%.*]]
 ; CHECK:       loop.preheader:
 ; CHECK-NEXT:    [[TMP0:%.*]] = zext i32 [[A:%.*]] to i64
-; CHECK-NEXT:    [[TMP1:%.*]] = add i64 [[TMP0]], 1
+; CHECK-NEXT:    [[TMP1:%.*]] = add nuw nsw i64 [[TMP0]], 1
 ; CHECK-NEXT:    [[SMIN:%.*]] = call i64 @llvm.smin.i64(i64 [[LEN]], i64 0)
 ; CHECK-NEXT:    [[TMP2:%.*]] = sub i64 [[LEN]], [[SMIN]]
 ; CHECK-NEXT:    [[UMIN:%.*]] = call i64 @llvm.umin.i64(i64 [[TMP2]], i64 [[TMP1]])
diff --git a/llvm/test/Transforms/IndVarSimplify/promote-iv-to-eliminate-casts.ll b/llvm/test/Transforms/IndVarSimplify/promote-iv-to-eliminate-casts.ll
index 60e014b0efca53a..3f0ada281b1e340 100644
--- a/llvm/test/Transforms/IndVarSimplify/promote-iv-to-eliminate-casts.ll
+++ b/llvm/test/Transforms/IndVarSimplify/promote-iv-to-eliminate-casts.ll
@@ -182,7 +182,7 @@ define void @promote_latch_condition_decrementing_loop_01(ptr %p, ptr %a) {
 ; CHECK-LABEL: @promote_latch_condition_decrementing_loop_01(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[LEN:%.*]] = load i32, ptr [[P:%.*]], align 4, !range [[RNG0:![0-9]+]]
-; CHECK-NEXT:    [[LEN_MINUS_1:%.*]] = add i32 [[LEN]], -1
+; CHECK-NEXT:    [[LEN_MINUS_1:%.*]] = add nsw i32 [[LEN]], -1
 ; CHECK-NEXT:    [[ZERO_CHECK:%.*]] = icmp eq i32 [[LEN]], 0
 ; CHECK-NEXT:    br i1 [[ZERO_CHECK]], label [[LOOPEXIT:%.*]], label [[PREHEADER:%.*]]
 ; CHECK:       preheader:
diff --git a/llvm/test/Transforms/LoopPredication/basic.ll b/llvm/test/Transforms/LoopPredication/basic.ll
index 2a99963ad0786c4..27c8bc99c407e0c 100644
--- a/llvm/test/Transforms/LoopPredication/basic.ll
+++ b/llvm/test/Transforms/LoopPredication/basic.ll
@@ -1681,7 +1681,7 @@ define i32 @ne_latch_zext(ptr %array, i32 %length, i16 %n16) {
 ; CHECK-LABEL: @ne_latch_zext(
 ; CHECK-NEXT:  loop.preheader:
 ; CHECK-NEXT:    [[N:%.*]] = zext i16 [[N16:%.*]] to i32
-; CHECK-NEXT:    [[NPLUS1:%.*]] = add i32 [[N]], 1
+; CHECK-NEXT:    [[NPLUS1:%.*]] = add nuw nsw i32 [[N]], 1
 ; CHECK-NEXT:    [[TMP0:%.*]] = icmp ule i32 [[NPLUS1]], [[LENGTH:%.*]]
 ; CHECK-NEXT:    [[TMP1:%.*]] = icmp ult i32 0, [[LENGTH]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = and i1 [[TMP1]], [[TMP0]]
diff --git a/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll b/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
index cf875ccdc147ab9..669306c8f3ab747 100644
--- a/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
+++ b/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
@@ -96,7 +96,7 @@ define void @pr56282() {
 ; CHECK:       inner.2.preheader:
 ; CHECK-NEXT:    br label [[INNER_2]]
 ; CHECK:       inner.2:
-; CHECK-NEXT:    [[OUTER_IV_NEXT]] = add i64 [[OUTER_IV]], 1
+; CHECK-NEXT:    [[OUTER_IV_NEXT]] = add nuw i64 [[OUTER_IV]], 1
 ; CHECK-NEXT:    br label [[OUTER_HEADER]]
 ; CHECK:       exit:
 ; CHECK-NEXT:    ret void
diff --git a/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll b/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll
index 246bb0095e1a258..e5582548447aed0 100644
--- a/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll
+++ b/llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll
@@ -19,7 +19,7 @@ define void @vdiv(ptr %x, ptr %y, double %a, i32 %N) #0 {
 ; CHECK:       for.body.preheader:
 ; CHECK-NEXT:    [[X4:%.*]] = ptrtoint ptr [[X:%.*]] to i64
 ; CHECK-NEXT:    [[Y5:%.*]] = ptrtoint ptr [[Y:%.*]] to i64
-; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[N]] to i64
+; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext nneg i32 [[N]] to i64
 ; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i32 [[N]], 16
 ; CHECK-NEXT:    [[TMP0:%.*]] = sub i64 [[X4]], [[Y5]]
 ; CHECK-NEXT:    [[DIFF_CHECK:%.*]] = icmp ult i64 [[TMP0]], 128

github-actions · 2023-11-15T20:21:01Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff 0e6685ab1a8313cd1dc7eb3c99ff642e6c492aa2 70f64340ab1a7e9d073555f28677cbb0f87c7ae9 -- llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp

View the diff from clang-format here.

diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index cd3ac317cd..1ec85b5f5b 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -1539,16 +1539,17 @@ Value *SCEVExpander::expand(const SCEV *S) {
       if (auto *OBO = dyn_cast<OverflowingBinaryOperator>(I))
         if (auto Flags = SE.getStrengthenedNoWrapFlagsFromBinOp(OBO)) {
           auto *BO = cast<BinaryOperator>(I);
-          BO->setHasNoUnsignedWrap(
-            ScalarEvolution::maskFlags(*Flags, SCEV::FlagNUW) == SCEV::FlagNUW);
-          BO->setHasNoSignedWrap(
-            ScalarEvolution::maskFlags(*Flags, SCEV::FlagNSW) == SCEV::FlagNSW);
+          BO->setHasNoUnsignedWrap(ScalarEvolution::maskFlags(
+                                       *Flags, SCEV::FlagNUW) == SCEV::FlagNUW);
+          BO->setHasNoSignedWrap(ScalarEvolution::maskFlags(
+                                     *Flags, SCEV::FlagNSW) == SCEV::FlagNSW);
         }
       if (auto *NNI = dyn_cast<PossiblyNonNegInst>(I)) {
         auto *Src = NNI->getOperand(0);
         if (isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
                                     Constant::getNullValue(Src->getType()), I,
-                                    DL).value_or(false))
+                                    DL)
+                .value_or(false))
           NNI->setNonNeg(true);
       }
     }

preames · 2023-11-28T15:57:11Z

ping

nikic · 2023-11-28T16:21:36Z

llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp

+            isKnownNonNegative(Src, DL, 0, &SE.AC, I, &SE.DT) ||
+            isImpliedByDomCondition(ICmpInst::ICMP_SGE, Src,
+                                    Constant::getNullValue(Src->getType()), I,
+                                    DL).value_or(false))


It looks like you don't have test coverage for all three of these calls. (Edit: As in, only one out of three is covered.)

Generally I'd expect ValueTracking isKnownNonNegative to not give any useful additional information here, with the exception of assume s.

Deleted for now. I'd originally intended for this to be read along side #72437, but let's keep them separate for now.

nikic

I think this is generally fine in terms of approach. Surprisingly, it doesn't seem to have any significant impact on compile-time either.

LSR uses SCEVExpander to generate induction formulas. The expander internally tries to reuse existing IR expressions. To do that, it needs to strip any poison generating flags (nsw, nuw, exact, nneg, etc..) which may not be valid for the newly added users. This is conservatively correct, but has the effect that LSR will strip nneg flags on zext instructions involved in trip counts in loop preheaders. To avoid this, this patch adjusts the expanded to reinfer the flags on the CSE candidate if legal for all possible users. This should fix the regression reported in llvm#71200. This should arguably be done inside canReuseInstruction instead, but doing it outside is more conservative compile time wise. Both canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so right now we are performing work which is roughly O(N^2) in the size of the operand graph. We should fix that before making the per operand step more expensive. My tenative plan is to land this, and then rework the code to sink the logic into more core interfaces.

nikic

LGTM

preames requested review from nikic and topperc November 15, 2023 20:18

llvmbot added the llvm:transforms label Nov 15, 2023

preames mentioned this pull request Nov 27, 2023

[ValueTracking] Perform a local edge implication check for isKnownNonNegative #72437

Closed

nikic reviewed Nov 28, 2023

View reviewed changes

preames added 3 commits December 7, 2023 11:00

Address review comment

54f7d6f

Test case update after rebase

70f6434

preames force-pushed the pr-lsr-dont-drop-zext-nneg branch from 8612176 to 70f6434 Compare December 7, 2023 19:09

llvmbot added the backend:RISC-V label Dec 7, 2023

nikic approved these changes Dec 7, 2023

View reviewed changes

preames merged commit ffb2af3 into llvm:main Dec 7, 2023

preames deleted the pr-lsr-dont-drop-zext-nneg branch December 7, 2023 21:23

preames mentioned this pull request Dec 7, 2023

LSR dropping nneg flag on zext in loop preheader #71200

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SCEVExpander] Attempt to reinfer flags dropped due to CSE #72431

[SCEVExpander] Attempt to reinfer flags dropped due to CSE #72431

Uh oh!

preames commented Nov 15, 2023

Uh oh!

llvmbot commented Nov 15, 2023 •

edited

Loading

Uh oh!

github-actions bot commented Nov 15, 2023 •

edited

Loading

Uh oh!

preames commented Nov 28, 2023

Uh oh!

nikic Nov 28, 2023 •

edited

Loading

Uh oh!

nikic Nov 28, 2023

Uh oh!

preames Dec 7, 2023

Uh oh!

nikic left a comment

Uh oh!

nikic left a comment

Uh oh!

Uh oh!

[SCEVExpander] Attempt to reinfer flags dropped due to CSE #72431

[SCEVExpander] Attempt to reinfer flags dropped due to CSE #72431

Uh oh!

Conversation

preames commented Nov 15, 2023

Uh oh!

llvmbot commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Nov 15, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

preames commented Nov 28, 2023

Uh oh!

nikic Nov 28, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nikic Nov 28, 2023

Choose a reason for hiding this comment

Uh oh!

preames Dec 7, 2023

Choose a reason for hiding this comment

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

nikic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

llvmbot commented Nov 15, 2023 •

edited

Loading

github-actions bot commented Nov 15, 2023 •

edited

Loading

nikic Nov 28, 2023 •

edited

Loading