Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCEV] Make invalidation in SCEVCallbackVH more thorough #68316

Merged
merged 1 commit into from
Oct 10, 2023

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 5, 2023

When a SCEVCallbackVH is RAUWed, we currently do a def-use walk and remove dependent instructions from the ValueExprMap. However, unlike SCEVs usual invalidation, this does not forget memoized values.

The end result is that we might end up removing a SCEVUnknown from the map, while that expression still has users. Due to that, we may later fail to invalide those expressions. In particular, invalidation of loop dispositions only does something if there is an expression for the value, which would not be the case here.

Fix this by using the standard forgetValue() API, instead of rolling a custom variant.

The impact on compile-time is small but non-zero:
http://llvm-compile-time-tracker.com/compare.php?from=236228f43d018ebf11697610fe6504c167ed6ac8&to=8e1f73fd7f883dc9efaf9fb673d27553f53cfb38&stat=instructions:u

Fixes #68285.

When a SCEVCallbackVH is RAUWed, we currently do a def-use walk
and remove dependent instructions from the ValueExprMap. However,
unlike SCEVs usual invalidation, this does not forget memoized
values.

The end result is that we might end up removing a SCEVUnknown from
the map, while that expression still has users. Due to that, we
may later fail to invalide those expressions. In particular,
invalidation of loop dispositions only does something if there is
an expression for the value, which would not be the case here.

Fix this by using the standard forgetValue() API, instead of
rolling a custom variant.

The impact on compile-time is small but non-zero:
http://llvm-compile-time-tracker.com/compare.php?from=236228f43d018ebf11697610fe6504c167ed6ac8&to=8e1f73fd7f883dc9efaf9fb673d27553f53cfb38&stat=instructions:u

Fixes llvm#68285.
@llvmbot
Copy link

llvmbot commented Oct 5, 2023

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Changes

When a SCEVCallbackVH is RAUWed, we currently do a def-use walk and remove dependent instructions from the ValueExprMap. However, unlike SCEVs usual invalidation, this does not forget memoized values.

The end result is that we might end up removing a SCEVUnknown from the map, while that expression still has users. Due to that, we may later fail to invalide those expressions. In particular, invalidation of loop dispositions only does something if there is an expression for the value, which would not be the case here.

Fix this by using the standard forgetValue() API, instead of rolling a custom variant.

The impact on compile-time is small but non-zero:
http://llvm-compile-time-tracker.com/compare.php?from=236228f43d018ebf11697610fe6504c167ed6ac8&to=8e1f73fd7f883dc9efaf9fb673d27553f53cfb38&stat=instructions:u

Fixes #68285.


Full diff: https://github.com/llvm/llvm-project/pull/68316.diff

3 Files Affected:

  • (modified) llvm/lib/Analysis/ScalarEvolution.cpp (+1-20)
  • (modified) llvm/test/Transforms/IndVarSimplify/elim-extend.ll (+11-11)
  • (added) llvm/test/Transforms/LoopSimplify/pr68285.ll (+89)
diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index 7c234fa309a0c29..c24c958ec36b200 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -13284,26 +13284,7 @@ void ScalarEvolution::SCEVCallbackVH::allUsesReplacedWith(Value *V) {
   // Forget all the expressions associated with users of the old value,
   // so that future queries will recompute the expressions using the new
   // value.
-  Value *Old = getValPtr();
-  SmallVector<User *, 16> Worklist(Old->users());
-  SmallPtrSet<User *, 8> Visited;
-  while (!Worklist.empty()) {
-    User *U = Worklist.pop_back_val();
-    // Deleting the Old value will cause this to dangle. Postpone
-    // that until everything else is done.
-    if (U == Old)
-      continue;
-    if (!Visited.insert(U).second)
-      continue;
-    if (PHINode *PN = dyn_cast<PHINode>(U))
-      SE->ConstantEvolutionLoopExitValue.erase(PN);
-    SE->eraseValueFromMap(U);
-    llvm::append_range(Worklist, U->users());
-  }
-  // Delete the Old value.
-  if (PHINode *PN = dyn_cast<PHINode>(Old))
-    SE->ConstantEvolutionLoopExitValue.erase(PN);
-  SE->eraseValueFromMap(Old);
+  SE->forgetValue(getValPtr());
   // this now dangles!
 }
 
diff --git a/llvm/test/Transforms/IndVarSimplify/elim-extend.ll b/llvm/test/Transforms/IndVarSimplify/elim-extend.ll
index 137fd4746ccf66a..54bb9951ff66ab0 100644
--- a/llvm/test/Transforms/IndVarSimplify/elim-extend.ll
+++ b/llvm/test/Transforms/IndVarSimplify/elim-extend.ll
@@ -114,46 +114,46 @@ define void @nestedIV(ptr %address, i32 %limit) nounwind {
 ; CHECK-LABEL: @nestedIV(
 ; CHECK-NEXT:  entry:
 ; CHECK-NEXT:    [[LIMITDEC:%.*]] = add i32 [[LIMIT:%.*]], -1
+; CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[LIMITDEC]] to i64
 ; CHECK-NEXT:    [[SMAX:%.*]] = call i32 @llvm.smax.i32(i32 [[LIMIT]], i32 1)
-; CHECK-NEXT:    [[WIDE_TRIP_COUNT4:%.*]] = zext i32 [[SMAX]] to i64
+; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = zext i32 [[SMAX]] to i64
 ; CHECK-NEXT:    br label [[OUTERLOOP:%.*]]
 ; CHECK:       outerloop:
 ; CHECK-NEXT:    [[INDVARS_IV1:%.*]] = phi i64 [ [[INDVARS_IV_NEXT2:%.*]], [[OUTERMERGE:%.*]] ], [ 0, [[ENTRY:%.*]] ]
 ; CHECK-NEXT:    [[INNERCOUNT:%.*]] = phi i32 [ [[INNERCOUNT_MERGE:%.*]], [[OUTERMERGE]] ], [ 0, [[ENTRY]] ]
-; CHECK-NEXT:    [[TMP0:%.*]] = add nsw i64 [[INDVARS_IV1]], -1
-; CHECK-NEXT:    [[ADR1:%.*]] = getelementptr i8, ptr [[ADDRESS:%.*]], i64 [[TMP0]]
+; CHECK-NEXT:    [[TMP1:%.*]] = add nsw i64 [[INDVARS_IV1]], -1
+; CHECK-NEXT:    [[ADR1:%.*]] = getelementptr i8, ptr [[ADDRESS:%.*]], i64 [[TMP1]]
 ; CHECK-NEXT:    store i8 0, ptr [[ADR1]], align 1
 ; CHECK-NEXT:    br label [[INNERPREHEADER:%.*]]
 ; CHECK:       innerpreheader:
 ; CHECK-NEXT:    [[INNERPRECMP:%.*]] = icmp sgt i32 [[LIMITDEC]], [[INNERCOUNT]]
 ; CHECK-NEXT:    br i1 [[INNERPRECMP]], label [[INNERLOOP_PREHEADER:%.*]], label [[OUTERMERGE]]
 ; CHECK:       innerloop.preheader:
-; CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[INNERCOUNT]] to i64
-; CHECK-NEXT:    [[WIDE_TRIP_COUNT:%.*]] = sext i32 [[LIMITDEC]] to i64
+; CHECK-NEXT:    [[TMP2:%.*]] = sext i32 [[INNERCOUNT]] to i64
 ; CHECK-NEXT:    br label [[INNERLOOP:%.*]]
 ; CHECK:       innerloop:
-; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[TMP1]], [[INNERLOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], [[INNERLOOP]] ]
+; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[TMP2]], [[INNERLOOP_PREHEADER]] ], [ [[INDVARS_IV_NEXT:%.*]], [[INNERLOOP]] ]
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], 1
 ; CHECK-NEXT:    [[ADR2:%.*]] = getelementptr i8, ptr [[ADDRESS]], i64 [[INDVARS_IV]]
 ; CHECK-NEXT:    store i8 0, ptr [[ADR2]], align 1
 ; CHECK-NEXT:    [[ADR3:%.*]] = getelementptr i8, ptr [[ADDRESS]], i64 [[INDVARS_IV_NEXT]]
 ; CHECK-NEXT:    store i8 0, ptr [[ADR3]], align 1
-; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT]], [[TMP0]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[INNERLOOP]], label [[INNEREXIT:%.*]]
 ; CHECK:       innerexit:
 ; CHECK-NEXT:    [[INNERCOUNT_LCSSA_WIDE:%.*]] = phi i64 [ [[INDVARS_IV_NEXT]], [[INNERLOOP]] ]
-; CHECK-NEXT:    [[TMP2:%.*]] = trunc i64 [[INNERCOUNT_LCSSA_WIDE]] to i32
+; CHECK-NEXT:    [[TMP3:%.*]] = trunc i64 [[INNERCOUNT_LCSSA_WIDE]] to i32
 ; CHECK-NEXT:    br label [[OUTERMERGE]]
 ; CHECK:       outermerge:
-; CHECK-NEXT:    [[INNERCOUNT_MERGE]] = phi i32 [ [[TMP2]], [[INNEREXIT]] ], [ [[INNERCOUNT]], [[INNERPREHEADER]] ]
+; CHECK-NEXT:    [[INNERCOUNT_MERGE]] = phi i32 [ [[TMP3]], [[INNEREXIT]] ], [ [[INNERCOUNT]], [[INNERPREHEADER]] ]
 ; CHECK-NEXT:    [[ADR4:%.*]] = getelementptr i8, ptr [[ADDRESS]], i64 [[INDVARS_IV1]]
 ; CHECK-NEXT:    store i8 0, ptr [[ADR4]], align 1
 ; CHECK-NEXT:    [[OFS5:%.*]] = sext i32 [[INNERCOUNT_MERGE]] to i64
 ; CHECK-NEXT:    [[ADR5:%.*]] = getelementptr i8, ptr [[ADDRESS]], i64 [[OFS5]]
 ; CHECK-NEXT:    store i8 0, ptr [[ADR5]], align 1
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT2]] = add nuw nsw i64 [[INDVARS_IV1]], 1
-; CHECK-NEXT:    [[EXITCOND5:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT2]], [[WIDE_TRIP_COUNT4]]
-; CHECK-NEXT:    br i1 [[EXITCOND5]], label [[OUTERLOOP]], label [[RETURN:%.*]]
+; CHECK-NEXT:    [[EXITCOND4:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT2]], [[WIDE_TRIP_COUNT]]
+; CHECK-NEXT:    br i1 [[EXITCOND4]], label [[OUTERLOOP]], label [[RETURN:%.*]]
 ; CHECK:       return:
 ; CHECK-NEXT:    ret void
 ;
diff --git a/llvm/test/Transforms/LoopSimplify/pr68285.ll b/llvm/test/Transforms/LoopSimplify/pr68285.ll
new file mode 100644
index 000000000000000..9490dcec7134a3a
--- /dev/null
+++ b/llvm/test/Transforms/LoopSimplify/pr68285.ll
@@ -0,0 +1,89 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 3
+; RUN: opt -S -passes="loop-mssa(simple-loop-unswitch<nontrivial>),loop(loop-unroll-full),loop-simplify,verify<scalar-evolution>" < %s | FileCheck %s
+
+; Should not caused a SCEV verification failure due to incorrect
+; loop dispositions.
+
+define void @g(ptr %b, i32 %arg, i1 %tobool.not) {
+; CHECK-LABEL: define void @g(
+; CHECK-SAME: ptr [[B:%.*]], i32 [[ARG:%.*]], i1 [[TOBOOL_NOT:%.*]]) {
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:    br i1 [[TOBOOL_NOT]], label [[ENTRY_SPLIT_US:%.*]], label [[ENTRY_SPLIT:%.*]]
+; CHECK:       entry.split.us:
+; CHECK-NEXT:    br label [[FOR_BODY_US:%.*]]
+; CHECK:       for.body.us:
+; CHECK-NEXT:    br label [[LAND_END_US:%.*]]
+; CHECK:       land.end.us:
+; CHECK-NEXT:    store i16 0, ptr [[B]], align 1
+; CHECK-NEXT:    br label [[FOR_END:%.*]]
+; CHECK:       entry.split:
+; CHECK-NEXT:    br label [[FOR_BODY_PEEL_BEGIN:%.*]]
+; CHECK:       for.body.peel.begin:
+; CHECK-NEXT:    br label [[FOR_BODY_PEEL:%.*]]
+; CHECK:       for.body.peel:
+; CHECK-NEXT:    br label [[LAND_RHS_PEEL:%.*]]
+; CHECK:       land.rhs.peel:
+; CHECK-NEXT:    [[AND_PEEL:%.*]] = and i32 -1, [[ARG]]
+; CHECK-NEXT:    [[TOBOOL2_PEEL:%.*]] = icmp ne i32 [[AND_PEEL]], 0
+; CHECK-NEXT:    [[DOTPRE_PEEL:%.*]] = load i32, ptr [[B]], align 1
+; CHECK-NEXT:    br label [[LAND_END_PEEL:%.*]]
+; CHECK:       land.end.peel:
+; CHECK-NEXT:    [[IV2_PEEL:%.*]] = phi i32 [ [[DOTPRE_PEEL]], [[LAND_RHS_PEEL]] ]
+; CHECK-NEXT:    [[IV3_PEEL:%.*]] = phi i1 [ [[TOBOOL2_PEEL]], [[LAND_RHS_PEEL]] ]
+; CHECK-NEXT:    [[LAND_EXT_PEEL:%.*]] = zext i1 [[IV3_PEEL]] to i16
+; CHECK-NEXT:    [[CMP3_PEEL:%.*]] = icmp ne i16 0, [[LAND_EXT_PEEL]]
+; CHECK-NEXT:    [[CONV4_PEEL:%.*]] = zext i1 [[CMP3_PEEL]] to i16
+; CHECK-NEXT:    store i16 [[LAND_EXT_PEEL]], ptr [[B]], align 1
+; CHECK-NEXT:    [[CMP_PEEL:%.*]] = icmp slt i32 [[IV2_PEEL]], 0
+; CHECK-NEXT:    br i1 [[CMP_PEEL]], label [[FOR_BODY_PEEL_NEXT:%.*]], label [[FOR_END_SPLIT:%.*]]
+; CHECK:       for.body.peel.next:
+; CHECK-NEXT:    br label [[FOR_BODY_PEEL_NEXT1:%.*]]
+; CHECK:       for.body.peel.next1:
+; CHECK-NEXT:    br label [[ENTRY_SPLIT_PEEL_NEWPH:%.*]]
+; CHECK:       entry.split.peel.newph:
+; CHECK-NEXT:    [[TOBOOL2:%.*]] = icmp ne i32 [[ARG]], 0
+; CHECK-NEXT:    [[LAND_EXT:%.*]] = zext i1 [[TOBOOL2]] to i16
+; CHECK-NEXT:    br label [[FOR_BODY:%.*]]
+; CHECK:       for.body:
+; CHECK-NEXT:    br label [[LAND_RHS:%.*]]
+; CHECK:       land.rhs:
+; CHECK-NEXT:    [[DOTPRE:%.*]] = load i32, ptr [[B]], align 1
+; CHECK-NEXT:    br label [[LAND_END:%.*]]
+; CHECK:       land.end:
+; CHECK-NEXT:    store i16 [[LAND_EXT]], ptr [[B]], align 1
+; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i32 [[DOTPRE]], 0
+; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END_SPLIT_LOOPEXIT:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
+; CHECK:       for.end.split.loopexit:
+; CHECK-NEXT:    br label [[FOR_END_SPLIT]]
+; CHECK:       for.end.split:
+; CHECK-NEXT:    br label [[FOR_END]]
+; CHECK:       for.end:
+; CHECK-NEXT:    ret void
+;
+entry:
+  br label %for.body
+
+for.body:                                         ; preds = %land.end, %entry
+  %iv1 = phi i16 [ 0, %entry ], [ %trunc, %land.end ]
+  br i1 %tobool.not, label %land.end, label %land.rhs
+
+land.rhs:                                         ; preds = %for.body
+  %and = and i32 -1, %arg
+  %tobool2 = icmp ne i32 %and, 0
+  %.pre = load i32, ptr %b, align 1
+  br label %land.end
+
+land.end:                                         ; preds = %land.rhs, %for.body
+  %iv2 = phi i32 [ 0, %for.body ], [ %.pre, %land.rhs ]
+  %iv3 = phi i1 [ false, %for.body ], [ %tobool2, %land.rhs ]
+  %land.ext = zext i1 %iv3 to i16
+  %cmp3 = icmp ne i16 %iv1, %land.ext
+  %conv4 = zext i1 %cmp3 to i16
+  store i16 %land.ext, ptr %b, align 1
+  %cmp = icmp slt i32 %iv2, 0
+  %trunc = trunc i32 %arg to i16
+  br i1 %cmp, label %for.body, label %for.end
+
+for.end:                                          ; preds = %land.end
+  ret void
+}

Copy link
Contributor

@fhahn fhahn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

; CHECK-NEXT: store i8 0, ptr [[ADR1]], align 1
; CHECK-NEXT: br label [[INNERPREHEADER:%.*]]
; CHECK: innerpreheader:
; CHECK-NEXT: [[INNERPRECMP:%.*]] = icmp sgt i32 [[LIMITDEC]], [[INNERCOUNT]]
; CHECK-NEXT: br i1 [[INNERPRECMP]], label [[INNERLOOP_PREHEADER:%.*]], label [[OUTERMERGE]]
; CHECK: innerloop.preheader:
; CHECK-NEXT: [[TMP1:%.*]] = sext i32 [[INNERCOUNT]] to i64
; CHECK-NEXT: [[WIDE_TRIP_COUNT:%.*]] = sext i32 [[LIMITDEC]] to i64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this has the nice side effect of hoisting the sext out of the loop.

@nikic nikic merged commit 32ec6d9 into llvm:main Oct 10, 2023
5 checks passed
@danilaml
Copy link
Collaborator

danilaml commented Oct 20, 2023

This appears to have fixed the hard-to-reduce issue with Assertion 'isAvailableAtLoopEntry(Operands[i], L) && "SCEVAddRecExpr operand is not available at loop entry!"' in verify passes after simple loop unswitch I've been debugging. Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants