Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SCEV] Check AR's wrap flags when expanding. #77827

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Jan 11, 2024

dcc84db introduced the IsIncrementNUW/NSW helpers instead of checking the AddRecs wrap flags, with the reason being that the increment by one in the post-inc form may wrap, even if the original AddRec would not wrap.

When introducing PostInc uses, flags that aren't present on the original AddRec are dropped, which should be sufficient to avoid the expansion result being more poisonous for post-inc uses.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 11, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

dcc84db introduced the IsIncrementNUW/NSW helpers instead of checking the AddRecs wrap flags, with the reason being that the increment by one in the post-inc form may wrap, even if the original AddRec would not wrap.

AFAICT there is only an issue with loops in post-inc form. This patch brings back checking the original check of the no-wrap flags on AR, if AR's loop isn't in the set of post-inc loops.


Full diff: https://github.com/llvm/llvm-project/pull/77827.diff

7 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp (+19-4)
  • (modified) llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll (+2-2)
  • (modified) llvm/test/Transforms/IndVarSimplify/pr30806-phi-scev.ll (+1-1)
  • (modified) llvm/test/Transforms/IndVarSimplify/pr55925.ll (+2-2)
  • (modified) llvm/test/Transforms/IndVarSimplify/preserve-nsw-during-expansion.ll (+1-1)
  • (modified) llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopStrengthReduce/X86/expander-crashes.ll (+2-2)
diff --git a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
index cd3ac317cd238e..dbaa506c8bbda0 100644
--- a/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
+++ b/llvm/lib/Transforms/Utils/ScalarEvolutionExpander.cpp
@@ -843,10 +843,16 @@ static bool canBeCheaplyTransformed(ScalarEvolution &SE,
   return false;
 }
 
-static bool IsIncrementNSW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {
+static bool IsIncrementNSW(ScalarEvolution &SE, const SCEVAddRecExpr *AR,
+                           bool IsPostIncLoop) {
   if (!isa<IntegerType>(AR->getType()))
     return false;
 
+  // Check AR's no-wrap flags, unless it is for a post-inc loop. Increment by 1
+  // for post-inc users may wrap even if the original AR does not.
+  if (!IsPostIncLoop && AR->getNoWrapFlags(SCEV::FlagNSW) & SCEV::FlagNSW)
+    return true;
+
   unsigned BitWidth = cast<IntegerType>(AR->getType())->getBitWidth();
   Type *WideTy = IntegerType::get(AR->getType()->getContext(), BitWidth * 2);
   const SCEV *Step = AR->getStepRecurrence(SE);
@@ -857,9 +863,14 @@ static bool IsIncrementNSW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {
   return ExtendAfterOp == OpAfterExtend;
 }
 
-static bool IsIncrementNUW(ScalarEvolution &SE, const SCEVAddRecExpr *AR) {
+static bool IsIncrementNUW(ScalarEvolution &SE, const SCEVAddRecExpr *AR,
+                           bool IsPostIncLoop) {
   if (!isa<IntegerType>(AR->getType()))
     return false;
+  // Check AR's no-wrap flags, unless it is for a post-inc loop. Increment by 1
+  // for post-inc users may wrap even if the original AR does not.
+  if (!IsPostIncLoop && AR->getNoWrapFlags(SCEV::FlagNUW) & SCEV::FlagNUW)
+    return true;
 
   unsigned BitWidth = cast<IntegerType>(AR->getType())->getBitWidth();
   Type *WideTy = IntegerType::get(AR->getType()->getContext(), BitWidth * 2);
@@ -1008,8 +1019,12 @@ SCEVExpander::getAddRecExprPHILiterally(const SCEVAddRecExpr *Normalized,
   // The no-wrap behavior proved by IsIncrement(NUW|NSW) is only applicable if
   // we actually do emit an addition.  It does not apply if we emit a
   // subtraction.
-  bool IncrementIsNUW = !useSubtract && IsIncrementNUW(SE, Normalized);
-  bool IncrementIsNSW = !useSubtract && IsIncrementNSW(SE, Normalized);
+  bool IsPostIncLoop = PostIncLoops.contains(Normalized->getLoop()) ||
+                       SavedPostIncLoops.count(Normalized->getLoop());
+  bool IncrementIsNUW =
+      !useSubtract && IsIncrementNUW(SE, Normalized, IsPostIncLoop);
+  bool IncrementIsNSW =
+      !useSubtract && IsIncrementNSW(SE, Normalized, IsPostIncLoop);
 
   // Create the PHI.
   BasicBlock *Header = L->getHeader();
diff --git a/llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll b/llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll
index 8aa698a4cb51d1..a6abd5ffbcb4a6 100644
--- a/llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll
+++ b/llvm/test/Transforms/IndVarSimplify/lftr-reuse.ll
@@ -86,7 +86,7 @@ define void @expandOuterRecurrence(i32 %arg) nounwind {
 ; CHECK-NEXT:    br label [[OUTER_INC]]
 ; CHECK:       outer.inc:
 ; CHECK-NEXT:    [[I_INC]] = add nuw nsw i32 [[I]], 1
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add i32 [[INDVARS_IV]], -1
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i32 [[INDVARS_IV]], -1
 ; CHECK-NEXT:    [[EXITCOND1:%.*]] = icmp ne i32 [[I_INC]], [[SUB1]]
 ; CHECK-NEXT:    br i1 [[EXITCOND1]], label [[OUTER]], label [[EXIT_LOOPEXIT:%.*]]
 ; CHECK:       exit.loopexit:
@@ -148,7 +148,7 @@ define void @guardedloop(ptr %matrix, ptr %vector,
 ; CHECK-NEXT:    [[VECTORP:%.*]] = getelementptr inbounds [0 x double], ptr [[VECTOR:%.*]], i32 0, i64 [[INDVARS_IV2]]
 ; CHECK-NEXT:    [[V2:%.*]] = load double, ptr [[VECTORP]], align 8
 ; CHECK-NEXT:    call void @use(double [[V2]])
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], [[TMP0]]
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], [[TMP0]]
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT3]] = add nuw nsw i64 [[INDVARS_IV2]], 1
 ; CHECK-NEXT:    [[EXITCOND:%.*]] = icmp ne i64 [[INDVARS_IV_NEXT3]], [[WIDE_TRIP_COUNT]]
 ; CHECK-NEXT:    br i1 [[EXITCOND]], label [[LOOP]], label [[RETURN_LOOPEXIT:%.*]]
diff --git a/llvm/test/Transforms/IndVarSimplify/pr30806-phi-scev.ll b/llvm/test/Transforms/IndVarSimplify/pr30806-phi-scev.ll
index b45f0946399f9d..6a2bbfa5447a9d 100644
--- a/llvm/test/Transforms/IndVarSimplify/pr30806-phi-scev.ll
+++ b/llvm/test/Transforms/IndVarSimplify/pr30806-phi-scev.ll
@@ -43,7 +43,7 @@ define void @foo(ptr %buf, i32 %denominator, ptr %flag) local_unnamed_addr {
 ; CHECK-NEXT:    [[INDVARS_IV:%.*]] = phi i64 [ [[INDVARS_IV_NEXT:%.*]], [[WHILE_BODY]] ], [ 0, [[WHILE_BODY_LR_PH]] ]
 ; CHECK-NEXT:    [[BUF_ADDR_07:%.*]] = phi ptr [ [[BUF]], [[WHILE_BODY_LR_PH]] ], [ [[CALL:%.*]], [[WHILE_BODY]] ]
 ; CHECK-NEXT:    [[TMP2:%.*]] = sext i32 [[DIV]] to i64
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], [[TMP2]]
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], [[TMP2]]
 ; CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr @theSize, align 4
 ; CHECK-NEXT:    store i32 [[TMP3]], ptr [[I]], align 4
 ; CHECK-NEXT:    call void @bar(ptr nonnull [[I]], i64 [[INDVARS_IV_NEXT]])
diff --git a/llvm/test/Transforms/IndVarSimplify/pr55925.ll b/llvm/test/Transforms/IndVarSimplify/pr55925.ll
index 420fc209949d4f..f95f263ae1b1ec 100644
--- a/llvm/test/Transforms/IndVarSimplify/pr55925.ll
+++ b/llvm/test/Transforms/IndVarSimplify/pr55925.ll
@@ -18,7 +18,7 @@ define void @test(ptr %p) personality ptr undef {
 ; CHECK-NEXT:    [[RES:%.*]] = invoke i32 @foo(i32 returned [[TMP0]])
 ; CHECK-NEXT:            to label [[LOOP_LATCH]] unwind label [[EXIT:%.*]]
 ; CHECK:       loop.latch:
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw i64 [[INDVARS_IV]], 1
 ; CHECK-NEXT:    [[TMP1:%.*]] = trunc i64 [[INDVARS_IV]] to i32
 ; CHECK-NEXT:    [[TMP2:%.*]] = call i32 @foo(i32 [[TMP1]])
 ; CHECK-NEXT:    br label [[LOOP]]
@@ -64,7 +64,7 @@ define void @test_critedge(i1 %c, ptr %p) personality ptr undef {
 ; CHECK-NEXT:    br label [[LOOP_LATCH]]
 ; CHECK:       loop.latch:
 ; CHECK-NEXT:    [[PHI:%.*]] = phi i32 [ [[TMP1]], [[LOOP_INVOKE]] ], [ 0, [[LOOP_OTHER]] ]
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], 1
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw i64 [[INDVARS_IV]], 1
 ; CHECK-NEXT:    [[TMP2:%.*]] = call i32 @foo(i32 [[PHI]])
 ; CHECK-NEXT:    br label [[LOOP]]
 ; CHECK:       exit:
diff --git a/llvm/test/Transforms/IndVarSimplify/preserve-nsw-during-expansion.ll b/llvm/test/Transforms/IndVarSimplify/preserve-nsw-during-expansion.ll
index 9c2237cff837bd..080bc9b42bbed5 100644
--- a/llvm/test/Transforms/IndVarSimplify/preserve-nsw-during-expansion.ll
+++ b/llvm/test/Transforms/IndVarSimplify/preserve-nsw-during-expansion.ll
@@ -23,7 +23,7 @@ define void @test_s172(i32 noundef %xa, i32 noundef %xb, ptr nocapture noundef %
 ; CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX2]], align 4
 ; CHECK-NEXT:    [[ADD:%.*]] = add nsw i32 [[TMP3]], [[TMP2]]
 ; CHECK-NEXT:    store i32 [[ADD]], ptr [[ARRAYIDX2]], align 4
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add i64 [[INDVARS_IV]], [[TMP1]]
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nsw i64 [[INDVARS_IV]], [[TMP1]]
 ; CHECK-NEXT:    [[CMP:%.*]] = icmp slt i64 [[INDVARS_IV_NEXT]], 32000
 ; CHECK-NEXT:    br i1 [[CMP]], label [[FOR_BODY]], label [[FOR_END_LOOPEXIT:%.*]], !llvm.loop [[LOOP0:![0-9]+]]
 ; CHECK:       for.end.loopexit:
diff --git a/llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll b/llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll
index 17ce13d8348784..35e6ca6c2cdee9 100644
--- a/llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll
+++ b/llvm/test/Transforms/IndVarSimplify/widen-i32-i8ptr.ll
@@ -15,7 +15,7 @@ define dso_local void @Widen_i32_i8ptr() local_unnamed_addr {
 ; CHECK-NEXT:    [[INCDEC_PTR]] = getelementptr inbounds i8, ptr [[GID_0]], i64 1
 ; CHECK-NEXT:    [[ARRAYIDX2115:%.*]] = getelementptr inbounds [15 x ptr], ptr [[PTRIDS]], i64 0, i64 [[INDVARS_IV]]
 ; CHECK-NEXT:    store ptr [[GID_0]], ptr [[ARRAYIDX2115]], align 8
-; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw i64 [[INDVARS_IV]], 1
+; CHECK-NEXT:    [[INDVARS_IV_NEXT]] = add nuw nsw i64 [[INDVARS_IV]], 1
 ; CHECK-NEXT:    br label [[FOR_COND2106]]
 ;
 entry:
diff --git a/llvm/test/Transforms/LoopStrengthReduce/X86/expander-crashes.ll b/llvm/test/Transforms/LoopStrengthReduce/X86/expander-crashes.ll
index 29c03b88c5fb1a..d7111e7dcb4dc0 100644
--- a/llvm/test/Transforms/LoopStrengthReduce/X86/expander-crashes.ll
+++ b/llvm/test/Transforms/LoopStrengthReduce/X86/expander-crashes.ll
@@ -17,7 +17,7 @@ define i64 @blam(ptr %start, ptr %end, ptr %ptr.2) {
 ; CHECK-NEXT:    [[LSR_IV4:%.*]] = phi i64 [ [[LSR_IV_NEXT5:%.*]], [[LOOP_1_HEADER]] ], [ [[START1]], [[ENTRY:%.*]] ]
 ; CHECK-NEXT:    [[IV:%.*]] = phi ptr [ [[IV_NEXT:%.*]], [[LOOP_1_HEADER]] ], [ [[START]], [[ENTRY]] ]
 ; CHECK-NEXT:    [[IV_NEXT]] = getelementptr inbounds [[STRUCT_HOGE:%.*]], ptr [[IV]], i64 1
-; CHECK-NEXT:    [[LSR_IV_NEXT5]] = add nuw i64 [[LSR_IV4]], 16
+; CHECK-NEXT:    [[LSR_IV_NEXT5]] = add i64 [[LSR_IV4]], 16
 ; CHECK-NEXT:    [[EC:%.*]] = icmp eq ptr [[IV_NEXT]], [[END:%.*]]
 ; CHECK-NEXT:    br i1 [[EC]], label [[LOOP_2_PH:%.*]], label [[LOOP_1_HEADER]]
 ; CHECK:       loop.2.ph:
@@ -35,7 +35,7 @@ define i64 @blam(ptr %start, ptr %end, ptr %ptr.2) {
 ; CHECK-NEXT:    br i1 [[EC_2]], label [[LOOP_2_EXIT:%.*]], label [[LOOP_2_LATCH]]
 ; CHECK:       loop.2.latch:
 ; CHECK-NEXT:    [[IV2_NEXT]] = getelementptr inbounds [[STRUCT_HOGE]], ptr [[IV2]], i64 1
-; CHECK-NEXT:    [[LSR_IV_NEXT3]] = add i64 [[LSR_IV2]], 16
+; CHECK-NEXT:    [[LSR_IV_NEXT3]] = add nuw i64 [[LSR_IV2]], 16
 ; CHECK-NEXT:    br label [[LOOP_2_HEADER]]
 ; CHECK:       loop.2.exit:
 ; CHECK-NEXT:    ret i64 [[LSR_IV2]]

@nikic
Copy link
Contributor

nikic commented Jan 11, 2024

Does this code need to care about post-inc addrecs at all? At this point, we're expanding the (normalized) pre-inc addrec. If we introduce a post-inc use, then we'll explicitly clear the nowrap flags if necessary here (note that it checks S, not Normalized):

// We might be introducing a new use of the post-inc IV that is not poison
// safe, in which case we should drop poison generating flags. Only keep
// those flags for which SCEV has proven that they always hold.
if (isa<OverflowingBinaryOperator>(Result)) {
auto *I = cast<Instruction>(Result);
if (!S->hasNoUnsignedWrap())
I->setHasNoUnsignedWrap(false);
if (!S->hasNoSignedWrap())
I->setHasNoSignedWrap(false);
}
This makes me think that we'll automatically handle post-inc correctly via that code. (Note that this was introduced only after the referenced commit.)

dcc84db introduced the IsIncrementNUW/NSW helpers instead of
checking the AddRecs wrap flags, with the reason being that the
increment by one in the post-inc form may wrap, even if the original
AddRec would not wrap.

AFAICT there is only an issue with loops in post-inc form. This patch
brings back checking the original check of the no-wrap flags on AR, if
AR's loop isn't in the set of post-inc loops.
@fhahn
Copy link
Contributor Author

fhahn commented Jan 12, 2024

Does this code need to care about post-inc addrecs at all? At this point, we're expanding the (normalized) pre-inc addrec. If we introduce a post-inc use, then we'll explicitly clear the nowrap flags if necessary here (note that it checks S, not Normalized):

Yes I think that should be sufficient. I updated the PR to always check the AddRec flags; to avoid any regressions, I had to do it after the check in the wider type, as the SCEV construction there strengthens flags of some involved AddRecs as unfortunate side effect.

@fhahn fhahn changed the title [SCEV] Check AR's wrap flags when expanding in non-post increment loops. [SCEV] Check AR's wrap flags when expanding. Jan 12, 2024
@@ -22,7 +22,7 @@ define i32 @test() {
; CHECK-NEXT: [[IV_2:%.*]] = phi i32 [ [[LSR_IV_NEXT2_LCSSA]], [[LOOP_2_PH]] ], [ [[IV_2_NEXT:%.*]], [[LOOP_2]] ]
; CHECK-NEXT: call void @use(i32 [[IV_2]])
; CHECK-NEXT: [[IV_2_NEXT]] = add i32 [[IV_2]], 1
; CHECK-NEXT: [[LSR_IV_NEXT]] = add nsw i64 [[LSR_IV]], -1
; CHECK-NEXT: [[LSR_IV_NEXT]] = add nuw nsw i64 [[LSR_IV]], -1
Copy link
Contributor

@nikic nikic Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks suspicious. We have initial value 1 and then add -1 with nuw. But this is guarded by a "br i1 false", so it's fine -- just not sure whether it's correct for the right reasons.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is due to range info, LSR will likely cause wrap flags to be strengthened by adding info from BTC

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Test diffs look ok.

@fhahn
Copy link
Contributor Author

fhahn commented Jan 12, 2024

Found a potential issue which I will need to investigate before merging.

fhahn added a commit that referenced this pull request Jan 15, 2024
Extra test for #77827, where
NUW gets added the AddRec due to the BTC being 0.
justinfargnoli pushed a commit to justinfargnoli/llvm-project that referenced this pull request Jan 28, 2024
Extra test for llvm#77827, where
NUW gets added the AddRec due to the BTC being 0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants