[SCEV] Fix incorrect nsw inference for multiply of addrec #66500

nikic · 2023-09-15T12:19:33Z

SCEV currently preserves the nsw flag when performing an nsw multiply of an nsw addrec. While this is legal for nuw, this is not generally the case for nsw.

This is because nsw mul does not distribute over nsw add: https://alive2.llvm.org/ce/z/mergCt

Instead, we need either both nuw and nsw to be set (https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the distributed multiplications are also nsw
(https://alive2.llvm.org/ce/z/wef9su).

Fixes #66066.

SCEV currently preserves the nsw flag when performing an nsw multiply of an nsw addrec. While this is legal for nuw, this is not generally the case for nsw. This is because nsw mul does not distribute over nsw add: https://alive2.llvm.org/ce/z/mergCt Instead, we need either both nuw and nsw to be set (https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the distributed multiplications are also nsw (https://alive2.llvm.org/ce/z/wef9su). Fixes llvm#66066.

llvmbot · 2023-09-15T12:21:37Z

@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-analysis

Changes

SCEV currently preserves the nsw flag when performing an nsw multiply of an nsw addrec. While this is legal for nuw, this is not generally the case for nsw.

This is because nsw mul does not distribute over nsw add: https://alive2.llvm.org/ce/z/mergCt

Instead, we need either both nuw and nsw to be set (https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the distributed multiplications are also nsw
(https://alive2.llvm.org/ce/z/wef9su).

Fixes #66066.

Patch is 26.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66500.diff

8 Files Affected:

(modified) llvm/lib/Analysis/ScalarEvolution.cpp (+19-9)
(modified) llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll (+2-2)
(modified) llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll (+1-1)
(modified) llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll (+11-11)
(modified) llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll (+6-6)
(modified) llvm/test/Analysis/ScalarEvolution/nsw.ll (+2-2)
(modified) llvm/test/Transforms/IndVarSimplify/pr66066.ll (+3-2)
(removed) llvm/test/Transforms/LoopDataPrefetch/AArch64/pr43784.ll (-117)

diff --git a/llvm/lib/Analysis/ScalarEvolution.cpp b/llvm/lib/Analysis/ScalarEvolution.cpp
index f951068c4c79c09..00b1af73671c041 100644
--- a/llvm/lib/Analysis/ScalarEvolution.cpp
+++ b/llvm/lib/Analysis/ScalarEvolution.cpp
@@ -3269,18 +3269,28 @@ const SCEV *ScalarEvolution::getMulExpr(SmallVectorImpl&lt;const SCEV *&gt; &amp;Ops,
       SmallVector&lt;const SCEV *, 4&gt; NewOps;
       NewOps.reserve(AddRec-&gt;getNumOperands());
       const SCEV *Scale = getMulExpr(LIOps, SCEV::FlagAnyWrap, Depth + 1);
-      for (unsigned i = 0, e = AddRec-&gt;getNumOperands(); i != e; ++i)
+
+      // If both the mul and addrec are nuw, we can preserve nuw.
+      // If both the mul and addrec are nsw, we can only preserve nsw if either
+      // a) they are also nuw, or
+      // b) all multiplications of addrec operands with scale are nsw.
+      SCEV::NoWrapFlags Flags =
+          AddRec-&gt;getNoWrapFlags(ComputeFlags({Scale, AddRec}));
+
+      for (unsigned i = 0, e = AddRec-&gt;getNumOperands(); i != e; ++i) {
         NewOps.push_back(getMulExpr(Scale, AddRec-&gt;getOperand(i),
                                     SCEV::FlagAnyWrap, Depth + 1));
 
-      // Build the new addrec. Propagate the NUW and NSW flags if both the
-      // outer mul and the inner addrec are guaranteed to have no overflow.
-      //
-      // No self-wrap cannot be guaranteed after changing the step size, but
-      // will be inferred if either NUW or NSW is true.
-      SCEV::NoWrapFlags Flags = ComputeFlags({Scale, AddRec});
-      const SCEV *NewRec = getAddRecExpr(
-          NewOps, AddRec-&gt;getLoop(), AddRec-&gt;getNoWrapFlags(Flags));
+        if (hasFlags(Flags, SCEV::FlagNSW) &amp;&amp; !hasFlags(Flags, SCEV::FlagNUW)) {
+          ConstantRange NSWRegion = ConstantRange::makeGuaranteedNoWrapRegion(
+              Instruction::Mul, getSignedRange(Scale),
+              OverflowingBinaryOperator::NoSignedWrap);
+          if (!NSWRegion.contains(getSignedRange(AddRec-&gt;getOperand(i))))
+            Flags = clearFlags(Flags, SCEV::FlagNSW);
+        }
+      }
+
+      const SCEV *NewRec = getAddRecExpr(NewOps, AddRec-&gt;getLoop(), Flags);
 
       // If all of the other operands were loop invariant, we are done.
       if (Ops.size() == 1) return NewRec;
diff --git a/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll b/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll
index 00fcbff02e2746a..3044a4868260b4d 100644
--- a/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll
+++ b/llvm/test/Analysis/Delinearization/constant_functions_multi_dim.ll
@@ -4,14 +4,14 @@ target datalayout = &quot;e-m:e-i64:64-f80:128-n8:16:32:64-S128&quot;
 
 ; CHECK:      Inst:  %tmp = load float, ptr %arrayidx, align 4
 ; CHECK-NEXT: In Loop with Header: for.inc
-; CHECK-NEXT: AccessFunction: {(4 * %N * %call),+,4}&lt;nsw&gt;&lt;%for.inc&gt;
+; CHECK-NEXT: AccessFunction: {(4 * %N * %call),+,4}&lt;%for.inc&gt;
 ; CHECK-NEXT: Base offset: %A
 ; CHECK-NEXT: ArrayDecl[UnknownSize][%N] with elements of 4 bytes.
 ; CHECK-NEXT: ArrayRef[%call][{0,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%for.inc&gt;]
 
 ; CHECK:      Inst:  %tmp5 = load float, ptr %arrayidx4, align 4
 ; CHECK-NEXT: In Loop with Header: for.inc
-; CHECK-NEXT: AccessFunction: {(4 * %call1),+,(4 * %N)}&lt;nsw&gt;&lt;%for.inc&gt;
+; CHECK-NEXT: AccessFunction: {(4 * %call1),+,(4 * %N)}&lt;%for.inc&gt;
 ; CHECK-NEXT: Base offset: %B
 ; CHECK-NEXT: ArrayDecl[UnknownSize][%N] with elements of 4 bytes.
 ; CHECK-NEXT: ArrayRef[{0,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%for.inc&gt;][%call1]
diff --git a/llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll b/llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll
index 8ddcc152d11c6b9..c268cc55880c15f 100644
--- a/llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll
+++ b/llvm/test/Analysis/LoopAccessAnalysis/number-of-memchecks.ll
@@ -247,7 +247,7 @@ for.end:                                          ; preds = %for.body
 ; CHECK-NEXT:   Grouped accesses:
 ; CHECK-NEXT:     Group {{.*}}[[ZERO]]:
 ; CHECK-NEXT:       (Low: ((2 * %offset) + %a) High: (10000 + (2 * %offset) + %a))
-; CHECK-NEXT:         Member: {((2 * %offset) + %a),+,2}&lt;nw&gt;&lt;%for.body&gt;
+; CHECK-NEXT:         Member: {((2 * %offset) + %a),+,2}&lt;%for.body&gt;
 ; CHECK-NEXT:     Group {{.*}}[[ONE]]:
 ; CHECK-NEXT:       (Low: %a High: (10000 + %a))
 ; CHECK-NEXT:         Member: {%a,+,2}&lt;nw&gt;&lt;%for.body&gt;
diff --git a/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll b/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll
index c6c9ff082ddd68b..a5bdee5c3b459bb 100644
--- a/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll
+++ b/llvm/test/Analysis/ScalarEvolution/flags-from-poison.ll
@@ -937,9 +937,9 @@ define void @test-mul-propagates-poison(ptr %input, i32 %offset, i32 %numIterati
 ; CHECK-NEXT:    %index32 = add nsw i32 %i, %offset
 ; CHECK-NEXT:    --&gt; {%offset,+,1}&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (-1 + %offset + %numIterations) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %indexmul = mul nsw i32 %index32, %offset
-; CHECK-NEXT:    --&gt; {(%offset * %offset),+,%offset}&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: ((-1 + %offset + %numIterations) * %offset) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {(%offset * %offset),+,%offset}&lt;%loop&gt; U: full-set S: full-set Exits: ((-1 + %offset + %numIterations) * %offset) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %ptr = getelementptr inbounds float, ptr %input, i32 %indexmul
-; CHECK-NEXT:    --&gt; {((4 * (sext i32 (%offset * %offset) to i64))&lt;nsw&gt; + %input),+,(4 * (sext i32 %offset to i64))&lt;nsw&gt;}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: ((4 * (sext i32 (%offset * %offset) to i64))&lt;nsw&gt; + (4 * (zext i32 (-1 + %numIterations) to i64) * (sext i32 %offset to i64)) + %input) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; ((4 * (sext i32 {(%offset * %offset),+,%offset}&lt;%loop&gt; to i64))&lt;nsw&gt; + %input) U: full-set S: full-set Exits: ((4 * (sext i32 ((-1 + %offset + %numIterations) * %offset) to i64))&lt;nsw&gt; + %input) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %nexti = add nsw i32 %i, 1
 ; CHECK-NEXT:    --&gt; {1,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: [1,-2147483648) S: [1,-2147483648) Exits: %numIterations LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test-mul-propagates-poison
@@ -1245,11 +1245,11 @@ define void @test-shl-nsw(ptr %input, i32 %start, i32 %numIterations) {
 ; CHECK-NEXT:    %i = phi i32 [ %nexti, %loop ], [ %start, %entry ]
 ; CHECK-NEXT:    --&gt; {%start,+,1}&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (-1 + %numIterations) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %index32 = shl nsw i32 %i, 8
-; CHECK-NEXT:    --&gt; {(256 * %start),+,256}&lt;nsw&gt;&lt;%loop&gt; U: [0,-255) S: [-2147483648,2147483393) Exits: (-256 + (256 * %numIterations)) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {(256 * %start),+,256}&lt;%loop&gt; U: [0,-255) S: [-2147483648,2147483393) Exits: (-256 + (256 * %numIterations)) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %index64 = sext i32 %index32 to i64
-; CHECK-NEXT:    --&gt; {(sext i32 (256 * %start) to i64),+,256}&lt;nsw&gt;&lt;%loop&gt; U: [0,-255) S: [-2147483648,1101659110913) Exits: ((sext i32 (256 * %start) to i64) + (256 * (zext i32 (-1 + (-1 * %start) + %numIterations) to i64))&lt;nuw&gt;&lt;nsw&gt;) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; (sext i32 {(256 * %start),+,256}&lt;%loop&gt; to i64) U: [0,-255) S: [-2147483648,2147483393) Exits: (sext i32 (-256 + (256 * %numIterations)) to i64) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %ptr = getelementptr inbounds float, ptr %input, i64 %index64
-; CHECK-NEXT:    --&gt; {((4 * (sext i32 (256 * %start) to i64))&lt;nsw&gt; + %input),+,1024}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: ((4 * (sext i32 (256 * %start) to i64))&lt;nsw&gt; + (1024 * (zext i32 (-1 + (-1 * %start) + %numIterations) to i64))&lt;nuw&gt;&lt;nsw&gt; + %input) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; ((4 * (sext i32 {(256 * %start),+,256}&lt;%loop&gt; to i64))&lt;nsw&gt; + %input) U: full-set S: full-set Exits: ((4 * (sext i32 (-256 + (256 * %numIterations)) to i64))&lt;nsw&gt; + %input) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %nexti = add nsw i32 %i, 1
 ; CHECK-NEXT:    --&gt; {(1 + %start)&lt;nsw&gt;,+,1}&lt;nsw&gt;&lt;%loop&gt; U: [-2147483647,-2147483648) S: [-2147483647,-2147483648) Exits: %numIterations LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test-shl-nsw
@@ -1325,11 +1325,11 @@ define void @test-shl-nuw-nsw(ptr %input, i32 %start, i32 %numIterations) {
 ; CHECK-NEXT:    %i = phi i32 [ %nexti, %loop ], [ %start, %entry ]
 ; CHECK-NEXT:    --&gt; {%start,+,1}&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (-1 + %numIterations) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %index32 = shl nuw nsw i32 %i, 31
-; CHECK-NEXT:    --&gt; {(-2147483648 * %start),+,-2147483648}&lt;nsw&gt;&lt;%loop&gt; U: [0,-2147483647) S: [-2147483648,1) Exits: (-2147483648 + (-2147483648 * %numIterations)) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {(-2147483648 * %start),+,-2147483648}&lt;%loop&gt; U: [0,-2147483647) S: [-2147483648,1) Exits: (-2147483648 + (-2147483648 * %numIterations)) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %index64 = sext i32 %index32 to i64
-; CHECK-NEXT:    --&gt; {(sext i32 (-2147483648 * %start) to i64),+,-2147483648}&lt;nsw&gt;&lt;%loop&gt; U: [0,-2147483647) S: [-9223372036854775808,1) Exits: ((sext i32 (-2147483648 * %start) to i64) + (-2147483648 * (zext i32 (-1 + (-1 * %start) + %numIterations) to i64))&lt;nsw&gt;) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; (sext i32 {(-2147483648 * %start),+,-2147483648}&lt;%loop&gt; to i64) U: [0,-2147483647) S: [-2147483648,1) Exits: (sext i32 (-2147483648 + (-2147483648 * %numIterations)) to i64) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %ptr = getelementptr inbounds float, ptr %input, i64 %index64
-; CHECK-NEXT:    --&gt; {((4 * (sext i32 (-2147483648 * %start) to i64))&lt;nsw&gt; + %input),+,-8589934592}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: ((4 * (sext i32 (-2147483648 * %start) to i64))&lt;nsw&gt; + (-8589934592 * (zext i32 (-1 + (-1 * %start) + %numIterations) to i64)) + %input) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; ((4 * (sext i32 {(-2147483648 * %start),+,-2147483648}&lt;%loop&gt; to i64))&lt;nsw&gt; + %input) U: full-set S: full-set Exits: ((4 * (sext i32 (-2147483648 + (-2147483648 * %numIterations)) to i64))&lt;nsw&gt; + %input) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %nexti = add nsw i32 %i, 1
 ; CHECK-NEXT:    --&gt; {(1 + %start)&lt;nsw&gt;,+,1}&lt;nsw&gt;&lt;%loop&gt; U: [-2147483647,-2147483648) S: [-2147483647,-2147483648) Exits: %numIterations LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test-shl-nuw-nsw
@@ -1405,11 +1405,11 @@ define void @test-shl-nsw-edgecase(ptr %input, i32 %start, i32 %numIterations) {
 ; CHECK-NEXT:    %i = phi i32 [ %nexti, %loop ], [ %start, %entry ]
 ; CHECK-NEXT:    --&gt; {%start,+,1}&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (-1 + %numIterations) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %index32 = shl nsw i32 %i, 30
-; CHECK-NEXT:    --&gt; {(1073741824 * %start),+,1073741824}&lt;nsw&gt;&lt;%loop&gt; U: [0,-1073741823) S: [-2147483648,1073741825) Exits: (-1073741824 + (1073741824 * %numIterations)) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {(1073741824 * %start),+,1073741824}&lt;%loop&gt; U: [0,-1073741823) S: [-2147483648,1073741825) Exits: (-1073741824 + (1073741824 * %numIterations)) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %index64 = sext i32 %index32 to i64
-; CHECK-NEXT:    --&gt; {(sext i32 (1073741824 * %start) to i64),+,1073741824}&lt;nsw&gt;&lt;%loop&gt; U: [0,-1073741823) S: [-2147483648,4611686018427387905) Exits: ((sext i32 (1073741824 * %start) to i64) + (1073741824 * (zext i32 (-1 + (-1 * %start) + %numIterations) to i64))&lt;nuw&gt;&lt;nsw&gt;) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; (sext i32 {(1073741824 * %start),+,1073741824}&lt;%loop&gt; to i64) U: [0,-1073741823) S: [-2147483648,1073741825) Exits: (sext i32 (-1073741824 + (1073741824 * %numIterations)) to i64) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %ptr = getelementptr inbounds float, ptr %input, i64 %index64
-; CHECK-NEXT:    --&gt; {((4 * (sext i32 (1073741824 * %start) to i64))&lt;nsw&gt; + %input),+,4294967296}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: ((4 * (sext i32 (1073741824 * %start) to i64))&lt;nsw&gt; + (4294967296 * (zext i32 (-1 + (-1 * %start) + %numIterations) to i64))&lt;nuw&gt; + %input) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; ((4 * (sext i32 {(1073741824 * %start),+,1073741824}&lt;%loop&gt; to i64))&lt;nsw&gt; + %input) U: full-set S: full-set Exits: ((4 * (sext i32 (-1073741824 + (1073741824 * %numIterations)) to i64))&lt;nsw&gt; + %input) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %nexti = add nsw i32 %i, 1
 ; CHECK-NEXT:    --&gt; {(1 + %start)&lt;nsw&gt;,+,1}&lt;nsw&gt;&lt;%loop&gt; U: [-2147483647,-2147483648) S: [-2147483647,-2147483648) Exits: %numIterations LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test-shl-nsw-edgecase
diff --git a/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll b/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
index e6fbcbe5333f82e..eb8e9dd09dc4fb4 100644
--- a/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
+++ b/llvm/test/Analysis/ScalarEvolution/max-backedge-taken-count-guard-info.ll
@@ -9,7 +9,7 @@ define void @test_guard_less_than_16(ptr nocapture %a, i64 %i) {
 ; CHECK-NEXT:    %iv = phi i64 [ %iv.next, %loop ], [ %i, %entry ]
 ; CHECK-NEXT:    --&gt; {%i,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 15 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %idx = getelementptr inbounds i32, ptr %a, i64 %iv
-; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %iv.next = add nuw nsw i64 %iv, 1
 ; CHECK-NEXT:    --&gt; {(1 + %i),+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 16 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test_guard_less_than_16
@@ -42,7 +42,7 @@ define void @test_guard_less_than_16_operands_swapped(ptr nocapture %a, i64 %i)
 ; CHECK-NEXT:    %iv = phi i64 [ %iv.next, %loop ], [ %i, %entry ]
 ; CHECK-NEXT:    --&gt; {%i,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 15 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %idx = getelementptr inbounds i32, ptr %a, i64 %iv
-; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %iv.next = add nuw nsw i64 %iv, 1
 ; CHECK-NEXT:    --&gt; {(1 + %i),+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 16 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test_guard_less_than_16_operands_swapped
@@ -75,7 +75,7 @@ define void @test_guard_less_than_16_branches_flipped(ptr nocapture %a, i64 %i)
 ; CHECK-NEXT:    %iv = phi i64 [ %iv.next, %loop ], [ %i, %entry ]
 ; CHECK-NEXT:    --&gt; {%i,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 15 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %idx = getelementptr inbounds i32, ptr %a, i64 %iv
-; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %iv.next = add nuw nsw i64 %iv, 1
 ; CHECK-NEXT:    --&gt; {(1 + %i),+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 16 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test_guard_less_than_16_branches_flipped
@@ -108,7 +108,7 @@ define void @test_guard_uge_16_branches_flipped(ptr nocapture %a, i64 %i) {
 ; CHECK-NEXT:    %iv = phi i64 [ %iv.next, %loop ], [ %i, %entry ]
 ; CHECK-NEXT:    --&gt; {%i,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 15 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %idx = getelementptr inbounds i32, ptr %a, i64 %iv
-; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
+; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;%loop&gt; U: full-set S: full-set Exits: (60 + %a) LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %iv.next = add nuw nsw i64 %iv, 1
 ; CHECK-NEXT:    --&gt; {(1 + %i),+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 16 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:  Determining loop execution counts for: @test_guard_uge_16_branches_flipped
@@ -1219,7 +1219,7 @@ define void @test_guard_slt_sgt_2(ptr nocapture %a, i64 %i) {
 ; CHECK-NEXT:    %iv = phi i64 [ %iv.next, %loop ], [ %i, %entry ]
 ; CHECK-NEXT:    --&gt; {%i,+,1}&lt;nuw&gt;&lt;nsw&gt;&lt;%loop&gt; U: full-set S: full-set Exits: 17 LoopDispositions: { %loop: Computable }
 ; CHECK-NEXT:    %idx = getelementptr inbounds i32, ptr %a, i64 %iv
-; CHECK-NEXT:    --&gt; {((4 * %i) + %a),+,4}&lt;nw&gt;&lt;%loop&gt; U: full-set S: full-s...

nikic · 2023-09-15T12:23:12Z

llvm/test/Transforms/LoopDataPrefetch/AArch64/pr43784.ll

I couldn't figure out how to adjust this test and ended up deleting it because it appears to be entirely broken anyway: It was supposed to test that IR not in loop simplify form is handled gracefully, but then ... it explicitly runs loop-simplify, so this is just pointless.

Yeah, removing seems the best way forward here.

efriedma-quic

LGTM

fhahn

LGTM, thanks!

fhahn · 2023-09-17T20:24:09Z

llvm/test/Transforms/LoopDataPrefetch/AArch64/pr43784.ll

Yeah, removing seems the best way forward here.

SCEV currently preserves the nsw flag when performing an nsw multiply of an nsw addrec. While this is legal for nuw, this is not generally the case for nsw. This is because nsw mul does not distribute over nsw add: https://alive2.llvm.org/ce/z/mergCt Instead, we need either both nuw and nsw to be set (https://alive2.llvm.org/ce/z/7wpgGc) or explicitly prove that the distributed multiplications are also nsw (https://alive2.llvm.org/ce/z/wef9su). Fixes llvm#66066.

nikic requested review from fhahn, preames and efriedma-quic September 15, 2023 12:19

llvmbot added llvm:analysis llvm:transforms labels Sep 15, 2023

nikic commented Sep 15, 2023

View reviewed changes

efriedma-quic approved these changes Sep 15, 2023

View reviewed changes

fhahn approved these changes Sep 17, 2023

View reviewed changes

llvm/test/Transforms/LoopDataPrefetch/AArch64/pr43784.ll

Copy link

Contributor

fhahn Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, removing seems the best way forward here.

nikic merged commit efe4e7a into llvm:main Sep 18, 2023
3 of 4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SCEV] Fix incorrect nsw inference for multiply of addrec #66500

[SCEV] Fix incorrect nsw inference for multiply of addrec #66500

nikic commented Sep 15, 2023

llvmbot commented Sep 15, 2023 •

edited

Fixes #66066.

nikic Sep 15, 2023

fhahn Sep 17, 2023

efriedma-quic left a comment

fhahn left a comment

fhahn Sep 17, 2023

[SCEV] Fix incorrect nsw inference for multiply of addrec #66500

[SCEV] Fix incorrect nsw inference for multiply of addrec #66500

Conversation

nikic commented Sep 15, 2023

llvmbot commented Sep 15, 2023 • edited

Fixes #66066.

nikic Sep 15, 2023

Choose a reason for hiding this comment

fhahn Sep 17, 2023

Choose a reason for hiding this comment

efriedma-quic left a comment

Choose a reason for hiding this comment

fhahn left a comment

Choose a reason for hiding this comment

fhahn Sep 17, 2023

Choose a reason for hiding this comment

llvmbot commented Sep 15, 2023 •

edited