Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LoopUnroll] Add NUW to IV increment introduced by runtime unrolling. #83246

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

fhahn
Copy link
Contributor

@fhahn fhahn commented Feb 28, 2024

Runtime unrolling introduces a new IV to control the loop. This IV is compared against a value that's guaranteed to be a multiple of the IV's step. This should guarantee that the loop exits before IV's increment can wrap in the unsigned sense.

Add NUW to the increment introduced by runtime unrolling.

Runtime unrolling introduces a new IV to control the loop. This IV is
compared against a value that's guaranteed to be a multiple of the IV's
step. This should guarantee that the loop exits before IV's increment
can wrap in the unsigned sense.

Add NUW to the increment introduced by runtime unrolling.
@llvmbot
Copy link
Collaborator

llvmbot commented Feb 28, 2024

@llvm/pr-subscribers-backend-powerpc

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Runtime unrolling introduces a new IV to control the loop. This IV is compared against a value that's guaranteed to be a multiple of the IV's step. This should guarantee that the loop exits before IV's increment can wrap in the unsigned sense.

Add NUW to the increment introduced by runtime unrolling.


Patch is 70.58 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/83246.diff

20 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopUnroll/X86/znver3.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-i128.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll (+132-132)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-loop5.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-multiexit-heuristic.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-unroll-assume-no-remainder.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/scev-invalidation-lcssa.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/tripcount-overflow.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnrollAndJam/unroll-and-jam.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/extra-unroll-simplifications.ll (+1-1)
diff --git a/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp b/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
index 650f055356c072..f802610c64ccc3 100644
--- a/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
+++ b/llvm/lib/Transforms/Utils/LoopUnrollRuntime.cpp
@@ -940,7 +940,7 @@ bool llvm::UnrollRuntimeLoopRemainder(
     B2.SetInsertPoint(LatchBR);
     auto *Zero = ConstantInt::get(NewIdx->getType(), 0);
     auto *One = ConstantInt::get(NewIdx->getType(), 1);
-    Value *IdxNext = B2.CreateAdd(NewIdx, One, NewIdx->getName() + ".next");
+    Value *IdxNext = B2.CreateNUWAdd(NewIdx, One, NewIdx->getName() + ".next");
     auto Pred = LatchBR->getSuccessor(0) == Header ? ICmpInst::ICMP_NE : ICmpInst::ICMP_EQ;
     Value *IdxCmp = B2.CreateICmp(Pred, IdxNext, TestVal, NewIdx->getName() + ".ncmp");
     NewIdx->addIncoming(Zero, NewPreHeader);
diff --git a/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll b/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll
index cba80a3a2bc5fb..c786e979240692 100644
--- a/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll
+++ b/llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll
@@ -67,7 +67,7 @@ define void @runtime_unroll_generic(i32 %arg_0, ptr %arg_1, ptr %arg_2, ptr %arg
 ; CHECK-A55-NEXT:    [[ADD21_3:%.*]] = add nsw i32 [[MUL16_3]], [[TMP12]]
 ; CHECK-A55-NEXT:    store i32 [[ADD21_3]], ptr [[ARRAYIDX20_3]], align 4
 ; CHECK-A55-NEXT:    [[INDVARS_IV_NEXT_3]] = add nuw nsw i64 [[INDVARS_IV]], 4
-; CHECK-A55-NEXT:    [[NITER_NEXT_3]] = add i64 [[NITER]], 4
+; CHECK-A55-NEXT:    [[NITER_NEXT_3]] = add nuw i64 [[NITER]], 4
 ; CHECK-A55-NEXT:    [[NITER_NCMP_3:%.*]] = icmp eq i64 [[NITER_NEXT_3]], [[UNROLL_ITER]]
 ; CHECK-A55-NEXT:    br i1 [[NITER_NCMP_3]], label [[FOR_END_LOOPEXIT_UNR_LCSSA]], label [[FOR_BODY6]], !llvm.loop [[LOOP0:![0-9]+]]
 ; CHECK-A55:       for.end.loopexit.unr-lcssa:
diff --git a/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll b/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll
index f988c153b8bc71..7eef8b717ba5c2 100644
--- a/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll
+++ b/llvm/test/Transforms/LoopUnroll/ARM/multi-blocks.ll
@@ -133,7 +133,7 @@ define void @test_three_blocks(ptr nocapture %Output, ptr nocapture readonly %Co
 ; CHECK:       for.inc.3:
 ; CHECK-NEXT:    [[TEMP_1_3]] = phi i32 [ [[ADD_3]], [[IF_THEN_3]] ], [ [[TEMP_1_2]], [[FOR_INC_2]] ]
 ; CHECK-NEXT:    [[INC_3]] = add nuw i32 [[J_010]], 4
-; CHECK-NEXT:    [[NITER_NEXT_3]] = add i32 [[NITER]], 4
+; CHECK-NEXT:    [[NITER_NEXT_3]] = add nuw i32 [[NITER]], 4
 ; CHECK-NEXT:    [[NITER_NCMP_3:%.*]] = icmp eq i32 [[NITER_NEXT_3]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_3]], label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT]], label [[FOR_BODY]]
 ;
@@ -549,7 +549,7 @@ define void @test_four_blocks(ptr nocapture %Output, ptr nocapture readonly %Con
 ; CHECK:       for.inc.3:
 ; CHECK-NEXT:    [[TEMP_1_3]] = phi i32 [ [[ADD_3]], [[IF_THEN_3]] ], [ [[SUB13_3]], [[IF_ELSE_3]] ]
 ; CHECK-NEXT:    [[INC_3]] = add nuw i32 [[J_027]], 4
-; CHECK-NEXT:    [[NITER_NEXT_3]] = add i32 [[NITER]], 4
+; CHECK-NEXT:    [[NITER_NEXT_3]] = add nuw i32 [[NITER]], 4
 ; CHECK-NEXT:    [[NITER_NCMP_3:%.*]] = icmp eq i32 [[NITER_NEXT_3]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_3]], label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA_LOOPEXIT]], label [[FOR_BODY]]
 ;
diff --git a/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors-inseltpoison.ll b/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors-inseltpoison.ll
index d0041668e77bde..6c65df85a5e649 100644
--- a/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors-inseltpoison.ll
+++ b/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors-inseltpoison.ll
@@ -48,7 +48,7 @@ define ptr @f(ptr returned %s, i32 zeroext %x, i32 signext %k) local_unnamed_add
 ; CHECK-NEXT:    store <16 x i8> [[TMP12]], ptr [[TMP13]], align 1
 ; CHECK-NEXT:    [[INDEX_NEXT_1]] = add i64 [[INDEX]], 32
 ; CHECK-NEXT:    [[VEC_IND_NEXT13_1]] = add <16 x i32> [[VEC_IND12]], <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32>
-; CHECK-NEXT:    [[NITER_NEXT_1]] = add i64 [[NITER]], 2
+; CHECK-NEXT:    [[NITER_NEXT_1]] = add nuw i64 [[NITER]], 2
 ; CHECK-NEXT:    [[NITER_NCMP_1:%.*]] = icmp eq i64 [[NITER_NEXT_1]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_1]], label [[MIDDLE_BLOCK_UNR_LCSSA_LOOPEXIT:%.*]], label [[VECTOR_BODY]]
 ; CHECK:       middle.block.unr-lcssa.loopexit:
diff --git a/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors.ll b/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors.ll
index c2a8675f7ebbab..9783c8ab03fd41 100644
--- a/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors.ll
+++ b/llvm/test/Transforms/LoopUnroll/PowerPC/p8-unrolling-legalize-vectors.ll
@@ -48,7 +48,7 @@ define ptr @f(ptr returned %s, i32 zeroext %x, i32 signext %k) local_unnamed_add
 ; CHECK-NEXT:    store <16 x i8> [[TMP12]], ptr [[TMP13]], align 1
 ; CHECK-NEXT:    [[INDEX_NEXT_1]] = add i64 [[INDEX]], 32
 ; CHECK-NEXT:    [[VEC_IND_NEXT13_1]] = add <16 x i32> [[VEC_IND12]], <i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32, i32 32>
-; CHECK-NEXT:    [[NITER_NEXT_1]] = add i64 [[NITER]], 2
+; CHECK-NEXT:    [[NITER_NEXT_1]] = add nuw i64 [[NITER]], 2
 ; CHECK-NEXT:    [[NITER_NCMP_1:%.*]] = icmp eq i64 [[NITER_NEXT_1]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_1]], label [[MIDDLE_BLOCK_UNR_LCSSA_LOOPEXIT:%.*]], label [[VECTOR_BODY]]
 ; CHECK:       middle.block.unr-lcssa.loopexit:
diff --git a/llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll b/llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll
index b59f05bd11f0d8..9546ce97e16e65 100644
--- a/llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll
+++ b/llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll
@@ -162,7 +162,7 @@ define hidden void @runtime(ptr nocapture %a, ptr nocapture readonly %b, ptr noc
 ; CHECK-NEXT:    [[ARRAYIDX2_1:%.*]] = getelementptr inbounds i32, ptr [[A]], i32 [[INC]]
 ; CHECK-NEXT:    store i32 [[MUL_1]], ptr [[ARRAYIDX2_1]], align 4
 ; CHECK-NEXT:    [[INC_1]] = add nuw i32 [[I_09]], 2
-; CHECK-NEXT:    [[NITER_NEXT_1]] = add i32 [[NITER]], 2
+; CHECK-NEXT:    [[NITER_NEXT_1]] = add nuw i32 [[NITER]], 2
 ; CHECK-NEXT:    [[NITER_NCMP_1:%.*]] = icmp eq i32 [[NITER_NEXT_1]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_1]], label [[FOR_COND_CLEANUP_LOOPEXIT_UNR_LCSSA]], label [[FOR_BODY]]
 ;
@@ -203,7 +203,7 @@ define hidden void @dont_unroll_call(ptr nocapture %a, ptr nocapture readonly %b
 ; CHECK-NEXT:    [[MUL:%.*]] = mul nsw i32 [[I1]], [[I]]
 ; CHECK-NEXT:    [[ARRAYIDX2:%.*]] = getelementptr inbounds i32, ptr [[A:%.*]], i32 [[I_013]]
 ; CHECK-NEXT:    store i32 [[MUL]], ptr [[ARRAYIDX2]], align 4
-; CHECK-NEXT:    call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 [[I_013]], i32 [[MUL]])
+; CHECK-NEXT:    [[TMP0:%.*]] = call i32 (ptr, ...) @printf(ptr noundef nonnull dereferenceable(1) @.str, i32 [[I_013]], i32 [[MUL]])
 ; CHECK-NEXT:    [[INC]] = add nuw i32 [[I_013]], 1
 ; CHECK-NEXT:    [[EXITCOND_NOT:%.*]] = icmp eq i32 [[INC]], [[N]]
 ; CHECK-NEXT:    br i1 [[EXITCOND_NOT]], label [[FOR_COND_CLEANUP]], label [[FOR_BODY]]
diff --git a/llvm/test/Transforms/LoopUnroll/X86/znver3.ll b/llvm/test/Transforms/LoopUnroll/X86/znver3.ll
index 30389062a09678..7873bc8eef7f38 100644
--- a/llvm/test/Transforms/LoopUnroll/X86/znver3.ll
+++ b/llvm/test/Transforms/LoopUnroll/X86/znver3.ll
@@ -993,7 +993,7 @@ define i32 @test2(ptr %ary, i64 %n) "target-cpu"="znver3" {
 ; CHECK-NEXT:    [[DUMMY50_7:%.*]] = mul i32 [[DUMMY49_7]], [[DUMMY49_7]]
 ; CHECK-NEXT:    [[SUM_NEXT_7]] = add nsw i32 [[DUMMY50_7]], [[SUM_NEXT_6]]
 ; CHECK-NEXT:    [[INDVARS_IV_NEXT_7]] = add nuw nsw i64 [[INDVARS_IV]], 8
-; CHECK-NEXT:    [[NITER_NEXT_7]] = add i64 [[NITER]], 8
+; CHECK-NEXT:    [[NITER_NEXT_7]] = add nuw i64 [[NITER]], 8
 ; CHECK-NEXT:    [[NITER_NCMP_7:%.*]] = icmp eq i64 [[NITER_NEXT_7]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_7]], label [[FOR_COND_CLEANUP_UNR_LCSSA_LOOPEXIT:%.*]], label [[FOR_BODY]]
 ; CHECK:       for.cond.cleanup.unr-lcssa.loopexit:
diff --git a/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll b/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
index 669306c8f3ab74..da590e03ab867a 100644
--- a/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
+++ b/llvm/test/Transforms/LoopUnroll/runtime-exit-phi-scev-invalidation.ll
@@ -60,7 +60,7 @@ define void @pr56282() {
 ; CHECK-NEXT:    [[C_1_7:%.*]] = icmp ugt i32 [[V_7]], 0
 ; CHECK-NEXT:    br i1 [[C_1_7]], label [[INNER_1_LATCH_7]], label [[EXIT_DEOPT_LOOPEXIT]]
 ; CHECK:       inner.1.latch.7:
-; CHECK-NEXT:    [[NITER_NEXT_7]] = add i64 [[NITER]], 8
+; CHECK-NEXT:    [[NITER_NEXT_7]] = add nuw i64 [[NITER]], 8
 ; CHECK-NEXT:    [[NITER_NCMP_7:%.*]] = icmp ne i64 [[NITER_NEXT_7]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_7]], label [[INNER_1_HEADER]], label [[OUTER_MIDDLE_UNR_LCSSA_LOOPEXIT:%.*]]
 ; CHECK:       outer.middle.unr-lcssa.loopexit:
diff --git a/llvm/test/Transforms/LoopUnroll/runtime-i128.ll b/llvm/test/Transforms/LoopUnroll/runtime-i128.ll
index 4cd8e7ca5d16f0..a964b24b1e3981 100644
--- a/llvm/test/Transforms/LoopUnroll/runtime-i128.ll
+++ b/llvm/test/Transforms/LoopUnroll/runtime-i128.ll
@@ -27,7 +27,7 @@ define void @test(i128 %n, i128 %m) {
 ; CHECK-NEXT:    call void @foo()
 ; CHECK-NEXT:    call void @foo()
 ; CHECK-NEXT:    [[IV_NEXT_7]] = add i128 [[IV]], 8
-; CHECK-NEXT:    [[NITER_NEXT_7]] = add i128 [[NITER]], 8
+; CHECK-NEXT:    [[NITER_NEXT_7]] = add nuw i128 [[NITER]], 8
 ; CHECK-NEXT:    [[NITER_NCMP_7:%.*]] = icmp ne i128 [[NITER_NEXT_7]], [[UNROLL_ITER]]
 ; CHECK-NEXT:    br i1 [[NITER_NCMP_7]], label [[LOOP]], label [[EXIT_UNR_LCSSA_LOOPEXIT:%.*]]
 ; CHECK:       exit.unr-lcssa.loopexit:
diff --git a/llvm/test/Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll b/llvm/test/Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll
index d5864e91c13e9a..5d1475e45382a3 100644
--- a/llvm/test/Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll
+++ b/llvm/test/Transforms/LoopUnroll/runtime-loop-at-most-two-exits.ll
@@ -69,7 +69,7 @@ define i32 @test(ptr nocapture %a, i64 %n) {
 ; ENABLED-NEXT:    [[TMP10:%.*]] = load i32, ptr [[ARRAYIDX_7]], align 4
 ; ENABLED-NEXT:    [[ADD_7]] = add nsw i32 [[TMP10]], [[ADD_6]]
 ; ENABLED-NEXT:    [[INDVARS_IV_NEXT_7]] = add i64 [[INDVARS_IV]], 8
-; ENABLED-NEXT:    [[NITER_NEXT_7]] = add i64 [[NITER]], 8
+; ENABLED-NEXT:    [[NITER_NEXT_7]] = add nuw i64 [[NITER]], 8
 ; ENABLED-NEXT:    [[NITER_NCMP_7:%.*]] = icmp eq i64 [[NITER_NEXT_7]], [[UNROLL_ITER]]
 ; ENABLED-NEXT:    br i1 [[NITER_NCMP_7]], label [[FOR_END_UNR_LCSSA_LOOPEXIT:%.*]], label [[HEADER]]
 ; ENABLED:       for.end.unr-lcssa.loopexit:
diff --git a/llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll b/llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
index 6cad2e1c4a3246..549bc2d0f0fd8b 100644
--- a/llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
+++ b/llvm/test/Transforms/LoopUnroll/runtime-loop-multiple-exits.ll
@@ -77,7 +77,7 @@ define void @test1(i64 %trip, i1 %cond) {
 ; EPILOG-NEXT:    br i1 false, label %loop_latch.7, label %exit3.loopexit
 ; EPILOG:       loop_latch.7:
 ; EPILOG-NEXT:    %iv_next.7 = add i64 %iv, 8
-; EPILOG-NEXT:    %niter.next.7 = add i64 %niter, 8
+; EPILOG-NEXT:    %niter.next.7 = add nuw i64 %niter, 8
 ; EPILOG-NEXT:    %niter.ncmp.7 = icmp ne i64 %niter.next.7, %unroll_iter
 ; EPILOG-NEXT:    br i1 %niter.ncmp.7, label %loop_header, label %exit2.loopexit.unr-lcssa.loopexit
 ; EPILOG:       exit1.loopexit:
@@ -144,7 +144,7 @@ define void @test1(i64 %trip, i1 %cond) {
 ; EPILOG-BLOCK-NEXT:    br i1 false, label %loop_latch.1, label %exit3.loopexit
 ; EPILOG-BLOCK:       loop_latch.1:
 ; EPILOG-BLOCK-NEXT:    %iv_next.1 = add i64 %iv, 2
-; EPILOG-BLOCK-NEXT:    %niter.next.1 = add i64 %niter, 2
+; EPILOG-BLOCK-NEXT:    %niter.next.1 = add nuw i64 %niter, 2
 ; EPILOG-BLOCK-NEXT:    %niter.ncmp.1 = icmp ne i64 %niter.next.1, %unroll_iter
 ; EPILOG-BLOCK-NEXT:    br i1 %niter.ncmp.1, label %loop_header, label %exit2.loopexit.unr-lcssa.loopexit, !llvm.loop !0
 ; EPILOG-BLOCK:       exit1.loopexit:
@@ -446,7 +446,7 @@ define i32 @test2(ptr nocapture %a, i64 %n) {
 ; EPILOG-NEXT:    %10 = load i32, ptr %arrayidx.7, align 4
 ; EPILOG-NEXT:    %add.7 = add nsw i32 %10, %add.6
 ; EPILOG-NEXT:    %indvars.iv.next.7 = add i64 %indvars.iv, 8
-; EPILOG-NEXT:    %niter.next.7 = add i64 %niter, 8
+; EPILOG-NEXT:    %niter.next.7 = add nuw i64 %niter, 8
 ; EPILOG-NEXT:    %niter.ncmp.7 = icmp eq i64 %niter.next.7, %unroll_iter
 ; EPILOG-NEXT:    br i1 %niter.ncmp.7, label %for.end.unr-lcssa.loopexit, label %header
 ; EPILOG:       for.end.unr-lcssa.loopexit:
@@ -527,7 +527,7 @@ define i32 @test2(ptr nocapture %a, i64 %n) {
 ; EPILOG-BLOCK-NEXT:    %4 = load i32, ptr %arrayidx.1, align 4
 ; EPILOG-BLOCK-NEXT:    %add.1 = add nsw i32 %4, %add
 ; EPILOG-BLOCK-NEXT:    %indvars.iv.next.1 = add i64 %indvars.iv, 2
-; EPILOG-BLOCK-NEXT:    %niter.next.1 = add i64 %niter, 2
+; EPILOG-BLOCK-NEXT:    %niter.next.1 = add nuw i64 %niter, 2
 ; EPILOG-BLOCK-NEXT:    %niter.ncmp.1 = icmp eq i64 %niter.next.1, %unroll_iter
 ; EPILOG-BLOCK-NEXT:    br i1 %niter.ncmp.1, label %for.end.unr-lcssa.loopexit, label %header, !llvm.loop !2
 ; EPILOG-BLOCK:       for.end.unr-lcssa.loopexit:
@@ -807,8 +807,8 @@ define void @test3(i64 %trip, i64 %add) {
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch, label %loop_exiting_bb1
 ; EPILOG:       loop_exiting_bb1:
 ; EPILOG-NEXT:    switch i64 %sum, label %loop_latch [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       exit3.loopexit:
 ; EPILOG-NEXT:    br label %exit3
@@ -821,61 +821,61 @@ define void @test3(i64 %trip, i64 %add) {
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.1, label %loop_exiting_bb1.1
 ; EPILOG:       loop_exiting_bb1.1:
 ; EPILOG-NEXT:    switch i64 %sum.next, label %loop_latch.1 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.1:
 ; EPILOG-NEXT:    %sum.next.1 = add i64 %sum.next, %add
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.2, label %loop_exiting_bb1.2
 ; EPILOG:       loop_exiting_bb1.2:
 ; EPILOG-NEXT:    switch i64 %sum.next.1, label %loop_latch.2 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.2:
 ; EPILOG-NEXT:    %sum.next.2 = add i64 %sum.next.1, %add
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.3, label %loop_exiting_bb1.3
 ; EPILOG:       loop_exiting_bb1.3:
 ; EPILOG-NEXT:    switch i64 %sum.next.2, label %loop_latch.3 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.3:
 ; EPILOG-NEXT:    %sum.next.3 = add i64 %sum.next.2, %add
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.4, label %loop_exiting_bb1.4
 ; EPILOG:       loop_exiting_bb1.4:
 ; EPILOG-NEXT:    switch i64 %sum.next.3, label %loop_latch.4 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.4:
 ; EPILOG-NEXT:    %sum.next.4 = add i64 %sum.next.3, %add
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.5, label %loop_exiting_bb1.5
 ; EPILOG:       loop_exiting_bb1.5:
 ; EPILOG-NEXT:    switch i64 %sum.next.4, label %loop_latch.5 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.5:
 ; EPILOG-NEXT:    %sum.next.5 = add i64 %sum.next.4, %add
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.6, label %loop_exiting_bb1.6
 ; EPILOG:       loop_exiting_bb1.6:
 ; EPILOG-NEXT:    switch i64 %sum.next.5, label %loop_latch.6 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.6:
 ; EPILOG-NEXT:    %sum.next.6 = add i64 %sum.next.5, %add
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.7, label %loop_exiting_bb1.7
 ; EPILOG:       loop_exiting_bb1.7:
 ; EPILOG-NEXT:    switch i64 %sum.next.6, label %loop_latch.7 [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.7:
 ; EPILOG-NEXT:    %iv_next.7 = add nuw nsw i64 %iv, 8
 ; EPILOG-NEXT:    %sum.next.7 = add i64 %sum.next.6, %add
-; EPILOG-NEXT:    %niter.next.7 = add i64 %niter, 8
+; EPILOG-NEXT:    %niter.next.7 = add nuw i64 %niter, 8
 ; EPILOG-NEXT:    %niter.ncmp.7 = icmp ne i64 %niter.next.7, %unroll_iter
 ; EPILOG-NEXT:    br i1 %niter.ncmp.7, label %loop_header, label %exit2.loopexit.unr-lcssa.loopexit
 ; EPILOG:       exit1.loopexit:
@@ -902,8 +902,8 @@ define void @test3(i64 %trip, i64 %add) {
 ; EPILOG-NEXT:    br i1 undef, label %loop_latch.epil, label %loop_exiting_bb1.epil
 ; EPILOG:       loop_exiting_bb1.epil:
 ; EPILOG-NEXT:    switch i64 %sum.epil, label %loop_latch.epil [
-; EPILOG-NEXT:    i64 24, label %exit1.loopexit1
-; EPILOG-NEXT:    i64 42, label %exit3.loopexit2
+; EPILOG-NEXT:      i64 24, label %exit1.loopexit1
+; EPILOG-NEXT:      i64 42, label %exit3.loopexit2
 ; EPILOG-NEXT:    ]
 ; EPILOG:       loop_latch.epil:
 ; EPILOG-NEXT:    %iv_next.epil = add nuw nsw i64 %iv.epil, 1
@@ -934,8 +934,8 @@ define void @test3(i64 %trip, i64 %add) {
 ; EPILOG-BLOCK-NEXT:    br i1 undef, label %loop_latch, label %loop_exiting_bb1
 ; EPILOG-BLOCK:       loop_exiting_bb1:
 ; EPILOG-BLOCK-NEXT:    switch i64 %sum, label %loop_latch [
-; EPILOG-BLOCK-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-BLOCK-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-BLOCK-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-BLOCK-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-BLOCK-NEXT:    ]
 ; EPILOG-BLOCK:       exit3.loopexit:
 ; EPILOG-BLOCK-NEXT:    br label %exit3
@@ -946,13 +946,13 @@ define void @test3(i64 %trip, i64 %add) {
 ; EPILOG-BLOCK-NEXT:    br i1 undef, label %loop_latch.1, label %loop_exiting_bb1.1
 ; EPILOG-BLOCK:       loop_exiting_bb1.1:
 ; EPILOG-BLOCK-NEXT:    switch i64 %sum.next, label %loop_latch.1 [
-; EPILOG-BLOCK-NEXT:    i64 24, label %exit1.loopexit
-; EPILOG-BLOCK-NEXT:    i64 42, label %exit3.loopexit
+; EPILOG-BLOCK-NEXT:      i64 24, label %exit1.loopexit
+; EPILOG-BLOCK-NEXT:      i64 42, label %exit3.loopexit
 ; EPILOG-BLOCK-NEXT:    ]
 ; EPILOG-BLOCK:       loop_latch.1:
 ; EPILOG-BLOCK-NEXT:    %iv_next.1 = add nuw nsw i64 %iv, 2
 ; EPILOG-BLOCK-NEXT:    %sum.next.1 = add i64 %sum.next, %add
-; EPILOG-BLOCK-NEXT:    %niter.next.1 = add i64 %niter, 2
+; EPILOG-BLOCK-NEXT:    %niter.next.1 = add nuw i64 %niter, 2
 ; EPILOG-BLOCK-NEXT:    %niter.ncmp.1 = icmp ne i64 %niter.next.1, %unroll_iter
 ; EPILOG-BLOCK-NEXT:    br i1 %niter.ncmp.1, label %loop_header, label %exit2.loopexit.unr-lcssa.loopexit, !llvm.loop !3
 ; EPILOG-BLOCK:       exit1.loopexit:
@@ -972,8 +972,8 @@ define void @test3(i64 %trip, i64 %add) {
 ; EPILOG-BLOCK-NEXT:    br i1 unde...
[truncated]

@@ -940,7 +940,7 @@ bool llvm::UnrollRuntimeLoopRemainder(
B2.SetInsertPoint(LatchBR);
auto *Zero = ConstantInt::get(NewIdx->getType(), 0);
auto *One = ConstantInt::get(NewIdx->getType(), 1);
Value *IdxNext = B2.CreateAdd(NewIdx, One, NewIdx->getName() + ".next");
Value *IdxNext = B2.CreateNUWAdd(NewIdx, One, NewIdx->getName() + ".next");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This contradicts the comment just above:
// Subtle: TestVal can be 0 if we wrapped when computing the trip count,
// thus we must compare the post-increment (wrapping) value."

So I don't believe it holds?

It should hold if we can prove TestVal != 0, but we should already get that in a bunch of places, so unless there's a pass ordering issue, adding it here might not be worthwhile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants