[flang] Clean-up for fir.do_loop generation in lowering. #160630

vzakhari · 2025-09-25T02:28:58Z

This patch changes two things:

We do not need to use the loop counter's last value
for regular do-loops in Lowering.
The loop counter's increment is implied by fir.do_loop
operation, so there is no need to increment it explicitly.

The last point has been especially confusing to me, because it was
unclear why we have an explicit increment if it is implied.
It looks like CFGConversion somehow still makes the final code
correct, i.e. the counter is not incremented twice.
Anyway, the new lowering should look more concise.

This patch changes two things: 1. We do not need to use the loop counter's last value for regular do-loops in Lowering. 2. The loop counter's increment is implied by fir.do_loop operation, so there is no need to increment it explicitly. The last point has been especially confusing to me, because it was unclear why we have an explicit increment if it is implied. It looks like CFGConversion somehow still makes the final code correct, i.e. the counter is not incremented twice. Anyway, the new lowering should look more concise.

llvmbot · 2025-09-25T02:29:33Z

@llvm/pr-subscribers-flang-openmp
@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-openacc

Author: Slava Zakharin (vzakhari)

Changes

This patch changes two things:

We do not need to use the loop counter's last value
for regular do-loops in Lowering.
The loop counter's increment is implied by fir.do_loop
operation, so there is no need to increment it explicitly.

The last point has been especially confusing to me, because it was
unclear why we have an explicit increment if it is implied.
It looks like CFGConversion somehow still makes the final code
correct, i.e. the counter is not incremented twice.
Anyway, the new lowering should look more concise.

Patch is 49.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/160630.diff

19 Files Affected:

(modified) flang/lib/Lower/Bridge.cpp (+5-9)
(modified) flang/lib/Lower/IO.cpp (+3-3)
(modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+6-9)
(modified) flang/test/Lower/OpenACC/acc-declare.f90 (+9-9)
(modified) flang/test/Lower/OpenMP/hlfir-seqloop-parallel.f90 (+8-8)
(modified) flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90 (+4-5)
(modified) flang/test/Lower/OpenMP/sections-predetermined-private.f90 (+4-4)
(modified) flang/test/Lower/OpenMP/shared-loop.f90 (+12-12)
(modified) flang/test/Lower/OpenMP/wsloop-reduction-allocatable-array-minmax.f90 (+3-4)
(modified) flang/test/Lower/OpenMP/wsloop-variable.f90 (+3-4)
(modified) flang/test/Lower/do_loop.f90 (+28-35)
(modified) flang/test/Lower/do_loop_unstructured.f90 (+1-2)
(modified) flang/test/Lower/infinite_loop.f90 (+4-5)
(modified) flang/test/Lower/io-implied-do-fixes.f90 (+2-4)
(modified) flang/test/Lower/loops.f90 (+2-2)
(modified) flang/test/Lower/loops2.f90 (+4-4)
(modified) flang/test/Lower/mixed_loops.f90 (+4-5)
(modified) flang/test/Lower/nsw.f90 (+1-1)
(modified) flang/test/Transforms/OpenMP/simd-only.mlir (+2-2)

diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index 4a5b9885bb7c4..149e51b501a82 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -2544,7 +2544,7 @@ class FirConverter : public Fortran::lower::AbstractConverter {
         auto loopOp = fir::DoLoopOp::create(
             *builder, loc, lowerValue, upperValue, stepValue,
             /*unordered=*/false,
-            /*finalCountValue=*/true,
+            /*finalCountValue=*/false,
             builder->createConvert(loc, loopVarType, lowerValue));
         info.loopOp = loopOp;
         builder->setInsertionPointToStart(loopOp.getBody());
@@ -2696,22 +2696,18 @@ class FirConverter : public Fortran::lower::AbstractConverter {
         // Decrement tripVariable.
         auto doLoopOp = mlir::cast<fir::DoLoopOp>(info.loopOp);
         builder->setInsertionPointToEnd(doLoopOp.getBody());
-        llvm::SmallVector<mlir::Value, 2> results;
-        results.push_back(mlir::arith::AddIOp::create(
-            *builder, loc, doLoopOp.getInductionVar(), doLoopOp.getStep(),
-            iofAttr));
         // Step loopVariable to help optimizations such as vectorization.
         // Induction variable elimination will clean up as necessary.
         mlir::Value step = builder->createConvert(
             loc, info.getLoopVariableType(), doLoopOp.getStep());
         mlir::Value loopVar =
             fir::LoadOp::create(*builder, loc, info.loopVariable);
-        results.push_back(
-            mlir::arith::AddIOp::create(*builder, loc, loopVar, step, iofAttr));
-        fir::ResultOp::create(*builder, loc, results);
+        mlir::Value loopVarInc =
+            mlir::arith::AddIOp::create(*builder, loc, loopVar, step, iofAttr);
+        fir::ResultOp::create(*builder, loc, loopVarInc);
         builder->setInsertionPointAfter(doLoopOp);
         // The loop control variable may be used after the loop.
-        fir::StoreOp::create(*builder, loc, doLoopOp.getResult(1),
+        fir::StoreOp::create(*builder, loc, doLoopOp.getResult(0),
                              info.loopVariable);
         continue;
       }
diff --git a/flang/lib/Lower/IO.cpp b/flang/lib/Lower/IO.cpp
index 4ad2ac01334fa..98dc78f625b9e 100644
--- a/flang/lib/Lower/IO.cpp
+++ b/flang/lib/Lower/IO.cpp
@@ -977,9 +977,9 @@ static void genIoLoop(Fortran::lower::AbstractConverter &converter,
     fir::StoreOp::create(builder, loc, lcv, loopVar);
     genItemList(ioImpliedDo);
     builder.setInsertionPointToEnd(doLoopOp.getBody());
-    mlir::Value result = mlir::arith::AddIOp::create(
-        builder, loc, doLoopOp.getInductionVar(), doLoopOp.getStep(), iofAttr);
-    fir::ResultOp::create(builder, loc, result);
+    // fir.do_loop's induction variable's increment is implied,
+    // so we do not need to increment it explicitly.
+    fir::ResultOp::create(builder, loc, doLoopOp.getInductionVar());
     builder.setInsertionPointAfter(doLoopOp);
     // The loop control variable may be used after the loop.
     lcv = builder.createConvert(loc, fir::unwrapRefType(loopVar.getType()),
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index 24e5cad84b709..38d51110bbde3 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -349,7 +349,7 @@ func.func @_QPopenmp_target_data_region() {
     %3 = fir.convert %c1024_i32 : (i32) -> index
     %c1 = arith.constant 1 : index
     %4 = fir.convert %2 : (index) -> i32
-    %5:2 = fir.do_loop %arg0 = %2 to %3 step %c1 iter_args(%arg1 = %4) -> (index, i32) {
+    %5 = fir.do_loop %arg0 = %2 to %3 step %c1 iter_args(%arg1 = %4) -> (i32) {
       fir.store %arg1 to %1 : !fir.ref<i32>
       %6 = fir.load %1 : !fir.ref<i32>
       %7 = fir.load %1 : !fir.ref<i32>
@@ -358,13 +358,12 @@ func.func @_QPopenmp_target_data_region() {
       %9 = arith.subi %8, %c1_i64 : i64
       %10 = fir.coordinate_of %0, %9 : (!fir.ref<!fir.array<1024xi32>>, i64) -> !fir.ref<i32>
       fir.store %6 to %10 : !fir.ref<i32>
-      %11 = arith.addi %arg0, %c1 overflow<nsw> : index
       %12 = fir.convert %c1 : (index) -> i32
       %13 = fir.load %1 : !fir.ref<i32>
       %14 = arith.addi %13, %12 overflow<nsw> : i32
-      fir.result %11, %14 : index, i32
+      fir.result %14 : i32
     }
-    fir.store %5#1 to %1 : !fir.ref<i32>
+    fir.store %5 to %1 : !fir.ref<i32>
     omp.terminator
   }
   return
@@ -404,7 +403,6 @@ func.func @_QPopenmp_target_data_region() {
 // CHECK:             %[[VAL_21:.*]] = llvm.sub %[[VAL_19]], %[[VAL_20]]  : i64
 // CHECK:             %[[VAL_22:.*]] = llvm.getelementptr %[[VAL_1]][0, %[[VAL_21]]] : (!llvm.ptr, i64) -> !llvm.ptr
 // CHECK:             llvm.store %[[VAL_17]], %[[VAL_22]] : i32, !llvm.ptr
-// CHECK:             %[[VAL_23:.*]] = llvm.add %[[VAL_12]], %[[VAL_8]] overflow<nsw> : i64
 // CHECK:             %[[VAL_24:.*]] = llvm.trunc %[[VAL_8]] : i64 to i32
 // CHECK:             %[[VAL_25:.*]] = llvm.load %[[VAL_3]] : !llvm.ptr -> i32
 // CHECK:             %[[VAL_26:.*]] = llvm.add %[[VAL_25]], %[[VAL_24]] overflow<nsw> : i32
@@ -653,18 +651,17 @@ func.func @_QPsb() {
   omp.sections   {
     omp.section {
       %2 = fir.convert %c1 : (index) -> i32
-      %3:2 = fir.do_loop %arg0 = %c1 to %c10 step %c1 iter_args(%arg1 = %2) -> (index, i32) {
+      %3 = fir.do_loop %arg0 = %c1 to %c10 step %c1 iter_args(%arg1 = %2) -> (i32) {
         fir.store %arg1 to %0 : !fir.ref<i32>
         %4 = fir.load %1 : !fir.ref<i32>
         %5 = arith.addi %4, %c1_i32 : i32
         fir.store %5 to %1 : !fir.ref<i32>
-        %6 = arith.addi %arg0, %c1 : index
         %7 = fir.convert %c1 : (index) -> i32
         %8 = fir.load %0 : !fir.ref<i32>
         %9 = arith.addi %8, %7 : i32
-        fir.result %6, %9 : index, i32
+        fir.result %9 : i32
       }
-      fir.store %3#1 to %0 : !fir.ref<i32>
+      fir.store %3 to %0 : !fir.ref<i32>
       omp.terminator
     }
     omp.terminator
diff --git a/flang/test/Lower/OpenACC/acc-declare.f90 b/flang/test/Lower/OpenACC/acc-declare.f90
index edae0e6a4d37e..46c4365f23fd6 100644
--- a/flang/test/Lower/OpenACC/acc-declare.f90
+++ b/flang/test/Lower/OpenACC/acc-declare.f90
@@ -20,7 +20,7 @@ subroutine acc_declare_copy()
 ! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ALLOCA]](%{{.*}}) {acc.declare = #acc.declare<dataClause =  acc_copy>, uniq_name = "_QMacc_declareFacc_declare_copyEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {dataClause = #acc<data_clause acc_copy>, name = "a"}
 ! CHECK: %[[TOKEN:.*]] = acc.declare_enter dataOperands(%[[COPYIN]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (index, i32) {
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (i32) {
 ! CHECK: }
 ! CHECK: acc.declare_exit token(%[[TOKEN]]) dataOperands(%[[COPYIN]] : !fir.ref<!fir.array<100xi32>>)
 ! CHECK: acc.copyout accPtr(%[[COPYIN]] : !fir.ref<!fir.array<100xi32>>) to varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) {dataClause = #acc<data_clause acc_copy>, name = "a"}
@@ -40,7 +40,7 @@ subroutine acc_declare_create()
 ! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ALLOCA]](%{{.*}}) {acc.declare = #acc.declare<dataClause =  acc_create>, uniq_name = "_QMacc_declareFacc_declare_createEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
 ! CHECK: %[[TOKEN:.*]] = acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (index, i32) {
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (i32) {
 ! CHECK: }
 ! CHECK: acc.declare_exit token(%[[TOKEN]]) dataOperands(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>)
 ! CHECK: acc.delete accPtr(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>) {dataClause = #acc<data_clause acc_create>, name = "a"}
@@ -60,7 +60,7 @@ subroutine acc_declare_present(a)
 ! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ARG0]](%{{.*}}) dummy_scope %{{[0-9]+}} {acc.declare = #acc.declare<dataClause =  acc_present>, uniq_name = "_QMacc_declareFacc_declare_presentEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[PRESENT:.*]] = acc.present varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
 ! CHECK: %[[TOKEN:.*]] = acc.declare_enter dataOperands(%[[PRESENT]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (i32)
 ! CHECK: acc.declare_exit token(%[[TOKEN]]) dataOperands(%[[PRESENT]] : !fir.ref<!fir.array<100xi32>>)
 ! CHECK: acc.delete accPtr(%[[PRESENT]] : !fir.ref<!fir.array<100xi32>>) {dataClause = #acc<data_clause acc_present>, name = "a"}
 
@@ -81,7 +81,7 @@ subroutine acc_declare_copyin()
 ! CHECK: %[[COPYIN_A:.*]] = acc.copyin varPtr(%[[ADECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
 ! CHECK: %[[COPYIN_B:.*]] = acc.copyin varPtr(%[[BDECL]]#0 : !fir.ref<!fir.array<10xi32>>) -> !fir.ref<!fir.array<10xi32>> {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
 ! CHECK: acc.declare_enter dataOperands(%[[COPYIN_A]], %[[COPYIN_B]] : !fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<10xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (i32)
 ! CHECK: acc.delete accPtr(%[[COPYIN_A]] : !fir.ref<!fir.array<100xi32>>) {dataClause = #acc<data_clause acc_copyin>, name = "a"}
 ! CHECK: acc.delete accPtr(%[[COPYIN_B]] : !fir.ref<!fir.array<10xi32>>) {dataClause = #acc<data_clause acc_copyin_readonly>, name = "b"}
 
@@ -99,7 +99,7 @@ subroutine acc_declare_copyout()
 ! CHECK: %[[ADECL:.*]]:2 = hlfir.declare %[[A]](%{{.*}}) {acc.declare = #acc.declare<dataClause =  acc_copyout>, uniq_name = "_QMacc_declareFacc_declare_copyoutEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[ADECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {dataClause = #acc<data_clause acc_copyout>, name = "a"}
 ! CHECK: %[[TOKEN:.*]] = acc.declare_enter dataOperands(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (i32)
 ! CHECK: acc.declare_exit token(%[[TOKEN]]) dataOperands(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>)
 ! CHECK: acc.copyout accPtr(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>) to varPtr(%[[ADECL]]#0 : !fir.ref<!fir.array<100xi32>>) {name = "a"}
 ! CHECK: return
@@ -118,7 +118,7 @@ subroutine acc_declare_deviceptr(a)
 ! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ARG0]](%{{.*}}) dummy_scope %{{[0-9]+}} {acc.declare = #acc.declare<dataClause =  acc_deviceptr>, uniq_name = "_QMacc_declareFacc_declare_deviceptrEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[DEVICEPTR:.*]] = acc.deviceptr varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
 ! CHECK: acc.declare_enter dataOperands(%[[DEVICEPTR]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (i32)
 
   subroutine acc_declare_link(a)
     integer :: a(100), i
@@ -134,7 +134,7 @@ subroutine acc_declare_link(a)
 ! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ARG0]](%{{.*}}) dummy_scope %{{[0-9]+}} {acc.declare = #acc.declare<dataClause =  acc_declare_link>, uniq_name = "_QMacc_declareFacc_declare_linkEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[LINK:.*]] = acc.declare_link varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
 ! CHECK: acc.declare_enter dataOperands(%[[LINK]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (i32)
 
   subroutine acc_declare_device_resident(a)
     integer :: a(100), i
@@ -150,7 +150,7 @@ subroutine acc_declare_device_resident(a)
 ! CHECK: %[[DECL:.*]]:2 = hlfir.declare %[[ARG0]](%{{.*}}) dummy_scope %{{[0-9]+}} {acc.declare = #acc.declare<dataClause =  acc_declare_device_resident>, uniq_name = "_QMacc_declareFacc_declare_device_residentEa"} : (!fir.ref<!fir.array<100xi32>>, !fir.shape<1>, !fir.dscope) -> (!fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
 ! CHECK: %[[DEVICERES:.*]] = acc.declare_device_resident varPtr(%[[DECL]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {name = "a"}
 ! CHECK: %[[TOKEN:.*]] = acc.declare_enter dataOperands(%[[DEVICERES]] : !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (index, i32)
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%arg{{.*}} = %{{.*}}) -> (i32)
 ! CHECK: acc.declare_exit token(%[[TOKEN]]) dataOperands(%[[DEVICERES]] : !fir.ref<!fir.array<100xi32>>)
 ! CHECK: acc.delete accPtr(%[[DEVICERES]] : !fir.ref<!fir.array<100xi32>>) {dataClause = #acc<data_clause acc_declare_device_resident>, name = "a"}
 
@@ -279,7 +279,7 @@ subroutine acc_declare_multiple_directive(a, b)
 ! CHECK: %[[COPYIN:.*]] = acc.copyin varPtr(%[[DECL_A]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {dataClause = #acc<data_clause acc_copy>, name = "a"}
 ! CHECK: %[[CREATE:.*]] = acc.create varPtr(%[[DECL_B]]#0 : !fir.ref<!fir.array<100xi32>>) -> !fir.ref<!fir.array<100xi32>> {dataClause = #acc<data_clause acc_copyout>, name = "b"}
 ! CHECK: acc.declare_enter dataOperands(%[[COPYIN]], %[[CREATE]] : !fir.ref<!fir.array<100xi32>>, !fir.ref<!fir.array<100xi32>>)
-! CHECK: %{{.*}}:{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (index, i32) {
+! CHECK: %{{.*}} = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%{{.*}} = %{{.*}}) -> (i32) {
 
 
 ! CHECK: acc.copyout accPtr(%[[CREATE]] : !fir.ref<!fir.array<100xi32>>) to varPtr(%[[DECL_B]]#0 : !fir.ref<!fir.array<100xi32>>) {name = "b"}
diff --git a/flang/test/Lower/OpenMP/hlfir-seqloop-parallel.f90 b/flang/test/Lower/OpenMP/hlfir-seqloop-parallel.f90
index 0c0d877a17b00..642b11bcd6b75 100644
--- a/flang/test/Lower/OpenMP/hlfir-seqloop-parallel.f90
+++ b/flang/test/Lower/OpenMP/hlfir-seqloop-parallel.f90
@@ -22,10 +22,10 @@ subroutine sb1
 !CHECK:    %[[I_DECL:.*]]:2 = hlfir.declare %[[I_ADDR]] {uniq_name = "_QFsb1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
 !CHECK:    omp.parallel private({{.*}} %[[I_DECL]]#0 -> %[[I_PVT_ADDR:.*]] : {{.*}}) {
 !CHECK:      %[[I_PVT_DECL:.*]]:2 = hlfir.declare %[[I_PVT_ADDR]] {uniq_name = "_QFsb1Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
-!CHECK:      %[[I_FINAL_VAL:.*]]:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[I_VAL:.*]] = %{{.*}}) -> (index, i32) {
+!CHECK:      %[[I_FINAL_VAL:.*]] = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[I_VAL:.*]] = %{{.*}}) -> (i32) {
 !CHECK:        fir.store %[[I_VAL]] to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:      }
-!CHECK:      fir.store %[[I_FINAL_VAL]]#1 to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
+!CHECK:      fir.store %[[I_FINAL_VAL]] to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:      omp.terminator
 !CHECK:    }
 !CHECK:    return
@@ -58,20 +58,20 @@ subroutine sb2
 
 !CHECK:      %[[I_PVT_DECL:.*]]:2 = hlfir.declare %[[I_PVT_ADDR]] {uniq_name = "_QFsb2Ei"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>)
 
-!CHECK:      %[[FINAL_J_VAL:.*]]:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[J_VAL:.*]] = %{{.*}}) -> (index, i32) {
+!CHECK:      %[[FINAL_J_VAL:.*]] = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[J_VAL:.*]] = %{{.*}}) -> (i32) {
 !CHECK:        fir.store %[[J_VAL]] to %[[J_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:        fir.if %{{.*}} {
-!CHECK:          %[[FINAL_I_VAL:.*]]:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[I_VAL:.*]] = %{{.*}}) -> (index, i32) {
+!CHECK:          %[[FINAL_I_VAL:.*]] = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[I_VAL:.*]] = %{{.*}}) -> (i32) {
 !CHECK:            fir.store %[[I_VAL]] to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:          }
-!CHECK:          fir.store %[[FINAL_I_VAL]]#1 to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
+!CHECK:          fir.store %[[FINAL_I_VAL]] to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:        }
-!CHECK:        %[[FINAL_I_VAL:.*]]:2 = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[I_VAL:.*]] = %{{.*}}) -> (index, i32) {
+!CHECK:        %[[FINAL_I_VAL:.*]] = fir.do_loop %{{.*}} = %{{.*}} to %{{.*}} step %{{.*}} iter_args(%[[I_VAL:.*]] = %{{.*}}) -> (i32) {
 !CHECK:          fir.store %[[I_VAL]] to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:        }
-!CHECK:        fir.store %[[FINAL_I_VAL]]#1 to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
+!CHECK:        fir.store %[[FINAL_I_VAL]] to %[[I_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:      }
-!CHECK:      fir.store %[[FINAL_J_VAL]]#1 to %[[J_PVT_DECL]]#0 : !fir.ref<i32>
+!CHECK:      fir.store %[[FINAL_J_VAL]] to %[[J_PVT_DECL]]#0 : !fir.ref<i32>
 !CHECK:      omp.terminator
 !CHECK:    }
 !CHECK:    return
diff --git a/flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90 b/flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90
index 4d1023925fd88..3bb40834afe4c 100644
--- a/flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90
+++ b/flang/test/Lower/OpenMP/parallel-private-clause-fixes.f90
@@ -59,21 +59,20 @@
 ! CHECK:               %[[VAL_10:.*]] = fir.convert %[[VAL_9]] : (i32) -> index
 ! CHECK:               %[[VAL_11:.*]] = arith.constant 1 : index
 ! CHECK:               %[[LB:.*]] = fir.convert %[[VAL_8]] : (index) -> i32
-! CHECK:               %[[VAL_12:.*]]:2 = fir.do_loop %[[VAL_13:[^ ]*]] =
+! CHECK:               %[[VAL_12:.*]] = fir.do_loop %[[VAL_13:[^ ]*]] =
 ! CHECK-SAME:              %[[VAL_8]] to %[[VAL_10]] step %[[VAL_11]]
-! CHECK-SAME:              iter_args(%[[IV:.*]] = %[[LB]]) -> (index, i32) {
+! CHECK-SAME:              iter_args(%[[IV:.*]] = %[[LB]]) -> (i32) {
 ! CHECK:                 fir.store %[[IV]] to %[[PRIV_J_DECL]]#0 : !fir.ref<i32>
 ! CHECK:                 %[[LOAD:.*]] = fir.load %[[PRIV_I_DECL]]#0 : !fir.ref<i32>
 ! CHECK:                 %[[VAL_15:.*]] = fir.load %[[PRIV_J_DECL]]#0 : !fir.ref<i32>
 ! CHECK:                 %[[VAL_16:.*]] = arith.addi %[[LOAD]], %[[VAL_15]] : i32
 ! CHECK:                 hlfir.assign %[[VAL_16]] to %[[PRIV_X_DECL]]#0 : i32, !fir.ref<i32>
-! CHECK:                 %[[VAL_17:.*]] = arith.addi %[[VAL_13]], %[[VAL_11]] overflow<nsw> : index
 ! CHECK:                 %[[STEPCAST:.*]] = fir.convert %[[VAL_11]] : (index) -> i32
 ! CHECK:                 %[[IVLOAD:.*]] = fir.load %[[PRIV_J_DECL]]#0 : !fir.ref<i32>
 ! CHECK:                 %[[IVINC:.*]] = arith.addi %[[I...
[truncated]

jeanPerier

Thanks

flang/lib/Lower/IO.cpp

tblah

LGTM. Thanks for taking the time to update all of those tests

rscottmanley · 2025-09-25T11:31:17Z

Maybe I'm misreading, but looking at the changes to the test, it looks like we're still yielding the original type instead of the primary iv. Why not yield the primary iv and convert the result after the loop? It's still incrementing the user variable?

jeanPerier · 2025-09-25T12:08:20Z

Maybe I'm misreading, but looking at the changes to the test, it looks like we're still yielding the original type instead of the primary iv. Why not yield the primary iv and convert the result after the loop? It's still incrementing the user variable?

Slava will know best. My understanding is that using the original variable type helps LLVM vectorizer: af7edf1

vzakhari · 2025-09-25T16:40:57Z

Maybe I'm misreading, but looking at the changes to the test, it looks like we're still yielding the original type instead of the primary iv. Why not yield the primary iv and convert the result after the loop? It's still incrementing the user variable?

Slava will know best. My understanding is that using the original variable type helps LLVM vectorizer: af7edf1

Yes, incrementing the user IV in the user type was crucial for helping LLVM vectorizer. Jean, Scott and I had some discussions about this today.

It seems that exposing the index loop counter too early in the Lowering does not give us a lot of benefit, and the following representation might be better:

      %3 = fir.do_loop %arg0 = %lb_in_user_type to %ub_in_user_type step %step_in_user_type -> (i32) {
        fir.store %arg0 to %user_iv_loc : !fir.ref<i32>
        ...
        fir.result %arg0 : i32
      }
      fir.store %3 to %user_iv_loc : !fir.ref<i32>

The fir.do_loop to CFG conversion, then should introduce an index loop counter to make sure that the loops with the final increment overflowing the IV user-type work properly.

I plan to merge this PR as-is, and I will experiment with the above lowering scheme.

This patch changes two things: 1. We do not need to use the loop counter's last value for regular do-loops in Lowering. 2. The loop counter's increment is implied by fir.do_loop operation, so there is no need to increment it explicitly. The last point has been especially confusing to me, because it was unclear why we have an explicit increment if it is implied. It looks like CFGConversion somehow still makes the final code correct, i.e. the counter is not incremented twice. Anyway, the new lowering should look more concise.

vzakhari requested review from clementval, jeanPerier, rscottmanley and tblah September 25, 2025 02:28

llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp openacc labels Sep 25, 2025

jeanPerier approved these changes Sep 25, 2025

View reviewed changes

flang/lib/Lower/IO.cpp Show resolved Hide resolved

tblah approved these changes Sep 25, 2025

View reviewed changes

rscottmanley approved these changes Sep 25, 2025

View reviewed changes

vzakhari merged commit d6e20c4 into llvm:main Sep 26, 2025
14 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[flang] Clean-up for fir.do_loop generation in lowering. #160630

[flang] Clean-up for fir.do_loop generation in lowering. #160630

Uh oh!

vzakhari commented Sep 25, 2025

Uh oh!

llvmbot commented Sep 25, 2025 •

edited

Loading

Uh oh!

jeanPerier left a comment

Uh oh!

Uh oh!

tblah left a comment

Uh oh!

rscottmanley commented Sep 25, 2025 •

edited

Loading

Uh oh!

jeanPerier commented Sep 25, 2025

Uh oh!

vzakhari commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[flang] Clean-up for fir.do_loop generation in lowering. #160630

[flang] Clean-up for fir.do_loop generation in lowering. #160630

Uh oh!

Conversation

vzakhari commented Sep 25, 2025

Uh oh!

llvmbot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeanPerier left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

rscottmanley commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jeanPerier commented Sep 25, 2025

Uh oh!

vzakhari commented Sep 25, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

llvmbot commented Sep 25, 2025 •

edited

Loading

rscottmanley commented Sep 25, 2025 •

edited

Loading