Skip to content

[flang][OpenMP] Move privatizations out of sections #88191

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 6, 2024

Conversation

luporl
Copy link
Contributor

@luporl luporl commented Apr 9, 2024

Besides duplicating code, privatizing variables in every section
causes problems when synchronization barriers are used. This
happens because each section is executed by a given thread, which
will cause the program to hang if not all running threads execute
the barrier operation.

Fixes #72824

@llvmbot llvmbot added flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp labels Apr 9, 2024
@llvmbot
Copy link
Member

llvmbot commented Apr 9, 2024

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-openmp

Author: Leandro Lupori (luporl)

Changes

Besides duplicating code, privatizing variables in every section
causes problems when synchronization barriers are used. This
happens because each section is executed by a given thread, which
will cause the program to hang if not all running threads execute
the barrier operation.

Fixes #72824


Patch is 33.30 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/88191.diff

4 Files Affected:

  • (modified) flang/lib/Lower/OpenMP/DataSharingProcessor.cpp (+13-81)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+50-15)
  • (modified) flang/test/Lower/OpenMP/FIR/sections.f90 (+17-39)
  • (modified) flang/test/Lower/OpenMP/sections.f90 (+23-53)
diff --git a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
index e1aa7524e930d6..f7810ce60fe051 100644
--- a/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/DataSharingProcessor.cpp
@@ -32,9 +32,12 @@ void DataSharingProcessor::processStep1() {
 }
 
 void DataSharingProcessor::processStep2(mlir::Operation *op, bool isLoop) {
-  insPt = firOpBuilder.saveInsertionPoint();
-  copyLastPrivatize(op);
-  firOpBuilder.restoreInsertionPoint(insPt);
+  // 'sections' lastprivate is handled by genOMP()
+  if (!mlir::isa<mlir::omp::SectionsOp>(op)) {
+    insPt = firOpBuilder.saveInsertionPoint();
+    copyLastPrivatize(op);
+    firOpBuilder.restoreInsertionPoint(insPt);
+  }
 
   if (isLoop) {
     // push deallocs out of the loop
@@ -111,6 +114,10 @@ void DataSharingProcessor::collectSymbolsForPrivatization() {
 }
 
 bool DataSharingProcessor::needBarrier() {
+  // Emit implicit barrier to synchronize threads and avoid data races on
+  // initialization of firstprivate variables and post-update of lastprivate
+  // variables.
+  // Emit implicit barrier for linear clause. Maybe on somewhere else.
   for (const Fortran::semantics::Symbol *sym : privatizedSymbols) {
     if (sym->test(Fortran::semantics::Symbol::Flag::OmpFirstPrivate) &&
         sym->test(Fortran::semantics::Symbol::Flag::OmpLastPrivate))
@@ -120,13 +127,6 @@ bool DataSharingProcessor::needBarrier() {
 }
 
 void DataSharingProcessor::insertBarrier() {
-  // Emit implicit barrier to synchronize threads and avoid data races on
-  // initialization of firstprivate variables and post-update of lastprivate
-  // variables.
-  // FIXME: Emit barrier for lastprivate clause when 'sections' directive has
-  // 'nowait' clause. Otherwise, emit barrier when 'sections' directive has
-  // both firstprivate and lastprivate clause.
-  // Emit implicit barrier for linear clause. Maybe on somewhere else.
   if (needBarrier())
     firOpBuilder.create<mlir::omp::BarrierOp>(converter.getCurrentLocation());
 }
@@ -138,77 +138,7 @@ void DataSharingProcessor::insertLastPrivateCompare(mlir::Operation *op) {
     if (clause.id != llvm::omp::OMPC_lastprivate)
       continue;
     // TODO: Add lastprivate support for simd construct
-    if (mlir::isa<mlir::omp::SectionOp>(op)) {
-      if (&eval == &eval.parentConstruct->getLastNestedEvaluation()) {
-        // For `omp.sections`, lastprivatized variables occur in
-        // lexically final `omp.section` operation. The following FIR
-        // shall be generated for the same:
-        //
-        // omp.sections lastprivate(...) {
-        //  omp.section {...}
-        //  omp.section {...}
-        //  omp.section {
-        //      fir.allocate for `private`/`firstprivate`
-        //      <More operations here>
-        //      fir.if %true {
-        //          ^%lpv_update_blk
-        //      }
-        //  }
-        // }
-        //
-        // To keep code consistency while handling privatization
-        // through this control flow, add a `fir.if` operation
-        // that always evaluates to true, in order to create
-        // a dedicated sub-region in `omp.section` where
-        // lastprivate FIR can reside. Later canonicalizations
-        // will optimize away this operation.
-        if (!eval.lowerAsUnstructured()) {
-          auto ifOp = firOpBuilder.create<fir::IfOp>(
-              op->getLoc(),
-              firOpBuilder.createIntegerConstant(
-                  op->getLoc(), firOpBuilder.getIntegerType(1), 0x1),
-              /*else*/ false);
-          firOpBuilder.setInsertionPointToStart(&ifOp.getThenRegion().front());
-
-          const Fortran::parser::OpenMPConstruct *parentOmpConstruct =
-              eval.parentConstruct->getIf<Fortran::parser::OpenMPConstruct>();
-          assert(parentOmpConstruct &&
-                 "Expected a valid enclosing OpenMP construct");
-          const Fortran::parser::OpenMPSectionsConstruct *sectionsConstruct =
-              std::get_if<Fortran::parser::OpenMPSectionsConstruct>(
-                  &parentOmpConstruct->u);
-          assert(sectionsConstruct &&
-                 "Expected an enclosing omp.sections construct");
-          const Fortran::parser::OmpClauseList &sectionsEndClauseList =
-              std::get<Fortran::parser::OmpClauseList>(
-                  std::get<Fortran::parser::OmpEndSectionsDirective>(
-                      sectionsConstruct->t)
-                      .t);
-          for (const Fortran::parser::OmpClause &otherClause :
-               sectionsEndClauseList.v)
-            if (std::get_if<Fortran::parser::OmpClause::Nowait>(&otherClause.u))
-              // Emit implicit barrier to synchronize threads and avoid data
-              // races on post-update of lastprivate variables when `nowait`
-              // clause is present.
-              firOpBuilder.create<mlir::omp::BarrierOp>(
-                  converter.getCurrentLocation());
-          firOpBuilder.setInsertionPointToStart(&ifOp.getThenRegion().front());
-          lastPrivIP = firOpBuilder.saveInsertionPoint();
-          firOpBuilder.setInsertionPoint(ifOp);
-          insPt = firOpBuilder.saveInsertionPoint();
-        } else {
-          // Lastprivate operation is inserted at the end
-          // of the lexically last section in the sections
-          // construct
-          mlir::OpBuilder::InsertPoint unstructuredSectionsIP =
-              firOpBuilder.saveInsertionPoint();
-          mlir::Operation *lastOper = op->getRegion(0).back().getTerminator();
-          firOpBuilder.setInsertionPoint(lastOper);
-          lastPrivIP = firOpBuilder.saveInsertionPoint();
-          firOpBuilder.restoreInsertionPoint(unstructuredSectionsIP);
-        }
-      }
-    } else if (mlir::isa<mlir::omp::WsloopOp>(op)) {
+    if (mlir::isa<mlir::omp::WsloopOp>(op)) {
       // Update the original variable just before exiting the worksharing
       // loop. Conversion as follows:
       //
@@ -259,6 +189,8 @@ void DataSharingProcessor::insertLastPrivateCompare(mlir::Operation *op) {
       assert(loopIV && "loopIV was not set");
       firOpBuilder.create<fir::StoreOp>(op->getLoc(), v, loopIV);
       lastPrivIP = firOpBuilder.saveInsertionPoint();
+    } else if (mlir::isa<mlir::omp::SectionsOp>(op)) {
+      // Already handled by genOMP()
     } else {
       TODO(converter.getCurrentLocation(),
            "lastprivate clause in constructs other than "
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index 160ada379a0889..8bcf47d61deef5 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -671,14 +671,10 @@ static mlir::omp::SectionOp
 genSectionOp(Fortran::lower::AbstractConverter &converter,
              Fortran::semantics::SemanticsContext &semaCtx,
              Fortran::lower::pft::Evaluation &eval, bool genNested,
-             mlir::Location currentLocation,
-             const Fortran::parser::OmpClauseList &sectionsClauseList) {
-  // Currently only private/firstprivate clause is handled, and
-  // all privatization is done within `omp.section` operations.
+             mlir::Location currentLocation) {
   return genOpWithBody<mlir::omp::SectionOp>(
       OpWithBodyGenInfo(converter, semaCtx, currentLocation, eval)
-          .setGenNested(genNested)
-          .setClauses(&sectionsClauseList));
+          .setGenNested(genNested));
 }
 
 static mlir::omp::SingleOp
@@ -1974,6 +1970,7 @@ genOMP(Fortran::lower::AbstractConverter &converter,
           .v;
 
   // Parallel wrapper of PARALLEL SECTIONS construct
+  bool hasNowait = false;
   if (dir == llvm::omp::Directive::OMPD_parallel_sections) {
     genParallelOp(converter, symTable, semaCtx, eval,
                   /*genNested=*/false, currentLocation, sectionsClauseList,
@@ -1983,30 +1980,68 @@ genOMP(Fortran::lower::AbstractConverter &converter,
         std::get<Fortran::parser::OmpEndSectionsDirective>(sectionsConstruct.t);
     const auto &endSectionsClauseList =
         std::get<Fortran::parser::OmpClauseList>(endSectionsDirective.t);
-    ClauseProcessor(converter, semaCtx, endSectionsClauseList)
-        .processNowait(nowaitClauseOperand);
+    hasNowait = ClauseProcessor(converter, semaCtx, endSectionsClauseList)
+                    .processNowait(nowaitClauseOperand);
   }
 
+  // Insert privatizations before SECTIONS
+  symTable.pushScope();
+  DataSharingProcessor dsp(converter, semaCtx, sectionsClauseList, eval);
+  dsp.processStep1();
+
   // SECTIONS construct
-  genOpWithBody<mlir::omp::SectionsOp>(
+  auto sectionsOp = genOpWithBody<mlir::omp::SectionsOp>(
       OpWithBodyGenInfo(converter, semaCtx, currentLocation, eval)
           .setGenNested(false),
       /*reduction_vars=*/mlir::ValueRange(),
       /*reductions=*/nullptr, allocateOperands, allocatorOperands,
       nowaitClauseOperand);
 
+  std::optional<Clause> lastPrivateClause;
+  for (const Fortran::parser::OmpClause &clause : sectionsClauseList.v) {
+    if (std::holds_alternative<Fortran::parser::OmpClause::Lastprivate>(
+            clause.u)) {
+      lastPrivateClause = makeClause(clause, semaCtx);
+    }
+  }
+
   const auto &sectionBlocks =
       std::get<Fortran::parser::OmpSectionBlocks>(sectionsConstruct.t);
   auto &firOpBuilder = converter.getFirOpBuilder();
   auto ip = firOpBuilder.saveInsertionPoint();
-  for (const auto &[nblock, neval] :
-       llvm::zip(sectionBlocks.v, eval.getNestedEvaluations())) {
-    symTable.pushScope();
-    genSectionOp(converter, semaCtx, neval, /*genNested=*/true, currentLocation,
-                 sectionsClauseList);
-    symTable.popScope();
+  auto zippy = llvm::zip(sectionBlocks.v, eval.getNestedEvaluations());
+  auto it = zippy.begin(), next = it, end = zippy.end();
+  ++next;
+  for (; it != end; it = next, ++next) {
+    const auto &[nblock, neval] = *it;
+    mlir::omp::SectionOp sectionOp = genSectionOp(
+        converter, semaCtx, neval, /*genNested=*/true, currentLocation);
+    // For `omp.sections`, lastprivatized variables occur in
+    // lexically final `omp.section` operation.
+    if (next == end && lastPrivateClause) {
+      clause::Lastprivate &lastPrivate =
+          std::get<clause::Lastprivate>(lastPrivateClause.value().u);
+      firOpBuilder.setInsertionPoint(
+          sectionOp.getRegion().back().getTerminator());
+      mlir::OpBuilder::InsertPoint lastPrivIP =
+          converter.getFirOpBuilder().saveInsertionPoint();
+      for (const Object &obj : lastPrivate.v) {
+        Fortran::semantics::Symbol *sym = obj.id();
+        converter.copyHostAssociateVar(*sym, &lastPrivIP);
+      }
+    }
     firOpBuilder.restoreInsertionPoint(ip);
   }
+
+  // Perform DataSharingProcessor's step2 out of SECTIONS
+  firOpBuilder.setInsertionPointAfter(sectionsOp.getOperation());
+  dsp.processStep2(sectionsOp, false);
+  // Emit implicit barrier to synchronize threads and avoid data
+  // races on post-update of lastprivate variables when `nowait`
+  // clause is present.
+  if (hasNowait && lastPrivateClause)
+    firOpBuilder.create<mlir::omp::BarrierOp>(converter.getCurrentLocation());
+  symTable.popScope();
 }
 
 static void
diff --git a/flang/test/Lower/OpenMP/FIR/sections.f90 b/flang/test/Lower/OpenMP/FIR/sections.f90
index 7b313f3dc0b41f..9d8edb5c75d5ce 100644
--- a/flang/test/Lower/OpenMP/FIR/sections.f90
+++ b/flang/test/Lower/OpenMP/FIR/sections.f90
@@ -8,10 +8,10 @@
 !CHECK:   %[[COUNT:.*]] = fir.address_of(@_QFEcount) : !fir.ref<i32>
 !CHECK:   %[[ETA:.*]] = fir.alloca f32 {bindc_name = "eta", uniq_name = "_QFEeta"}
 !CHECK:   %[[CONST_1:.*]] = arith.constant 4 : i64
+!CHECK:   %[[PRIVATE_ETA:.*]] = fir.alloca f32 {bindc_name = "eta", pinned, uniq_name = "_QFEeta"}
+!CHECK:   %[[PRIVATE_DOUBLE_COUNT:.*]] = fir.alloca i32 {bindc_name = "double_count", pinned, uniq_name = "_QFEdouble_count"} 
 !CHECK:   omp.sections allocate(%[[CONST_1]] : i64 -> %0 : !fir.ref<i32>)  {
 !CHECK:     omp.section {
-!CHECK:       %[[PRIVATE_ETA:.*]] = fir.alloca f32 {bindc_name = "eta", pinned, uniq_name = "_QFEeta"}
-!CHECK:       %[[PRIVATE_DOUBLE_COUNT:.*]] = fir.alloca i32 {bindc_name = "double_count", pinned, uniq_name = "_QFEdouble_count"} 
 !CHECK:       %[[const:.*]] = arith.constant 5 : i32
 !CHECK:       fir.store %[[const]] to %[[COUNT]] : !fir.ref<i32>
 !CHECK:       %[[temp_count:.*]] = fir.load %[[COUNT]] : !fir.ref<i32>
@@ -22,8 +22,6 @@
 !CHECK:       omp.terminator
 !CHECK:     }
 !CHECK:     omp.section {
-!CHECK:       %[[PRIVATE_ETA:.*]] = fir.alloca f32 {bindc_name = "eta", pinned, uniq_name = "_QFEeta"}
-!CHECK:       %[[PRIVATE_DOUBLE_COUNT:.*]] = fir.alloca i32 {bindc_name = "double_count", pinned, uniq_name = "_QFEdouble_count"} 
 !CHECK:       %[[temp:.*]] = fir.load %[[PRIVATE_DOUBLE_COUNT]] : !fir.ref<i32>
 !CHECK:       %[[const:.*]] = arith.constant 1 : i32
 !CHECK:       %[[result:.*]] = arith.addi %[[temp]], %[[const]] : i32
@@ -31,8 +29,6 @@
 !CHECK:       omp.terminator
 !CHECK:     }
 !CHECK:     omp.section {
-!CHECK:       %[[PRIVATE_ETA:.*]] = fir.alloca f32 {bindc_name = "eta", pinned, uniq_name = "_QFEeta"}
-!CHECK:       %[[PRIVATE_DOUBLE_COUNT:.*]] = fir.alloca i32 {bindc_name = "double_count", pinned, uniq_name = "_QFEdouble_count"} 
 !CHECK:       %[[temp:.*]] = fir.load %[[PRIVATE_ETA]] : !fir.ref<f32>
 !CHECK:       %[[const:.*]] = arith.constant 7.000000e+00 : f32
 !CHECK:       %[[result:.*]] = arith.subf %[[temp]], %[[const]] {{.*}}: f32
@@ -79,11 +75,11 @@ program sample
 end program sample
 
 !CHECK: func @_QPfirstprivate(%[[ARG:.*]]: !fir.ref<f32> {fir.bindc_name = "alpha"}) {
+!CHECK:   %[[PRIVATE_ALPHA:.*]] = fir.alloca f32 {bindc_name = "alpha", pinned, uniq_name = "_QFfirstprivateEalpha"}
+!CHECK:   %[[temp:.*]] = fir.load %[[ARG]] : !fir.ref<f32>
+!CHECK:   fir.store %[[temp]] to %[[PRIVATE_ALPHA]] : !fir.ref<f32>
 !CHECK:   omp.sections {
 !CHECK:     omp.section  {
-!CHECK:         %[[PRIVATE_ALPHA:.*]] = fir.alloca f32 {bindc_name = "alpha", pinned, uniq_name = "_QFfirstprivateEalpha"}
-!CHECK:         %[[temp:.*]] = fir.load %[[ARG]] : !fir.ref<f32>
-!CHECK:         fir.store %[[temp]] to %[[PRIVATE_ALPHA]] : !fir.ref<f32>
 !CHECK:       omp.terminator
 !CHECK:     }
 !CHECK:     omp.terminator
@@ -114,10 +110,10 @@ subroutine firstprivate(alpha)
 subroutine lastprivate()
 	integer :: x
 !CHECK: %[[X:.*]] = fir.alloca i32 {bindc_name = "x", uniq_name = "_QFlastprivateEx"}
+!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: omp.sections   {
 	!$omp sections lastprivate(x)
 !CHECK: omp.section {
-!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: %[[const:.*]] = arith.constant 10 : i32
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: %[[result:.*]] = arith.muli %c10_i32, %[[temp]] : i32
@@ -127,16 +123,12 @@ subroutine lastprivate()
         !$omp section
             x = x * 10
 !CHECK: omp.section {
-!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: %[[const:.*]] = arith.constant 1 : i32
 !CHECK: %[[result:.*]] = arith.addi %[[temp]], %[[const]] : i32
 !CHECK: fir.store %[[result]] to %[[PRIVATE_X]] : !fir.ref<i32>
-!CHECK: %[[true:.*]] = arith.constant true
-!CHECK: fir.if %[[true]] {
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: fir.store %[[temp]] to %[[X]] : !fir.ref<i32>
-!CHECK: }
 !CHECK: omp.terminator
 !CHECK: }
         !$omp section
@@ -145,13 +137,13 @@ subroutine lastprivate()
 !CHECK: }
     !$omp end sections
     
-!CHECK: omp.sections   {
-    !$omp sections firstprivate(x) lastprivate(x)
-!CHECK: omp.section {
 !CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: %[[temp:.*]] = fir.load %[[X]] : !fir.ref<i32>
 !CHECK: fir.store %[[temp]] to %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: omp.barrier
+!CHECK: omp.sections   {
+    !$omp sections firstprivate(x) lastprivate(x)
+!CHECK: omp.section {
 !CHECK: %[[const:.*]] = arith.constant 10 : i32
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: %[[result:.*]] = arith.muli %c10_i32, %[[temp]] : i32
@@ -161,19 +153,12 @@ subroutine lastprivate()
         !$omp section
             x = x * 10
 !CHECK: omp.section {
-!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
-!CHECK: %[[temp:.*]] = fir.load %[[X]] : !fir.ref<i32>
-!CHECK: fir.store %[[temp]] to %[[PRIVATE_X]] : !fir.ref<i32>
-!CHECK: omp.barrier
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: %[[const:.*]] = arith.constant 1 : i32
 !CHECK: %[[result:.*]] = arith.addi %[[temp]], %[[const]] : i32
 !CHECK: fir.store %[[result]] to %[[PRIVATE_X]] : !fir.ref<i32>
-!CHECK: %[[true:.*]] = arith.constant true
-!CHECK: fir.if %true {
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: fir.store %[[temp]] to %[[X]] : !fir.ref<i32>
-!CHECK: }
 !CHECK: omp.terminator
 !CHECK: }
         !$omp section
@@ -182,13 +167,13 @@ subroutine lastprivate()
 !CHECK: }
     !$omp end sections
 
-!CHECK: omp.sections nowait {
-    !$omp sections firstprivate(x) lastprivate(x)
-!CHECK: omp.section {
 !CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: %[[temp:.*]] = fir.load %[[X]] : !fir.ref<i32>
 !CHECK: fir.store %[[temp]] to %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: omp.barrier
+!CHECK: omp.sections nowait {
+    !$omp sections firstprivate(x) lastprivate(x)
+!CHECK: omp.section {
 !CHECK: %[[const:.*]] = arith.constant 10 : i32
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: %[[result:.*]] = arith.muli %c10_i32, %[[temp]] : i32
@@ -198,31 +183,24 @@ subroutine lastprivate()
         !$omp section
             x = x * 10
 !CHECK: omp.section {
-!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
-!CHECK: %[[temp:.*]] = fir.load %[[X]] : !fir.ref<i32>
-!CHECK: fir.store %[[temp]] to %[[PRIVATE_X]] : !fir.ref<i32>
-!CHECK: omp.barrier
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: %[[const:.*]] = arith.constant 1 : i32
 !CHECK: %[[result:.*]] = arith.addi %[[temp]], %[[const]] : i32
 !CHECK: fir.store %[[result]] to %[[PRIVATE_X]] : !fir.ref<i32>
-!CHECK: %[[true:.*]] = arith.constant true
-!CHECK: fir.if %true {
 !CHECK: %[[temp:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
 !CHECK: fir.store %[[temp]] to %[[X]] : !fir.ref<i32>
-!CHECK: omp.barrier
-!CHECK: }
 !CHECK: omp.terminator
 !CHECK: }
         !$omp section
             x = x + 1
 !CHECK: omp.terminator
 !CHECK: }
+!CHECK: omp.barrier
     !$omp end sections nowait
 
+!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: omp.sections {
 !CHECK: omp.section {
-!CHECK: %[[PRIVATE_X:.*]] = fir.alloca i32 {bindc_name = "x", pinned, uniq_name = "_QFlastprivateEx"}
 !CHECK: cf.br ^bb1
 !CHECK: ^bb1:  // pred: ^bb0
 !CHECK: %[[INNER_PRIVATE_X:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<i32>
@@ -247,9 +225,9 @@ subroutine lastprivate()
 
 subroutine unstructured_sections_privatization()
 !CHECK: %[[X:.*]] = fir.alloca f32 {bindc_name = "x", uniq_name = "_QFunstructured_sections_privatizationEx"}
+!CHECK: %[[PRIVATE_X:.*]] = fir.alloca f32 {bindc_name = "x", pinned, uniq_name = "_QFunstructured_sections_privatizationEx"}
 !CHECK: omp.sections {
 !CHECK: omp.section {
-!CHECK: %[[PRIVATE_X:.*]] = fir.alloca f32 {bindc_name = "x", pinned, uniq_name = "_QFunstructured_sections_privatizationEx"}
 !CHECK: cf.br ^bb1
 !CHECK: ^bb1:  // pred: ^bb0
 !CHECK: %[[INNER_PRIVATE_X:.*]] = fir.load %[[PRIVATE_X]] : !fir.ref<f32>
@@ -265,11 +243,11 @@ subroutine unstructured_sections_privatization()
             goto 40
         40  x = x + 1
     !$omp end sections
-!CHECK: omp.sections {
-!CHECK: omp.section {
 !CHECK: %[[PRIVATE_X:.*]] = fir.alloca f32 {bindc_name = "x", pinned, uniq_name = "_QFunstructured_sections_privatizationEx"}
 !CHECK: %[[temp:.*]] = fir.load %[[X]] : !fir.ref<f32>
 !CHECK: fir.store %[[temp]] to %[[PRIVATE_X]] : !fir.ref<f32>
+!CHECK: omp.sections {
+!CHECK: omp.section {
 !CHECK: ...
[truncated]

@luporl luporl force-pushed the luporl-fix-omp-sections branch from 52f40d3 to c2c06a1 Compare April 15, 2024 19:49
Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG. Thanks for fixing this. Please wait for @kparzysz

@luporl luporl force-pushed the luporl-fix-omp-sections branch from c2c06a1 to 27e17d2 Compare April 17, 2024 22:18
@luporl luporl force-pushed the luporl-fix-omp-sections branch from 27e17d2 to 3ad0dd1 Compare April 26, 2024 14:06
@luporl
Copy link
Contributor Author

luporl commented Apr 26, 2024

@kparzysz, does it look good to you now?

@luporl
Copy link
Contributor Author

luporl commented May 6, 2024

gentle ping

Besides duplicating code, privatizing variables in every section
causes problems when synchronization barriers are used. This
happens because each section is executed by a given thread, which
will cause the program to hang if not all running threads execute
the barrier operation.

Fixes llvm#72824
@luporl luporl force-pushed the luporl-fix-omp-sections branch from 3ad0dd1 to 9564a35 Compare May 6, 2024 14:43
@luporl
Copy link
Contributor Author

luporl commented May 6, 2024

Thanks for the reviews @kparzysz and @kiranchandramohan.

@luporl luporl merged commit 6542e56 into llvm:main May 6, 2024
@luporl luporl deleted the luporl-fix-omp-sections branch May 6, 2024 16:14
@amd-thtsikas
Copy link

@luporl , please check your registered email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flang][OpenMP] Execution of sections construct with firstprivate and lastparivate clauses does not terminate
5 participants