Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang] run CFG conversion on omp reduction declare ops #84953

Merged
merged 3 commits into from
Mar 20, 2024

Conversation

tblah
Copy link
Contributor

@tblah tblah commented Mar 12, 2024

Most FIR passes only look for FIR operations inside of functions (either because they run only on func.func or they run on the module but iterate over functions internally). But there can also be FIR operations inside of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they only contained very simple FIR code which doesn't need most passes to be lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure that every pass runs on all of these container operations. I will write an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also run on omp reduction operations. This is similar to how the AbstractResult pass is already duplicated for fir.global operations.

OpenMP array reductions 2/6
Previous PR: #84952
Next PR: #84954

@llvmbot
Copy link
Collaborator

llvmbot commented Mar 12, 2024

@llvm/pr-subscribers-flang-fir-hlfir

@llvm/pr-subscribers-flang-driver

Author: Tom Eccles (tblah)

Changes

Most FIR passes only look for FIR operations inside of functions (either because they run only on func.func or they run on the module but iterate over functions internally). But there can also be FIR operations inside of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they only contained very simple FIR code which doesn't need most passes to be lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure that every pass runs on all of these container operations. I will write an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also run on omp reduction operations. This is similar to how the AbstractResult pass is already duplicated for fir.global operations.

OpenMP array reductions 2/6


Full diff: https://github.com/llvm/llvm-project/pull/84953.diff

15 Files Affected:

  • (modified) flang/include/flang/Optimizer/Transforms/Passes.h (+5-2)
  • (modified) flang/include/flang/Optimizer/Transforms/Passes.td (+10-2)
  • (modified) flang/include/flang/Tools/CLOptions.inc (+3-1)
  • (modified) flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp (+22-8)
  • (modified) flang/test/Driver/bbc-mlir-pass-pipeline.f90 (+4-1)
  • (modified) flang/test/Driver/mlir-debug-pass-pipeline.f90 (+9-7)
  • (modified) flang/test/Driver/mlir-pass-pipeline.f90 (+10-6)
  • (modified) flang/test/Fir/array-value-copy-2.fir (+2-2)
  • (modified) flang/test/Fir/basic-program.fir (+4-1)
  • (modified) flang/test/Fir/convert-to-llvm-openmp-and-fir.fir (+1-1)
  • (modified) flang/test/Fir/loop01.fir (+1-1)
  • (modified) flang/test/Fir/loop02.fir (+2-2)
  • (modified) flang/test/Lower/OpenMP/FIR/flush.f90 (+1-1)
  • (modified) flang/test/Lower/OpenMP/FIR/master.f90 (+1-1)
  • (modified) flang/test/Lower/OpenMP/FIR/parallel-sections.f90 (+1-1)
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.h b/flang/include/flang/Optimizer/Transforms/Passes.h
index e1d22c8c986da7..adf747ebdb400b 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.h
+++ b/flang/include/flang/Optimizer/Transforms/Passes.h
@@ -11,6 +11,7 @@
 
 #include "flang/Optimizer/Dialect/FIROps.h"
 #include "mlir/Dialect/LLVMIR/LLVMAttrs.h"
+#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
 #include "mlir/Pass/Pass.h"
 #include "mlir/Pass/PassRegistry.h"
 #include <memory>
@@ -37,7 +38,8 @@ namespace fir {
 #define GEN_PASS_DECL_ANNOTATECONSTANTOPERANDS
 #define GEN_PASS_DECL_ARRAYVALUECOPY
 #define GEN_PASS_DECL_CHARACTERCONVERSION
-#define GEN_PASS_DECL_CFGCONVERSION
+#define GEN_PASS_DECL_CFGCONVERSIONONFUNC
+#define GEN_PASS_DECL_CFGCONVERSIONONREDUCTION
 #define GEN_PASS_DECL_EXTERNALNAMECONVERSION
 #define GEN_PASS_DECL_MEMREFDATAFLOWOPT
 #define GEN_PASS_DECL_SIMPLIFYINTRINSICS
@@ -53,7 +55,8 @@ std::unique_ptr<mlir::Pass> createAbstractResultOnGlobalOptPass();
 std::unique_ptr<mlir::Pass> createAffineDemotionPass();
 std::unique_ptr<mlir::Pass>
 createArrayValueCopyPass(fir::ArrayValueCopyOptions options = {});
-std::unique_ptr<mlir::Pass> createFirToCfgPass();
+std::unique_ptr<mlir::Pass> createFirToCfgOnFuncPass();
+std::unique_ptr<mlir::Pass> createFirToCfgOnReductionPass();
 std::unique_ptr<mlir::Pass> createCharacterConversionPass();
 std::unique_ptr<mlir::Pass> createExternalNameConversionPass();
 std::unique_ptr<mlir::Pass>
diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index 5fb576fd876254..e6ea92d814400f 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -145,7 +145,8 @@ def CharacterConversion : Pass<"character-conversion"> {
   ];
 }
 
-def CFGConversion : Pass<"cfg-conversion", "::mlir::func::FuncOp"> {
+class CFGConversionBase<string optExt, string operation>
+  : Pass<"cfg-conversion-on-" # optExt # "-opt", operation> {
   let summary = "Convert FIR structured control flow ops to CFG ops.";
   let description = [{
     Transform the `fir.do_loop`, `fir.if`, `fir.iterate_while` and
@@ -154,7 +155,6 @@ def CFGConversion : Pass<"cfg-conversion", "::mlir::func::FuncOp"> {
 
     This pass is required before code gen to the LLVM IR dialect.
   }];
-  let constructor = "::fir::createFirToCfgPass()";
   let dependentDialects = [
     "fir::FIROpsDialect", "mlir::func::FuncDialect"
   ];
@@ -165,6 +165,14 @@ def CFGConversion : Pass<"cfg-conversion", "::mlir::func::FuncOp"> {
   ];
 }
 
+def CFGConversionOnFunc : CFGConversionBase<"func", "mlir::func::FuncOp"> {
+  let constructor = "::fir::createFirToCfgOnFuncPass()";
+}
+
+def CFGConversionOnReduction : CFGConversionBase<"reduce", "mlir::omp::ReductionDeclareOp"> {
+  let constructor = "::fir::createFirToCfgOnReductionPass()";
+}
+
 def ExternalNameConversion : Pass<"external-name-interop", "mlir::ModuleOp"> {
   let summary = "Convert name for external interoperability";
   let description = [{
diff --git a/flang/include/flang/Tools/CLOptions.inc b/flang/include/flang/Tools/CLOptions.inc
index 68e504d0ccb512..21b36dba9ac704 100644
--- a/flang/include/flang/Tools/CLOptions.inc
+++ b/flang/include/flang/Tools/CLOptions.inc
@@ -123,7 +123,9 @@ static void addCanonicalizerPassWithoutRegionSimplification(
 
 inline void addCfgConversionPass(mlir::PassManager &pm) {
   addNestedPassConditionally<mlir::func::FuncOp>(
-      pm, disableCfgConversion, fir::createFirToCfgPass);
+      pm, disableCfgConversion, fir::createFirToCfgOnFuncPass);
+  addNestedPassConditionally<mlir::omp::ReductionDeclareOp>(
+      pm, disableCfgConversion, fir::createFirToCfgOnReductionPass);
 }
 
 inline void addAVC(
diff --git a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
index 0944b184ca0d4e..f972f59749bea1 100644
--- a/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
+++ b/flang/lib/Optimizer/Transforms/ControlFlowConverter.cpp
@@ -24,7 +24,8 @@
 #include "llvm/Support/CommandLine.h"
 
 namespace fir {
-#define GEN_PASS_DEF_CFGCONVERSION
+#define GEN_PASS_DEF_CFGCONVERSIONONFUNC
+#define GEN_PASS_DEF_CFGCONVERSIONONREDUCTION
 #include "flang/Optimizer/Transforms/Passes.h.inc"
 } // namespace fir
 
@@ -308,13 +309,14 @@ class CfgIterWhileConv : public mlir::OpRewritePattern<fir::IterWhileOp> {
 };
 
 /// Convert FIR structured control flow ops to CFG ops.
-class CfgConversion : public fir::impl::CFGConversionBase<CfgConversion> {
+template <typename Pass, template <typename> class PassBase>
+class CfgConversionTemplate : public PassBase<Pass> {
 public:
   void runOnOperation() override {
-    auto *context = &getContext();
+    auto *context = &this->getContext();
     mlir::RewritePatternSet patterns(context);
     patterns.insert<CfgLoopConv, CfgIfConv, CfgIterWhileConv>(
-        context, forceLoopToExecuteOnce);
+        context, this->forceLoopToExecuteOnce);
     mlir::ConversionTarget target(*context);
     target.addLegalDialect<mlir::affine::AffineDialect,
                            mlir::cf::ControlFlowDialect, FIROpsDialect,
@@ -323,19 +325,31 @@ class CfgConversion : public fir::impl::CFGConversionBase<CfgConversion> {
     // apply the patterns
     target.addIllegalOp<ResultOp, DoLoopOp, IfOp, IterWhileOp>();
     target.markUnknownOpDynamicallyLegal([](Operation *) { return true; });
-    if (mlir::failed(mlir::applyPartialConversion(getOperation(), target,
+    if (mlir::failed(mlir::applyPartialConversion(this->getOperation(), target,
                                                   std::move(patterns)))) {
       mlir::emitError(mlir::UnknownLoc::get(context),
                       "error in converting to CFG\n");
-      signalPassFailure();
+      this->signalPassFailure();
     }
   }
 };
+
+class CfgConversionOnFunc
+    : public CfgConversionTemplate<CfgConversionOnFunc,
+                                   ::fir::impl::CFGConversionOnFuncBase> {};
+
+class CfgConversionOnReduction
+    : public CfgConversionTemplate<CfgConversionOnReduction,
+                                   ::fir::impl::CFGConversionOnReductionBase> {
+};
 } // namespace
 
 /// Convert FIR's structured control flow ops to CFG ops.  This
 /// conversion enables the `createLowerToCFGPass` to transform these to CFG
 /// form.
-std::unique_ptr<mlir::Pass> fir::createFirToCfgPass() {
-  return std::make_unique<CfgConversion>();
+std::unique_ptr<mlir::Pass> fir::createFirToCfgOnFuncPass() {
+  return std::make_unique<CfgConversionOnFunc>();
+}
+std::unique_ptr<mlir::Pass> fir::createFirToCfgOnReductionPass() {
+  return std::make_unique<CfgConversionOnReduction>();
 }
diff --git a/flang/test/Driver/bbc-mlir-pass-pipeline.f90 b/flang/test/Driver/bbc-mlir-pass-pipeline.f90
index 243a620a9fd003..b25edf3599175e 100644
--- a/flang/test/Driver/bbc-mlir-pass-pipeline.f90
+++ b/flang/test/Driver/bbc-mlir-pass-pipeline.f90
@@ -38,9 +38,12 @@
 ! CHECK-NEXT:   (S) 0 num-cse'd - Number of operations CSE'd
 ! CHECK-NEXT:   (S) 0 num-dce'd - Number of operations DCE'd
 
+! CHECK-NEXT: Pipeline Collection : ['func.func', 'omp.reduction.declare']
 ! CHECK-NEXT: 'func.func' Pipeline
 ! CHECK-NEXT:   PolymorphicOpConversion
-! CHECK-NEXT:   CFGConversion
+! CHECK-NEXT:   CFGConversionOnFunc
+! CHECK-NEXT: 'omp.reduction.declare' Pipeline
+! CHECK-NEXT:   CFGConversionOnReduction
 
 ! CHECK-NEXT: SCFToControlFlow
 ! CHECK-NEXT: Canonicalizer
diff --git a/flang/test/Driver/mlir-debug-pass-pipeline.f90 b/flang/test/Driver/mlir-debug-pass-pipeline.f90
index 45b1717d7187db..a865b170c3d417 100644
--- a/flang/test/Driver/mlir-debug-pass-pipeline.f90
+++ b/flang/test/Driver/mlir-debug-pass-pipeline.f90
@@ -58,10 +58,12 @@
 ! ALL-NEXT:   (S) 0 num-cse'd - Number of operations CSE'd
 ! ALL-NEXT:   (S) 0 num-dce'd - Number of operations DCE'd
 
-! ALL-NEXT: 'func.func' Pipeline
-! ALL-NEXT:   PolymorphicOpConversion
-! ALL-NEXT:   CFGConversion
-
+! ALL-NEXT: Pipeline Collection : ['func.func', 'omp.reduction.declare']
+! ALL-NEXT:   'func.func' Pipeline
+! ALL-NEXT:     PolymorphicOpConversion
+! ALL-NEXT:     CFGConversionOnFunc
+! ALL-NEXT:   'omp.reduction.declare' Pipeline
+! ALL-NEXT:     CFGConversionOnReduction
 ! ALL-NEXT: SCFToControlFlow
 ! ALL-NEXT: Canonicalizer
 ! ALL-NEXT: SimplifyRegionLite
@@ -72,9 +74,9 @@
 
 ! ALL-NEXT: Pipeline Collection : ['fir.global', 'func.func']
 ! ALL-NEXT:   'fir.global' Pipeline
-! ALL-NEXT:    AbstractResultOnGlobalOpt
-! ALL-NEXT:  'func.func' Pipeline
-! ALL-NEXT:    AbstractResultOnFuncOpt
+! ALL-NEXT:   AbstractResultOnGlobalOpt
+! ALL-NEXT:   'func.func' Pipeline
+! ALL-NEXT:   AbstractResultOnFuncOpt
 
 ! ALL-NEXT: CodeGenRewrite
 ! ALL-NEXT:   (S) 0 num-dce'd - Number of operations eliminated
diff --git a/flang/test/Driver/mlir-pass-pipeline.f90 b/flang/test/Driver/mlir-pass-pipeline.f90
index 3d8c42f123e2eb..ee3f41a15ca481 100644
--- a/flang/test/Driver/mlir-pass-pipeline.f90
+++ b/flang/test/Driver/mlir-pass-pipeline.f90
@@ -1,8 +1,8 @@
 ! Test the MLIR pass pipeline
 
-! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline -o /dev/null %s 2>&1 | FileCheck --check-prefixes=ALL %s
+! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline -o /dev/null %s 2>&1 | FileCheck --check-prefixes=ALL,NOTO2 %s
 ! -O0 is the default:
-! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O0 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL %s
+! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O0 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,NOTO2 %s
 ! RUN: %flang_fc1 -S -mmlir --mlir-pass-statistics -mmlir --mlir-pass-statistics-display=pipeline %s -O2 -o /dev/null 2>&1 | FileCheck --check-prefixes=ALL,O2 %s
 
 ! REQUIRES: asserts
@@ -49,11 +49,15 @@
 ! ALL-NEXT:   (S) 0 num-cse'd - Number of operations CSE'd
 ! ALL-NEXT:   (S) 0 num-dce'd - Number of operations DCE'd
 
-! ALL-NEXT: 'func.func' Pipeline
-! ALL-NEXT:   PolymorphicOpConversion
+! O2-NEXT:    'func.func' Pipeline
+! O2-NEXT:      PolymorphicOpConversion
 ! O2-NEXT:  AddAliasTags
-! O2-NEXT:  'func.func' Pipeline
-! ALL-NEXT:   CFGConversion
+! ALL-NEXT: Pipeline Collection : ['func.func', 'omp.reduction.declare']
+! ALL-NEXT:    'func.func' Pipeline
+! NOTO2-NEXT:      PolymorphicOpConversion
+! ALL-NEXT:      CFGConversionOnFunc
+! ALL-NEXT:   'omp.reduction.declare' Pipeline
+! ALL-NEXT:      CFGConversionOnReduction
 
 ! ALL-NEXT: SCFToControlFlow
 ! ALL-NEXT: Canonicalizer
diff --git a/flang/test/Fir/array-value-copy-2.fir b/flang/test/Fir/array-value-copy-2.fir
index 21b340af10c6b8..cb8d6ca2b05540 100644
--- a/flang/test/Fir/array-value-copy-2.fir
+++ b/flang/test/Fir/array-value-copy-2.fir
@@ -1,5 +1,5 @@
-// RUN: fir-opt --array-value-copy --cfg-conversion %s | FileCheck %s
-// RUN: fir-opt --array-value-copy="optimize-conflicts=true" --cfg-conversion %s | FileCheck %s
+// RUN: fir-opt --array-value-copy --cfg-conversion-on-func-opt %s | FileCheck %s
+// RUN: fir-opt --array-value-copy="optimize-conflicts=true" --cfg-conversion-on-func-opt %s | FileCheck %s
 
 // CHECK-LABEL: func @_QPslice1(
 // CHECK-NOT: fir.allocmem
diff --git a/flang/test/Fir/basic-program.fir b/flang/test/Fir/basic-program.fir
index d8a9e74c318ce1..516946bd4fb46a 100644
--- a/flang/test/Fir/basic-program.fir
+++ b/flang/test/Fir/basic-program.fir
@@ -60,8 +60,11 @@ func.func @_QQmain() {
 
 // PASSES-NEXT: AddAliasTags
 
+// PASSES-NEXT: Pipeline Collection : ['func.func', 'omp.reduction.declare']
 // PASSES-NEXT: 'func.func' Pipeline
-// PASSES-NEXT:   CFGConversion
+// PASSES-NEXT:   CFGConversionOnFunc
+// PASSES-NEXT: 'omp.reduction.declare' Pipeline
+// PASSES-NEXT:   CFGConversionOnReduction
 
 // PASSES-NEXT: SCFToControlFlow
 // PASSES-NEXT: Canonicalizer
diff --git a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
index a1fc614334dbce..8d9bdca4bf5446 100644
--- a/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
+++ b/flang/test/Fir/convert-to-llvm-openmp-and-fir.fir
@@ -1,4 +1,4 @@
-// RUN: fir-opt --split-input-file --cfg-conversion --fir-to-llvm-ir="target=aarch64-unknown-linux-gnu" %s | FileCheck %s
+// RUN: fir-opt --split-input-file --cfg-conversion-on-func-opt --fir-to-llvm-ir="target=aarch64-unknown-linux-gnu" %s | FileCheck %s
 
 func.func @_QPsb1(%arg0: !fir.ref<i32> {fir.bindc_name = "n"}, %arg1: !fir.ref<!fir.array<?xi32>> {fir.bindc_name = "arr"}) {
   %c1_i64 = arith.constant 1 : i64
diff --git a/flang/test/Fir/loop01.fir b/flang/test/Fir/loop01.fir
index 72ca1c3989e453..c849797b969eba 100644
--- a/flang/test/Fir/loop01.fir
+++ b/flang/test/Fir/loop01.fir
@@ -1,4 +1,4 @@
-// RUN: fir-opt --split-input-file --cfg-conversion %s | FileCheck %s
+// RUN: fir-opt --split-input-file --cfg-conversion-on-func-opt %s | FileCheck %s
 
 func.func @x(%lb : index, %ub : index, %step : index, %b : i1, %addr : !fir.ref<index>) {
   fir.do_loop %iv = %lb to %ub step %step unordered {
diff --git a/flang/test/Fir/loop02.fir b/flang/test/Fir/loop02.fir
index 50948e0e7aa6b5..8918666f0b3460 100644
--- a/flang/test/Fir/loop02.fir
+++ b/flang/test/Fir/loop02.fir
@@ -1,5 +1,5 @@
-// RUN: fir-opt --cfg-conversion="always-execute-loop-body=true" %s | FileCheck %s
-// RUN: fir-opt --cfg-conversion %s | FileCheck %s --check-prefix=NOOPT
+// RUN: fir-opt --cfg-conversion-on-func-opt="always-execute-loop-body=true" %s | FileCheck %s
+// RUN: fir-opt --cfg-conversion-on-func-opt %s | FileCheck %s --check-prefix=NOOPT
 
 func.func @x(%addr : !fir.ref<index>) {
   %bound = arith.constant 452 : index
diff --git a/flang/test/Lower/OpenMP/FIR/flush.f90 b/flang/test/Lower/OpenMP/FIR/flush.f90
index 2c281632b85cb0..2868367fbdba64 100644
--- a/flang/test/Lower/OpenMP/FIR/flush.f90
+++ b/flang/test/Lower/OpenMP/FIR/flush.f90
@@ -1,7 +1,7 @@
 ! This test checks lowering of OpenMP Flush Directive.
 
 !RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes="FIRDialect,OMPDialect"
-!RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | fir-opt --cfg-conversion | fir-opt --fir-to-llvm-ir | FileCheck %s --check-prefixes="LLVMIRDialect,OMPDialect"
+!RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | fir-opt --cfg-conversion-on-func-opt | fir-opt --fir-to-llvm-ir | FileCheck %s --check-prefixes="LLVMIRDialect,OMPDialect"
 
 subroutine flush_standalone(a, b, c)
     integer, intent(inout) :: a, b, c
diff --git a/flang/test/Lower/OpenMP/FIR/master.f90 b/flang/test/Lower/OpenMP/FIR/master.f90
index dd9910da2f4190..3bac582c7725a8 100644
--- a/flang/test/Lower/OpenMP/FIR/master.f90
+++ b/flang/test/Lower/OpenMP/FIR/master.f90
@@ -1,5 +1,5 @@
 !RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes="FIRDialect,OMPDialect"
-!RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | fir-opt --cfg-conversion | fir-opt --fir-to-llvm-ir | FileCheck %s --check-prefixes="OMPDialect"
+!RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | fir-opt --cfg-conversion-on-func-opt | fir-opt --fir-to-llvm-ir | FileCheck %s --check-prefixes="OMPDialect"
 
 !===============================================================================
 ! parallel construct with function call which has master construct internally
diff --git a/flang/test/Lower/OpenMP/FIR/parallel-sections.f90 b/flang/test/Lower/OpenMP/FIR/parallel-sections.f90
index 33fda178323f2b..78d73f038f0793 100644
--- a/flang/test/Lower/OpenMP/FIR/parallel-sections.f90
+++ b/flang/test/Lower/OpenMP/FIR/parallel-sections.f90
@@ -1,5 +1,5 @@
 !RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | FileCheck %s --check-prefixes="FIRDialect,OMPDialect"
-!RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | fir-opt --cfg-conversion | fir-opt --fir-to-llvm-ir | FileCheck %s --check-prefixes="OMPDialect,LLVMDialect"
+!RUN: %flang_fc1 -emit-fir -flang-deprecated-no-hlfir -fopenmp %s -o - | fir-opt --cfg-conversion-on-func-opt | fir-opt --fir-to-llvm-ir | FileCheck %s --check-prefixes="OMPDialect,LLVMDialect"
 
 !===============================================================================
 ! Parallel sections construct

@clementval
Copy link
Contributor

Wouldn't it be cleaner to expose the patterns via a populateFirCfgConversionPatterns function and reuse it in you extra pass instead of making two pass from the initial file?

We did this recently for the FirToLLVM patterns. #83492

@tblah
Copy link
Contributor Author

tblah commented Mar 14, 2024

Wouldn't it be cleaner to expose the patterns via a populateFirCfgConversionPatterns function and reuse it in you extra pass instead of making two pass from the initial file?

We did this recently for the FirToLLVM patterns. #83492

Thanks for taking a look at this. I have extracted the patterns into an externally visible function, but I don't understand the benefit of splitting the two CFG conversion passes out into different files. I think this could be confusing because it would be unclear which file one should contain the definition of those conversion patterns. Keeping it in one file makes it clear that both do exactly the same thing on different target operations.

@clementval
Copy link
Contributor

Wouldn't it be cleaner to expose the patterns via a populateFirCfgConversionPatterns function and reuse it in you extra pass instead of making two pass from the initial file?
We did this recently for the FirToLLVM patterns. #83492

Thanks for taking a look at this. I have extracted the patterns into an externally visible function, but I don't understand the benefit of splitting the two CFG conversion passes out into different files. I think this could be confusing because it would be unclear which file one should contain the definition of those conversion patterns. Keeping it in one file makes it clear that both do exactly the same thing on different target operations.

Wouldn't applying the patterns on the module in a single pass work here as well?

@tblah
Copy link
Contributor Author

tblah commented Mar 15, 2024

Wouldn't applying the patterns on the module in a single pass work here as well?

Yes that would work but we would loose parallelism. I wanted to keep that because there was talk in the past about breaking up the existing Module passes so that different functions (or other container operations) can be processed in parallel

@tblah tblah force-pushed the users/tblah/omp_array_reduction_1 branch from 636566e to 595cea7 Compare March 15, 2024 11:45
Base automatically changed from users/tblah/omp_array_reduction_1 to main March 15, 2024 11:46
tblah added a commit that referenced this pull request Mar 15, 2024
…4952)

Advise to place the alloca at the start of the first block of whichever
region (init or combiner) we are currently inside.

It probably isn't safe to put an alloca inside of a combiner region
because this will be executed multiple times. But that would be a bug to
fix in Lower/OpenMP.cpp, not here.

OpenMP array reductions 1/6
Next PR: #84953
@clementval
Copy link
Contributor

Wouldn't applying the patterns on the module in a single pass work here as well?

Yes that would work but we would loose parallelism. I wanted to keep that because there was talk in the past about breaking up the existing Module passes so that different functions (or other container operations) can be processed in parallel

Ok that makes sense. It would be nice to have a solution to for top level operations since we have more than two. Have you discuss with MLIR folks yet?

@tblah
Copy link
Contributor Author

tblah commented Mar 19, 2024

Wouldn't applying the patterns on the module in a single pass work here as well?

Yes that would work but we would loose parallelism. I wanted to keep that because there was talk in the past about breaking up the existing Module passes so that different functions (or other container operations) can be processed in parallel

Ok that makes sense. It would be nice to have a solution to for top level operations since we have more than two. Have you discuss with MLIR folks yet?

I have created an RFC here https://discourse.llvm.org/t/rfc-add-an-interface-for-top-level-container-operations/77807

Would it be okay to merge this pass as it is for now (there is already precedent in how fir.global is handled) and then fix everything together once a solution has been agreed upon?

@clementval
Copy link
Contributor

Wouldn't applying the patterns on the module in a single pass work here as well?

Yes that would work but we would loose parallelism. I wanted to keep that because there was talk in the past about breaking up the existing Module passes so that different functions (or other container operations) can be processed in parallel

Ok that makes sense. It would be nice to have a solution to for top level operations since we have more than two. Have you discuss with MLIR folks yet?

I have created an RFC here https://discourse.llvm.org/t/rfc-add-an-interface-for-top-level-container-operations/77807

Would it be okay to merge this pass as it is for now (there is already precedent in how fir.global is handled) and then fix everything together once a solution has been agreed upon?

Thanks for the RFC. I'm ok if you merge this for the time being.

tblah and others added 2 commits March 19, 2024 20:15
Most FIR passes only look for FIR operations inside of functions (either
because they run only on func.func or they run on the module but iterate
over functions internally). But there can also be FIR operations inside
of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they
only contained very simple FIR code which doesn't need most passes to be
lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure
that every pass runs on all of these container operations. I will write
an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also
run on omp reduction operations. This is similar to how the
AbstractResult pass is already duplicated for fir.global operations.

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
@tblah tblah force-pushed the users/tblah/omp_array_reduction_2 branch from 30bcab6 to cdad77c Compare March 19, 2024 20:26
@tblah
Copy link
Contributor Author

tblah commented Mar 19, 2024

Thanks for the review.

I rebased on main so it is clearer to everyone what is part of this patch.

Copy link
Contributor

@kiranchandramohan kiranchandramohan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Please add a test before you submit.

@tblah tblah merged commit 1f1e094 into main Mar 20, 2024
3 of 4 checks passed
@tblah tblah deleted the users/tblah/omp_array_reduction_2 branch March 20, 2024 09:47
tblah added a commit that referenced this pull request Mar 20, 2024
OpenMP reduction declare operations can contain FIR code which needs to
be lowered to LLVM. With array reductions, these regions can contain
more complicated operations which need PreCGRewriting. A similar extra
case was already needed for fir::GlobalOp.

OpenMP array reductions 3/6
Previous PR: #84953
Next PR: #84955
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
Most FIR passes only look for FIR operations inside of functions (either
because they run only on func.func or they run on the module but iterate
over functions internally). But there can also be FIR operations inside
of fir.global, some OpenMP and OpenACC container operations.

This has worked so far for fir.global and OpenMP reductions because they
only contained very simple FIR code which doesn't need most passes to be
lowered into LLVM IR. I am not sure how OpenACC works.

In the long run, I hope to see a more systematic approach to making sure
that every pass runs on all of these container operations. I will write
an RFC for this soon.

In the meantime, this pass duplicates the CFG conversion pass to also
run on omp reduction operations. This is similar to how the
AbstractResult pass is already duplicated for fir.global operations.

OpenMP array reductions 2/6
Previous PR: llvm#84952
Next PR: llvm#84954

---------

Co-authored-by: Mats Petersson <mats.petersson@arm.com>
chencha3 pushed a commit to chencha3/llvm-project that referenced this pull request Mar 23, 2024
…84954)

OpenMP reduction declare operations can contain FIR code which needs to
be lowered to LLVM. With array reductions, these regions can contain
more complicated operations which need PreCGRewriting. A similar extra
case was already needed for fir::GlobalOp.

OpenMP array reductions 3/6
Previous PR: llvm#84953
Next PR: llvm#84955
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:driver flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants