Skip to content

[flang][OpenMP] Support user-defined declare reduction with derived types#190288

Merged
Jason-Van-Beusekom merged 1 commit into
llvm:mainfrom
MattPD:flang_176278_declare-reduction-derived
Apr 22, 2026
Merged

[flang][OpenMP] Support user-defined declare reduction with derived types#190288
Jason-Van-Beusekom merged 1 commit into
llvm:mainfrom
MattPD:flang_176278_declare-reduction-derived

Conversation

@MattPD
Copy link
Copy Markdown
Member

@MattPD MattPD commented Apr 2, 2026

Fix lowering of !$omp declare reduction for intrinsic operators applied
to user-defined derived types (e.g., + on type(t)). Previously, this
hit a TODO in ReductionProcessor::getReductionInitValue because the code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the DeclareReductionOp.

This fixes the issue #176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: genOMP for
OpenMPDeclareReductionConstruct used a raw operator string (e.g., "Add")
as the reduction name, while processReductionArguments at the use site
computed a canonical name via getReductionName (e.g.,
"add_reduction_byref_rec__QFTt"). The lookupSymbol in
createDeclareReductionHelper never found the already-created op, so it
fell through to createDeclareReduction which called
getReductionInitValue with the derived type and hit the TODO.

The fix has four parts:

  1. Consistent names: In genOMP for OpenMPDeclareReductionConstruct,
    compute the reduction name using the same getReductionName scheme that
    processReductionArguments uses, so both sites produce identical symbol
    names. For intrinsic operators, this maps through ReductionIdentifier to
    get the canonical name. For user-defined named reductions, the raw symbol
    name is used directly, matching the existing custom-reduction lookup path.

  2. Reuse reduction: In processReductionArguments, when an intrinsic
    operator reduction is requested, check whether a user-defined declare
    reduction already exists under that canonical name before attempting to
    create a new one. If found, reuse it. This avoids calling
    createDeclareReduction (and thus getReductionInitValue) for types that
    have user-provided initializers.

  3. Reference semantics: Change doReductionByRef to return true for
    derived types. Previously it returned false for both trivial and derived
    types, treating derived types as by-val. This is incorrect for
    user-defined combiners that operate on components via side-effects (e.g.,
    omp_out%x = omp_out%x + omp_in%x): the combiner mutates omp_out in
    place and does not produce a whole-struct value, so convertExprToValue
    returns the component type (i32) rather than the struct type, causing a
    type mismatch in the omp.yield. By-ref is the correct model: the
    combiner stores into the lhs reference and yields it.

The combiner callback in processReductionCombiner is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level assignments),
the store is skipped since the assignment already wrote into omp_out as a
side-effect, and only the lhs reference is yielded.

  1. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
    codegen in OMPIRBuilder unconditionally called the DataPtrPtrGen
    callback for all by-ref elements, assuming descriptor-based pointer
    indirection was always needed. For simple derived types (e.g., a struct
    with scalar integer members), there is no Fortran descriptor/box, so
    dataPtrPtrRegion is empty and DataPtrPtrGen is null. This caused a
    std::bad_function_call crash (or an assertion failure in
    getSingleElement when the empty region was inlined).

The fix returns a null DataPtrPtrGen callback from makeRefDataPtrGen
when dataPtrPtrRegion is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when DataPtrPtrGen is null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:

  • Update declare-reduction-intrinsic-op.f90 from a negative test
    (checking for the TODO error) to a positive test checking generated MLIR.
  • Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
    reference semantics fix.
  • Add declare-reduction-finalizer.f90 for derived type with init and
    finalization.

Remaining work: declare reduction without an initializer clause is not yet
supported.

Co-authored-by: Matt P. Dziubinski matt-p.dziubinski@hpe.com
Assisted-by: Claude Opus 4.6.

@llvmbot llvmbot added mlir:llvm mlir flang Flang issues not falling into any other category mlir:openmp flang:fir-hlfir flang:openmp clang:openmp OpenMP related changes to Clang labels Apr 2, 2026
@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 2, 2026

@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-llvm

Author: Matt (MattPD)

Changes

Fix lowering of !$omp declare reduction for intrinsic operators applied
to user-defined derived types (e.g., + on type(t)). Previously, this
hit a TODO in ReductionProcessor::getReductionInitValue because the code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the DeclareReductionOp.

This fixes the issue #176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: genOMP for
OpenMPDeclareReductionConstruct used a raw operator string (e.g., "Add")
as the reduction name, while processReductionArguments at the use site
computed a canonical name via getReductionName (e.g.,
"add_reduction_byref_rec__QFTt"). The lookupSymbol in
createDeclareReductionHelper never found the already-created op, so it
fell through to createDeclareReduction which called
getReductionInitValue with the derived type and hit the TODO.

The fix has four parts:

  1. Consistent names: In genOMP for OpenMPDeclareReductionConstruct,
    compute the reduction name using the same getReductionName scheme that
    processReductionArguments uses, so both sites produce identical symbol
    names. For intrinsic operators, this maps through ReductionIdentifier to
    get the canonical name. For user-defined named reductions, the raw symbol
    name is used directly, matching the existing custom-reduction lookup path.

  2. Reuse reduction: In processReductionArguments, when an intrinsic
    operator reduction is requested, check whether a user-defined declare
    reduction already exists under that canonical name before attempting to
    create a new one. If found, reuse it. This avoids calling
    createDeclareReduction (and thus getReductionInitValue) for types that
    have user-provided initializers.

  3. Reference semantics: Change doReductionByRef to return true for
    derived types. Previously it returned false for both trivial and derived
    types, treating derived types as by-val. This is incorrect for
    user-defined combiners that operate on components via side-effects (e.g.,
    omp_out%x = omp_out%x + omp_in%x): the combiner mutates omp_out in
    place and does not produce a whole-struct value, so convertExprToValue
    returns the component type (i32) rather than the struct type, causing a
    type mismatch in the omp.yield. By-ref is the correct model: the
    combiner stores into the lhs reference and yields it.

The combiner callback in processReductionCombiner is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level assignments),
the store is skipped since the assignment already wrote into omp_out as a
side-effect, and only the lhs reference is yielded.

  1. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
    codegen in OMPIRBuilder unconditionally called the DataPtrPtrGen
    callback for all by-ref elements, assuming descriptor-based pointer
    indirection was always needed. For simple derived types (e.g., a struct
    with scalar integer members), there is no Fortran descriptor/box, so
    dataPtrPtrRegion is empty and DataPtrPtrGen is null. This caused a
    std::bad_function_call crash (or an assertion failure in
    getSingleElement when the empty region was inlined).

The fix returns a null DataPtrPtrGen callback from makeRefDataPtrGen
when dataPtrPtrRegion is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when DataPtrPtrGen is null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:

  • Update declare-reduction-intrinsic-op.f90 from a negative test
    (checking for the TODO error) to a positive test checking generated MLIR.
  • Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
    reference semantics fix.
  • Add declare-reduction-finalizer.f90 for derived type with init and
    finalization.

Remaining work: declare reduction without an initializer clause is not yet
supported.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Assisted-by: Claude Opus 4.6.


Patch is 43.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/190288.diff

11 Files Affected:

  • (modified) flang/include/flang/Lower/OpenMP/Clauses.h (+11)
  • (modified) flang/include/flang/Lower/Support/ReductionProcessor.h (+7-4)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+13-12)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+151-30)
  • (modified) flang/lib/Lower/Support/PrivateReductionUtils.cpp (+26)
  • (modified) flang/lib/Lower/Support/ReductionProcessor.cpp (+25-10)
  • (added) flang/test/Lower/OpenMP/declare-reduction-finalizer.f90 (+86)
  • (modified) flang/test/Lower/OpenMP/declare-reduction-intrinsic-op.f90 (+25-2)
  • (modified) flang/test/Lower/OpenMP/omp-declare-reduction-derivedtype.f90 (+18-22)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+56-50)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1)
diff --git a/flang/include/flang/Lower/OpenMP/Clauses.h b/flang/include/flang/Lower/OpenMP/Clauses.h
index a325e74327240..f334374280c73 100644
--- a/flang/include/flang/Lower/OpenMP/Clauses.h
+++ b/flang/include/flang/Lower/OpenMP/Clauses.h
@@ -329,6 +329,17 @@ using UsesAllocators = tomp::clause::UsesAllocatorsT<TypeTy, IdTy, ExprTy>;
 using Weak = tomp::clause::WeakT<TypeTy, IdTy, ExprTy>;
 using When = tomp::clause::WhenT<TypeTy, IdTy, ExprTy>;
 using Write = tomp::clause::WriteT<TypeTy, IdTy, ExprTy>;
+
+DefinedOperator makeDefinedOperator(const parser::DefinedOperator &inp,
+                                    semantics::SemanticsContext &semaCtx);
+
+ProcedureDesignator
+makeProcedureDesignator(const parser::ProcedureDesignator &inp,
+                        semantics::SemanticsContext &semaCtx);
+
+ReductionOperator
+makeReductionOperator(const parser::OmpReductionIdentifier &inp,
+                      semantics::SemanticsContext &semaCtx);
 } // namespace clause
 
 using tomp::type::operator==;
diff --git a/flang/include/flang/Lower/Support/ReductionProcessor.h b/flang/include/flang/Lower/Support/ReductionProcessor.h
index bd0447360f089..bbc4879bbe352 100644
--- a/flang/include/flang/Lower/Support/ReductionProcessor.h
+++ b/flang/include/flang/Lower/Support/ReductionProcessor.h
@@ -40,9 +40,11 @@ namespace omp {
 
 class ReductionProcessor {
 public:
-  using GenInitValueCBTy =
-      std::function<mlir::Value(fir::FirOpBuilder &builder, mlir::Location loc,
-                                mlir::Type type, mlir::Value ompOrig)>;
+  // ompOrig: mold/original variable
+  // ompPriv: private allocation (may be null for by-value reductions)
+  using GenInitValueCBTy = std::function<mlir::Value(
+      fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type type,
+      mlir::Value ompOrig, mlir::Value ompPriv)>;
   using GenCombinerCBTy = std::function<void(
       fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type type,
       mlir::Value op1, mlir::Value op2, bool isByRef)>;
@@ -126,7 +128,8 @@ class ReductionProcessor {
   static DeclareRedType createDeclareReductionHelper(
       AbstractConverter &converter, llvm::StringRef reductionOpName,
       mlir::Type type, mlir::Location loc, bool isByRef,
-      GenCombinerCBTy genCombinerCB, GenInitValueCBTy genInitValueCB);
+      GenCombinerCBTy genCombinerCB, GenInitValueCBTy genInitValueCB,
+      const semantics::Symbol *sym = nullptr);
 
   /// Creates an OpenMP reduction declaration and inserts it into the provided
   /// symbol table. The declaration has a constant initializer with the neutral
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index 85493bf45453e..e856fdf5c0fad 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -388,7 +388,8 @@ bool ClauseProcessor::processInitializer(
     ReductionProcessor::GenInitValueCBTy &genInitValueCB) const {
   if (auto *clause = findUniqueClause<omp::clause::Initializer>()) {
     genInitValueCB = [&, clause](fir::FirOpBuilder &builder, mlir::Location loc,
-                                 mlir::Type type, mlir::Value ompOrig) {
+                                 mlir::Type type, mlir::Value moldArg,
+                                 mlir::Value privArg) {
       lower::SymMapScope scope(symMap);
       mlir::Value ompPrivVar;
       const StylizedInstance &inst = clause->v.front();
@@ -396,9 +397,10 @@ bool ClauseProcessor::processInitializer(
       for (const Object &object :
            std::get<StylizedInstance::Variables>(inst.t)) {
         mlir::Value addr;
-        mlir::Type ompOrigType = ompOrig.getType();
+        std::string name = object.sym()->name().ToString();
+        mlir::Type moldArgType = moldArg.getType();
         // Check for unsupported dynamic-length character reductions
-        mlir::Type unwrappedType = fir::unwrapRefType(ompOrigType);
+        mlir::Type unwrappedType = fir::unwrapRefType(moldArgType);
         if (mlir::isa<fir::BoxCharType>(unwrappedType)) {
           TODO(loc, "OpenMP reduction allocation for dynamic length character");
         }
@@ -408,18 +410,20 @@ bool ClauseProcessor::processInitializer(
                  "OpenMP reduction allocation for dynamic length character");
           }
         }
-        // If ompOrig is already a reference, we can use it directly
-        if (fir::isa_ref_type(ompOrigType)) {
-          addr = ompOrig;
+        // For by-ref reductions, omp_priv maps to privArg (the private
+        // allocation) and omp_orig maps to moldArg (the original).
+        if (name == "omp_priv" && privArg) {
+          addr = privArg;
+        } else if (fir::isa_ref_type(moldArgType)) {
+          addr = moldArg;
         } else {
-          addr = builder.createTemporary(loc, ompOrigType);
-          fir::StoreOp::create(builder, loc, ompOrig, addr);
+          addr = builder.createTemporary(loc, moldArgType);
+          fir::StoreOp::create(builder, loc, moldArg, addr);
         }
         fir::FortranVariableFlagsEnum extraFlags = {};
         fir::FortranVariableFlagsAttr attributes =
             Fortran::lower::translateSymbolAttributes(
                 builder.getContext(), *object.sym(), extraFlags);
-        std::string name = object.sym()->name().ToString();
         // Get length parameters for types that need them (e.g., characters).
         // Note: DeclareOp requires exactly one type parameter for non-boxed
         // characters, unlike EmboxOp which doesn't allow them for constant-len.
@@ -451,9 +455,6 @@ bool ClauseProcessor::processInitializer(
               [&](const auto &expr) -> mlir::Value {
                 mlir::Value exprResult = fir::getBase(convertExprToValue(
                     loc, converter, initExpr, symMap, stmtCtx));
-                // Conversion can either give a value or a refrence to a value,
-                // we need to return the reduction type, so an optional load may
-                // be generated.
                 if (auto refType = llvm::dyn_cast<fir::ReferenceType>(
                         exprResult.getType()))
                   if (ompPrivVar.getType() == refType)
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index e2018add11206..5b449c8ba46fd 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -18,6 +18,7 @@
 #include "Decomposer.h"
 #include "Utils.h"
 #include "flang/Common/idioms.h"
+#include "flang/Evaluate/expression.h"
 #include "flang/Evaluate/tools.h"
 #include "flang/Evaluate/type.h"
 #include "flang/Lower/Bridge.h"
@@ -3796,14 +3797,27 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
 
 static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
     lower::AbstractConverter &converter, lower::SymMap &symTable,
-    semantics::SemanticsContext &semaCtx, const clause::Combiner &combiner) {
+    semantics::SemanticsContext &semaCtx, const clause::Combiner &combiner,
+    const parser::OmpStylizedInstance &parserInst) {
+  // Extract the typed assignment from the parser-level instance, if
+  // the combiner is an assignment statement (as opposed to a call).
+  const evaluate::Assignment *assign = nullptr;
+  const auto &instance =
+      std::get<parser::OmpStylizedInstance::Instance>(parserInst.t);
+  if (const auto *assignStmt =
+          std::get_if<parser::AssignmentStmt>(&instance.u)) {
+    if (auto *wrapper = assignStmt->typedAssignment.get())
+      if (wrapper->v)
+        assign = &*wrapper->v;
+  }
   ReductionProcessor::GenCombinerCBTy genCombinerCB;
   const StylizedInstance &inst = combiner.v.front();
   semantics::SomeExpr evalExpr = std::get<StylizedInstance::Instance>(inst.t);
 
-  genCombinerCB = [&, evalExpr](fir::FirOpBuilder &builder, mlir::Location loc,
-                                mlir::Type type, mlir::Value lhs,
-                                mlir::Value rhs, bool isByRef) {
+  genCombinerCB = [&, evalExpr, assign](fir::FirOpBuilder &builder,
+                                        mlir::Location loc, mlir::Type type,
+                                        mlir::Value lhs, mlir::Value rhs,
+                                        bool isByRef) {
     lower::SymMapScope scope(symTable);
     mlir::Value ompOutVar;
     for (const Object &object : std::get<StylizedInstance::Variables>(inst.t)) {
@@ -3838,6 +3852,44 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
       symTable.addVariableDefinition(*object.sym(), declareOp);
     }
 
+    // For derived types with a typed assignment available, use
+    // hlfir::AssignOp or user-defined assignment directly instead of
+    // trying to convert the expression to a value (which doesn't work
+    // for record types).  Only take this path when the assignment RHS
+    // itself is a derived type -- i.e. the combiner assigns to the whole
+    // derived-type variable (e.g. omp_out = mycombine(omp_out, omp_in)).
+    // When the combiner assigns to a component (e.g. omp_out%x = ...),
+    // the RHS is a scalar intrinsic type and the existing convertExprToValue
+    // path handles it correctly.
+    bool rhsIsDerived =
+        assign && assign->rhs.GetType() &&
+        assign->rhs.GetType()->category() == common::TypeCategory::Derived;
+    if (rhsIsDerived && isByRef &&
+        mlir::isa<fir::RecordType>(fir::unwrapRefType(lhs.getType()))) {
+      lower::StatementContext stmtCtx;
+      hlfir::Entity lhsEntity{ompOutVar};
+      hlfir::Entity rhsEntity = lower::convertExprToHLFIR(
+          loc, converter, assign->rhs, symTable, stmtCtx);
+      common::visit(
+          common::visitors{
+              [&](const evaluate::Assignment::Intrinsic &) {
+                hlfir::AssignOp::create(builder, loc, rhsEntity, lhsEntity);
+              },
+              [&](const evaluate::ProcedureRef &procRef) {
+                lower::convertUserDefinedAssignmentToHLFIR(
+                    loc, converter, procRef, lhsEntity, rhsEntity, symTable);
+              },
+              [&](const auto &) {
+                llvm_unreachable(
+                    "Unexpected assignment type in reduction combiner");
+              },
+          },
+          assign->u);
+      stmtCtx.finalizeAndPop();
+      mlir::omp::YieldOp::create(builder, loc, lhs);
+      return;
+    }
+
     lower::StatementContext stmtCtx;
     mlir::Value result = common::visit(
         common::visitors{
@@ -3845,6 +3897,10 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
               convertCallToHLFIR(loc, converter, procRef, std::nullopt,
                                  symTable, stmtCtx);
               auto outVal = fir::LoadOp::create(builder, loc, ompOutVar);
+              if (isByRef) {
+                fir::StoreOp::create(builder, loc, outVal, lhs);
+                return mlir::Value{};
+              }
               return outVal;
             },
             [&](const auto &expr) -> mlir::Value {
@@ -3859,12 +3915,35 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
                 if (expectedType == refType.getElementType())
                   exprResult = fir::LoadOp::create(builder, loc, exprResult);
               }
+              // For component-level derived-type combiners (e.g.
+              // omp_out%x = omp_out%x + omp_in%x), the assignment was
+              // not performed during expression lowering since
+              // convertExprToValue only evaluates the RHS value.
+              // The result type won't match the reduction variable type.
+              // Use the typed assignment LHS to store to the correct
+              // component, then skip the whole-variable store.
+              if (isByRef &&
+                  exprResult.getType() != fir::unwrapRefType(lhs.getType())) {
+                if (assign) {
+                  lower::StatementContext assignCtx;
+                  hlfir::Entity lhsEntity = lower::convertExprToHLFIR(
+                      loc, converter, assign->lhs, symTable, assignCtx);
+                  hlfir::AssignOp::create(builder, loc, exprResult, lhsEntity);
+                  assignCtx.finalizeAndPop();
+                } else {
+                  fir::StoreOp::create(builder, loc, exprResult, ompOutVar);
+                }
+                return mlir::Value{};
+              }
+              if (isByRef) {
+                fir::StoreOp::create(builder, loc, exprResult, lhs);
+                return mlir::Value{};
+              }
               return exprResult;
             }},
         evalExpr.u);
     stmtCtx.finalizeAndPop();
     if (isByRef) {
-      fir::StoreOp::create(builder, loc, result, lhs);
       mlir::omp::YieldOp::create(builder, loc, lhs);
     } else {
       mlir::omp::YieldOp::create(builder, loc, result);
@@ -3957,41 +4036,83 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
   const auto &identifier =
       std::get<parser::OmpReductionIdentifier>(specifier.t);
 
-  std::string reductionNameStr = Fortran::common::visit(
-      common::visitors{
-          [](const parser::ProcedureDesignator &pd) -> std::string {
-            return std::get<parser::Name>(pd.u).ToString();
-          },
-          [](const parser::DefinedOperator &defOp) -> std::string {
-            return Fortran::common::visit(
-                common::visitors{
-                    [](const parser::DefinedOpName &opName) -> std::string {
-                      return opName.v.ToString();
-                    },
-                    [](parser::DefinedOperator::IntrinsicOperator intrOp)
-                        -> std::string {
-                      return std::string(
-                          parser::DefinedOperator::EnumToString(intrOp));
-                    },
-                },
-                defOp.u);
-          },
-      },
-      identifier.u);
+  // Convert the parser-level reduction identifier to the clause-level
+  // representation, then use ReductionProcessor to derive the canonical name.
+  clause::ReductionOperator redOp =
+      clause::makeReductionOperator(identifier, semaCtx);
+
+  // Get the parser-level combiner expression so we can pass each
+  // parser::OmpStylizedInstance to processReductionCombiner.
+  // The combiner expression's instances correspond 1:1 to typeNameList entries.
+  const auto *combinerExpr = parser::omp::GetCombinerExpr(specifier);
+  assert(combinerExpr && "Expecting combiner expression");
+  auto parserInstIt = combinerExpr->v.begin();
 
   for (const auto &typeSpec : typeNameList.v) {
     (void)typeSpec; // Currently unused
+
+    assert(parserInstIt != combinerExpr->v.end() &&
+           "Mismatched combiner instance count");
+    const parser::OmpStylizedInstance &parserInst = *parserInstIt++;
+
     mlir::Type reductionType = getReductionType(converter, specifier);
+    bool isByRef = ReductionProcessor::doReductionByRef(reductionType);
+    // Compute the canonical reduction name the same way
+    // processReductionArguments does.
+    std::string reductionNameStr = Fortran::common::visit(
+        common::visitors{
+            [&](const clause::DefinedOperator &defOp) -> std::string {
+              return Fortran::common::visit(
+                  common::visitors{
+                      [&](const clause::DefinedOperator::IntrinsicOperator
+                              &intrOp) -> std::string {
+                        ReductionProcessor::ReductionIdentifier redId =
+                            ReductionProcessor::getReductionType(intrOp);
+                        return ReductionProcessor::getReductionName(
+                            redId, converter.getFirOpBuilder().getKindMap(),
+                            reductionType, isByRef);
+                      },
+                      [&](const clause::DefinedOperator::DefinedOpName &opName)
+                          -> std::string {
+                        return opName.v.sym()->name().ToString();
+                      },
+                  },
+                  defOp.u);
+            },
+            [&](const clause::ProcedureDesignator &pd) -> std::string {
+              return pd.v.sym()->name().ToString();
+            },
+        },
+        redOp.u);
+
     ReductionProcessor::GenCombinerCBTy genCombinerCB =
-        processReductionCombiner(converter, symTable, semaCtx, combiner);
+        processReductionCombiner(converter, symTable, semaCtx, combiner,
+                                 parserInst);
     ReductionProcessor::GenInitValueCBTy genInitValueCB;
     ClauseProcessor cp(converter, semaCtx, clauses);
     cp.processInitializer(symTable, genInitValueCB);
-    bool isByRef = ReductionProcessor::doReductionByRef(reductionType);
+    mlir::Type redType =
+        isByRef
+            ? static_cast<mlir::Type>(fir::ReferenceType::get(reductionType))
+            : reductionType;
+
+    // Get the omp_out symbol from the combiner for finalization checks
+    // in populateByRefInitAndCleanupRegions.
+    const semantics::Symbol *reductionSym = nullptr;
+    const auto &declList =
+        std::get<std::list<parser::OmpStylizedDeclaration>>(parserInst.t);
+    for (const auto &decl : declList) {
+      const auto &name = std::get<parser::ObjectName>(decl.var.t);
+      if (name.ToString() == "omp_out") {
+        reductionSym = name.symbol;
+        break;
+      }
+    }
+
     ReductionProcessor::createDeclareReductionHelper<
         mlir::omp::DeclareReductionOp>(
-        converter, reductionNameStr, reductionType,
-        converter.getCurrentLocation(), isByRef, genCombinerCB, genInitValueCB);
+        converter, reductionNameStr, redType, converter.getCurrentLocation(),
+        isByRef, genCombinerCB, genInitValueCB, reductionSym);
   }
 }
 
diff --git a/flang/lib/Lower/Support/PrivateReductionUtils.cpp b/flang/lib/Lower/Support/PrivateReductionUtils.cpp
index d1a965d288cad..6314d8c389954 100644
--- a/flang/lib/Lower/Support/PrivateReductionUtils.cpp
+++ b/flang/lib/Lower/Support/PrivateReductionUtils.cpp
@@ -137,6 +137,20 @@ static void createCleanupRegion(Fortran::lower::AbstractConverter &converter,
     return;
   }
 
+  // Handle unboxed derived types that need finalization (e.g. types with
+  // FINAL subroutines). Embox the reference and call the runtime destroy.
+  if (fir::isa_derived(valTy) && mlir::isa<fir::ReferenceType>(argType)) {
+    mlir::Type boxTy = fir::BoxType::get(valTy);
+    mlir::Value box =
+        fir::EmboxOp::create(builder, loc, boxTy, block->getArgument(0));
+    fir::runtime::genDerivedTypeDestroy(builder, loc, box);
+    if (isDoConcurrent)
+      fir::YieldOp::create(builder, loc);
+    else
+      mlir::omp::YieldOp::create(builder, loc);
+    return;
+  }
+
   typeError();
 }
 
@@ -604,6 +618,18 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxchar(
 void PopulateInitAndCleanupRegionsHelper::initAndCleanupUnboxedDerivedType(
     bool needsInitialization) {
   builder.setInsertionPointToStart(initBlock);
+  // For reductions with a user-provided init value, store it into the
+  // private variable. Insert after the init value's defining op to
+  // maintain SSA dominance (the init value was generated by the
+  // callback before populateByRefInitAndCleanupRegions was called).
+  if (scalarInitValue && isReduction(kind)) {
+    mlir::OpBuilder::InsertionGuard guard(builder);
+    if (auto *defOp = scalarInitValue.getDefiningOp())
+      builder.setInsertionPointAfter(defOp);
+    else
+      builder.setInsertionPointToEnd(initBlock);
+    fir::StoreOp::create(builder, loc, scalarInitValue, allocatedPrivVarArg);
+  }
   mlir::Type boxedTy = fir::BoxType::get(valType);
   mlir::Value newBox =
       fir::EmboxOp::create(builder, loc, boxedTy, allocatedPrivVarArg);
diff --git a/flang/lib/Lower/Support/ReductionProcessor.cpp b/flang/lib/Lower/Support/ReductionProcessor.cpp
index e0cba4c512258..36bd1f1ef2397 100644
--- a/flang/lib/Lower/Support/ReductionProcessor.cpp
+++ b/flang/lib/Lower/Support/ReductionProcessor.cpp
@@ -502,7 +502,7 @@ template <typename OpType>
 static void createReductionAllocAndInitRegions(
     AbstractConverter &converter, mlir::Location loc, OpType &reductionDecl,
     ReductionProcessor::GenInitValueCBTy genInitValueCB, mlir::Type type,
-    bool isByRef) {
+    bool isByRef, const Fortran::semantics::Symbol *sym)...
[truncated]

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 2, 2026

@llvm/pr-subscribers-flang-openmp

Author: Matt (MattPD)

Changes

Fix lowering of !$omp declare reduction for intrinsic operators applied
to user-defined derived types (e.g., + on type(t)). Previously, this
hit a TODO in ReductionProcessor::getReductionInitValue because the code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the DeclareReductionOp.

This fixes the issue #176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: genOMP for
OpenMPDeclareReductionConstruct used a raw operator string (e.g., "Add")
as the reduction name, while processReductionArguments at the use site
computed a canonical name via getReductionName (e.g.,
"add_reduction_byref_rec__QFTt"). The lookupSymbol in
createDeclareReductionHelper never found the already-created op, so it
fell through to createDeclareReduction which called
getReductionInitValue with the derived type and hit the TODO.

The fix has four parts:

  1. Consistent names: In genOMP for OpenMPDeclareReductionConstruct,
    compute the reduction name using the same getReductionName scheme that
    processReductionArguments uses, so both sites produce identical symbol
    names. For intrinsic operators, this maps through ReductionIdentifier to
    get the canonical name. For user-defined named reductions, the raw symbol
    name is used directly, matching the existing custom-reduction lookup path.

  2. Reuse reduction: In processReductionArguments, when an intrinsic
    operator reduction is requested, check whether a user-defined declare
    reduction already exists under that canonical name before attempting to
    create a new one. If found, reuse it. This avoids calling
    createDeclareReduction (and thus getReductionInitValue) for types that
    have user-provided initializers.

  3. Reference semantics: Change doReductionByRef to return true for
    derived types. Previously it returned false for both trivial and derived
    types, treating derived types as by-val. This is incorrect for
    user-defined combiners that operate on components via side-effects (e.g.,
    omp_out%x = omp_out%x + omp_in%x): the combiner mutates omp_out in
    place and does not produce a whole-struct value, so convertExprToValue
    returns the component type (i32) rather than the struct type, causing a
    type mismatch in the omp.yield. By-ref is the correct model: the
    combiner stores into the lhs reference and yields it.

The combiner callback in processReductionCombiner is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level assignments),
the store is skipped since the assignment already wrote into omp_out as a
side-effect, and only the lhs reference is yielded.

  1. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
    codegen in OMPIRBuilder unconditionally called the DataPtrPtrGen
    callback for all by-ref elements, assuming descriptor-based pointer
    indirection was always needed. For simple derived types (e.g., a struct
    with scalar integer members), there is no Fortran descriptor/box, so
    dataPtrPtrRegion is empty and DataPtrPtrGen is null. This caused a
    std::bad_function_call crash (or an assertion failure in
    getSingleElement when the empty region was inlined).

The fix returns a null DataPtrPtrGen callback from makeRefDataPtrGen
when dataPtrPtrRegion is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when DataPtrPtrGen is null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:

  • Update declare-reduction-intrinsic-op.f90 from a negative test
    (checking for the TODO error) to a positive test checking generated MLIR.
  • Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
    reference semantics fix.
  • Add declare-reduction-finalizer.f90 for derived type with init and
    finalization.

Remaining work: declare reduction without an initializer clause is not yet
supported.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Assisted-by: Claude Opus 4.6.


Patch is 43.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/190288.diff

11 Files Affected:

  • (modified) flang/include/flang/Lower/OpenMP/Clauses.h (+11)
  • (modified) flang/include/flang/Lower/Support/ReductionProcessor.h (+7-4)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+13-12)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+151-30)
  • (modified) flang/lib/Lower/Support/PrivateReductionUtils.cpp (+26)
  • (modified) flang/lib/Lower/Support/ReductionProcessor.cpp (+25-10)
  • (added) flang/test/Lower/OpenMP/declare-reduction-finalizer.f90 (+86)
  • (modified) flang/test/Lower/OpenMP/declare-reduction-intrinsic-op.f90 (+25-2)
  • (modified) flang/test/Lower/OpenMP/omp-declare-reduction-derivedtype.f90 (+18-22)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+56-50)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1)
diff --git a/flang/include/flang/Lower/OpenMP/Clauses.h b/flang/include/flang/Lower/OpenMP/Clauses.h
index a325e74327240..f334374280c73 100644
--- a/flang/include/flang/Lower/OpenMP/Clauses.h
+++ b/flang/include/flang/Lower/OpenMP/Clauses.h
@@ -329,6 +329,17 @@ using UsesAllocators = tomp::clause::UsesAllocatorsT<TypeTy, IdTy, ExprTy>;
 using Weak = tomp::clause::WeakT<TypeTy, IdTy, ExprTy>;
 using When = tomp::clause::WhenT<TypeTy, IdTy, ExprTy>;
 using Write = tomp::clause::WriteT<TypeTy, IdTy, ExprTy>;
+
+DefinedOperator makeDefinedOperator(const parser::DefinedOperator &inp,
+                                    semantics::SemanticsContext &semaCtx);
+
+ProcedureDesignator
+makeProcedureDesignator(const parser::ProcedureDesignator &inp,
+                        semantics::SemanticsContext &semaCtx);
+
+ReductionOperator
+makeReductionOperator(const parser::OmpReductionIdentifier &inp,
+                      semantics::SemanticsContext &semaCtx);
 } // namespace clause
 
 using tomp::type::operator==;
diff --git a/flang/include/flang/Lower/Support/ReductionProcessor.h b/flang/include/flang/Lower/Support/ReductionProcessor.h
index bd0447360f089..bbc4879bbe352 100644
--- a/flang/include/flang/Lower/Support/ReductionProcessor.h
+++ b/flang/include/flang/Lower/Support/ReductionProcessor.h
@@ -40,9 +40,11 @@ namespace omp {
 
 class ReductionProcessor {
 public:
-  using GenInitValueCBTy =
-      std::function<mlir::Value(fir::FirOpBuilder &builder, mlir::Location loc,
-                                mlir::Type type, mlir::Value ompOrig)>;
+  // ompOrig: mold/original variable
+  // ompPriv: private allocation (may be null for by-value reductions)
+  using GenInitValueCBTy = std::function<mlir::Value(
+      fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type type,
+      mlir::Value ompOrig, mlir::Value ompPriv)>;
   using GenCombinerCBTy = std::function<void(
       fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type type,
       mlir::Value op1, mlir::Value op2, bool isByRef)>;
@@ -126,7 +128,8 @@ class ReductionProcessor {
   static DeclareRedType createDeclareReductionHelper(
       AbstractConverter &converter, llvm::StringRef reductionOpName,
       mlir::Type type, mlir::Location loc, bool isByRef,
-      GenCombinerCBTy genCombinerCB, GenInitValueCBTy genInitValueCB);
+      GenCombinerCBTy genCombinerCB, GenInitValueCBTy genInitValueCB,
+      const semantics::Symbol *sym = nullptr);
 
   /// Creates an OpenMP reduction declaration and inserts it into the provided
   /// symbol table. The declaration has a constant initializer with the neutral
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index 85493bf45453e..e856fdf5c0fad 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -388,7 +388,8 @@ bool ClauseProcessor::processInitializer(
     ReductionProcessor::GenInitValueCBTy &genInitValueCB) const {
   if (auto *clause = findUniqueClause<omp::clause::Initializer>()) {
     genInitValueCB = [&, clause](fir::FirOpBuilder &builder, mlir::Location loc,
-                                 mlir::Type type, mlir::Value ompOrig) {
+                                 mlir::Type type, mlir::Value moldArg,
+                                 mlir::Value privArg) {
       lower::SymMapScope scope(symMap);
       mlir::Value ompPrivVar;
       const StylizedInstance &inst = clause->v.front();
@@ -396,9 +397,10 @@ bool ClauseProcessor::processInitializer(
       for (const Object &object :
            std::get<StylizedInstance::Variables>(inst.t)) {
         mlir::Value addr;
-        mlir::Type ompOrigType = ompOrig.getType();
+        std::string name = object.sym()->name().ToString();
+        mlir::Type moldArgType = moldArg.getType();
         // Check for unsupported dynamic-length character reductions
-        mlir::Type unwrappedType = fir::unwrapRefType(ompOrigType);
+        mlir::Type unwrappedType = fir::unwrapRefType(moldArgType);
         if (mlir::isa<fir::BoxCharType>(unwrappedType)) {
           TODO(loc, "OpenMP reduction allocation for dynamic length character");
         }
@@ -408,18 +410,20 @@ bool ClauseProcessor::processInitializer(
                  "OpenMP reduction allocation for dynamic length character");
           }
         }
-        // If ompOrig is already a reference, we can use it directly
-        if (fir::isa_ref_type(ompOrigType)) {
-          addr = ompOrig;
+        // For by-ref reductions, omp_priv maps to privArg (the private
+        // allocation) and omp_orig maps to moldArg (the original).
+        if (name == "omp_priv" && privArg) {
+          addr = privArg;
+        } else if (fir::isa_ref_type(moldArgType)) {
+          addr = moldArg;
         } else {
-          addr = builder.createTemporary(loc, ompOrigType);
-          fir::StoreOp::create(builder, loc, ompOrig, addr);
+          addr = builder.createTemporary(loc, moldArgType);
+          fir::StoreOp::create(builder, loc, moldArg, addr);
         }
         fir::FortranVariableFlagsEnum extraFlags = {};
         fir::FortranVariableFlagsAttr attributes =
             Fortran::lower::translateSymbolAttributes(
                 builder.getContext(), *object.sym(), extraFlags);
-        std::string name = object.sym()->name().ToString();
         // Get length parameters for types that need them (e.g., characters).
         // Note: DeclareOp requires exactly one type parameter for non-boxed
         // characters, unlike EmboxOp which doesn't allow them for constant-len.
@@ -451,9 +455,6 @@ bool ClauseProcessor::processInitializer(
               [&](const auto &expr) -> mlir::Value {
                 mlir::Value exprResult = fir::getBase(convertExprToValue(
                     loc, converter, initExpr, symMap, stmtCtx));
-                // Conversion can either give a value or a refrence to a value,
-                // we need to return the reduction type, so an optional load may
-                // be generated.
                 if (auto refType = llvm::dyn_cast<fir::ReferenceType>(
                         exprResult.getType()))
                   if (ompPrivVar.getType() == refType)
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index e2018add11206..5b449c8ba46fd 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -18,6 +18,7 @@
 #include "Decomposer.h"
 #include "Utils.h"
 #include "flang/Common/idioms.h"
+#include "flang/Evaluate/expression.h"
 #include "flang/Evaluate/tools.h"
 #include "flang/Evaluate/type.h"
 #include "flang/Lower/Bridge.h"
@@ -3796,14 +3797,27 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
 
 static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
     lower::AbstractConverter &converter, lower::SymMap &symTable,
-    semantics::SemanticsContext &semaCtx, const clause::Combiner &combiner) {
+    semantics::SemanticsContext &semaCtx, const clause::Combiner &combiner,
+    const parser::OmpStylizedInstance &parserInst) {
+  // Extract the typed assignment from the parser-level instance, if
+  // the combiner is an assignment statement (as opposed to a call).
+  const evaluate::Assignment *assign = nullptr;
+  const auto &instance =
+      std::get<parser::OmpStylizedInstance::Instance>(parserInst.t);
+  if (const auto *assignStmt =
+          std::get_if<parser::AssignmentStmt>(&instance.u)) {
+    if (auto *wrapper = assignStmt->typedAssignment.get())
+      if (wrapper->v)
+        assign = &*wrapper->v;
+  }
   ReductionProcessor::GenCombinerCBTy genCombinerCB;
   const StylizedInstance &inst = combiner.v.front();
   semantics::SomeExpr evalExpr = std::get<StylizedInstance::Instance>(inst.t);
 
-  genCombinerCB = [&, evalExpr](fir::FirOpBuilder &builder, mlir::Location loc,
-                                mlir::Type type, mlir::Value lhs,
-                                mlir::Value rhs, bool isByRef) {
+  genCombinerCB = [&, evalExpr, assign](fir::FirOpBuilder &builder,
+                                        mlir::Location loc, mlir::Type type,
+                                        mlir::Value lhs, mlir::Value rhs,
+                                        bool isByRef) {
     lower::SymMapScope scope(symTable);
     mlir::Value ompOutVar;
     for (const Object &object : std::get<StylizedInstance::Variables>(inst.t)) {
@@ -3838,6 +3852,44 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
       symTable.addVariableDefinition(*object.sym(), declareOp);
     }
 
+    // For derived types with a typed assignment available, use
+    // hlfir::AssignOp or user-defined assignment directly instead of
+    // trying to convert the expression to a value (which doesn't work
+    // for record types).  Only take this path when the assignment RHS
+    // itself is a derived type -- i.e. the combiner assigns to the whole
+    // derived-type variable (e.g. omp_out = mycombine(omp_out, omp_in)).
+    // When the combiner assigns to a component (e.g. omp_out%x = ...),
+    // the RHS is a scalar intrinsic type and the existing convertExprToValue
+    // path handles it correctly.
+    bool rhsIsDerived =
+        assign && assign->rhs.GetType() &&
+        assign->rhs.GetType()->category() == common::TypeCategory::Derived;
+    if (rhsIsDerived && isByRef &&
+        mlir::isa<fir::RecordType>(fir::unwrapRefType(lhs.getType()))) {
+      lower::StatementContext stmtCtx;
+      hlfir::Entity lhsEntity{ompOutVar};
+      hlfir::Entity rhsEntity = lower::convertExprToHLFIR(
+          loc, converter, assign->rhs, symTable, stmtCtx);
+      common::visit(
+          common::visitors{
+              [&](const evaluate::Assignment::Intrinsic &) {
+                hlfir::AssignOp::create(builder, loc, rhsEntity, lhsEntity);
+              },
+              [&](const evaluate::ProcedureRef &procRef) {
+                lower::convertUserDefinedAssignmentToHLFIR(
+                    loc, converter, procRef, lhsEntity, rhsEntity, symTable);
+              },
+              [&](const auto &) {
+                llvm_unreachable(
+                    "Unexpected assignment type in reduction combiner");
+              },
+          },
+          assign->u);
+      stmtCtx.finalizeAndPop();
+      mlir::omp::YieldOp::create(builder, loc, lhs);
+      return;
+    }
+
     lower::StatementContext stmtCtx;
     mlir::Value result = common::visit(
         common::visitors{
@@ -3845,6 +3897,10 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
               convertCallToHLFIR(loc, converter, procRef, std::nullopt,
                                  symTable, stmtCtx);
               auto outVal = fir::LoadOp::create(builder, loc, ompOutVar);
+              if (isByRef) {
+                fir::StoreOp::create(builder, loc, outVal, lhs);
+                return mlir::Value{};
+              }
               return outVal;
             },
             [&](const auto &expr) -> mlir::Value {
@@ -3859,12 +3915,35 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
                 if (expectedType == refType.getElementType())
                   exprResult = fir::LoadOp::create(builder, loc, exprResult);
               }
+              // For component-level derived-type combiners (e.g.
+              // omp_out%x = omp_out%x + omp_in%x), the assignment was
+              // not performed during expression lowering since
+              // convertExprToValue only evaluates the RHS value.
+              // The result type won't match the reduction variable type.
+              // Use the typed assignment LHS to store to the correct
+              // component, then skip the whole-variable store.
+              if (isByRef &&
+                  exprResult.getType() != fir::unwrapRefType(lhs.getType())) {
+                if (assign) {
+                  lower::StatementContext assignCtx;
+                  hlfir::Entity lhsEntity = lower::convertExprToHLFIR(
+                      loc, converter, assign->lhs, symTable, assignCtx);
+                  hlfir::AssignOp::create(builder, loc, exprResult, lhsEntity);
+                  assignCtx.finalizeAndPop();
+                } else {
+                  fir::StoreOp::create(builder, loc, exprResult, ompOutVar);
+                }
+                return mlir::Value{};
+              }
+              if (isByRef) {
+                fir::StoreOp::create(builder, loc, exprResult, lhs);
+                return mlir::Value{};
+              }
               return exprResult;
             }},
         evalExpr.u);
     stmtCtx.finalizeAndPop();
     if (isByRef) {
-      fir::StoreOp::create(builder, loc, result, lhs);
       mlir::omp::YieldOp::create(builder, loc, lhs);
     } else {
       mlir::omp::YieldOp::create(builder, loc, result);
@@ -3957,41 +4036,83 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
   const auto &identifier =
       std::get<parser::OmpReductionIdentifier>(specifier.t);
 
-  std::string reductionNameStr = Fortran::common::visit(
-      common::visitors{
-          [](const parser::ProcedureDesignator &pd) -> std::string {
-            return std::get<parser::Name>(pd.u).ToString();
-          },
-          [](const parser::DefinedOperator &defOp) -> std::string {
-            return Fortran::common::visit(
-                common::visitors{
-                    [](const parser::DefinedOpName &opName) -> std::string {
-                      return opName.v.ToString();
-                    },
-                    [](parser::DefinedOperator::IntrinsicOperator intrOp)
-                        -> std::string {
-                      return std::string(
-                          parser::DefinedOperator::EnumToString(intrOp));
-                    },
-                },
-                defOp.u);
-          },
-      },
-      identifier.u);
+  // Convert the parser-level reduction identifier to the clause-level
+  // representation, then use ReductionProcessor to derive the canonical name.
+  clause::ReductionOperator redOp =
+      clause::makeReductionOperator(identifier, semaCtx);
+
+  // Get the parser-level combiner expression so we can pass each
+  // parser::OmpStylizedInstance to processReductionCombiner.
+  // The combiner expression's instances correspond 1:1 to typeNameList entries.
+  const auto *combinerExpr = parser::omp::GetCombinerExpr(specifier);
+  assert(combinerExpr && "Expecting combiner expression");
+  auto parserInstIt = combinerExpr->v.begin();
 
   for (const auto &typeSpec : typeNameList.v) {
     (void)typeSpec; // Currently unused
+
+    assert(parserInstIt != combinerExpr->v.end() &&
+           "Mismatched combiner instance count");
+    const parser::OmpStylizedInstance &parserInst = *parserInstIt++;
+
     mlir::Type reductionType = getReductionType(converter, specifier);
+    bool isByRef = ReductionProcessor::doReductionByRef(reductionType);
+    // Compute the canonical reduction name the same way
+    // processReductionArguments does.
+    std::string reductionNameStr = Fortran::common::visit(
+        common::visitors{
+            [&](const clause::DefinedOperator &defOp) -> std::string {
+              return Fortran::common::visit(
+                  common::visitors{
+                      [&](const clause::DefinedOperator::IntrinsicOperator
+                              &intrOp) -> std::string {
+                        ReductionProcessor::ReductionIdentifier redId =
+                            ReductionProcessor::getReductionType(intrOp);
+                        return ReductionProcessor::getReductionName(
+                            redId, converter.getFirOpBuilder().getKindMap(),
+                            reductionType, isByRef);
+                      },
+                      [&](const clause::DefinedOperator::DefinedOpName &opName)
+                          -> std::string {
+                        return opName.v.sym()->name().ToString();
+                      },
+                  },
+                  defOp.u);
+            },
+            [&](const clause::ProcedureDesignator &pd) -> std::string {
+              return pd.v.sym()->name().ToString();
+            },
+        },
+        redOp.u);
+
     ReductionProcessor::GenCombinerCBTy genCombinerCB =
-        processReductionCombiner(converter, symTable, semaCtx, combiner);
+        processReductionCombiner(converter, symTable, semaCtx, combiner,
+                                 parserInst);
     ReductionProcessor::GenInitValueCBTy genInitValueCB;
     ClauseProcessor cp(converter, semaCtx, clauses);
     cp.processInitializer(symTable, genInitValueCB);
-    bool isByRef = ReductionProcessor::doReductionByRef(reductionType);
+    mlir::Type redType =
+        isByRef
+            ? static_cast<mlir::Type>(fir::ReferenceType::get(reductionType))
+            : reductionType;
+
+    // Get the omp_out symbol from the combiner for finalization checks
+    // in populateByRefInitAndCleanupRegions.
+    const semantics::Symbol *reductionSym = nullptr;
+    const auto &declList =
+        std::get<std::list<parser::OmpStylizedDeclaration>>(parserInst.t);
+    for (const auto &decl : declList) {
+      const auto &name = std::get<parser::ObjectName>(decl.var.t);
+      if (name.ToString() == "omp_out") {
+        reductionSym = name.symbol;
+        break;
+      }
+    }
+
     ReductionProcessor::createDeclareReductionHelper<
         mlir::omp::DeclareReductionOp>(
-        converter, reductionNameStr, reductionType,
-        converter.getCurrentLocation(), isByRef, genCombinerCB, genInitValueCB);
+        converter, reductionNameStr, redType, converter.getCurrentLocation(),
+        isByRef, genCombinerCB, genInitValueCB, reductionSym);
   }
 }
 
diff --git a/flang/lib/Lower/Support/PrivateReductionUtils.cpp b/flang/lib/Lower/Support/PrivateReductionUtils.cpp
index d1a965d288cad..6314d8c389954 100644
--- a/flang/lib/Lower/Support/PrivateReductionUtils.cpp
+++ b/flang/lib/Lower/Support/PrivateReductionUtils.cpp
@@ -137,6 +137,20 @@ static void createCleanupRegion(Fortran::lower::AbstractConverter &converter,
     return;
   }
 
+  // Handle unboxed derived types that need finalization (e.g. types with
+  // FINAL subroutines). Embox the reference and call the runtime destroy.
+  if (fir::isa_derived(valTy) && mlir::isa<fir::ReferenceType>(argType)) {
+    mlir::Type boxTy = fir::BoxType::get(valTy);
+    mlir::Value box =
+        fir::EmboxOp::create(builder, loc, boxTy, block->getArgument(0));
+    fir::runtime::genDerivedTypeDestroy(builder, loc, box);
+    if (isDoConcurrent)
+      fir::YieldOp::create(builder, loc);
+    else
+      mlir::omp::YieldOp::create(builder, loc);
+    return;
+  }
+
   typeError();
 }
 
@@ -604,6 +618,18 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxchar(
 void PopulateInitAndCleanupRegionsHelper::initAndCleanupUnboxedDerivedType(
     bool needsInitialization) {
   builder.setInsertionPointToStart(initBlock);
+  // For reductions with a user-provided init value, store it into the
+  // private variable. Insert after the init value's defining op to
+  // maintain SSA dominance (the init value was generated by the
+  // callback before populateByRefInitAndCleanupRegions was called).
+  if (scalarInitValue && isReduction(kind)) {
+    mlir::OpBuilder::InsertionGuard guard(builder);
+    if (auto *defOp = scalarInitValue.getDefiningOp())
+      builder.setInsertionPointAfter(defOp);
+    else
+      builder.setInsertionPointToEnd(initBlock);
+    fir::StoreOp::create(builder, loc, scalarInitValue, allocatedPrivVarArg);
+  }
   mlir::Type boxedTy = fir::BoxType::get(valType);
   mlir::Value newBox =
       fir::EmboxOp::create(builder, loc, boxedTy, allocatedPrivVarArg);
diff --git a/flang/lib/Lower/Support/ReductionProcessor.cpp b/flang/lib/Lower/Support/ReductionProcessor.cpp
index e0cba4c512258..36bd1f1ef2397 100644
--- a/flang/lib/Lower/Support/ReductionProcessor.cpp
+++ b/flang/lib/Lower/Support/ReductionProcessor.cpp
@@ -502,7 +502,7 @@ template <typename OpType>
 static void createReductionAllocAndInitRegions(
     AbstractConverter &converter, mlir::Location loc, OpType &reductionDecl,
     ReductionProcessor::GenInitValueCBTy genInitValueCB, mlir::Type type,
-    bool isByRef) {
+    bool isByRef, const Fortran::semantics::Symbol *sym)...
[truncated]

@llvmbot
Copy link
Copy Markdown
Member

llvmbot commented Apr 2, 2026

@llvm/pr-subscribers-flang-fir-hlfir

Author: Matt (MattPD)

Changes

Fix lowering of !$omp declare reduction for intrinsic operators applied
to user-defined derived types (e.g., + on type(t)). Previously, this
hit a TODO in ReductionProcessor::getReductionInitValue because the code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the DeclareReductionOp.

This fixes the issue #176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: genOMP for
OpenMPDeclareReductionConstruct used a raw operator string (e.g., "Add")
as the reduction name, while processReductionArguments at the use site
computed a canonical name via getReductionName (e.g.,
"add_reduction_byref_rec__QFTt"). The lookupSymbol in
createDeclareReductionHelper never found the already-created op, so it
fell through to createDeclareReduction which called
getReductionInitValue with the derived type and hit the TODO.

The fix has four parts:

  1. Consistent names: In genOMP for OpenMPDeclareReductionConstruct,
    compute the reduction name using the same getReductionName scheme that
    processReductionArguments uses, so both sites produce identical symbol
    names. For intrinsic operators, this maps through ReductionIdentifier to
    get the canonical name. For user-defined named reductions, the raw symbol
    name is used directly, matching the existing custom-reduction lookup path.

  2. Reuse reduction: In processReductionArguments, when an intrinsic
    operator reduction is requested, check whether a user-defined declare
    reduction already exists under that canonical name before attempting to
    create a new one. If found, reuse it. This avoids calling
    createDeclareReduction (and thus getReductionInitValue) for types that
    have user-provided initializers.

  3. Reference semantics: Change doReductionByRef to return true for
    derived types. Previously it returned false for both trivial and derived
    types, treating derived types as by-val. This is incorrect for
    user-defined combiners that operate on components via side-effects (e.g.,
    omp_out%x = omp_out%x + omp_in%x): the combiner mutates omp_out in
    place and does not produce a whole-struct value, so convertExprToValue
    returns the component type (i32) rather than the struct type, causing a
    type mismatch in the omp.yield. By-ref is the correct model: the
    combiner stores into the lhs reference and yields it.

The combiner callback in processReductionCombiner is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level assignments),
the store is skipped since the assignment already wrote into omp_out as a
side-effect, and only the lhs reference is yielded.

  1. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
    codegen in OMPIRBuilder unconditionally called the DataPtrPtrGen
    callback for all by-ref elements, assuming descriptor-based pointer
    indirection was always needed. For simple derived types (e.g., a struct
    with scalar integer members), there is no Fortran descriptor/box, so
    dataPtrPtrRegion is empty and DataPtrPtrGen is null. This caused a
    std::bad_function_call crash (or an assertion failure in
    getSingleElement when the empty region was inlined).

The fix returns a null DataPtrPtrGen callback from makeRefDataPtrGen
when dataPtrPtrRegion is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when DataPtrPtrGen is null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:

  • Update declare-reduction-intrinsic-op.f90 from a negative test
    (checking for the TODO error) to a positive test checking generated MLIR.
  • Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
    reference semantics fix.
  • Add declare-reduction-finalizer.f90 for derived type with init and
    finalization.

Remaining work: declare reduction without an initializer clause is not yet
supported.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Assisted-by: Claude Opus 4.6.


Patch is 43.69 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/190288.diff

11 Files Affected:

  • (modified) flang/include/flang/Lower/OpenMP/Clauses.h (+11)
  • (modified) flang/include/flang/Lower/Support/ReductionProcessor.h (+7-4)
  • (modified) flang/lib/Lower/OpenMP/ClauseProcessor.cpp (+13-12)
  • (modified) flang/lib/Lower/OpenMP/OpenMP.cpp (+151-30)
  • (modified) flang/lib/Lower/Support/PrivateReductionUtils.cpp (+26)
  • (modified) flang/lib/Lower/Support/ReductionProcessor.cpp (+25-10)
  • (added) flang/test/Lower/OpenMP/declare-reduction-finalizer.f90 (+86)
  • (modified) flang/test/Lower/OpenMP/declare-reduction-intrinsic-op.f90 (+25-2)
  • (modified) flang/test/Lower/OpenMP/omp-declare-reduction-derivedtype.f90 (+18-22)
  • (modified) llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (+56-50)
  • (modified) mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp (+1-1)
diff --git a/flang/include/flang/Lower/OpenMP/Clauses.h b/flang/include/flang/Lower/OpenMP/Clauses.h
index a325e74327240..f334374280c73 100644
--- a/flang/include/flang/Lower/OpenMP/Clauses.h
+++ b/flang/include/flang/Lower/OpenMP/Clauses.h
@@ -329,6 +329,17 @@ using UsesAllocators = tomp::clause::UsesAllocatorsT<TypeTy, IdTy, ExprTy>;
 using Weak = tomp::clause::WeakT<TypeTy, IdTy, ExprTy>;
 using When = tomp::clause::WhenT<TypeTy, IdTy, ExprTy>;
 using Write = tomp::clause::WriteT<TypeTy, IdTy, ExprTy>;
+
+DefinedOperator makeDefinedOperator(const parser::DefinedOperator &inp,
+                                    semantics::SemanticsContext &semaCtx);
+
+ProcedureDesignator
+makeProcedureDesignator(const parser::ProcedureDesignator &inp,
+                        semantics::SemanticsContext &semaCtx);
+
+ReductionOperator
+makeReductionOperator(const parser::OmpReductionIdentifier &inp,
+                      semantics::SemanticsContext &semaCtx);
 } // namespace clause
 
 using tomp::type::operator==;
diff --git a/flang/include/flang/Lower/Support/ReductionProcessor.h b/flang/include/flang/Lower/Support/ReductionProcessor.h
index bd0447360f089..bbc4879bbe352 100644
--- a/flang/include/flang/Lower/Support/ReductionProcessor.h
+++ b/flang/include/flang/Lower/Support/ReductionProcessor.h
@@ -40,9 +40,11 @@ namespace omp {
 
 class ReductionProcessor {
 public:
-  using GenInitValueCBTy =
-      std::function<mlir::Value(fir::FirOpBuilder &builder, mlir::Location loc,
-                                mlir::Type type, mlir::Value ompOrig)>;
+  // ompOrig: mold/original variable
+  // ompPriv: private allocation (may be null for by-value reductions)
+  using GenInitValueCBTy = std::function<mlir::Value(
+      fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type type,
+      mlir::Value ompOrig, mlir::Value ompPriv)>;
   using GenCombinerCBTy = std::function<void(
       fir::FirOpBuilder &builder, mlir::Location loc, mlir::Type type,
       mlir::Value op1, mlir::Value op2, bool isByRef)>;
@@ -126,7 +128,8 @@ class ReductionProcessor {
   static DeclareRedType createDeclareReductionHelper(
       AbstractConverter &converter, llvm::StringRef reductionOpName,
       mlir::Type type, mlir::Location loc, bool isByRef,
-      GenCombinerCBTy genCombinerCB, GenInitValueCBTy genInitValueCB);
+      GenCombinerCBTy genCombinerCB, GenInitValueCBTy genInitValueCB,
+      const semantics::Symbol *sym = nullptr);
 
   /// Creates an OpenMP reduction declaration and inserts it into the provided
   /// symbol table. The declaration has a constant initializer with the neutral
diff --git a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
index 85493bf45453e..e856fdf5c0fad 100644
--- a/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
+++ b/flang/lib/Lower/OpenMP/ClauseProcessor.cpp
@@ -388,7 +388,8 @@ bool ClauseProcessor::processInitializer(
     ReductionProcessor::GenInitValueCBTy &genInitValueCB) const {
   if (auto *clause = findUniqueClause<omp::clause::Initializer>()) {
     genInitValueCB = [&, clause](fir::FirOpBuilder &builder, mlir::Location loc,
-                                 mlir::Type type, mlir::Value ompOrig) {
+                                 mlir::Type type, mlir::Value moldArg,
+                                 mlir::Value privArg) {
       lower::SymMapScope scope(symMap);
       mlir::Value ompPrivVar;
       const StylizedInstance &inst = clause->v.front();
@@ -396,9 +397,10 @@ bool ClauseProcessor::processInitializer(
       for (const Object &object :
            std::get<StylizedInstance::Variables>(inst.t)) {
         mlir::Value addr;
-        mlir::Type ompOrigType = ompOrig.getType();
+        std::string name = object.sym()->name().ToString();
+        mlir::Type moldArgType = moldArg.getType();
         // Check for unsupported dynamic-length character reductions
-        mlir::Type unwrappedType = fir::unwrapRefType(ompOrigType);
+        mlir::Type unwrappedType = fir::unwrapRefType(moldArgType);
         if (mlir::isa<fir::BoxCharType>(unwrappedType)) {
           TODO(loc, "OpenMP reduction allocation for dynamic length character");
         }
@@ -408,18 +410,20 @@ bool ClauseProcessor::processInitializer(
                  "OpenMP reduction allocation for dynamic length character");
           }
         }
-        // If ompOrig is already a reference, we can use it directly
-        if (fir::isa_ref_type(ompOrigType)) {
-          addr = ompOrig;
+        // For by-ref reductions, omp_priv maps to privArg (the private
+        // allocation) and omp_orig maps to moldArg (the original).
+        if (name == "omp_priv" && privArg) {
+          addr = privArg;
+        } else if (fir::isa_ref_type(moldArgType)) {
+          addr = moldArg;
         } else {
-          addr = builder.createTemporary(loc, ompOrigType);
-          fir::StoreOp::create(builder, loc, ompOrig, addr);
+          addr = builder.createTemporary(loc, moldArgType);
+          fir::StoreOp::create(builder, loc, moldArg, addr);
         }
         fir::FortranVariableFlagsEnum extraFlags = {};
         fir::FortranVariableFlagsAttr attributes =
             Fortran::lower::translateSymbolAttributes(
                 builder.getContext(), *object.sym(), extraFlags);
-        std::string name = object.sym()->name().ToString();
         // Get length parameters for types that need them (e.g., characters).
         // Note: DeclareOp requires exactly one type parameter for non-boxed
         // characters, unlike EmboxOp which doesn't allow them for constant-len.
@@ -451,9 +455,6 @@ bool ClauseProcessor::processInitializer(
               [&](const auto &expr) -> mlir::Value {
                 mlir::Value exprResult = fir::getBase(convertExprToValue(
                     loc, converter, initExpr, symMap, stmtCtx));
-                // Conversion can either give a value or a refrence to a value,
-                // we need to return the reduction type, so an optional load may
-                // be generated.
                 if (auto refType = llvm::dyn_cast<fir::ReferenceType>(
                         exprResult.getType()))
                   if (ompPrivVar.getType() == refType)
diff --git a/flang/lib/Lower/OpenMP/OpenMP.cpp b/flang/lib/Lower/OpenMP/OpenMP.cpp
index e2018add11206..5b449c8ba46fd 100644
--- a/flang/lib/Lower/OpenMP/OpenMP.cpp
+++ b/flang/lib/Lower/OpenMP/OpenMP.cpp
@@ -18,6 +18,7 @@
 #include "Decomposer.h"
 #include "Utils.h"
 #include "flang/Common/idioms.h"
+#include "flang/Evaluate/expression.h"
 #include "flang/Evaluate/tools.h"
 #include "flang/Evaluate/type.h"
 #include "flang/Lower/Bridge.h"
@@ -3796,14 +3797,27 @@ genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
 
 static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
     lower::AbstractConverter &converter, lower::SymMap &symTable,
-    semantics::SemanticsContext &semaCtx, const clause::Combiner &combiner) {
+    semantics::SemanticsContext &semaCtx, const clause::Combiner &combiner,
+    const parser::OmpStylizedInstance &parserInst) {
+  // Extract the typed assignment from the parser-level instance, if
+  // the combiner is an assignment statement (as opposed to a call).
+  const evaluate::Assignment *assign = nullptr;
+  const auto &instance =
+      std::get<parser::OmpStylizedInstance::Instance>(parserInst.t);
+  if (const auto *assignStmt =
+          std::get_if<parser::AssignmentStmt>(&instance.u)) {
+    if (auto *wrapper = assignStmt->typedAssignment.get())
+      if (wrapper->v)
+        assign = &*wrapper->v;
+  }
   ReductionProcessor::GenCombinerCBTy genCombinerCB;
   const StylizedInstance &inst = combiner.v.front();
   semantics::SomeExpr evalExpr = std::get<StylizedInstance::Instance>(inst.t);
 
-  genCombinerCB = [&, evalExpr](fir::FirOpBuilder &builder, mlir::Location loc,
-                                mlir::Type type, mlir::Value lhs,
-                                mlir::Value rhs, bool isByRef) {
+  genCombinerCB = [&, evalExpr, assign](fir::FirOpBuilder &builder,
+                                        mlir::Location loc, mlir::Type type,
+                                        mlir::Value lhs, mlir::Value rhs,
+                                        bool isByRef) {
     lower::SymMapScope scope(symTable);
     mlir::Value ompOutVar;
     for (const Object &object : std::get<StylizedInstance::Variables>(inst.t)) {
@@ -3838,6 +3852,44 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
       symTable.addVariableDefinition(*object.sym(), declareOp);
     }
 
+    // For derived types with a typed assignment available, use
+    // hlfir::AssignOp or user-defined assignment directly instead of
+    // trying to convert the expression to a value (which doesn't work
+    // for record types).  Only take this path when the assignment RHS
+    // itself is a derived type -- i.e. the combiner assigns to the whole
+    // derived-type variable (e.g. omp_out = mycombine(omp_out, omp_in)).
+    // When the combiner assigns to a component (e.g. omp_out%x = ...),
+    // the RHS is a scalar intrinsic type and the existing convertExprToValue
+    // path handles it correctly.
+    bool rhsIsDerived =
+        assign && assign->rhs.GetType() &&
+        assign->rhs.GetType()->category() == common::TypeCategory::Derived;
+    if (rhsIsDerived && isByRef &&
+        mlir::isa<fir::RecordType>(fir::unwrapRefType(lhs.getType()))) {
+      lower::StatementContext stmtCtx;
+      hlfir::Entity lhsEntity{ompOutVar};
+      hlfir::Entity rhsEntity = lower::convertExprToHLFIR(
+          loc, converter, assign->rhs, symTable, stmtCtx);
+      common::visit(
+          common::visitors{
+              [&](const evaluate::Assignment::Intrinsic &) {
+                hlfir::AssignOp::create(builder, loc, rhsEntity, lhsEntity);
+              },
+              [&](const evaluate::ProcedureRef &procRef) {
+                lower::convertUserDefinedAssignmentToHLFIR(
+                    loc, converter, procRef, lhsEntity, rhsEntity, symTable);
+              },
+              [&](const auto &) {
+                llvm_unreachable(
+                    "Unexpected assignment type in reduction combiner");
+              },
+          },
+          assign->u);
+      stmtCtx.finalizeAndPop();
+      mlir::omp::YieldOp::create(builder, loc, lhs);
+      return;
+    }
+
     lower::StatementContext stmtCtx;
     mlir::Value result = common::visit(
         common::visitors{
@@ -3845,6 +3897,10 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
               convertCallToHLFIR(loc, converter, procRef, std::nullopt,
                                  symTable, stmtCtx);
               auto outVal = fir::LoadOp::create(builder, loc, ompOutVar);
+              if (isByRef) {
+                fir::StoreOp::create(builder, loc, outVal, lhs);
+                return mlir::Value{};
+              }
               return outVal;
             },
             [&](const auto &expr) -> mlir::Value {
@@ -3859,12 +3915,35 @@ static ReductionProcessor::GenCombinerCBTy processReductionCombiner(
                 if (expectedType == refType.getElementType())
                   exprResult = fir::LoadOp::create(builder, loc, exprResult);
               }
+              // For component-level derived-type combiners (e.g.
+              // omp_out%x = omp_out%x + omp_in%x), the assignment was
+              // not performed during expression lowering since
+              // convertExprToValue only evaluates the RHS value.
+              // The result type won't match the reduction variable type.
+              // Use the typed assignment LHS to store to the correct
+              // component, then skip the whole-variable store.
+              if (isByRef &&
+                  exprResult.getType() != fir::unwrapRefType(lhs.getType())) {
+                if (assign) {
+                  lower::StatementContext assignCtx;
+                  hlfir::Entity lhsEntity = lower::convertExprToHLFIR(
+                      loc, converter, assign->lhs, symTable, assignCtx);
+                  hlfir::AssignOp::create(builder, loc, exprResult, lhsEntity);
+                  assignCtx.finalizeAndPop();
+                } else {
+                  fir::StoreOp::create(builder, loc, exprResult, ompOutVar);
+                }
+                return mlir::Value{};
+              }
+              if (isByRef) {
+                fir::StoreOp::create(builder, loc, exprResult, lhs);
+                return mlir::Value{};
+              }
               return exprResult;
             }},
         evalExpr.u);
     stmtCtx.finalizeAndPop();
     if (isByRef) {
-      fir::StoreOp::create(builder, loc, result, lhs);
       mlir::omp::YieldOp::create(builder, loc, lhs);
     } else {
       mlir::omp::YieldOp::create(builder, loc, result);
@@ -3957,41 +4036,83 @@ static void genOMP(lower::AbstractConverter &converter, lower::SymMap &symTable,
   const auto &identifier =
       std::get<parser::OmpReductionIdentifier>(specifier.t);
 
-  std::string reductionNameStr = Fortran::common::visit(
-      common::visitors{
-          [](const parser::ProcedureDesignator &pd) -> std::string {
-            return std::get<parser::Name>(pd.u).ToString();
-          },
-          [](const parser::DefinedOperator &defOp) -> std::string {
-            return Fortran::common::visit(
-                common::visitors{
-                    [](const parser::DefinedOpName &opName) -> std::string {
-                      return opName.v.ToString();
-                    },
-                    [](parser::DefinedOperator::IntrinsicOperator intrOp)
-                        -> std::string {
-                      return std::string(
-                          parser::DefinedOperator::EnumToString(intrOp));
-                    },
-                },
-                defOp.u);
-          },
-      },
-      identifier.u);
+  // Convert the parser-level reduction identifier to the clause-level
+  // representation, then use ReductionProcessor to derive the canonical name.
+  clause::ReductionOperator redOp =
+      clause::makeReductionOperator(identifier, semaCtx);
+
+  // Get the parser-level combiner expression so we can pass each
+  // parser::OmpStylizedInstance to processReductionCombiner.
+  // The combiner expression's instances correspond 1:1 to typeNameList entries.
+  const auto *combinerExpr = parser::omp::GetCombinerExpr(specifier);
+  assert(combinerExpr && "Expecting combiner expression");
+  auto parserInstIt = combinerExpr->v.begin();
 
   for (const auto &typeSpec : typeNameList.v) {
     (void)typeSpec; // Currently unused
+
+    assert(parserInstIt != combinerExpr->v.end() &&
+           "Mismatched combiner instance count");
+    const parser::OmpStylizedInstance &parserInst = *parserInstIt++;
+
     mlir::Type reductionType = getReductionType(converter, specifier);
+    bool isByRef = ReductionProcessor::doReductionByRef(reductionType);
+    // Compute the canonical reduction name the same way
+    // processReductionArguments does.
+    std::string reductionNameStr = Fortran::common::visit(
+        common::visitors{
+            [&](const clause::DefinedOperator &defOp) -> std::string {
+              return Fortran::common::visit(
+                  common::visitors{
+                      [&](const clause::DefinedOperator::IntrinsicOperator
+                              &intrOp) -> std::string {
+                        ReductionProcessor::ReductionIdentifier redId =
+                            ReductionProcessor::getReductionType(intrOp);
+                        return ReductionProcessor::getReductionName(
+                            redId, converter.getFirOpBuilder().getKindMap(),
+                            reductionType, isByRef);
+                      },
+                      [&](const clause::DefinedOperator::DefinedOpName &opName)
+                          -> std::string {
+                        return opName.v.sym()->name().ToString();
+                      },
+                  },
+                  defOp.u);
+            },
+            [&](const clause::ProcedureDesignator &pd) -> std::string {
+              return pd.v.sym()->name().ToString();
+            },
+        },
+        redOp.u);
+
     ReductionProcessor::GenCombinerCBTy genCombinerCB =
-        processReductionCombiner(converter, symTable, semaCtx, combiner);
+        processReductionCombiner(converter, symTable, semaCtx, combiner,
+                                 parserInst);
     ReductionProcessor::GenInitValueCBTy genInitValueCB;
     ClauseProcessor cp(converter, semaCtx, clauses);
     cp.processInitializer(symTable, genInitValueCB);
-    bool isByRef = ReductionProcessor::doReductionByRef(reductionType);
+    mlir::Type redType =
+        isByRef
+            ? static_cast<mlir::Type>(fir::ReferenceType::get(reductionType))
+            : reductionType;
+
+    // Get the omp_out symbol from the combiner for finalization checks
+    // in populateByRefInitAndCleanupRegions.
+    const semantics::Symbol *reductionSym = nullptr;
+    const auto &declList =
+        std::get<std::list<parser::OmpStylizedDeclaration>>(parserInst.t);
+    for (const auto &decl : declList) {
+      const auto &name = std::get<parser::ObjectName>(decl.var.t);
+      if (name.ToString() == "omp_out") {
+        reductionSym = name.symbol;
+        break;
+      }
+    }
+
     ReductionProcessor::createDeclareReductionHelper<
         mlir::omp::DeclareReductionOp>(
-        converter, reductionNameStr, reductionType,
-        converter.getCurrentLocation(), isByRef, genCombinerCB, genInitValueCB);
+        converter, reductionNameStr, redType, converter.getCurrentLocation(),
+        isByRef, genCombinerCB, genInitValueCB, reductionSym);
   }
 }
 
diff --git a/flang/lib/Lower/Support/PrivateReductionUtils.cpp b/flang/lib/Lower/Support/PrivateReductionUtils.cpp
index d1a965d288cad..6314d8c389954 100644
--- a/flang/lib/Lower/Support/PrivateReductionUtils.cpp
+++ b/flang/lib/Lower/Support/PrivateReductionUtils.cpp
@@ -137,6 +137,20 @@ static void createCleanupRegion(Fortran::lower::AbstractConverter &converter,
     return;
   }
 
+  // Handle unboxed derived types that need finalization (e.g. types with
+  // FINAL subroutines). Embox the reference and call the runtime destroy.
+  if (fir::isa_derived(valTy) && mlir::isa<fir::ReferenceType>(argType)) {
+    mlir::Type boxTy = fir::BoxType::get(valTy);
+    mlir::Value box =
+        fir::EmboxOp::create(builder, loc, boxTy, block->getArgument(0));
+    fir::runtime::genDerivedTypeDestroy(builder, loc, box);
+    if (isDoConcurrent)
+      fir::YieldOp::create(builder, loc);
+    else
+      mlir::omp::YieldOp::create(builder, loc);
+    return;
+  }
+
   typeError();
 }
 
@@ -604,6 +618,18 @@ void PopulateInitAndCleanupRegionsHelper::initAndCleanupBoxchar(
 void PopulateInitAndCleanupRegionsHelper::initAndCleanupUnboxedDerivedType(
     bool needsInitialization) {
   builder.setInsertionPointToStart(initBlock);
+  // For reductions with a user-provided init value, store it into the
+  // private variable. Insert after the init value's defining op to
+  // maintain SSA dominance (the init value was generated by the
+  // callback before populateByRefInitAndCleanupRegions was called).
+  if (scalarInitValue && isReduction(kind)) {
+    mlir::OpBuilder::InsertionGuard guard(builder);
+    if (auto *defOp = scalarInitValue.getDefiningOp())
+      builder.setInsertionPointAfter(defOp);
+    else
+      builder.setInsertionPointToEnd(initBlock);
+    fir::StoreOp::create(builder, loc, scalarInitValue, allocatedPrivVarArg);
+  }
   mlir::Type boxedTy = fir::BoxType::get(valType);
   mlir::Value newBox =
       fir::EmboxOp::create(builder, loc, boxedTy, allocatedPrivVarArg);
diff --git a/flang/lib/Lower/Support/ReductionProcessor.cpp b/flang/lib/Lower/Support/ReductionProcessor.cpp
index e0cba4c512258..36bd1f1ef2397 100644
--- a/flang/lib/Lower/Support/ReductionProcessor.cpp
+++ b/flang/lib/Lower/Support/ReductionProcessor.cpp
@@ -502,7 +502,7 @@ template <typename OpType>
 static void createReductionAllocAndInitRegions(
     AbstractConverter &converter, mlir::Location loc, OpType &reductionDecl,
     ReductionProcessor::GenInitValueCBTy genInitValueCB, mlir::Type type,
-    bool isByRef) {
+    bool isByRef, const Fortran::semantics::Symbol *sym)...
[truncated]

@MattPD
Copy link
Copy Markdown
Member Author

MattPD commented Apr 3, 2026

This replaces the merged and reverted commit e80604a from PR #184897.

Part 4, GPU codegen for non-descriptor by-ref reductions, addresses the ICE encountered in the original PR: These are the changes in llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp (I did run git-clang-format, FWIW) and mlir/lib/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.cpp

(I can see 1 failing check, "CI Checks / Build and Test Linux AArch64 (pull_request)". The failure is an LLDB test:
TestSyntheticCapping.py - Fatal Python error: Segmentation fault. It's a Python segfault in an LLDB data-formatter test not related to this patch.)

@tblah
Copy link
Copy Markdown
Contributor

tblah commented Apr 7, 2026

Added some AMD people to review offloading. I'm happy with the original PR

@mjklemm mjklemm self-requested a review April 8, 2026 15:57
@dpalermo dpalermo requested review from jsjodin and kparzysz April 8, 2026 15:59
@MattPD
Copy link
Copy Markdown
Member Author

MattPD commented Apr 15, 2026

Just a gentle ping to ask whether there are any comments on the patch.

Copy link
Copy Markdown
Contributor

@kparzysz kparzysz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

Comment thread flang/lib/Lower/OpenMP/OpenMP.cpp Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the explicit Fortran:: namespace for consistency with the rest of the code. In most cases we do using namespace Fortran at the beginning of the .cpp file. I think I've seen the Fortran:: used more than once.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done!

…ypes

Fix lowering of `!$omp declare reduction` for intrinsic operators applied
to user-defined derived types (e.g., `+` on `type(t)`). Previously, this
hit a TODO in `ReductionProcessor::getReductionInitValue` because the code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the `DeclareReductionOp`.

This fixes the issue llvm#176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: `genOMP` for
`OpenMPDeclareReductionConstruct` used a raw operator string (e.g., "Add")
as the reduction name, while `processReductionArguments` at the use site
computed a canonical name via `getReductionName` (e.g.,
"add_reduction_byref_rec__QFTt"). The `lookupSymbol` in
`createDeclareReductionHelper` never found the already-created op, so it
fell through to `createDeclareReduction` which called
`getReductionInitValue` with the derived type and hit the TODO.

The fix has four parts:

1. Consistent names: In `genOMP` for `OpenMPDeclareReductionConstruct`,
compute the reduction name using the same `getReductionName` scheme that
`processReductionArguments` uses, so both sites produce identical symbol
names. For intrinsic operators, this maps through `ReductionIdentifier` to
get the canonical name. For user-defined named reductions, the raw symbol
name is used directly, matching the existing custom-reduction lookup path.

2. Reuse reduction: In `processReductionArguments`, when an intrinsic
operator reduction is requested, check whether a user-defined declare
reduction already exists under that canonical name before attempting to
create a new one. If found, reuse it. This avoids calling
`createDeclareReduction` (and thus `getReductionInitValue`) for types that
have user-provided initializers.

3. Reference semantics: Change `doReductionByRef` to return true for
derived types. Previously it returned false for both trivial and derived
types, treating derived types as by-val. This is incorrect for
user-defined combiners that operate on components via side-effects (e.g.,
`omp_out%x = omp_out%x + omp_in%x`): the combiner mutates `omp_out` in
place and does not produce a whole-struct value, so `convertExprToValue`
returns the component type (`i32`) rather than the struct type, causing a
type mismatch in the `omp.yield`. By-ref is the correct model: the
combiner stores into the lhs reference and yields it.

The combiner callback in `processReductionCombiner` is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level assignments),
the store is skipped since the assignment already wrote into omp_out as a
side-effect, and only the lhs reference is yielded.

4. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
codegen in OMPIRBuilder unconditionally called the `DataPtrPtrGen`
callback for all by-ref elements, assuming descriptor-based pointer
indirection was always needed. For simple derived types (e.g., a struct
with scalar integer members), there is no Fortran descriptor/box, so
`dataPtrPtrRegion` is empty and `DataPtrPtrGen` is null. This caused a
`std::bad_function_call` crash (or an assertion failure in
`getSingleElement` when the empty region was inlined).

The fix returns a null `DataPtrPtrGen` callback from `makeRefDataPtrGen`
when `dataPtrPtrRegion` is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when `DataPtrPtrGen` is null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:
- Update declare-reduction-intrinsic-op.f90 from a negative test
  (checking for the TODO error) to a positive test checking generated MLIR.
- Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
  reference semantics fix.
- Add declare-reduction-finalizer.f90 for derived type with init and
  finalization.

Remaining work: declare reduction without an initializer clause is not yet
supported.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Assisted-by: Claude Opus 4.6.
@MattPD MattPD force-pushed the flang_176278_declare-reduction-derived branch from 0607a8d to 6eb106d Compare April 20, 2026 21:05
@MattPD
Copy link
Copy Markdown
Member Author

MattPD commented Apr 20, 2026

Updated to include the reviewer's feedback. If this looks good please feel free to merge; thanks!

@Jason-Van-Beusekom Jason-Van-Beusekom merged commit 50d7c99 into llvm:main Apr 22, 2026
10 checks passed
yingopq pushed a commit to yingopq/llvm-project that referenced this pull request Apr 29, 2026
…ypes (llvm#190288)

Fix lowering of `!$omp declare reduction` for intrinsic operators
applied
to user-defined derived types (e.g., `+` on `type(t)`). Previously, this
hit a TODO in `ReductionProcessor::getReductionInitValue` because the
code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the `DeclareReductionOp`.

This fixes the issue llvm#176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: `genOMP` for
`OpenMPDeclareReductionConstruct` used a raw operator string (e.g.,
"Add")
as the reduction name, while `processReductionArguments` at the use site
computed a canonical name via `getReductionName` (e.g.,
"add_reduction_byref_rec__QFTt"). The `lookupSymbol` in
`createDeclareReductionHelper` never found the already-created op, so it
fell through to `createDeclareReduction` which called
`getReductionInitValue` with the derived type and hit the TODO.

The fix has four parts:

1. Consistent names: In `genOMP` for `OpenMPDeclareReductionConstruct`,
compute the reduction name using the same `getReductionName` scheme that
`processReductionArguments` uses, so both sites produce identical symbol
names. For intrinsic operators, this maps through `ReductionIdentifier`
to
get the canonical name. For user-defined named reductions, the raw
symbol
name is used directly, matching the existing custom-reduction lookup
path.

2. Reuse reduction: In `processReductionArguments`, when an intrinsic
operator reduction is requested, check whether a user-defined declare
reduction already exists under that canonical name before attempting to
create a new one. If found, reuse it. This avoids calling
`createDeclareReduction` (and thus `getReductionInitValue`) for types
that
have user-provided initializers.

3. Reference semantics: Change `doReductionByRef` to return true for
derived types. Previously it returned false for both trivial and derived
types, treating derived types as by-val. This is incorrect for
user-defined combiners that operate on components via side-effects
(e.g.,
`omp_out%x = omp_out%x + omp_in%x`): the combiner mutates `omp_out` in
place and does not produce a whole-struct value, so `convertExprToValue`
returns the component type (`i32`) rather than the struct type, causing
a
type mismatch in the `omp.yield`. By-ref is the correct model: the
combiner stores into the lhs reference and yields it.

The combiner callback in `processReductionCombiner` is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level
assignments),
the store is skipped since the assignment already wrote into omp_out as
a
side-effect, and only the lhs reference is yielded.

4. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
codegen in OMPIRBuilder unconditionally called the `DataPtrPtrGen`
callback for all by-ref elements, assuming descriptor-based pointer
indirection was always needed. For simple derived types (e.g., a struct
with scalar integer members), there is no Fortran descriptor/box, so
`dataPtrPtrRegion` is empty and `DataPtrPtrGen` is null. This caused a
`std::bad_function_call` crash (or an assertion failure in
`getSingleElement` when the empty region was inlined).

The fix returns a null `DataPtrPtrGen` callback from `makeRefDataPtrGen`
when `dataPtrPtrRegion` is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when `DataPtrPtrGen` is
null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:
- Update declare-reduction-intrinsic-op.f90 from a negative test
(checking for the TODO error) to a positive test checking generated
MLIR.
- Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
  reference semantics fix.
- Add declare-reduction-finalizer.f90 for derived type with init and
  finalization.

Remaining work: declare reduction without an initializer clause is not
yet
supported.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Assisted-by: Claude Opus 4.6.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
KHicketts pushed a commit to KHicketts/llvm-project that referenced this pull request Apr 30, 2026
…ypes (llvm#190288)

Fix lowering of `!$omp declare reduction` for intrinsic operators
applied
to user-defined derived types (e.g., `+` on `type(t)`). Previously, this
hit a TODO in `ReductionProcessor::getReductionInitValue` because the
code
tried to compute an init value for a non-predefined type, when it should
instead use the initializer region from the `DeclareReductionOp`.

This fixes the issue llvm#176278: [Flang][OpenMP] Compilation error when
type-list in declare reduction directive is derived type name.

The root cause was a naming mismatch: `genOMP` for
`OpenMPDeclareReductionConstruct` used a raw operator string (e.g.,
"Add")
as the reduction name, while `processReductionArguments` at the use site
computed a canonical name via `getReductionName` (e.g.,
"add_reduction_byref_rec__QFTt"). The `lookupSymbol` in
`createDeclareReductionHelper` never found the already-created op, so it
fell through to `createDeclareReduction` which called
`getReductionInitValue` with the derived type and hit the TODO.

The fix has four parts:

1. Consistent names: In `genOMP` for `OpenMPDeclareReductionConstruct`,
compute the reduction name using the same `getReductionName` scheme that
`processReductionArguments` uses, so both sites produce identical symbol
names. For intrinsic operators, this maps through `ReductionIdentifier`
to
get the canonical name. For user-defined named reductions, the raw
symbol
name is used directly, matching the existing custom-reduction lookup
path.

2. Reuse reduction: In `processReductionArguments`, when an intrinsic
operator reduction is requested, check whether a user-defined declare
reduction already exists under that canonical name before attempting to
create a new one. If found, reuse it. This avoids calling
`createDeclareReduction` (and thus `getReductionInitValue`) for types
that
have user-provided initializers.

3. Reference semantics: Change `doReductionByRef` to return true for
derived types. Previously it returned false for both trivial and derived
types, treating derived types as by-val. This is incorrect for
user-defined combiners that operate on components via side-effects
(e.g.,
`omp_out%x = omp_out%x + omp_in%x`): the combiner mutates `omp_out` in
place and does not produce a whole-struct value, so `convertExprToValue`
returns the component type (`i32`) rather than the struct type, causing
a
type mismatch in the `omp.yield`. By-ref is the correct model: the
combiner stores into the lhs reference and yields it.

The combiner callback in `processReductionCombiner` is also updated to
handle the by-ref derived-type case: when the combiner result type does
not match the element type (as happens with component-level
assignments),
the store is skipped since the assignment already wrote into omp_out as
a
side-effect, and only the lhs reference is yielded.

4. GPU codegen for non-descriptor by-ref reductions: The GPU reduction
codegen in OMPIRBuilder unconditionally called the `DataPtrPtrGen`
callback for all by-ref elements, assuming descriptor-based pointer
indirection was always needed. For simple derived types (e.g., a struct
with scalar integer members), there is no Fortran descriptor/box, so
`dataPtrPtrRegion` is empty and `DataPtrPtrGen` is null. This caused a
`std::bad_function_call` crash (or an assertion failure in
`getSingleElement` when the empty region was inlined).

The fix returns a null `DataPtrPtrGen` callback from `makeRefDataPtrGen`
when `dataPtrPtrRegion` is empty, and guards all call sites in
OMPIRBuilder to skip descriptor indirection when `DataPtrPtrGen` is
null.
For non-descriptor by-ref reductions, the pointer already references the
data directly--no indirection is needed.

Tests:
- Update declare-reduction-intrinsic-op.f90 from a negative test
(checking for the TODO error) to a positive test checking generated
MLIR.
- Update omp-declare-reduction-derivedtype.f90 CHECK lines to match the
  reference semantics fix.
- Add declare-reduction-finalizer.f90 for derived type with init and
  finalization.

Remaining work: declare reduction without an initializer clause is not
yet
supported.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Assisted-by: Claude Opus 4.6.

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:openmp OpenMP related changes to Clang flang:fir-hlfir flang:openmp flang Flang issues not falling into any other category mlir:llvm mlir:openmp mlir

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants