Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flang][hlfir] Fixed some finalization/deallocation issues. #67047

Merged
merged 7 commits into from
Sep 22, 2023

Conversation

vzakhari
Copy link
Contributor

@vzakhari vzakhari commented Sep 21, 2023

This set of commits resolves some of the issues with elemental calls producing
results that may require finalization, and also some memory leak issues due to
the missing deallocation of allocatable components of the temporary buffers
created by the bufferization pass.

I split it into 7 commits to make the review a little bit more convenient, though,
I am planning to merge it as a single squashed commit.

  • [flang][runtime] Expose Finalize API for derived types.

  • [flang][hlfir] Add 'finalize' attribute for DestroyOp.

  • [flang][hlfir] Postpone result finalization for elemental calls.

    The results of elemental calls generated inside hlfir.elemental must not
    be finalized/destructed before they are copied into the resulting
    array. The finalization must be done on the array as a whole
    (e.g. there might be different scalar and array finalization routines).
    The finalization work is left to the hlfir.destroy corresponding
    to this hlfir.elemental.

  • [flang][hlfir] Tighten requirements on hlfir.end_associate operand.

    If component deallocation might be required for the operand of
    hlfir.end_associate, we have to be able to get the variable
    shape/params to create a descriptor for calling the runtime.
    This commit adds verification that we can do so.

  • [flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate.

    The operand must be a Fortran entity, when allocatable component
    deallocation may be required.

  • [flang][hlfir] Properly clean-up temporary buffers in bufferization pass.

    This commit combines changes for proper finalization and component
    deallocation of the temporary buffers. The finalization part
    relates to hlfir.destroy operations with 'finalize' attribute.
    The component deallocation might be invoked for both hlfir.destroy
    and hlfir.end_associate, if the operand is of a derived type
    with allocatable component(s).

    The changes are mostly in one function, so I decided not to split them.

  • [flang][hlfir] Disable optimizations for hlfir.elemental requiring finalization.

    If hlfir.elemental is coupled with hlfir.destroy with 'finalize' attribute,
    the temporary array result of hlfir.elemental needs to be created
    for the purpose of finalization. We cannot do certain optimizations
    on such hlfir.elemental operations.

    I was not able to come up with a test for the OptimizedBufferization pass,
    but I put the check there as well.

The results of elemental calls generated inside hlfir.elemental must not
be finalized/destructed before they are copied into the resulting
array. The finalization must be done on the array as a whole
(e.g. there might be different scalar and array finalization routines).
The finalization work is left to the hlfir.destroy corresponding
to this hlfir.elemental.
If component deallocation might be required for the operand of
hlfir.end_associate, we have to be able to get the variable
shape/params to create a descriptor for calling the runtime.
This commit adds verification that we can do so.
The operand must be a Fortran entity, when allocatable component
deallocation may be required.
…ass.

This commit combines changes for proper finalization and component
deallocation of the temporary buffers. The finalization part
relates to hlfir.destroy operations with 'finalize' attribute.
The component deallocation might be invoked for both hlfir.destroy
and hlfir.end_associate, if the operand is of a derived type
with allocatable component(s).

The changes are mostly in one function, so I decided not to split them.
…nalization.

If hlfir.elemental is coupled with hlfir.destroy with 'finalize' attribute,
the temporary array result of hlfir.elemental needs to be created
for the purpose of finalization. We cannot do certain optimizations
on such hlfir.elemental operations.

I was not able to come up with a test for the OptimizedBufferization pass,
but I put the check there as well.
@llvmbot llvmbot added flang:runtime flang Flang issues not falling into any other category flang:fir-hlfir labels Sep 21, 2023
@llvmbot
Copy link
Collaborator

llvmbot commented Sep 21, 2023

@llvm/pr-subscribers-flang-runtime

@llvm/pr-subscribers-flang-fir-hlfir

Changes

This set of commits resolves some of the issues with elemental calls producing
results that may require finalization, and also some memory leak issues due to
the missing deallocation of allocatable components of the temporary buffers
created by the bufferization pass.

I split it into 7 commits to make the review a little bit more convenient, though,
I am planning to merge it as a single squashed commit.

  • [flang][runtime] Expose Finalize API for derived types.
  • [flang][hlfir] Add 'finalize' attribute for DestroyOp.
  • [flang][hlfir] Postpone result finalization for elemental calls.
  • [flang][hlfir] Tighten requirements on hlfir.end_associate operand.
  • [flang][hlfir] Lower argument clean-ups using valid hlfir.end_associate.
  • [flang][hlfir] Properly clean-up temporary buffers in bufferization pass.
  • [flang][hlfir] Disable optimizations for hlfir.elemental requiring finalization.

Patch is 68.92 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/67047.diff

29 Files Affected:

  • (modified) flang/include/flang/Lower/ConvertCall.h (+3-1)
  • (modified) flang/include/flang/Optimizer/Builder/HLFIRTools.h (+7-31)
  • (modified) flang/include/flang/Optimizer/Builder/Runtime/Derived.h (+5)
  • (modified) flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h (+38)
  • (modified) flang/include/flang/Optimizer/HLFIR/HLFIROps.td (+23-2)
  • (modified) flang/include/flang/Runtime/derived-api.h (+4)
  • (modified) flang/lib/Lower/ConvertCall.cpp (+55-9)
  • (modified) flang/lib/Lower/HlfirIntrinsics.cpp (+4)
  • (modified) flang/lib/Optimizer/Builder/HLFIRTools.cpp (+9)
  • (modified) flang/lib/Optimizer/Builder/Runtime/Derived.cpp (+12)
  • (modified) flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp (+5)
  • (modified) flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp (+36-1)
  • (modified) flang/lib/Optimizer/HLFIR/Transforms/BufferizeHLFIR.cpp (+104-18)
  • (modified) flang/lib/Optimizer/HLFIR/Transforms/InlineElementals.cpp (+5)
  • (modified) flang/lib/Optimizer/HLFIR/Transforms/OptimizedBufferization.cpp (+7)
  • (modified) flang/runtime/derived-api.cpp (+12)
  • (modified) flang/test/HLFIR/associate-codegen.fir (+1-1)
  • (added) flang/test/HLFIR/bufferize-destroy-for-derived.fir (+97)
  • (added) flang/test/HLFIR/bufferize-end-associate-for-derived.fir (+52)
  • (modified) flang/test/HLFIR/bufferize-poly-expr.fir (+1-1)
  • (modified) flang/test/HLFIR/destroy.fir (+16)
  • (modified) flang/test/HLFIR/elemental-codegen-nested.fir (+1-1)
  • (modified) flang/test/HLFIR/inline-elemental.fir (+37)
  • (modified) flang/test/HLFIR/invalid.fir (+16)
  • (modified) flang/test/HLFIR/mul_transpose.f90 (+2-1)
  • (added) flang/test/Lower/HLFIR/associate-for-args-with-alloc-components.f90 (+65)
  • (added) flang/test/Lower/HLFIR/elemental-call-with-finalization.f90 (+60)
  • (modified) flang/test/Lower/HLFIR/poly_expr_for_nonpoly_dummy.f90 (+2-2)
  • (modified) flang/test/Lower/HLFIR/polymorphic-expressions.f90 (+1-1)
diff --git a/flang/include/flang/Lower/ConvertCall.h b/flang/include/flang/Lower/ConvertCall.h
index 62ba229614d58c1..f8171236bb39d99 100644
--- a/flang/include/flang/Lower/ConvertCall.h
+++ b/flang/include/flang/Lower/ConvertCall.h
@@ -28,11 +28,13 @@ namespace Fortran::lower {
 /// the call and return the result. This function deals with explicit result
 /// allocation and lowering if needed. It also deals with passing the host
 /// link to internal procedures.
+/// \p isElemental must be set to true if elemental call is being produced.
+/// It is only used for HLFIR.
 fir::ExtendedValue genCallOpAndResult(
     mlir::Location loc, Fortran::lower::AbstractConverter &converter,
     Fortran::lower::SymMap &symMap, Fortran::lower::StatementContext &stmtCtx,
     Fortran::lower::CallerInterface &caller, mlir::FunctionType callSiteType,
-    std::optional<mlir::Type> resultType);
+    std::optional<mlir::Type> resultType, bool isElemental = false);
 
 /// If \p arg is the address of a function with a denoted host-association tuple
 /// argument, then return the host-associations tuple value of the current
diff --git a/flang/include/flang/Optimizer/Builder/HLFIRTools.h b/flang/include/flang/Optimizer/Builder/HLFIRTools.h
index 6d73ebc3a7e1d9c..393e70f772e5ecf 100644
--- a/flang/include/flang/Optimizer/Builder/HLFIRTools.h
+++ b/flang/include/flang/Optimizer/Builder/HLFIRTools.h
@@ -35,37 +35,6 @@ class ElementalOpInterface;
 class ElementalAddrOp;
 class YieldElementOp;
 
-/// Is this an SSA value type for the value of a Fortran procedure
-/// designator ?
-inline bool isFortranProcedureValue(mlir::Type type) {
-  return type.isa<fir::BoxProcType>() ||
-         (type.isa<mlir::TupleType>() &&
-          fir::isCharacterProcedureTuple(type, /*acceptRawFunc=*/false));
-}
-
-/// Is this an SSA value type for the value of a Fortran expression?
-inline bool isFortranValueType(mlir::Type type) {
-  return type.isa<hlfir::ExprType>() || fir::isa_trivial(type) ||
-         isFortranProcedureValue(type);
-}
-
-/// Is this the value of a Fortran expression in an SSA value form?
-inline bool isFortranValue(mlir::Value value) {
-  return isFortranValueType(value.getType());
-}
-
-/// Is this a Fortran variable?
-/// Note that by "variable", it must be understood that the mlir::Value is
-/// a memory value of a storage that can be reason about as a Fortran object
-/// (its bounds, shape, and type parameters, if any, are retrievable).
-/// This does not imply that the mlir::Value points to a variable from the
-/// original source or can be legally defined: temporaries created to store
-/// expression values are considered to be variables, and so are PARAMETERs
-/// global constant address.
-inline bool isFortranEntity(mlir::Value value) {
-  return isFortranValue(value) || isFortranVariableType(value.getType());
-}
-
 /// Is this a Fortran variable for which the defining op carrying the Fortran
 /// attributes is visible?
 inline bool isFortranVariableWithAttributes(mlir::Value value) {
@@ -442,6 +411,13 @@ hlfir::ElementalOp cloneToElementalOp(mlir::Location loc,
                                       fir::FirOpBuilder &builder,
                                       hlfir::ElementalAddrOp elementalAddrOp);
 
+/// Return true, if \p elemental must produce a temporary array,
+/// for example, for the purpose of finalization. Note that such
+/// ElementalOp's must be optimized with caution. For example,
+/// completely inlining such ElementalOp into another one
+/// would be incorrect.
+bool elementalOpMustProduceTemp(hlfir::ElementalOp elemental);
+
 } // namespace hlfir
 
 #endif // FORTRAN_OPTIMIZER_BUILDER_HLFIRTOOLS_H
diff --git a/flang/include/flang/Optimizer/Builder/Runtime/Derived.h b/flang/include/flang/Optimizer/Builder/Runtime/Derived.h
index 30998f8b0ea6558..d8b06f35b1da8a6 100644
--- a/flang/include/flang/Optimizer/Builder/Runtime/Derived.h
+++ b/flang/include/flang/Optimizer/Builder/Runtime/Derived.h
@@ -31,6 +31,11 @@ void genDerivedTypeInitialize(fir::FirOpBuilder &builder, mlir::Location loc,
 void genDerivedTypeDestroy(fir::FirOpBuilder &builder, mlir::Location loc,
                            mlir::Value box);
 
+/// Generate call to derived type finalization runtime routine
+/// to finalize \p box.
+void genDerivedTypeFinalize(fir::FirOpBuilder &builder, mlir::Location loc,
+                            mlir::Value box);
+
 /// Generate call to derived type destruction runtime routine to
 /// destroy \p box without finalization
 void genDerivedTypeDestroyWithoutFinalization(fir::FirOpBuilder &builder,
diff --git a/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h b/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h
index b76063fb7c53536..aa68d0811c4868a 100644
--- a/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h
+++ b/flang/include/flang/Optimizer/HLFIR/HLFIRDialect.h
@@ -78,6 +78,37 @@ inline bool isPolymorphicType(mlir::Type type) {
   return fir::isPolymorphicType(type);
 }
 
+/// Is this an SSA value type for the value of a Fortran procedure
+/// designator ?
+inline bool isFortranProcedureValue(mlir::Type type) {
+  return type.isa<fir::BoxProcType>() ||
+         (type.isa<mlir::TupleType>() &&
+          fir::isCharacterProcedureTuple(type, /*acceptRawFunc=*/false));
+}
+
+/// Is this an SSA value type for the value of a Fortran expression?
+inline bool isFortranValueType(mlir::Type type) {
+  return type.isa<hlfir::ExprType>() || fir::isa_trivial(type) ||
+         isFortranProcedureValue(type);
+}
+
+/// Is this the value of a Fortran expression in an SSA value form?
+inline bool isFortranValue(mlir::Value value) {
+  return isFortranValueType(value.getType());
+}
+
+/// Is this a Fortran variable?
+/// Note that by "variable", it must be understood that the mlir::Value is
+/// a memory value of a storage that can be reason about as a Fortran object
+/// (its bounds, shape, and type parameters, if any, are retrievable).
+/// This does not imply that the mlir::Value points to a variable from the
+/// original source or can be legally defined: temporaries created to store
+/// expression values are considered to be variables, and so are PARAMETERs
+/// global constant address.
+inline bool isFortranEntity(mlir::Value value) {
+  return isFortranValue(value) || isFortranVariableType(value.getType());
+}
+
 bool isFortranScalarNumericalType(mlir::Type);
 bool isFortranNumericalArrayObject(mlir::Type);
 bool isFortranNumericalOrLogicalArrayObject(mlir::Type);
@@ -94,6 +125,13 @@ bool isPolymorphicObject(mlir::Type);
 mlir::Value genExprShape(mlir::OpBuilder &builder, const mlir::Location &loc,
                          const hlfir::ExprType &expr);
 
+/// Return true iff `ty` may have allocatable component.
+/// TODO: this actually belongs to FIRType.cpp, but the method's implementation
+/// depends on HLFIRDialect component. FIRType.cpp itself is part of FIRDialect
+/// that cannot depend on HLFIRBuilder (there will be a cyclic dependency).
+/// This has to be cleaned up, when HLFIR is the default.
+bool mayHaveAllocatableComponent(mlir::Type ty);
+
 } // namespace hlfir
 
 #endif // FORTRAN_OPTIMIZER_HLFIR_HLFIRDIALECT_H
diff --git a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
index 1f584d6afd8fbcb..9fbf09331c099c0 100644
--- a/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
+++ b/flang/include/flang/Optimizer/HLFIR/HLFIROps.td
@@ -705,6 +705,8 @@ def hlfir_EndAssociateOp : hlfir_Op<"end_associate", [MemoryEffects<[MemFree]>]>
 
   let description = [{
     Mark the end of life of a variable associated to an expression.
+    If the expression has a derived type that may contain allocatable
+    components, the variable operand must be a Fortran entity.
   }];
 
   let arguments = (ins AnyRefOrBoxLike:$var,
@@ -715,6 +717,7 @@ def hlfir_EndAssociateOp : hlfir_Op<"end_associate", [MemoryEffects<[MemFree]>]>
   }];
 
   let builders = [OpBuilder<(ins "hlfir::AssociateOp":$associate)>];
+  let hasVerifier = 1;
 }
 
 def hlfir_AsExprOp : hlfir_Op<"as_expr",
@@ -981,6 +984,11 @@ def hlfir_DestroyOp : hlfir_Op<"destroy", [MemoryEffects<[MemFree]>]> {
     Mark the last use of an hlfir.expr. This will be the point at which the
     buffer of an hlfir.expr, if any, will be deallocated if it was heap
     allocated.
+    If "finalize" attribute is set, the hlfir.expr value will be finalized
+    before the deallocation. Note that this implies that the hlfir.expr
+    is placed into a memory buffer, so that the library runtime
+    can be called on it. The element type of the hlfir.expr must be
+    derived type in this case.
     It is not required to create an hlfir.destroy operation for and hlfir.expr
     created inside an hlfir.elemental and returned in the hlfir.yield_element.
     The last use of such expression is implicit and an hlfir.destroy could
@@ -995,9 +1003,22 @@ def hlfir_DestroyOp : hlfir_Op<"destroy", [MemoryEffects<[MemFree]>]> {
     in bufferization instead.
   }];
 
-  let arguments = (ins hlfir_ExprType:$expr);
+  let arguments = (ins
+    hlfir_ExprType:$expr,
+    UnitAttr:$finalize
+  );
+
+  let assemblyFormat = [{
+    $expr (`finalize` $finalize^)? attr-dict `:` qualified(type($expr))
+  }];
+
+  let extraClassDeclaration = [{
+    bool mustFinalizeExpr() {
+      return getFinalize();
+    }
+  }];
 
-  let assemblyFormat = "$expr attr-dict `:` qualified(type($expr))";
+  let hasVerifier = 1;
 }
 
 def hlfir_CopyInOp : hlfir_Op<"copy_in", [MemoryEffects<[MemAlloc]>]> {
diff --git a/flang/include/flang/Runtime/derived-api.h b/flang/include/flang/Runtime/derived-api.h
index 3bf631bc4d0c763..decba9f686d9201 100644
--- a/flang/include/flang/Runtime/derived-api.h
+++ b/flang/include/flang/Runtime/derived-api.h
@@ -37,6 +37,10 @@ void RTNAME(Initialize)(
 // storage.
 void RTNAME(Destroy)(const Descriptor &);
 
+// Finalizes the object and its components.
+void RTNAME(Finalize)(
+    const Descriptor &, const char *sourceFile = nullptr, int sourceLine = 0);
+
 /// Deallocates any allocatable/automatic components.
 /// Does not deallocate the descriptor's storage.
 /// Does not perform any finalization.
diff --git a/flang/lib/Lower/ConvertCall.cpp b/flang/lib/Lower/ConvertCall.cpp
index 59d059e27cf1aae..788aac2d583f93d 100644
--- a/flang/lib/Lower/ConvertCall.cpp
+++ b/flang/lib/Lower/ConvertCall.cpp
@@ -148,7 +148,7 @@ fir::ExtendedValue Fortran::lower::genCallOpAndResult(
     mlir::Location loc, Fortran::lower::AbstractConverter &converter,
     Fortran::lower::SymMap &symMap, Fortran::lower::StatementContext &stmtCtx,
     Fortran::lower::CallerInterface &caller, mlir::FunctionType callSiteType,
-    std::optional<mlir::Type> resultType) {
+    std::optional<mlir::Type> resultType, bool isElemental) {
   fir::FirOpBuilder &builder = converter.getFirOpBuilder();
   using PassBy = Fortran::lower::CallerInterface::PassEntityBy;
   // Handle cases where caller must allocate the result or a fir.box for it.
@@ -435,7 +435,13 @@ fir::ExtendedValue Fortran::lower::genCallOpAndResult(
     std::optional<Fortran::evaluate::DynamicType> retTy =
         caller.getCallDescription().proc().GetType();
     bool cleanupWithDestroy = false;
-    if (!fir::isPointerType(funcType.getResults()[0]) && retTy &&
+    // With HLFIR lowering, isElemental must be set to true
+    // if we are producing an elemental call. In this case,
+    // the elemental results must not be destroyed, instead,
+    // the resulting array result will be finalized/destroyed
+    // as needed by hlfir.destroy.
+    if (!isElemental && !fir::isPointerType(funcType.getResults()[0]) &&
+        retTy &&
         (retTy->category() == Fortran::common::TypeCategory::Derived ||
          retTy->IsPolymorphic() || retTy->IsUnlimitedPolymorphic())) {
       if (retTy->IsPolymorphic() || retTy->IsUnlimitedPolymorphic()) {
@@ -692,6 +698,14 @@ struct PreparedDummyArgument {
     cleanups.emplace_back(
         CallCleanUp{CallCleanUp::ExprAssociate{tempVar, wasCopied}});
   }
+  void pushExprAssociateCleanUp(hlfir::AssociateOp associate) {
+    mlir::Value hlfirBase = associate.getBase();
+    mlir::Value firBase = associate.getFirBase();
+    cleanups.emplace_back(CallCleanUp{CallCleanUp::ExprAssociate{
+        hlfir::mayHaveAllocatableComponent(hlfirBase.getType()) ? hlfirBase
+                                                                : firBase,
+        associate.getMustFreeStrorageFlag()}});
+  }
 
   mlir::Value dummy;
   // NOTE: the clean-ups are executed in reverse order.
@@ -896,8 +910,7 @@ static PreparedDummyArgument preparePresentUserCallActualArgument(
           loc, builder, hlfir::Entity{copy}, storageType, "adapt.valuebyref");
       entity = hlfir::Entity{associate.getBase()};
       // Register the temporary destruction after the call.
-      preparedDummy.pushExprAssociateCleanUp(
-          associate.getFirBase(), associate.getMustFreeStrorageFlag());
+      preparedDummy.pushExprAssociateCleanUp(associate);
     } else if (mustDoCopyInOut) {
       // Copy-in non contiguous variables.
       assert(entity.getType().isa<fir::BaseBoxType>() &&
@@ -924,8 +937,7 @@ static PreparedDummyArgument preparePresentUserCallActualArgument(
     hlfir::AssociateOp associate = hlfir::genAssociateExpr(
         loc, builder, entity, storageType, "adapt.valuebyref");
     entity = hlfir::Entity{associate.getBase()};
-    preparedDummy.pushExprAssociateCleanUp(associate.getFirBase(),
-                                           associate.getMustFreeStrorageFlag());
+    preparedDummy.pushExprAssociateCleanUp(associate);
     if (mustSetDynamicTypeToDummyType) {
       // Rebox the actual argument to the dummy argument's type, and make
       // sure that we pass a contiguous entity (i.e. make copy-in,
@@ -1201,7 +1213,8 @@ genUserCall(Fortran::lower::PreparedActualArguments &loweredActuals,
   // arguments.
   fir::ExtendedValue result = Fortran::lower::genCallOpAndResult(
       loc, callContext.converter, callContext.symMap, callContext.stmtCtx,
-      caller, callSiteType, callContext.resultType);
+      caller, callSiteType, callContext.resultType,
+      callContext.isElementalProcWithArrayArgs());
 
   /// Clean-up associations and copy-in.
   for (auto cleanUp : callCleanUps)
@@ -1687,9 +1700,14 @@ class ElementalCallBuilder {
     mlir::Value elemental =
         hlfir::genElementalOp(loc, builder, elementType, shape, typeParams,
                               genKernel, !mustBeOrdered, polymorphicMold);
+    // If the function result requires finalization, then it has to be done
+    // for the array result of the elemental call. We have to communicate
+    // this via the DestroyOp's attribute.
+    bool mustFinalizeExpr = impl().resultMayRequireFinalization(callContext);
     fir::FirOpBuilder *bldr = &builder;
-    callContext.stmtCtx.attachCleanup(
-        [=]() { bldr->create<hlfir::DestroyOp>(loc, elemental); });
+    callContext.stmtCtx.attachCleanup([=]() {
+      bldr->create<hlfir::DestroyOp>(loc, elemental, mustFinalizeExpr);
+    });
     return hlfir::EntityWithAttributes{elemental};
   }
 
@@ -1743,6 +1761,26 @@ class ElementalUserCallBuilder
     return {};
   }
 
+  bool resultMayRequireFinalization(CallContext &callContext) const {
+    std::optional<Fortran::evaluate::DynamicType> retTy =
+        caller.getCallDescription().proc().GetType();
+    if (!retTy)
+      return false;
+
+    if (retTy->IsPolymorphic() || retTy->IsUnlimitedPolymorphic())
+      fir::emitFatalError(
+          callContext.loc,
+          "elemental function call with [unlimited-]polymorphic result");
+
+    if (retTy->category() == Fortran::common::TypeCategory::Derived) {
+      const Fortran::semantics::DerivedTypeSpec &typeSpec =
+          retTy->GetDerivedTypeSpec();
+      return Fortran::semantics::IsFinalizable(typeSpec);
+    }
+
+    return false;
+  }
+
 private:
   Fortran::lower::CallerInterface &caller;
   mlir::FunctionType callSiteType;
@@ -1804,6 +1842,14 @@ class ElementalIntrinsicCallBuilder
     return {};
   }
 
+  bool resultMayRequireFinalization(
+      [[maybe_unused]] CallContext &callContext) const {
+    // FIXME: need access to the CallerInterface's return type
+    // to check if the result may need finalization (e.g. the result
+    // of MERGE).
+    return false;
+  }
+
 private:
   const Fortran::evaluate::SpecificIntrinsic *intrinsic;
   const fir::IntrinsicArgumentLoweringRules *argLowering;
diff --git a/flang/lib/Lower/HlfirIntrinsics.cpp b/flang/lib/Lower/HlfirIntrinsics.cpp
index 62f63e376c815df..20e570044e8d4ae 100644
--- a/flang/lib/Lower/HlfirIntrinsics.cpp
+++ b/flang/lib/Lower/HlfirIntrinsics.cpp
@@ -330,6 +330,9 @@ std::optional<hlfir::EntityWithAttributes> Fortran::lower::lowerHlfirIntrinsic(
     const Fortran::lower::PreparedActualArguments &loweredActuals,
     const fir::IntrinsicArgumentLoweringRules *argLowering,
     mlir::Type stmtResultType) {
+  // If the result is of a derived type that may need finalization,
+  // we have to use DestroyOp with 'finalize' attribute for the result
+  // of the intrinsic operation.
   if (name == "sum")
     return HlfirSumLowering{builder, loc}.lower(loweredActuals, argLowering,
                                                 stmtResultType);
@@ -348,6 +351,7 @@ std::optional<hlfir::EntityWithAttributes> Fortran::lower::lowerHlfirIntrinsic(
   if (name == "dot_product")
     return HlfirDotProductLowering{builder, loc}.lower(
         loweredActuals, argLowering, stmtResultType);
+  // FIXME: the result may need finalization.
   if (name == "transpose")
     return HlfirTransposeLowering{builder, loc}.lower(
         loweredActuals, argLowering, stmtResultType);
diff --git a/flang/lib/Optimizer/Builder/HLFIRTools.cpp b/flang/lib/Optimizer/Builder/HLFIRTools.cpp
index dd62aa0e370122f..7034d6e893e7e9a 100644
--- a/flang/lib/Optimizer/Builder/HLFIRTools.cpp
+++ b/flang/lib/Optimizer/Builder/HLFIRTools.cpp
@@ -1021,3 +1021,12 @@ hlfir::cloneToElementalOp(mlir::Location loc, fir::FirOpBuilder &builder,
                                elementalAddrOp.getShape(), typeParams,
                                genKernel, !elementalAddrOp.isOrdered());
 }
+
+bool hlfir::elementalOpMustProduceTemp(hlfir::ElementalOp elemental) {
+  for (mlir::Operation *useOp : elemental->getUsers())
+    if (auto destroy = mlir::dyn_cast<hlfir::DestroyOp>(useOp))
+      if (destroy.mustFinalizeExpr())
+        return true;
+
+  return false;
+}
diff --git a/flang/lib/Optimizer/Builder/Runtime/Derived.cpp b/flang/lib/Optimizer/Builder/Runtime/Derived.cpp
index 8975656ffb38026..fe7e2d157ad9a6b 100644
--- a/flang/lib/Optimizer/Builder/Runtime/Derived.cpp
+++ b/flang/lib/Optimizer/Builder/Runtime/Derived.cpp
@@ -37,6 +37,18 @@ void fir::runtime::genDerivedTypeDestroy(fir::FirOpBuilder &builder,
   builder.create<fir::CallOp>(loc, func, args);
 }
 
+void fir::runtime::genDerivedTypeFinalize(fir::FirOpBuilder &builder,
+                                          mlir::Location loc, mlir::Value box) {
+  auto func = fir::runtime::getRuntimeFunc<mkRTKey(Finalize)>(loc, builder);
+  auto fTy = func.getFunctionType();
+  auto sourceFile = fir::factory::locationToFilename(builder, loc);
+  auto sourceLine =
+      fir::factory::locationToLineNo(builder, loc, fTy.getInput(2));
+  auto args = fir::runtime::createArguments(builder, loc, fTy, box, sourceFile,
+                                            sourceLine);
+  builder.create<fir::CallOp>(loc, func, args);
+}
+
 void fir::runtime::genDerivedTypeDestroyWithoutFinalization(
     fir::FirOpBuilder &builder, mlir::Location loc, mlir::Value box) {
   auto func = fir::runtime::getRuntimeFunc<mkRTKey(DestroyWithoutFinalization)>(
diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp b/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp
index 7ca6108a31acbb6..d3a6fb305c19919 100644
--- a/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp
+++ b/flang/lib/Optimizer/HLFIR/IR/HLFIRDialect.cpp
@@ -207,3 +207,8 @@ mlir::Value hlfir::genExprShape(mlir::OpBuilder &builder,
   fir::ShapeOp shape = builder.create<fir::ShapeOp>(loc, shapeTy, extents);
   return shape.getResult();
 }
+
+bool hlfir::mayHaveAllocatableComponent(mlir::Type ty) {
+  return fir::isPolymorphicType(ty) || fir::isUnlimitedPolymorphicType(ty) ||
+         fir::isRecordWithAllocatableMember(hlfir::getFortranElementType(ty));
+}
diff --git a/flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp b/flan...
[truncated]

@@ -772,6 +852,12 @@ struct ElementalOpConversion
// Assign the element value to the temp element for this iteration.
auto tempElement =
hlfir::getElementAt(loc, builder, temp, loopNest.oneBasedIndices);
// FIXME: if the elemental result is a function result temporary
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do this in a separate PR, because this one already touches a few pieces.

Copy link
Contributor

@jeanPerier jeanPerier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clean-up on this non trivial problem Slava!

builder.create<fir::FreeMemOp>(loc, var);

if (mustFinalize)
fir::runtime::genDerivedTypeFinalize(builder, loc, var);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Destroy may actually be a better runtime call here since automatic component should be deallocated if any. Currently, this does not make a difference since "automatic component" requires support for derived type with length parameter in lowering.
In general, the only place where we would want to use Finalize instead of Destroy is when the entity will be used again after finalization (I only see assignment LHS, and INTENT(OUT) dummies that fit this pattern), otherwise we will have memory leaks.
Now, we do not support length PDTs, so I am also fine with calling Finalize here and wait until we see how lengths PDTs are implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the automatic components will be deallocated by DestroyWithoutFinalization call below. Of course, hlfir::mayHaveAllocatableComponent have to return true, if there is either allocatable or automatic component, which might be not the case right now.

We may also call Destroy if both mustFinalize and deallocComponent are true.

Comment on lines +449 to +451
if (auto destroy = mlir::dyn_cast<hlfir::DestroyOp>(useOp)) {
if (destroy.mustFinalizeExpr())
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While semantically correct, it is a bit annoying to introduce a copy instead of running the finalizers at the hlfir.end_associate.

Moving the finalization at the end associate needs a bit of thinking though since it is not the same if the associated buffer was modified (I think correct Fortran cannot modified a variable that we create with hlfir.associate, except maybe for VALUE dummies...), so I am OK with doing the copy right now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I think we will need finalize attribute for hlfir.end_associate to communicate the "theft" of the buffer that needs finalization.

I also agree that we have to be careful about stealing the buffer from the elemental op, when the buffer might be modified inside the association scope, e.g.:

module types
  type t
     integer :: i
   contains
     final :: finalize
  end type t
contains
  subroutine finalize(x)
    type(t), intent(inout) :: x
    print *, x%i
  end subroutine finalize
end module types
subroutine test(x)
  use types
  type(t) :: x(10)
  call callee(elem(x))
contains
  subroutine callee(x)
    type(t), value :: x(10)
    x%i = 10
  end subroutine callee
  impure elemental function elem(x)
    type(t), intent(in) :: x
    type(t) :: elem
    elem = x
  end function elem
end subroutine test

Since the callee might modify the buffer, the modification can be made visible via an impure finalization routine. I guess it may only be the case for impure elemental calls, so we can probably rely on the unordered attribute to decide if it is safe to steal the buffer.

Copy link
Contributor

@tblah tblah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp Show resolved Hide resolved
Copy link
Contributor Author

@vzakhari vzakhari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both for the review!

I am going to merge this without further modifications, and will follow-up in additional PR(s).

flang/lib/Optimizer/HLFIR/IR/HLFIROps.cpp Show resolved Hide resolved
Comment on lines +449 to +451
if (auto destroy = mlir::dyn_cast<hlfir::DestroyOp>(useOp)) {
if (destroy.mustFinalizeExpr())
return false;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree! I think we will need finalize attribute for hlfir.end_associate to communicate the "theft" of the buffer that needs finalization.

I also agree that we have to be careful about stealing the buffer from the elemental op, when the buffer might be modified inside the association scope, e.g.:

module types
  type t
     integer :: i
   contains
     final :: finalize
  end type t
contains
  subroutine finalize(x)
    type(t), intent(inout) :: x
    print *, x%i
  end subroutine finalize
end module types
subroutine test(x)
  use types
  type(t) :: x(10)
  call callee(elem(x))
contains
  subroutine callee(x)
    type(t), value :: x(10)
    x%i = 10
  end subroutine callee
  impure elemental function elem(x)
    type(t), intent(in) :: x
    type(t) :: elem
    elem = x
  end function elem
end subroutine test

Since the callee might modify the buffer, the modification can be made visible via an impure finalization routine. I guess it may only be the case for impure elemental calls, so we can probably rely on the unordered attribute to decide if it is safe to steal the buffer.

builder.create<fir::FreeMemOp>(loc, var);

if (mustFinalize)
fir::runtime::genDerivedTypeFinalize(builder, loc, var);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the automatic components will be deallocated by DestroyWithoutFinalization call below. Of course, hlfir::mayHaveAllocatableComponent have to return true, if there is either allocatable or automatic component, which might be not the case right now.

We may also call Destroy if both mustFinalize and deallocComponent are true.

@vzakhari vzakhari merged commit ab1db26 into llvm:main Sep 22, 2023
5 checks passed
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Sep 28, 2023
Local branch amd-gfx 957cf8f Merged main:3510552df673 into amd-gfx:d975d0fd1040
Remote branch main ab1db26 [flang][hlfir] Fixed some finalization/deallocation issues. (llvm#67047)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:fir-hlfir flang:runtime flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants