[flang] Added LoopInvariantCodeMotion pass for [HL]FIR. #173438

vzakhari · 2025-12-24T01:09:05Z

The new pass allows hoisting some fir.load operations early
in MLIR. For example, many descriptor load might be hoisted
out of the loops, though it does not make much difference
in performance, because LLVM is able to optimize such loads
(which are lowered as llvm.memcpy into temporary descriptors),
given that proper TBAA information is generated by Flang.

Further hoisting improvements are possible in [HL]FIR LICM,
e.g. getting proper mod-ref results for Fortran runtime calls
may allow hoisting loads from global variables, which LLVM
cannot do due to lack of alias information.

This patch also contains improvements for FIR mod-ref analysis:
We may recurse into HasRecursiveMemoryEffects operations and
use getModRef recursively to get more precise results for
regions with fir.call operations.

This patch also modifies AliasAnalysis to set the instantiation
point for cases where the tracked data is accessed through a load
from !fir.ref<!fir.box<>>: without this change the mod-ref
analysis was not able to recognize user pointer/allocatable variables.

We may recurse into `HasRecursiveMemoryEffects` operations and use `getModRef` recursively to get more precise results for regions with `fir.call` operations. This patch also modifies `AliasAnalysis` to set the instantiation point for cases where the tracked data is accessed through a load from `!fir.ref<!fir.box<>>`: without this change the mod-ref analysis was not able to recognize user pointer/allocatable variables.

The new pass allows hoisting some `fir.load` operations early in MLIR. For example, many descriptor load might be hoisted out of the loops, though it does not make much difference in performance, because LLVM is able to optimize such loads (which are lowered as `llvm.memcpy` into temporary descriptors), given that proper TBAA information is generated by Flang. Further hoisting improvements are possible in [HL]FIR LICM, e.g. getting proper mod-ref results for Fortran runtime calls may allow hoisting loads from global variables, which LLVM cannot do due to lack of alias information.

llvmbot · 2025-12-24T01:09:33Z

@llvm/pr-subscribers-flang-driver

Author: Slava Zakharin (vzakhari)

Changes

The new pass allows hoisting some fir.load operations early
in MLIR. For example, many descriptor load might be hoisted
out of the loops, though it does not make much difference
in performance, because LLVM is able to optimize such loads
(which are lowered as llvm.memcpy into temporary descriptors),
given that proper TBAA information is generated by Flang.

Further hoisting improvements are possible in [HL]FIR LICM,
e.g. getting proper mod-ref results for Fortran runtime calls
may allow hoisting loads from global variables, which LLVM
cannot do due to lack of alias information.

This patch also contains improvements for FIR mod-ref analysis:
We may recurse into HasRecursiveMemoryEffects operations and
use getModRef recursively to get more precise results for
regions with fir.call operations.

This patch also modifies AliasAnalysis to set the instantiation
point for cases where the tracked data is accessed through a load
from !fir.ref<!fir.box<>>: without this change the mod-ref
analysis was not able to recognize user pointer/allocatable variables.

Patch is 198.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/173438.diff

16 Files Affected:

(modified) flang/include/flang/Optimizer/Transforms/Passes.td (+11)
(modified) flang/lib/Optimizer/Analysis/AliasAnalysis.cpp (+34-24)
(modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+4)
(modified) flang/lib/Optimizer/Transforms/CMakeLists.txt (+28-25)
(added) flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp (+233)
(modified) flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py (+10)
(modified) flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90 (+59-2)
(modified) flang/test/Analysis/AliasAnalysis/modref-call-globals.f90 (+22)
(added) flang/test/Analysis/AliasAnalysis/modref-call-recursive.fir (+62)
(modified) flang/test/Driver/bbc-mlir-pass-pipeline.f90 (+3)
(modified) flang/test/Driver/mlir-pass-pipeline.f90 (+3)
(modified) flang/test/Fir/basic-program.fir (+3)
(modified) flang/test/Lower/array-expression-assumed-size.f90 (+164-159)
(modified) flang/test/Lower/array-temp.f90 (+262-250)
(modified) flang/test/Lower/vector-subscript-io.f90 (+5-5)
(added) flang/test/Transforms/licm.fir (+1326)

diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index f50202784e2dc..b6b78ec7b21b0 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -596,4 +596,15 @@ def MIFOpConversion : Pass<"mif-convert", "mlir::ModuleOp"> {
   let dependentDialects = ["fir::FIROpsDialect", "mlir::LLVM::LLVMDialect"];
 }
 
+def LoopInvariantCodeMotion : Pass<"flang-licm", "::mlir::func::FuncOp"> {
+  let summary = "Hoist invariants from loops";
+  let description = [{
+    Hoist invariants from loops. This is a FIR-specific version of loop
+    invariant code motion, which relies on FIR types, operations (such as
+    fir.declare) and interfaces such as FortranObjectViewOpInterface.
+    The pass only moves existing operations, so there are no dependent
+    dialects.
+  }];
+}
+
 #endif // FLANG_OPTIMIZER_TRANSFORMS_PASSES
diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
index d2656fa7b8ea8..0dc9dda3aa0f5 100644
--- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
+++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
@@ -468,7 +468,8 @@ static ModRefResult getCallModRef(fir::CallOp call, mlir::Value var) {
   // TODO: limit to Fortran functions??
   // 1. Detect variables that can be accessed indirectly.
   fir::AliasAnalysis aliasAnalysis;
-  fir::AliasAnalysis::Source varSrc = aliasAnalysis.getSource(var);
+  fir::AliasAnalysis::Source varSrc =
+      aliasAnalysis.getSource(var, /*getLastInstantiationPoint=*/true);
   // If the variable is not a user variable, we cannot safely assume that
   // Fortran semantics apply (e.g., a bare alloca/allocmem result may very well
   // be placed in an allocatable/pointer descriptor and escape).
@@ -498,6 +499,7 @@ static ModRefResult getCallModRef(fir::CallOp call, mlir::Value var) {
   // At that stage, it has been ruled out that local (including the saved ones)
   // and dummy cannot be indirectly accessed in the call.
   if (varSrc.kind != fir::AliasAnalysis::SourceKind::Allocate &&
+      varSrc.kind != fir::AliasAnalysis::SourceKind::Argument &&
       !varSrc.isDummyArgument()) {
     if (varSrc.kind != fir::AliasAnalysis::SourceKind::Global ||
         !isSavedLocal(varSrc))
@@ -523,19 +525,36 @@ static ModRefResult getCallModRef(fir::CallOp call, mlir::Value var) {
 /// flow analysis to come 2) Allocate and Free effects are considered
 /// modifying
 ModRefResult AliasAnalysis::getModRef(Operation *op, Value location) {
-  MemoryEffectOpInterface interface = dyn_cast<MemoryEffectOpInterface>(op);
-  if (!interface) {
-    if (auto call = llvm::dyn_cast<fir::CallOp>(op))
-      return getCallModRef(call, location);
-    return ModRefResult::getModAndRef();
-  }
+  if (auto call = llvm::dyn_cast<fir::CallOp>(op))
+    return getCallModRef(call, location);
 
   // Build a ModRefResult by merging the behavior of the effects of this
   // operation.
+  ModRefResult result = ModRefResult::getNoModRef();
+  MemoryEffectOpInterface interface = dyn_cast<MemoryEffectOpInterface>(op);
+  if (op->hasTrait<mlir::OpTrait::HasRecursiveMemoryEffects>()) {
+    for (mlir::Region &region : op->getRegions()) {
+      result = result.merge(getModRef(region, location));
+      if (result.isModAndRef())
+        break;
+    }
+
+    // In MLIR, RecursiveMemoryEffects can be combined with
+    // MemoryEffectOpInterface to describe extra effects on top of the
+    // effects of the nested operations.  However, the presence of
+    // RecursiveMemoryEffects and the absence of MemoryEffectOpInterface
+    // implies the operation has no other memory effects than the one of its
+    // nested operations.
+    if (!interface)
+      return result;
+  }
+
+  if (!interface || result.isModAndRef())
+    return ModRefResult::getModAndRef();
+
   SmallVector<MemoryEffects::EffectInstance> effects;
   interface.getEffects(effects);
 
-  ModRefResult result = ModRefResult::getNoModRef();
   for (const MemoryEffects::EffectInstance &effect : effects) {
 
     // Check for an alias between the effect and our memory location.
@@ -563,22 +582,6 @@ ModRefResult AliasAnalysis::getModRef(mlir::Region &region,
                                       mlir::Value location) {
   ModRefResult result = ModRefResult::getNoModRef();
   for (mlir::Operation &op : region.getOps()) {
-    if (op.hasTrait<mlir::OpTrait::HasRecursiveMemoryEffects>()) {
-      for (mlir::Region &subRegion : op.getRegions()) {
-        result = result.merge(getModRef(subRegion, location));
-        // Fast return is already mod and ref.
-        if (result.isModAndRef())
-          return result;
-      }
-      // In MLIR, RecursiveMemoryEffects can be combined with
-      // MemoryEffectOpInterface to describe extra effects on top of the
-      // effects of the nested operations.  However, the presence of
-      // RecursiveMemoryEffects and the absence of MemoryEffectOpInterface
-      // implies the operation has no other memory effects than the one of its
-      // nested operations.
-      if (!mlir::isa<mlir::MemoryEffectOpInterface>(op))
-        continue;
-    }
     result = result.merge(getModRef(&op, location));
     if (result.isModAndRef())
       return result;
@@ -674,6 +677,13 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
             isCapturedInInternalProcedure |=
                 boxSrc.isCapturedInInternalProcedure;
 
+            if (getLastInstantiationPoint) {
+              if (!instantiationPoint)
+                instantiationPoint = boxSrc.origin.instantiationPoint;
+            } else {
+              instantiationPoint = boxSrc.origin.instantiationPoint;
+            }
+
             global = llvm::dyn_cast<mlir::SymbolRefAttr>(boxSrc.origin.u);
             if (global) {
               type = SourceKind::Global;
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 103e736accca0..038ceb1d4bd75 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -206,6 +206,10 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm,
   pm.addPass(fir::createSimplifyRegionLite());
   pm.addPass(mlir::createCSEPass());
 
+  // Run LICM after CSE, which may reduce the number of operations to hoist.
+  if (pc.OptLevel.isOptimizingForSpeed())
+    pm.addPass(fir::createLoopInvariantCodeMotion());
+
   // Polymorphic types
   pm.addPass(fir::createPolymorphicOpConversion());
   pm.addPass(fir::createAssumedRankOpConversion());
diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt
index 619f3adc67c85..fad6f34f478ba 100644
--- a/flang/lib/Optimizer/Transforms/CMakeLists.txt
+++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt
@@ -1,43 +1,44 @@
 add_flang_library(FIRTransforms
   AbstractResult.cpp
   AddAliasTags.cpp
-  AffinePromotion.cpp
+  AddDebugInfo.cpp
   AffineDemotion.cpp
+  AffinePromotion.cpp
+  AlgebraicSimplification.cpp
   AnnotateConstant.cpp
+  ArrayValueCopy.cpp
   AssumedRankOpConversion.cpp
-  CharacterConversion.cpp
-  CompilerGeneratedNames.cpp
-  ConstantArgumentGlobalisation.cpp
-  ControlFlowConverter.cpp
   CUDA/CUFAllocationConversion.cpp
   CUFAddConstructor.cpp
+  CUFComputeSharedMemoryOffsetsAndSize.cpp
   CUFDeviceGlobal.cpp
-  CUFOpConversion.cpp
   CUFGPUToLLVMConversion.cpp
-  CUFComputeSharedMemoryOffsetsAndSize.cpp
-  ArrayValueCopy.cpp
+  CUFOpConversion.cpp
+  CharacterConversion.cpp
+  CompilerGeneratedNames.cpp
+  ConstantArgumentGlobalisation.cpp
+  ControlFlowConverter.cpp
+  ConvertComplexPow.cpp
+  DebugTypeGenerator.cpp
   ExternalNameConversion.cpp
   FIRToSCF.cpp
-  MemoryUtils.cpp
-  MemoryAllocation.cpp
-  StackArrays.cpp
+  FunctionAttr.cpp
+  GenRuntimeCallsForTest.cpp
+  LoopInvariantCodeMotion.cpp
+  LoopVersioning.cpp
+  MIFOpConversion.cpp
   MemRefDataFlowOpt.cpp
-  SimplifyRegionLite.cpp
-  AlgebraicSimplification.cpp
-  SimplifyIntrinsics.cpp
-  AddDebugInfo.cpp
+  MemoryAllocation.cpp
+  MemoryUtils.cpp
+  OptimizeArrayRepacking.cpp
   PolymorphicOpConversion.cpp
-  LoopVersioning.cpp
-  StackReclaim.cpp
-  VScaleAttr.cpp
-  FunctionAttr.cpp
-  DebugTypeGenerator.cpp
   SetRuntimeCallAttributes.cpp
-  GenRuntimeCallsForTest.cpp
   SimplifyFIROperations.cpp
-  OptimizeArrayRepacking.cpp
-  ConvertComplexPow.cpp
-  MIFOpConversion.cpp
+  SimplifyIntrinsics.cpp
+  SimplifyRegionLite.cpp
+  StackArrays.cpp
+  StackReclaim.cpp
+  VScaleAttr.cpp
 
   DEPENDS
   CUFAttrs
@@ -63,12 +64,14 @@ add_flang_library(FIRTransforms
 
   MLIR_LIBS
   MLIRAffineUtils
+  MLIRAnalysis
   MLIRFuncDialect
   MLIRGPUDialect
-  MLIRLLVMDialect
   MLIRLLVMCommonConversion
+  MLIRLLVMDialect
   MLIRMathTransforms
   MLIROpenACCDialect
   MLIROpenACCToLLVMIRTranslation
   MLIROpenMPDialect
+  MLIRTransformUtils
 )
diff --git a/flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp b/flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp
new file mode 100644
index 0000000000000..c033d5e278c8d
--- /dev/null
+++ b/flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp
@@ -0,0 +1,233 @@
+//===- LoopInvariantCodeMotion.cpp ----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+/// \file
+/// FIR-specific Loop Invariant Code Motion pass.
+/// The pass relies on FIR types and interfaces to prove the safety
+/// of hoisting invariant operations out of loop-like operations.
+/// It may be run on both HLFIR and FIR representations.
+//===----------------------------------------------------------------------===//
+
+#include "flang/Optimizer/Analysis/AliasAnalysis.h"
+#include "flang/Optimizer/Dialect/FortranVariableInterface.h"
+#include "flang/Optimizer/Transforms/Passes.h"
+#include "mlir/Pass/Pass.h"
+#include "mlir/Transforms/LoopInvariantCodeMotionUtils.h"
+#include "llvm/Support/DebugLog.h"
+
+namespace fir {
+#define GEN_PASS_DEF_LOOPINVARIANTCODEMOTION
+#include "flang/Optimizer/Transforms/Passes.h.inc"
+} // namespace fir
+
+#define DEBUG_TYPE "flang-licm"
+
+// Temporary engineering option for triaging LICM.
+static llvm::cl::opt<bool> disableFlangLICM(
+    "disable-flang-licm", llvm::cl::init(false), llvm::cl::Hidden,
+    llvm::cl::desc("Disable Flang's loop invariant code motion"));
+
+namespace {
+
+using namespace mlir;
+
+/// The pass tries to hoist loop invariant operations with only
+/// MemoryEffects::Read effects (MemoryEffects::Write support
+/// may be added later).
+/// The safety of hoisting is proven by:
+///   * Proving that the loop runs at least one iteration.
+///   * Proving that is is always safe to load from this location
+///     (see isSafeToHoistLoad() comments below).
+struct LoopInvariantCodeMotion
+    : fir::impl::LoopInvariantCodeMotionBase<LoopInvariantCodeMotion> {
+  void runOnOperation() override;
+};
+
+} // namespace
+
+/// 'location' is a memory reference used by a memory access.
+/// The type of 'location' defines the data type of the access
+/// (e.g. it is considered to be invalid to access 'i64'
+/// data using '!fir.ref<i32>`).
+/// For the given location, this function returns true iff
+/// the Fortran object being accessed is a scalar that
+/// may not be OPTIONAL.
+///
+/// Note that the '!fir.ref<!fir.box<>>' accesses are considered
+/// to be scalar, even if the underlying data is an array.
+///
+/// Note that an access of '!fir.ref<scalar>' may access
+/// an array object. For example:
+///   real :: x(:)
+///   do i=...
+///     = x(10)
+/// 'x(10)' accesses array 'x', and it may be unsafe to hoist
+/// it without proving that '10' is a valid index for the array.
+/// The fact that 'x' is not OPTIONAL does not allow hoisting
+/// on its own.
+static bool isNonOptionalScalar(Value location) {
+  while (true) {
+    LDBG() << "Checking location:\n" << location;
+    Type dataType = fir::unwrapRefType(location.getType());
+    if (!isa<fir::BaseBoxType>(location.getType()) &&
+        (!dataType ||
+         (!isa<fir::BaseBoxType>(dataType) && !fir::isa_trivial(dataType) &&
+          !fir::isa_derived(dataType)))) {
+      LDBG() << "Failure: data access is not scalar";
+      return false;
+    }
+    Operation *defOp = location.getDefiningOp();
+    if (!defOp) {
+      LDBG() << "Failure: no defining operation";
+      return false;
+    }
+    if (auto varIface = dyn_cast<fir::FortranVariableOpInterface>(defOp)) {
+      bool result = !varIface.isOptional();
+      if (result)
+        LDBG() << "Success: is non optional scalar";
+      else
+        LDBG() << "Failure: is not non optional scalar";
+      return result;
+    }
+    if (auto viewIface = dyn_cast<fir::FortranObjectViewOpInterface>(defOp)) {
+      location = viewIface.getViewSource(cast<OpResult>(location));
+    } else {
+      LDBG() << "Failure: unknown operation:\n" << *defOp;
+      return false;
+    }
+  }
+}
+
+/// Returns true iff it is safe to hoist the given load-like operation 'op',
+/// which access given memory 'locations', out of the operation 'loopLike'.
+/// The current safety conditions are:
+///   * The loop runs at least one iteration, OR
+///   * all the accessed locations are inside scalar non-OPTIONAL
+///     Fortran objects (Fortran descriptors are considered to be scalars).
+static bool isSafeToHoistLoad(Operation *op, ArrayRef<Value> locations,
+                              LoopLikeOpInterface loopLike,
+                              AliasAnalysis &aliasAnalysis) {
+  for (Value location : locations)
+    if (aliasAnalysis.getModRef(loopLike.getOperation(), location)
+            .isModAndRef()) {
+      LDBG() << "Failure: reads location:\n"
+             << location << "\nwhich is modified inside the loop";
+      return false;
+    }
+
+  // Check that it is safe to read from all the locations before the loop.
+  std::optional<llvm::APInt> tripCount = loopLike.getStaticTripCount();
+  if (tripCount && !tripCount->isZero()) {
+    // Loop executes at least one iteration, so it is safe to hoist.
+    LDBG() << "Success: loop has non-zero iterations";
+    return true;
+  }
+
+  // Check whether the access must always be valid.
+  return llvm::all_of(
+      locations, [&](Value location) { return isNonOptionalScalar(location); });
+  // TODO: consider hoisting under condition of the loop's trip count
+  // being non-zero.
+}
+
+/// Returns true iff the given 'op' is a load-like operation,
+/// and it can be hoisted out of 'loopLike' operation.
+static bool canHoistLoad(Operation *op, LoopLikeOpInterface loopLike,
+                         AliasAnalysis &aliasAnalysis) {
+  LDBG() << "Checking operation:\n" << *op;
+  if (auto effectInterface = dyn_cast<MemoryEffectOpInterface>(op)) {
+    SmallVector<MemoryEffects::EffectInstance> effects;
+    effectInterface.getEffects(effects);
+    if (effects.empty()) {
+      LDBG() << "Failure: not a load";
+      return false;
+    }
+    llvm::SetVector<Value> locations;
+    for (const MemoryEffects::EffectInstance &effect : effects) {
+      Value location = effect.getValue();
+      if (!isa<MemoryEffects::Read>(effect.getEffect())) {
+        LDBG() << "Failure: has unsupported effects";
+        return false;
+      } else if (!location) {
+        LDBG() << "Failure: reads from unknown location";
+        return false;
+      }
+      locations.insert(location);
+    }
+    return isSafeToHoistLoad(op, locations.getArrayRef(), loopLike,
+                             aliasAnalysis);
+  }
+  LDBG() << "Failure: has unknown effects";
+  return false;
+}
+
+void LoopInvariantCodeMotion::runOnOperation() {
+  if (disableFlangLICM) {
+    LDBG() << "Skipping [HL]FIR LoopInvariantCodeMotion()";
+    return;
+  }
+
+  LDBG() << "Enter [HL]FIR LoopInvariantCodeMotion()";
+
+  auto &aliasAnalysis = getAnalysis<AliasAnalysis>();
+  aliasAnalysis.addAnalysisImplementation(fir::AliasAnalysis{});
+
+  std::function<bool(Operation *, LoopLikeOpInterface loopLike)>
+      shouldMoveOutOfLoop = [&](Operation *op, LoopLikeOpInterface loopLike) {
+        if (isPure(op)) {
+          LDBG() << "Pure operation: " << *op;
+          return true;
+        }
+
+        // Handle RecursivelySpeculatable operations that have
+        // RecursiveMemoryEffects by checking if all their
+        // nested operations can be hoisted.
+        auto iface = dyn_cast<ConditionallySpeculatable>(op);
+        if (iface && iface.getSpeculatability() ==
+                         Speculation::RecursivelySpeculatable) {
+          if (op->hasTrait<OpTrait::HasRecursiveMemoryEffects>()) {
+            LDBG() << "Checking recursive operation:\n" << *op;
+            llvm::SmallVector<Operation *> nestedOps;
+            for (Region &region : op->getRegions())
+              for (Block &block : region)
+                for (Operation &nestedOp : block)
+                  nestedOps.push_back(&nestedOp);
+
+            bool result = llvm::all_of(nestedOps, [&](Operation *nestedOp) {
+              return shouldMoveOutOfLoop(nestedOp, loopLike);
+            });
+            LDBG() << "Recursive operation can" << (result ? "" : "not")
+                   << " be hoisted";
+
+            // If nested operations cannot be hoisted, there is nothing
+            // else to check. Also if the operation itself does not have
+            // any memory effects, we can return the result now.
+            // Otherwise, we have to check the operation itself below.
+            if (!result || !isa<MemoryEffectOpInterface>(op))
+              return result;
+          }
+        }
+        return canHoistLoad(op, loopLike, aliasAnalysis);
+      };
+
+  getOperation()->walk([&](LoopLikeOpInterface loopLike) {
+    moveLoopInvariantCode(
+        loopLike.getLoopRegions(),
+        /*isDefinedOutsideRegion=*/
+        [&](Value value, Region *) {
+          return loopLike.isDefinedOutsideOfLoop(value);
+        },
+        /*shouldMoveOutOfRegion=*/
+        [&](Operation *op, Region *) {
+          return shouldMoveOutOfLoop(op, loopLike);
+        },
+        /*moveOutOfRegion=*/
+        [&](Operation *op, Region *) { loopLike.moveOutOfLoop(op); });
+  });
+
+  LDBG() << "Exit [HL]FIR LoopInvariantCodeMotion()";
+}
diff --git a/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py b/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py
index ce7d9b1700bf7..953f1a2545781 100755
--- a/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py
+++ b/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py
@@ -7,11 +7,14 @@
  This will insert mod ref test hook:
    - to any fir.call to a function which name starts with "test_effect_"
    - to any hlfir.declare for variable which name starts with "test_var_"
+   - to any fir.box_addr - they are assigned box_addr_# hooks
 """
 
 import sys
 import re
 
+box_addr_counter=0
+
 for line in sys.stdin:
     line = re.sub(
         r"(fir.call @_\w*P)(test_effect_\w*)(\(.*) : ",
@@ -23,4 +26,11 @@
         r'\1\2", test.ptr ="\2"',
         line,
     )
+    line, count = re.subn(
+        r'(fir.box_addr.*) :',
+        rf'\1 {{test.ptr ="box_addr_{box_addr_counter}"}} :',
+        line,
+    )
+    if count > 0:
+        box_addr_counter += 1
     sys.stdout.write(line)
diff --git a/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90 b/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90
index a4c57cff70927..a81f266b87979 100644
--- a/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90
+++ b/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90
@@ -20,7 +20,7 @@ subroutine test_dummy(test_var_x)
   use somemod, only : may_capture
   implicit none
   real :: test_var_x
-  ! Capture is invalid after the call because test_var_xsaved does not have the
+  ! Capture is invalid after the call because test_var_x does not have the
   ! target attribute.
   call may_capture(test_var_x)
   call test_effect_external()
@@ -50,4 +50,61 @@ subroutine test_dummy_pointer(p)
   end associate
 end subroutine
 ! CHECK-LABEL: Testing : "_QPtest_dummy_pointer"
-! CHECK: test_effect_external -> test_var_p_target#0: ModRef
+! CHECK-DAG: test_effect_external ...
[truncated]

llvmbot · 2025-12-24T01:09:33Z

@llvm/pr-subscribers-flang-fir-hlfir

Author: Slava Zakharin (vzakhari)

Changes

The new pass allows hoisting some fir.load operations early
in MLIR. For example, many descriptor load might be hoisted
out of the loops, though it does not make much difference
in performance, because LLVM is able to optimize such loads
(which are lowered as llvm.memcpy into temporary descriptors),
given that proper TBAA information is generated by Flang.

Further hoisting improvements are possible in [HL]FIR LICM,
e.g. getting proper mod-ref results for Fortran runtime calls
may allow hoisting loads from global variables, which LLVM
cannot do due to lack of alias information.

This patch also contains improvements for FIR mod-ref analysis:
We may recurse into HasRecursiveMemoryEffects operations and
use getModRef recursively to get more precise results for
regions with fir.call operations.

This patch also modifies AliasAnalysis to set the instantiation
point for cases where the tracked data is accessed through a load
from !fir.ref<!fir.box<>>: without this change the mod-ref
analysis was not able to recognize user pointer/allocatable variables.

Patch is 198.99 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/173438.diff

16 Files Affected:

(modified) flang/include/flang/Optimizer/Transforms/Passes.td (+11)
(modified) flang/lib/Optimizer/Analysis/AliasAnalysis.cpp (+34-24)
(modified) flang/lib/Optimizer/Passes/Pipelines.cpp (+4)
(modified) flang/lib/Optimizer/Transforms/CMakeLists.txt (+28-25)
(added) flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp (+233)
(modified) flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py (+10)
(modified) flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90 (+59-2)
(modified) flang/test/Analysis/AliasAnalysis/modref-call-globals.f90 (+22)
(added) flang/test/Analysis/AliasAnalysis/modref-call-recursive.fir (+62)
(modified) flang/test/Driver/bbc-mlir-pass-pipeline.f90 (+3)
(modified) flang/test/Driver/mlir-pass-pipeline.f90 (+3)
(modified) flang/test/Fir/basic-program.fir (+3)
(modified) flang/test/Lower/array-expression-assumed-size.f90 (+164-159)
(modified) flang/test/Lower/array-temp.f90 (+262-250)
(modified) flang/test/Lower/vector-subscript-io.f90 (+5-5)
(added) flang/test/Transforms/licm.fir (+1326)

diff --git a/flang/include/flang/Optimizer/Transforms/Passes.td b/flang/include/flang/Optimizer/Transforms/Passes.td
index f50202784e2dc..b6b78ec7b21b0 100644
--- a/flang/include/flang/Optimizer/Transforms/Passes.td
+++ b/flang/include/flang/Optimizer/Transforms/Passes.td
@@ -596,4 +596,15 @@ def MIFOpConversion : Pass<"mif-convert", "mlir::ModuleOp"> {
   let dependentDialects = ["fir::FIROpsDialect", "mlir::LLVM::LLVMDialect"];
 }
 
+def LoopInvariantCodeMotion : Pass<"flang-licm", "::mlir::func::FuncOp"> {
+  let summary = "Hoist invariants from loops";
+  let description = [{
+    Hoist invariants from loops. This is a FIR-specific version of loop
+    invariant code motion, which relies on FIR types, operations (such as
+    fir.declare) and interfaces such as FortranObjectViewOpInterface.
+    The pass only moves existing operations, so there are no dependent
+    dialects.
+  }];
+}
+
 #endif // FLANG_OPTIMIZER_TRANSFORMS_PASSES
diff --git a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
index d2656fa7b8ea8..0dc9dda3aa0f5 100644
--- a/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
+++ b/flang/lib/Optimizer/Analysis/AliasAnalysis.cpp
@@ -468,7 +468,8 @@ static ModRefResult getCallModRef(fir::CallOp call, mlir::Value var) {
   // TODO: limit to Fortran functions??
   // 1. Detect variables that can be accessed indirectly.
   fir::AliasAnalysis aliasAnalysis;
-  fir::AliasAnalysis::Source varSrc = aliasAnalysis.getSource(var);
+  fir::AliasAnalysis::Source varSrc =
+      aliasAnalysis.getSource(var, /*getLastInstantiationPoint=*/true);
   // If the variable is not a user variable, we cannot safely assume that
   // Fortran semantics apply (e.g., a bare alloca/allocmem result may very well
   // be placed in an allocatable/pointer descriptor and escape).
@@ -498,6 +499,7 @@ static ModRefResult getCallModRef(fir::CallOp call, mlir::Value var) {
   // At that stage, it has been ruled out that local (including the saved ones)
   // and dummy cannot be indirectly accessed in the call.
   if (varSrc.kind != fir::AliasAnalysis::SourceKind::Allocate &&
+      varSrc.kind != fir::AliasAnalysis::SourceKind::Argument &&
       !varSrc.isDummyArgument()) {
     if (varSrc.kind != fir::AliasAnalysis::SourceKind::Global ||
         !isSavedLocal(varSrc))
@@ -523,19 +525,36 @@ static ModRefResult getCallModRef(fir::CallOp call, mlir::Value var) {
 /// flow analysis to come 2) Allocate and Free effects are considered
 /// modifying
 ModRefResult AliasAnalysis::getModRef(Operation *op, Value location) {
-  MemoryEffectOpInterface interface = dyn_cast<MemoryEffectOpInterface>(op);
-  if (!interface) {
-    if (auto call = llvm::dyn_cast<fir::CallOp>(op))
-      return getCallModRef(call, location);
-    return ModRefResult::getModAndRef();
-  }
+  if (auto call = llvm::dyn_cast<fir::CallOp>(op))
+    return getCallModRef(call, location);
 
   // Build a ModRefResult by merging the behavior of the effects of this
   // operation.
+  ModRefResult result = ModRefResult::getNoModRef();
+  MemoryEffectOpInterface interface = dyn_cast<MemoryEffectOpInterface>(op);
+  if (op->hasTrait<mlir::OpTrait::HasRecursiveMemoryEffects>()) {
+    for (mlir::Region &region : op->getRegions()) {
+      result = result.merge(getModRef(region, location));
+      if (result.isModAndRef())
+        break;
+    }
+
+    // In MLIR, RecursiveMemoryEffects can be combined with
+    // MemoryEffectOpInterface to describe extra effects on top of the
+    // effects of the nested operations.  However, the presence of
+    // RecursiveMemoryEffects and the absence of MemoryEffectOpInterface
+    // implies the operation has no other memory effects than the one of its
+    // nested operations.
+    if (!interface)
+      return result;
+  }
+
+  if (!interface || result.isModAndRef())
+    return ModRefResult::getModAndRef();
+
   SmallVector<MemoryEffects::EffectInstance> effects;
   interface.getEffects(effects);
 
-  ModRefResult result = ModRefResult::getNoModRef();
   for (const MemoryEffects::EffectInstance &effect : effects) {
 
     // Check for an alias between the effect and our memory location.
@@ -563,22 +582,6 @@ ModRefResult AliasAnalysis::getModRef(mlir::Region &region,
                                       mlir::Value location) {
   ModRefResult result = ModRefResult::getNoModRef();
   for (mlir::Operation &op : region.getOps()) {
-    if (op.hasTrait<mlir::OpTrait::HasRecursiveMemoryEffects>()) {
-      for (mlir::Region &subRegion : op.getRegions()) {
-        result = result.merge(getModRef(subRegion, location));
-        // Fast return is already mod and ref.
-        if (result.isModAndRef())
-          return result;
-      }
-      // In MLIR, RecursiveMemoryEffects can be combined with
-      // MemoryEffectOpInterface to describe extra effects on top of the
-      // effects of the nested operations.  However, the presence of
-      // RecursiveMemoryEffects and the absence of MemoryEffectOpInterface
-      // implies the operation has no other memory effects than the one of its
-      // nested operations.
-      if (!mlir::isa<mlir::MemoryEffectOpInterface>(op))
-        continue;
-    }
     result = result.merge(getModRef(&op, location));
     if (result.isModAndRef())
       return result;
@@ -674,6 +677,13 @@ AliasAnalysis::Source AliasAnalysis::getSource(mlir::Value v,
             isCapturedInInternalProcedure |=
                 boxSrc.isCapturedInInternalProcedure;
 
+            if (getLastInstantiationPoint) {
+              if (!instantiationPoint)
+                instantiationPoint = boxSrc.origin.instantiationPoint;
+            } else {
+              instantiationPoint = boxSrc.origin.instantiationPoint;
+            }
+
             global = llvm::dyn_cast<mlir::SymbolRefAttr>(boxSrc.origin.u);
             if (global) {
               type = SourceKind::Global;
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 103e736accca0..038ceb1d4bd75 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -206,6 +206,10 @@ void createDefaultFIROptimizerPassPipeline(mlir::PassManager &pm,
   pm.addPass(fir::createSimplifyRegionLite());
   pm.addPass(mlir::createCSEPass());
 
+  // Run LICM after CSE, which may reduce the number of operations to hoist.
+  if (pc.OptLevel.isOptimizingForSpeed())
+    pm.addPass(fir::createLoopInvariantCodeMotion());
+
   // Polymorphic types
   pm.addPass(fir::createPolymorphicOpConversion());
   pm.addPass(fir::createAssumedRankOpConversion());
diff --git a/flang/lib/Optimizer/Transforms/CMakeLists.txt b/flang/lib/Optimizer/Transforms/CMakeLists.txt
index 619f3adc67c85..fad6f34f478ba 100644
--- a/flang/lib/Optimizer/Transforms/CMakeLists.txt
+++ b/flang/lib/Optimizer/Transforms/CMakeLists.txt
@@ -1,43 +1,44 @@
 add_flang_library(FIRTransforms
   AbstractResult.cpp
   AddAliasTags.cpp
-  AffinePromotion.cpp
+  AddDebugInfo.cpp
   AffineDemotion.cpp
+  AffinePromotion.cpp
+  AlgebraicSimplification.cpp
   AnnotateConstant.cpp
+  ArrayValueCopy.cpp
   AssumedRankOpConversion.cpp
-  CharacterConversion.cpp
-  CompilerGeneratedNames.cpp
-  ConstantArgumentGlobalisation.cpp
-  ControlFlowConverter.cpp
   CUDA/CUFAllocationConversion.cpp
   CUFAddConstructor.cpp
+  CUFComputeSharedMemoryOffsetsAndSize.cpp
   CUFDeviceGlobal.cpp
-  CUFOpConversion.cpp
   CUFGPUToLLVMConversion.cpp
-  CUFComputeSharedMemoryOffsetsAndSize.cpp
-  ArrayValueCopy.cpp
+  CUFOpConversion.cpp
+  CharacterConversion.cpp
+  CompilerGeneratedNames.cpp
+  ConstantArgumentGlobalisation.cpp
+  ControlFlowConverter.cpp
+  ConvertComplexPow.cpp
+  DebugTypeGenerator.cpp
   ExternalNameConversion.cpp
   FIRToSCF.cpp
-  MemoryUtils.cpp
-  MemoryAllocation.cpp
-  StackArrays.cpp
+  FunctionAttr.cpp
+  GenRuntimeCallsForTest.cpp
+  LoopInvariantCodeMotion.cpp
+  LoopVersioning.cpp
+  MIFOpConversion.cpp
   MemRefDataFlowOpt.cpp
-  SimplifyRegionLite.cpp
-  AlgebraicSimplification.cpp
-  SimplifyIntrinsics.cpp
-  AddDebugInfo.cpp
+  MemoryAllocation.cpp
+  MemoryUtils.cpp
+  OptimizeArrayRepacking.cpp
   PolymorphicOpConversion.cpp
-  LoopVersioning.cpp
-  StackReclaim.cpp
-  VScaleAttr.cpp
-  FunctionAttr.cpp
-  DebugTypeGenerator.cpp
   SetRuntimeCallAttributes.cpp
-  GenRuntimeCallsForTest.cpp
   SimplifyFIROperations.cpp
-  OptimizeArrayRepacking.cpp
-  ConvertComplexPow.cpp
-  MIFOpConversion.cpp
+  SimplifyIntrinsics.cpp
+  SimplifyRegionLite.cpp
+  StackArrays.cpp
+  StackReclaim.cpp
+  VScaleAttr.cpp
 
   DEPENDS
   CUFAttrs
@@ -63,12 +64,14 @@ add_flang_library(FIRTransforms
 
   MLIR_LIBS
   MLIRAffineUtils
+  MLIRAnalysis
   MLIRFuncDialect
   MLIRGPUDialect
-  MLIRLLVMDialect
   MLIRLLVMCommonConversion
+  MLIRLLVMDialect
   MLIRMathTransforms
   MLIROpenACCDialect
   MLIROpenACCToLLVMIRTranslation
   MLIROpenMPDialect
+  MLIRTransformUtils
 )
diff --git a/flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp b/flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp
new file mode 100644
index 0000000000000..c033d5e278c8d
--- /dev/null
+++ b/flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp
@@ -0,0 +1,233 @@
+//===- LoopInvariantCodeMotion.cpp ----------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+/// \file
+/// FIR-specific Loop Invariant Code Motion pass.
+/// The pass relies on FIR types and interfaces to prove the safety
+/// of hoisting invariant operations out of loop-like operations.
+/// It may be run on both HLFIR and FIR representations.
+//===----------------------------------------------------------------------===//
+
+#include "flang/Optimizer/Analysis/AliasAnalysis.h"
+#include "flang/Optimizer/Dialect/FortranVariableInterface.h"
+#include "flang/Optimizer/Transforms/Passes.h"
+#include "mlir/Pass/Pass.h"
+#include "mlir/Transforms/LoopInvariantCodeMotionUtils.h"
+#include "llvm/Support/DebugLog.h"
+
+namespace fir {
+#define GEN_PASS_DEF_LOOPINVARIANTCODEMOTION
+#include "flang/Optimizer/Transforms/Passes.h.inc"
+} // namespace fir
+
+#define DEBUG_TYPE "flang-licm"
+
+// Temporary engineering option for triaging LICM.
+static llvm::cl::opt<bool> disableFlangLICM(
+    "disable-flang-licm", llvm::cl::init(false), llvm::cl::Hidden,
+    llvm::cl::desc("Disable Flang's loop invariant code motion"));
+
+namespace {
+
+using namespace mlir;
+
+/// The pass tries to hoist loop invariant operations with only
+/// MemoryEffects::Read effects (MemoryEffects::Write support
+/// may be added later).
+/// The safety of hoisting is proven by:
+///   * Proving that the loop runs at least one iteration.
+///   * Proving that is is always safe to load from this location
+///     (see isSafeToHoistLoad() comments below).
+struct LoopInvariantCodeMotion
+    : fir::impl::LoopInvariantCodeMotionBase<LoopInvariantCodeMotion> {
+  void runOnOperation() override;
+};
+
+} // namespace
+
+/// 'location' is a memory reference used by a memory access.
+/// The type of 'location' defines the data type of the access
+/// (e.g. it is considered to be invalid to access 'i64'
+/// data using '!fir.ref<i32>`).
+/// For the given location, this function returns true iff
+/// the Fortran object being accessed is a scalar that
+/// may not be OPTIONAL.
+///
+/// Note that the '!fir.ref<!fir.box<>>' accesses are considered
+/// to be scalar, even if the underlying data is an array.
+///
+/// Note that an access of '!fir.ref<scalar>' may access
+/// an array object. For example:
+///   real :: x(:)
+///   do i=...
+///     = x(10)
+/// 'x(10)' accesses array 'x', and it may be unsafe to hoist
+/// it without proving that '10' is a valid index for the array.
+/// The fact that 'x' is not OPTIONAL does not allow hoisting
+/// on its own.
+static bool isNonOptionalScalar(Value location) {
+  while (true) {
+    LDBG() << "Checking location:\n" << location;
+    Type dataType = fir::unwrapRefType(location.getType());
+    if (!isa<fir::BaseBoxType>(location.getType()) &&
+        (!dataType ||
+         (!isa<fir::BaseBoxType>(dataType) && !fir::isa_trivial(dataType) &&
+          !fir::isa_derived(dataType)))) {
+      LDBG() << "Failure: data access is not scalar";
+      return false;
+    }
+    Operation *defOp = location.getDefiningOp();
+    if (!defOp) {
+      LDBG() << "Failure: no defining operation";
+      return false;
+    }
+    if (auto varIface = dyn_cast<fir::FortranVariableOpInterface>(defOp)) {
+      bool result = !varIface.isOptional();
+      if (result)
+        LDBG() << "Success: is non optional scalar";
+      else
+        LDBG() << "Failure: is not non optional scalar";
+      return result;
+    }
+    if (auto viewIface = dyn_cast<fir::FortranObjectViewOpInterface>(defOp)) {
+      location = viewIface.getViewSource(cast<OpResult>(location));
+    } else {
+      LDBG() << "Failure: unknown operation:\n" << *defOp;
+      return false;
+    }
+  }
+}
+
+/// Returns true iff it is safe to hoist the given load-like operation 'op',
+/// which access given memory 'locations', out of the operation 'loopLike'.
+/// The current safety conditions are:
+///   * The loop runs at least one iteration, OR
+///   * all the accessed locations are inside scalar non-OPTIONAL
+///     Fortran objects (Fortran descriptors are considered to be scalars).
+static bool isSafeToHoistLoad(Operation *op, ArrayRef<Value> locations,
+                              LoopLikeOpInterface loopLike,
+                              AliasAnalysis &aliasAnalysis) {
+  for (Value location : locations)
+    if (aliasAnalysis.getModRef(loopLike.getOperation(), location)
+            .isModAndRef()) {
+      LDBG() << "Failure: reads location:\n"
+             << location << "\nwhich is modified inside the loop";
+      return false;
+    }
+
+  // Check that it is safe to read from all the locations before the loop.
+  std::optional<llvm::APInt> tripCount = loopLike.getStaticTripCount();
+  if (tripCount && !tripCount->isZero()) {
+    // Loop executes at least one iteration, so it is safe to hoist.
+    LDBG() << "Success: loop has non-zero iterations";
+    return true;
+  }
+
+  // Check whether the access must always be valid.
+  return llvm::all_of(
+      locations, [&](Value location) { return isNonOptionalScalar(location); });
+  // TODO: consider hoisting under condition of the loop's trip count
+  // being non-zero.
+}
+
+/// Returns true iff the given 'op' is a load-like operation,
+/// and it can be hoisted out of 'loopLike' operation.
+static bool canHoistLoad(Operation *op, LoopLikeOpInterface loopLike,
+                         AliasAnalysis &aliasAnalysis) {
+  LDBG() << "Checking operation:\n" << *op;
+  if (auto effectInterface = dyn_cast<MemoryEffectOpInterface>(op)) {
+    SmallVector<MemoryEffects::EffectInstance> effects;
+    effectInterface.getEffects(effects);
+    if (effects.empty()) {
+      LDBG() << "Failure: not a load";
+      return false;
+    }
+    llvm::SetVector<Value> locations;
+    for (const MemoryEffects::EffectInstance &effect : effects) {
+      Value location = effect.getValue();
+      if (!isa<MemoryEffects::Read>(effect.getEffect())) {
+        LDBG() << "Failure: has unsupported effects";
+        return false;
+      } else if (!location) {
+        LDBG() << "Failure: reads from unknown location";
+        return false;
+      }
+      locations.insert(location);
+    }
+    return isSafeToHoistLoad(op, locations.getArrayRef(), loopLike,
+                             aliasAnalysis);
+  }
+  LDBG() << "Failure: has unknown effects";
+  return false;
+}
+
+void LoopInvariantCodeMotion::runOnOperation() {
+  if (disableFlangLICM) {
+    LDBG() << "Skipping [HL]FIR LoopInvariantCodeMotion()";
+    return;
+  }
+
+  LDBG() << "Enter [HL]FIR LoopInvariantCodeMotion()";
+
+  auto &aliasAnalysis = getAnalysis<AliasAnalysis>();
+  aliasAnalysis.addAnalysisImplementation(fir::AliasAnalysis{});
+
+  std::function<bool(Operation *, LoopLikeOpInterface loopLike)>
+      shouldMoveOutOfLoop = [&](Operation *op, LoopLikeOpInterface loopLike) {
+        if (isPure(op)) {
+          LDBG() << "Pure operation: " << *op;
+          return true;
+        }
+
+        // Handle RecursivelySpeculatable operations that have
+        // RecursiveMemoryEffects by checking if all their
+        // nested operations can be hoisted.
+        auto iface = dyn_cast<ConditionallySpeculatable>(op);
+        if (iface && iface.getSpeculatability() ==
+                         Speculation::RecursivelySpeculatable) {
+          if (op->hasTrait<OpTrait::HasRecursiveMemoryEffects>()) {
+            LDBG() << "Checking recursive operation:\n" << *op;
+            llvm::SmallVector<Operation *> nestedOps;
+            for (Region &region : op->getRegions())
+              for (Block &block : region)
+                for (Operation &nestedOp : block)
+                  nestedOps.push_back(&nestedOp);
+
+            bool result = llvm::all_of(nestedOps, [&](Operation *nestedOp) {
+              return shouldMoveOutOfLoop(nestedOp, loopLike);
+            });
+            LDBG() << "Recursive operation can" << (result ? "" : "not")
+                   << " be hoisted";
+
+            // If nested operations cannot be hoisted, there is nothing
+            // else to check. Also if the operation itself does not have
+            // any memory effects, we can return the result now.
+            // Otherwise, we have to check the operation itself below.
+            if (!result || !isa<MemoryEffectOpInterface>(op))
+              return result;
+          }
+        }
+        return canHoistLoad(op, loopLike, aliasAnalysis);
+      };
+
+  getOperation()->walk([&](LoopLikeOpInterface loopLike) {
+    moveLoopInvariantCode(
+        loopLike.getLoopRegions(),
+        /*isDefinedOutsideRegion=*/
+        [&](Value value, Region *) {
+          return loopLike.isDefinedOutsideOfLoop(value);
+        },
+        /*shouldMoveOutOfRegion=*/
+        [&](Operation *op, Region *) {
+          return shouldMoveOutOfLoop(op, loopLike);
+        },
+        /*moveOutOfRegion=*/
+        [&](Operation *op, Region *) { loopLike.moveOutOfLoop(op); });
+  });
+
+  LDBG() << "Exit [HL]FIR LoopInvariantCodeMotion()";
+}
diff --git a/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py b/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py
index ce7d9b1700bf7..953f1a2545781 100755
--- a/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py
+++ b/flang/test/Analysis/AliasAnalysis/gen_mod_ref_test.py
@@ -7,11 +7,14 @@
  This will insert mod ref test hook:
    - to any fir.call to a function which name starts with "test_effect_"
    - to any hlfir.declare for variable which name starts with "test_var_"
+   - to any fir.box_addr - they are assigned box_addr_# hooks
 """
 
 import sys
 import re
 
+box_addr_counter=0
+
 for line in sys.stdin:
     line = re.sub(
         r"(fir.call @_\w*P)(test_effect_\w*)(\(.*) : ",
@@ -23,4 +26,11 @@
         r'\1\2", test.ptr ="\2"',
         line,
     )
+    line, count = re.subn(
+        r'(fir.box_addr.*) :',
+        rf'\1 {{test.ptr ="box_addr_{box_addr_counter}"}} :',
+        line,
+    )
+    if count > 0:
+        box_addr_counter += 1
     sys.stdout.write(line)
diff --git a/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90 b/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90
index a4c57cff70927..a81f266b87979 100644
--- a/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90
+++ b/flang/test/Analysis/AliasAnalysis/modref-call-dummies.f90
@@ -20,7 +20,7 @@ subroutine test_dummy(test_var_x)
   use somemod, only : may_capture
   implicit none
   real :: test_var_x
-  ! Capture is invalid after the call because test_var_xsaved does not have the
+  ! Capture is invalid after the call because test_var_x does not have the
   ! target attribute.
   call may_capture(test_var_x)
   call test_effect_external()
@@ -50,4 +50,61 @@ subroutine test_dummy_pointer(p)
   end associate
 end subroutine
 ! CHECK-LABEL: Testing : "_QPtest_dummy_pointer"
-! CHECK: test_effect_external -> test_var_p_target#0: ModRef
+! CHECK-DAG: test_effect_external ...
[truncated]

github-actions · 2025-12-24T01:10:51Z

✅ With the latest revision this PR passed the Python code formatter.

LICM (llvm#173438) may insert new operations at the beginning of `fir.do_concurrent`'s block and they cannot be always hoisted to the alloca-block of the parent operation. This patch only moves `fir.alloca`s into the alloca-block, and moves all other operations right before fir.do_concurrent.

vzakhari · 2025-12-24T18:05:09Z

This should be merged after #173502 to avoid new test failures.

tblah

Interesting stuff - again!

First pass looking at this, I will have a second look this week.

tblah · 2025-12-29T10:51:07Z

flang/lib/Optimizer/Analysis/AliasAnalysis.cpp

  // An example of something currently problematic is the allocmem generated for
  // ALLOCATE of allocatable target. It currently does not have the target
  // attribute, which would lead this analysis to believe it cannot escape.
  if (!varSrc.isFortranUserVariable() || !isCallToFortranUserProcedure(call))


Does this mean that the above TODOs:

Limit to fortran functions

Skip compiler generated temps

Are complete and can be removed?

Not fully related to your patch: I am trying to understand how ready getCallModRef is for use.

Thanks for the review, Tom!

I do not think this TODO is resolved, so the check for isFortranUserVariable should be necessary still.

I would like to look further into isCallToFortranUserProcedure check so that we can handle Fortran runtime calls better (e.g. based on the memroy access attributes that we are able to figure out in SetRuntimeCallAttributesPass).

I beleive there is a room for improvement in the mod-ref analysis, but it already works for some cases that Jean addressed when he initially added the implementation.

tblah · 2025-12-29T11:02:37Z

flang/lib/Optimizer/Passes/Pipelines.cpp


+  // Run LICM after CSE, which may reduce the number of operations to hoist.
+  if (pc.OptLevel.isOptimizingForSpeed())
+    pm.addPass(fir::createLoopInvariantCodeMotion());


If this doesn't currently lead to performance improvements, do we need to enable it by default? I'm worried this might be a relatively expensive pass.

I'm curious what the motivation is here more than anything. If there is a plan and you haven't observed serious compile time regressions then I am happy to see this on by default.

I still need to measure performance, but my expectation is that this patch won't change anything. You have a good point about the compilation time, and I will measure it.

There is one example related to private copies of descriptors in OpenACC, which LICM can help to address (though, I guess this is not the only way to solve it). In general, I would like to be able to optimize code at MLIR level, which otherwise would require more detailed alias-info representation in LLVM IR. In addition, such early MLIR optimizations may help with enabling more MLIR optimizations (like the ones related to Affine transformations).

I agree with you that it is probably too early to enable this by default, and I will move it under an option.

Thanks.

I'm not really opposed to having this on by default so long as you don't find serious compilation time regressions (I agree this pass looks useful in the long term), but feel free to disable by default for now if you prefer.

LICM (#173438) may insert new operations at the beginning of `fir.do_concurrent`'s block and they cannot be always hoisted to the alloca-block of the parent operation. This patch only moves `fir.alloca`s into the alloca-block, and moves all other operations right before fir.do_concurrent.

tblah · 2026-01-02T12:10:25Z

flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp

+        return canHoistLoad(op, loopLike, aliasAnalysis);
+      };
+
+  getOperation()->walk([&](LoopLikeOpInterface loopLike) {


Please could you check for mlir::omp::OutlinableOpenMPOpInterface and avoid LICM in that case. These are operations whose body will be outlined (and could be run in a different thread). I'm not sure that we will ever have a LoopLikeOp which is outlinable but it is better to be safe.

Similarly, please could you make sure not to outline operations out of loops and into something that is a mlir::omp::LoopWrapperInterface. The loop wrappers are assumed to contain only a nested loop wrapper or a loop, and a terminator. Any other operations would currently crash the compiler. For example in,

omp.private {type = private} @_QFtestEi_private_i32 : i32 func.func @_QPtest() { %0 = fir.dummy_scope : !fir.dscope %1 = fir.alloca i32 {bindc_name = "i", uniq_name = "_QFtestEi"} %2:2 = hlfir.declare %1 {uniq_name = "_QFtestEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) %c1_i32 = arith.constant 1 : i32 %c100_i32 = arith.constant 100 : i32 %c1_i32_0 = arith.constant 1 : i32 omp.wsloop private(@_QFtestEi_private_i32 %2#0 -> %arg0 : !fir.ref<i32>) { omp.loop_nest (%arg1) : i32 = (%c1_i32) to (%c100_i32) inclusive step (%c1_i32_0) { %3:2 = hlfir.declare %arg0 {uniq_name = "_QFtestEi"} : (!fir.ref<i32>) -> (!fir.ref<i32>, !fir.ref<i32>) hlfir.assign %arg1 to %3#0 : i32, !fir.ref<i32> fir.call @_QPdo_something(%3#0) fastmath<contract> : (!fir.ref<i32>) -> () omp.yield } } return } func.func private @_QPdo_something(!fir.ref<i32>)

The omp.wsloop operation must contain only the omp.loop_nest and the (implicit) terminator. It would be valid to keep going and move to the first non-loop wrapper parent (beware nested loop wrappers). But again, make sure nothing is ever moved out of a outlinable operation.

tblah · 2026-01-02T12:20:01Z

flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp

+/// Returns true iff it is safe to hoist the given load-like operation 'op',
+/// which access given memory 'locations', out of the operation 'loopLike'.
+/// The current safety conditions are:
+///   * The loop runs at least one iteration, OR


What about something like

integer, allocatable :: a integer, pointer :: p ! a is not allocated ! p is not associated do i=1,2 if (.false.) load_like(a) load_like(p) endif enddo

tblah · 2026-01-02T12:34:56Z

flang/lib/Optimizer/Transforms/LoopInvariantCodeMotion.cpp

+              return result;
+          }
+        }
+        return canHoistLoad(op, loopLike, aliasAnalysis);


Could be another PR, but looking at #62419, I think it could be useful to hoist operations with allocation effects too.

vzakhari added 2 commits December 23, 2025 16:35

vzakhari requested review from jeanPerier, rscottmanley and tblah December 24, 2025 01:09

llvmbot added flang:driver flang Flang issues not falling into any other category flang:fir-hlfir labels Dec 24, 2025

vzakhari added 2 commits December 23, 2025 17:12

clang-format

f94a101

format

2832238

vzakhari mentioned this pull request Dec 24, 2025

[flang] Fixed operations hoisting out of fir.do_concurrent. #173502

Merged

vzakhari mentioned this pull request Dec 24, 2025

[flang] Make fir.result Pure operation. #173508

Open

tblah reviewed Dec 29, 2025

View reviewed changes

tblah reviewed Jan 2, 2026

View reviewed changes

[flang] Added LoopInvariantCodeMotion pass for [HL]FIR. #173438

Are you sure you want to change the base?

[flang] Added LoopInvariantCodeMotion pass for [HL]FIR. #173438

Conversation

vzakhari commented Dec 24, 2025

Uh oh!

llvmbot commented Dec 24, 2025

Uh oh!

llvmbot commented Dec 24, 2025

Uh oh!

github-actions bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vzakhari commented Dec 24, 2025

Uh oh!

tblah left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions bot commented Dec 24, 2025 •

edited

Loading