-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[flang][StackArrays] skip analysis of very large functions #71047
Conversation
The stack arrays pass uses data flow analysis to determine whether heap allocations are freed on all paths out of the function. interp_domain_em_part2 in spec2017 wrf generates over 120k operations, including almost 5k fir.if operations and over 200 fir.do_loop operations, all in the same function. The MLIR data flow analysis framework cannot provide reasonable performance for such cases because there is a combinatorial explosion in the number of control flow paths through the function, all of which must be checked to determine if the heap allocations will be freed. This patch skips the stack arrays pass for ridiculously large functions (defined as having more than 1000 fir.allocmem operations). This threshold is configurable at runtime with a command line argument. With this patch, compiling this file is more than 80% faster.
@d-smirnov for some reason it didn't let me add you as a reviewer |
@llvm/pr-subscribers-flang-fir-hlfir Author: Tom Eccles (tblah) ChangesThe stack arrays pass uses data flow analysis to determine whether heap allocations are freed on all paths out of the function.
This patch skips the stack arrays pass for ridiculously large functions (defined as having more than 1000 fir.allocmem operations). This threshold is configurable at runtime with a command line argument. With this patch, compiling this file is more than 80% faster. Full diff: https://github.com/llvm/llvm-project/pull/71047.diff 1 Files Affected:
diff --git a/flang/lib/Optimizer/Transforms/StackArrays.cpp b/flang/lib/Optimizer/Transforms/StackArrays.cpp
index 9b90aed5a17ae73..7b066ec7a2bfda6 100644
--- a/flang/lib/Optimizer/Transforms/StackArrays.cpp
+++ b/flang/lib/Optimizer/Transforms/StackArrays.cpp
@@ -42,6 +42,12 @@ namespace fir {
#define DEBUG_TYPE "stack-arrays"
+static llvm::cl::opt<std::size_t> maxAllocsPerFunc(
+ "stack-arrays-max-allocs",
+ llvm::cl::desc("The maximum number of heap allocations to consider in one "
+ "function before skipping (to save compilation time)"),
+ llvm::cl::init(1000), llvm::cl::Hidden);
+
namespace {
/// The state of an SSA value at each program point
@@ -411,6 +417,17 @@ void AllocationAnalysis::processOperation(mlir::Operation *op) {
mlir::LogicalResult
StackArraysAnalysisWrapper::analyseFunction(mlir::Operation *func) {
assert(mlir::isa<mlir::func::FuncOp>(func));
+ size_t nAllocs = 0;
+ func->walk([&nAllocs](fir::AllocMemOp) { nAllocs++; });
+ // don't bother with the analysis if there are no heap allocations
+ if (nAllocs == 0)
+ return mlir::success();
+ if ((maxAllocsPerFunc != 0) && (nAllocs > maxAllocsPerFunc)) {
+ LLVM_DEBUG(llvm::dbgs() << "Skipping stack arrays for function with "
+ << nAllocs << " heap allocations");
+ return mlir::success();
+ }
+
mlir::DataFlowSolver solver;
// constant propagation is required for dead code analysis, dead code analysis
// is required to mark blocks live (required for mlir dense dfa)
|
// don't bother with the analysis if there are no heap allocations | ||
if (nAllocs == 0) | ||
return mlir::success(); | ||
if ((maxAllocsPerFunc != 0) && (nAllocs > maxAllocsPerFunc)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If maxAllocsPerFunc is 0, should the pass run?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was intending this to follow the idiom of "set the limit to zero for unlimited"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. May be just document that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG with comment added.
The stack arrays pass uses data flow analysis to determine whether heap allocations are freed on all paths out of the function.
interp_domain_em_part2
in spec2017 wrf generates over 120k operations, including almost 5k fir.if operations and over 200 fir.do_loop operations, all in the same function. The MLIR data flow analysis framework cannot provide reasonable performance for such cases because there is a combinatorial explosion in the number of control flow paths through the function, all of which must be checked to determine if the heap allocations will be freed.This patch skips the stack arrays pass for ridiculously large functions (defined as having more than 1000 fir.allocmem operations). This threshold is configurable at runtime with a command line argument.
With this patch, compiling this file is more than 80% faster.