Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions mlir/include/mlir/Analysis/DataFlow/LivenessAnalysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,9 @@ struct RunLivenessAnalysis {

const Liveness *getLiveness(Value val);

/// Return the configuration of the solver used for this analysis.
const DataFlowConfig &getSolverConfig() const { return solver.getConfig(); }

private:
/// Stores the result of the liveness analysis that was run.
DataFlowSolver solver;
Expand Down
14 changes: 14 additions & 0 deletions mlir/include/mlir/IR/Visitors.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,20 @@ struct ForwardIterator {
}
};

/// This iterator enumerates the elements in "backward" order.
struct BackwardIterator {
template <typename T>
static auto makeIterable(T &range) {
if constexpr (std::is_same<T, Operation>()) {
/// Make operations iterable: return the list of regions.
return range.getRegions();
} else {
/// Regions and block are already iterable.
return llvm::reverse(range);
}
}
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which do you need to do this? Are you somehow trying to make an assumption about the order in which the regions are executed at runtime?
The order of the region on the op are not indicative of anything related to this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, @joker-eph,

Which do you need to do this?
I change iterator here. My intention is to visit basic block and CFG in reverse order.
module->walk<WalkOrder::PostOrder, BackwardIterator>

The order of the region on the op are not indicative of anything related to this.
TBH, this is the part of MLIR I don't understand. Should we also reverse Operation::getRegions()? yes, my understanding is that the order doesn't matter for regions.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My intention is to visit basic block and CFG in reverse order.

Reverse order from what? The order of basic blocks isn't indicative of any execution order. The only constraint is on the first basic block to be the "entry" into the region.


/// A utility class to encode the current walk stage for "generic" walkers.
/// When walking an operation, we can either choose a Pre/Post order walker
/// which invokes the callback on an operation before/after all its attached
Expand Down
175 changes: 156 additions & 19 deletions mlir/lib/Transforms/RemoveDeadValues.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@

#include "mlir/Analysis/DataFlow/DeadCodeAnalysis.h"
#include "mlir/Analysis/DataFlow/LivenessAnalysis.h"
#include "mlir/Dialect/Arith/IR/Arith.h"
#include "mlir/IR/Builders.h"
#include "mlir/IR/BuiltinAttributes.h"
#include "mlir/IR/Dialect.h"
Expand Down Expand Up @@ -118,8 +119,13 @@ struct RDVFinalCleanupList {
/// Return true iff at least one value in `values` is live, given the liveness
/// information in `la`.
static bool hasLive(ValueRange values, const DenseSet<Value> &nonLiveSet,
RunLivenessAnalysis &la) {
const DenseSet<Value> &liveSet, RunLivenessAnalysis &la) {
for (Value value : values) {
if (liveSet.contains(value)) {
LDBG() << "Value " << value << " is marked live by CallOp";
return true;
}

if (nonLiveSet.contains(value)) {
LDBG() << "Value " << value << " is already marked non-live (dead)";
continue;
Expand All @@ -144,6 +150,7 @@ static bool hasLive(ValueRange values, const DenseSet<Value> &nonLiveSet,
/// Return a BitVector of size `values.size()` where its i-th bit is 1 iff the
/// i-th value in `values` is live, given the liveness information in `la`.
static BitVector markLives(ValueRange values, const DenseSet<Value> &nonLiveSet,
const DenseSet<Value> &liveSet,
RunLivenessAnalysis &la) {
BitVector lives(values.size(), true);

Expand All @@ -154,7 +161,9 @@ static BitVector markLives(ValueRange values, const DenseSet<Value> &nonLiveSet,
<< " is already marked non-live (dead) at index " << index;
continue;
}

if (liveSet.contains(value)) {
continue;
}
Comment on lines +165 to +166
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
continue;
}
LDBG() << "Value " << value
<< " is already marked live at index " << index;
continue;
}

Should we align the logging with above?
(otherwise remove the braces)

const Liveness *liveness = la.getLiveness(value);
// It is important to note that when `liveness` is null, we can't tell if
// `value` is live or not. So, the safe option is to consider it live. Also,
Expand Down Expand Up @@ -259,8 +268,9 @@ static SmallVector<OpOperand *> operandsToOpOperands(OperandRange operands) {
/// - Return-like
static void processSimpleOp(Operation *op, RunLivenessAnalysis &la,
DenseSet<Value> &nonLiveSet,
RDVFinalCleanupList &cl) {
if (!isMemoryEffectFree(op) || hasLive(op->getResults(), nonLiveSet, la)) {
DenseSet<Value> &liveSet, RDVFinalCleanupList &cl) {
if (!isMemoryEffectFree(op) ||
hasLive(op->getResults(), nonLiveSet, liveSet, la)) {
LDBG() << "Simple op is not memory effect free or has live results, "
"preserving it: "
<< OpWithFlags(op, OpPrintingFlags().skipRegions());
Expand Down Expand Up @@ -288,7 +298,7 @@ static void processSimpleOp(Operation *op, RunLivenessAnalysis &la,
/// (6) Marking all its results as non-live values.
static void processFuncOp(FunctionOpInterface funcOp, Operation *module,
RunLivenessAnalysis &la, DenseSet<Value> &nonLiveSet,
RDVFinalCleanupList &cl) {
DenseSet<Value> &liveSet, RDVFinalCleanupList &cl) {
LDBG() << "Processing function op: "
<< OpWithFlags(funcOp, OpPrintingFlags().skipRegions());
if (funcOp.isPublic() || funcOp.isExternal()) {
Expand All @@ -299,7 +309,7 @@ static void processFuncOp(FunctionOpInterface funcOp, Operation *module,

// Get the list of unnecessary (non-live) arguments in `nonLiveArgs`.
SmallVector<Value> arguments(funcOp.getArguments());
BitVector nonLiveArgs = markLives(arguments, nonLiveSet, la);
BitVector nonLiveArgs = markLives(arguments, nonLiveSet, liveSet, la);
nonLiveArgs = nonLiveArgs.flip();

// Do (1).
Expand Down Expand Up @@ -352,7 +362,8 @@ static void processFuncOp(FunctionOpInterface funcOp, Operation *module,
for (SymbolTable::SymbolUse use : uses) {
Operation *callOp = use.getUser();
assert(isa<CallOpInterface>(callOp) && "expected a call-like user");
BitVector liveCallRets = markLives(callOp->getResults(), nonLiveSet, la);
BitVector liveCallRets =
markLives(callOp->getResults(), nonLiveSet, liveSet, la);
nonLiveRets &= liveCallRets.flip();
}

Expand All @@ -379,6 +390,127 @@ static void processFuncOp(FunctionOpInterface funcOp, Operation *module,
}
}

// Create a cheaper value with the same type of oldVal in front of CallOp.
static Value createDummyArgument(CallOpInterface callOp, Value oldVal) {
OpBuilder builder(callOp.getOperation());
Type type = oldVal.getType();

// Create zero constant for any supported type
if (TypedAttr zeroAttr = builder.getZeroAttr(type)) {
return builder.create<arith::ConstantOp>(oldVal.getLoc(), type, zeroAttr);
}
return {};
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we split this out in a follow-up PR: keep this PR about fixing the bugs without introducing an "aggressive" optimization, and introduce the optimization on its own afterward?


// When you mark a call operand as live, also mark its definition chain, recursively.
// We handle RegionBranchOpInterface here. I think we should handle BranchOpInterface as well.
void propagateBackward(Value val, DenseSet<Value> &liveSet) {
if (liveSet.contains(val)) return;
liveSet.insert(val);

if (auto defOp = val.getDefiningOp()) {
// Mark operands of live results as live
for (Value operand : defOp->getOperands()) {
propagateBackward(operand, liveSet);
}

// Handle RegionBranchOpInterface specially
if (auto regionBranchOp = dyn_cast<RegionBranchOpInterface>(defOp)) {
// If this is a result of a RegionBranchOpInterface, we need to trace back
// through the control flow to find the sources that contribute to this result

OpResult result = cast<OpResult>(val);
unsigned resultIndex = result.getResultNumber();

// Find all possible sources that can contribute to this result
// by examining all regions and their terminators
for (Region &region : regionBranchOp->getRegions()) {
if (region.empty()) continue;

// Get the successors from this region
SmallVector<RegionSuccessor> successors;
regionBranchOp.getSuccessorRegions(RegionBranchPoint(&region), successors);

// Check if any successor can produce this result
for (const RegionSuccessor &successor : successors) {
if (successor.isParent()) {
// This region can return to the parent operation
ValueRange successorInputs = successor.getSuccessorInputs();
if (resultIndex < successorInputs.size()) {
// Find the terminator that contributes to this result
Operation *terminator = region.back().getTerminator();
if (auto regionBranchTerm =
dyn_cast<RegionBranchTerminatorOpInterface>(terminator)) {
OperandRange terminatorOperands =
regionBranchTerm.getSuccessorOperands(RegionBranchPoint::parent());
if (resultIndex < terminatorOperands.size()) {
// This terminator operand contributes to our result
propagateBackward(terminatorOperands[resultIndex], liveSet);
}
}
}
}
}

// Also mark region arguments as live if they might contribute to this result
// Find which operand of the parent operation corresponds to region arguments
Block &entryBlock = region.front();
for (BlockArgument arg : entryBlock.getArguments()) {
// Get entry successor operands - these are the operands that flow
// from the parent operation to this region
SmallVector<RegionSuccessor> entrySuccessors;
regionBranchOp.getSuccessorRegions(RegionBranchPoint::parent(), entrySuccessors);

for (const RegionSuccessor &entrySuccessor : entrySuccessors) {
if (entrySuccessor.getSuccessor() == &region) {
// Get the operands that are forwarded to this region
OperandRange entryOperands =
regionBranchOp.getEntrySuccessorOperands(RegionBranchPoint::parent());
unsigned argIndex = arg.getArgNumber();
if (argIndex < entryOperands.size()) {
propagateBackward(entryOperands[argIndex], liveSet);
}
break;
}
}
}
}
}
}
}
static void processCallOp(CallOpInterface callOp, Operation *module,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you document the API?

RunLivenessAnalysis &la, DenseSet<Value> &nonLiveSet,
DenseSet<Value> &liveSet) {
if (!la.getSolverConfig().isInterprocedural())
return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That deserves a comment.


Operation *callableOp = callOp.resolveCallable();
auto funcOp = dyn_cast<FunctionOpInterface>(callableOp);
if (!funcOp || !funcOp.isPublic()) {
return;
}
Comment on lines +489 to +491
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!funcOp || !funcOp.isPublic()) {
return;
}
if (!funcOp || !funcOp.isPublic())
return;

Nit: no trivial braces.


LDBG() << "processCallOp to a public function: " << funcOp.getName();
// Get the list of unnecessary (non-live) arguments in `nonLiveArgs`.
SmallVector<Value> arguments(funcOp.getArguments());
BitVector nonLiveArgs = markLives(arguments, nonLiveSet, liveSet, la);
nonLiveArgs = nonLiveArgs.flip();

if (nonLiveArgs.count() > 0) {
LDBG() << funcOp.getName() << " contains NonLive arguments";
// The number of operands in the call op may not match the number of
// arguments in the func op.
SmallVector<OpOperand *> callOpOperands =
operandsToOpOperands(callOp.getArgOperands());

for (int index : nonLiveArgs.set_bits()) {
OpOperand *operand = callOpOperands[index];
LDBG() << "mark operand " << index << " live " << operand->get();
propagateBackward(operand->get(), liveSet);
}
}
}

/// Process a region branch operation `regionBranchOp` using the liveness
/// information in `la`. The processing involves two scenarios:
///
Expand Down Expand Up @@ -411,12 +543,14 @@ static void processFuncOp(FunctionOpInterface funcOp, Operation *module,
static void processRegionBranchOp(RegionBranchOpInterface regionBranchOp,
RunLivenessAnalysis &la,
DenseSet<Value> &nonLiveSet,
DenseSet<Value> &liveSet,
RDVFinalCleanupList &cl) {
LDBG() << "Processing region branch op: "
<< OpWithFlags(regionBranchOp, OpPrintingFlags().skipRegions());
// Mark live results of `regionBranchOp` in `liveResults`.
auto markLiveResults = [&](BitVector &liveResults) {
liveResults = markLives(regionBranchOp->getResults(), nonLiveSet, la);
liveResults =
markLives(regionBranchOp->getResults(), nonLiveSet, liveSet, la);
};

// Mark live arguments in the regions of `regionBranchOp` in `liveArgs`.
Expand All @@ -425,7 +559,7 @@ static void processRegionBranchOp(RegionBranchOpInterface regionBranchOp,
if (region.empty())
continue;
SmallVector<Value> arguments(region.front().getArguments());
BitVector regionLiveArgs = markLives(arguments, nonLiveSet, la);
BitVector regionLiveArgs = markLives(arguments, nonLiveSet, liveSet, la);
liveArgs[&region] = regionLiveArgs;
}
};
Expand Down Expand Up @@ -619,7 +753,7 @@ static void processRegionBranchOp(RegionBranchOpInterface regionBranchOp,
// attributed to something else.
// Do (1') and (2').
if (isMemoryEffectFree(regionBranchOp.getOperation()) &&
!hasLive(regionBranchOp->getResults(), nonLiveSet, la)) {
!hasLive(regionBranchOp->getResults(), nonLiveSet, liveSet, la)) {
cl.operations.push_back(regionBranchOp.getOperation());
return;
}
Expand Down Expand Up @@ -698,7 +832,7 @@ static void processRegionBranchOp(RegionBranchOpInterface regionBranchOp,

static void processBranchOp(BranchOpInterface branchOp, RunLivenessAnalysis &la,
DenseSet<Value> &nonLiveSet,
RDVFinalCleanupList &cl) {
DenseSet<Value> &liveSet, RDVFinalCleanupList &cl) {
LDBG() << "Processing branch op: " << *branchOp;
unsigned numSuccessors = branchOp->getNumSuccessors();

Expand All @@ -716,7 +850,7 @@ static void processBranchOp(BranchOpInterface branchOp, RunLivenessAnalysis &la,

// Do (2)
BitVector successorNonLive =
markLives(operandValues, nonLiveSet, la).flip();
markLives(operandValues, nonLiveSet, liveSet, la).flip();
collectNonLiveValues(nonLiveSet, successorBlock->getArguments(),
successorNonLive);

Expand Down Expand Up @@ -876,26 +1010,29 @@ void RemoveDeadValues::runOnOperation() {
// Tracks values eligible for erasure - complements liveness analysis to
// identify "droppable" values.
DenseSet<Value> deadVals;
// mark outgoing arguments to a public function LIVE. We also propagate
// liveness backward.
DenseSet<Value> liveVals;

// Maintains a list of Ops, values, branches, etc., slated for cleanup at the
// end of this pass.
RDVFinalCleanupList finalCleanupList;

module->walk([&](Operation *op) {
module->walk<WalkOrder::PostOrder, BackwardIterator>([&](Operation *op) {
if (auto funcOp = dyn_cast<FunctionOpInterface>(op)) {
processFuncOp(funcOp, module, la, deadVals, finalCleanupList);
processFuncOp(funcOp, module, la, deadVals, liveVals, finalCleanupList);
} else if (auto regionBranchOp = dyn_cast<RegionBranchOpInterface>(op)) {
processRegionBranchOp(regionBranchOp, la, deadVals, finalCleanupList);
processRegionBranchOp(regionBranchOp, la, deadVals, liveVals,
finalCleanupList);
} else if (auto branchOp = dyn_cast<BranchOpInterface>(op)) {
processBranchOp(branchOp, la, deadVals, finalCleanupList);
processBranchOp(branchOp, la, deadVals, liveVals, finalCleanupList);
} else if (op->hasTrait<::mlir::OpTrait::IsTerminator>()) {
// Nothing to do here because this is a terminator op and it should be
// honored with respect to its parent
} else if (isa<CallOpInterface>(op)) {
// Nothing to do because this op is associated with a function op and gets
// cleaned when the latter is cleaned.
processCallOp(cast<CallOpInterface>(op), module, la, deadVals, liveVals);
} else {
processSimpleOp(op, la, deadVals, finalCleanupList);
processSimpleOp(op, la, deadVals, liveVals, finalCleanupList);
}
});

Expand Down
37 changes: 37 additions & 0 deletions mlir/test/Transforms/remove-dead-values.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -569,6 +569,43 @@ module @return_void_with_unused_argument {
call @fn_return_void_with_unused_argument(%arg0, %unused) : (i32, memref<4xi32>) -> ()
return %unused : memref<4xi32>
}

// the function signature is immutable because it is public.
func.func public @immutable_fn_with_unused_argument(%arg0: i32, %arg1: memref<4xf32>) -> () {
return
}

// CHECK-LABEL: func.func @main2
// CHECK: %[[ONE:.*]] = arith.constant 1 : i32
// CHECK: %[[UNUSED:.*]] = arith.addi %[[ONE]], %[[ONE]] : i32
// CHECK: %[[MEM:.*]] = memref.alloc() : memref<4xf32>
// CHECK: call @immutable_fn_with_unused_argument(%[[UNUSED]], %[[MEM]]) : (i32, memref<4xf32>) -> ()
func.func @main2() -> () {
%one = arith.constant 1 : i32
%scalar = arith.addi %one, %one: i32
%mem = memref.alloc() : memref<4xf32>

call @immutable_fn_with_unused_argument(%scalar, %mem) : (i32, memref<4xf32>) -> ()
return
}

// CHECK-LABEL: func.func @main3
// CHECK: %[[UNUSED:.*]] = scf.if %arg0 -> (i32)
// CHECK: %[[MEM:.*]] = memref.alloc() : memref<4xf32>
// CHECK: call @immutable_fn_with_unused_argument(%[[UNUSED]], %[[MEM]]) : (i32, memref<4xf32>) -> ()
func.func @main3(%arg0: i1) {
%0 = scf.if %arg0 -> (i32) {
%c1_i32 = arith.constant 1 : i32
scf.yield %c1_i32 : i32
} else {
%c0_i32 = arith.constant 0 : i32
scf.yield %c0_i32 : i32
}
%mem = memref.alloc() : memref<4xf32>

call @immutable_fn_with_unused_argument(%0, %mem) : (i32, memref<4xf32>) -> ()
return
}
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add two CFG examples where the blocks are listed in different order to ensure you're not sensitive to the order the blocks are in-memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi, @joker-eph
I did add a testcase as you said. Then I realize that it's non-trivial to propagate liveness in RemoveDeadValues.

Here is a testcase only for RegionBranchOpinterface. From %0 is live at line 11, we need to mark %0 is live at line 2. After that, we need to mark %c1_i32 at line 4 and c0_i32 at line 7 live as well. In order words, we need to walk function @main3 preorder + backward.

     1	  func.func @main3(%arg0: i1) {
     2	    %0 = scf.if %arg0 -> (i32) {
     3	      %c1_i32 = arith.constant 1 : i32
     4	      scf.yield %c1_i32 : i32
     5	    } else {
     6	      %c0_i32 = arith.constant 0 : i32
     7	      scf.yield %c0_i32 : i32
     8	    }
     9	    %mem = memref.alloc() : memref<4xf32>
    10	
    11	    call @immutable_fn_with_unused_argument(%0, %mem) : (i32, memref<4xf32>) -> ()
    12	    return
    13	  }

I manage to fix this in propagateBackward. It pretty much redo what liveness analysis has done. TBH, I don't think this is the right way to proceed. RemoveDeadValues should keep its own single responsibility.

I take a step back and think about why we end up here. The very reason we try to propagate liveness in it because:

  1. liveness is immutable.
  2. We somehow need to update the NonLive arguments of a public function.

How about we just introduce a new pass: 'privatize-public-function' right before 'remove-dead-values'.

  1. It deploys separation of interface and implementation.
  2. If nothing changes, we preserve liveness. Otherwise, we invalidate it and let remove-dead-value recompute.

Here is a demo what this pass transforms.
i think we can waive cost model because we don't clone function body. We just create a
thin wrapper.

public void foo(int unused){...}

void main() {
arg = compute();
call foo(arg);
}
=> 
public void foo(int unused) { // interface
return __foo_impl(unused);
}

private void __foo_impl(int unused) { //implementation, new function.
... // the function body of the original foo.
}

void main() {
arg = compute();
call __foo_impl(arg);
}

This is my prototype here. do you think it's more feasible solution?
navyxliu@b73f537#diff-904855c22d662d8afbc11c40fe2906259836ba53e907e1cc899e6355358ec482

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, but this needs to be callable from RemoveDeadValues itself (the pass can't crash itself here)


// -----
Expand Down
Loading