-
Notifications
You must be signed in to change notification settings - Fork 14.9k
Description
The problem only appears if we use a single command line on the IR below.
Instead, if we first do convert-scf-to-cf
, copy the output IR to a file, and then apply test-print-liveness
on the file, we don't observe the problem.
(Please bear with me on the weird variable names; it's modified from triton IR)
We found one possible explanation.
Maybe isBeforeInBlock
should add an edge case for other == this
because updateOrderIfNecessary
is expected to only be invoked on blocks with more than one. After skipping updateOrderIfNecessary
when other == this
, we did solve the problem but it doesn't explain why separating convert-scf-to-cf
and test-print-liveness
could avoid the problem.
llvm-project/mlir/lib/IR/Operation.cpp
Line 274 in fd5d92e
bool Operation::isBeforeInBlock(Operation *other) { |
llvm-project/mlir/lib/IR/Operation.cpp
Line 304 in fd5d92e
assert(blockFront != blockBack && "expected more than one operation"); |
cc @ptillet
Tested with f50cad2
Reproduced with:
mlir-opt -pass-pipeline="builtin.module(func.func(convert-scf-to-cf), func.func(test-print-liveness))" ./test.mlir
func.func @for_if_for(%lb : index, %ub : index, %step : index, %i1 : i1) {
%a_shared_init = arith.constant dense<0.00e+00> : tensor<128x32xf16>
%b_shared_init = arith.constant dense<0.00e+00> : tensor<128x32xf16>
%c_shared_init = arith.constant dense<0.00e+00> : tensor<128x32xf16>
%c_blocked = arith.negf %c_shared_init : tensor<128x32xf16>
%a_shared, %b_shared, %c_shared = scf.for %iv = %lb to %ub step %step iter_args(%a_shared = %a_shared_init, %b_shared = %b_shared_init, %c_shared = %c_shared_init) -> (tensor<128x32xf16>, tensor<128x32xf16>, tensor<128x32xf16>) {
%c_shared_next_next = scf.if %i1 -> tensor<128x32xf16> {
%cst0 = arith.constant dense<0.00e+00> : tensor<128x32xf16>
scf.yield %cst0 : tensor<128x32xf16>
} else {
%c_shared_ = scf.for %jv = %lb to %ub step %step iter_args(%c_shared_next = %c_shared) -> (tensor<128x32xf16>) {
%c_blocked_next = arith.negf %c_shared_next : tensor<128x32xf16>
scf.yield %c_shared_next : tensor<128x32xf16>
}
scf.yield %c_shared_ : tensor<128x32xf16>
}
%b_blocked_next = arith.negf %b_shared: tensor<128x32xf16>
scf.yield %a_shared, %b_shared, %c_shared_next_next : tensor<128x32xf16>, tensor<128x32xf16>, tensor<128x32xf16>
}
return
}