Lowering of i1 types conflicts with detensoring #6756

ScottTodd · 2021-08-13T17:13:57Z

I'm encountering this while working on #1159. Compiling the tosa if.mlir file using -iree-flow-enable-linalg-detensorize produces error: TensorRewriteAdaptor newValue is invalid type 'tensor<i8>' during ConvertToHALPass, but there are deeper issues with how i1 types are handled throughout the compilation.

Here's the relevant section of IR through the compilation flow (the full IR dump is here):

// original
  builtin.func @if_true_test(%arg0: tensor<i1>, %arg1: tensor<i32>) -> tensor<i32> {
    %0 = "tosa.cond_if"(%arg0, %arg1) ( {

// -----// IR Dump After mlir::iree_compiler::IREE::ABI::WrapEntryPointsPass //----- //
// -----// IR Dump Before Canonicalizer //----- //
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = "tosa.cond_if"(%0, %1) ( {

// -----// IR Dump After TosaToSCF //----- //
// -----// IR Dump After TopLevelSCFToCFG //----- //
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  cond_br %2, ^bb1, ^bb2

// -----// IR Dump After PromoteTensorLoads //----- //
// Note: this runs twice - this is the first time!
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = zexti %0 : tensor<i1> to tensor<i8>
  %3 = flow.tensor.load %2 : tensor<i8>
  %4 = trunci %3 : i8 to i1
  cond_br %4, ^bb1, ^bb2

// -----// IR Dump After ConvertElementwiseToLinalg //----- //
// Note: i1 goes into the linalg region, handled by the HAL without detensoring
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = linalg.init_tensor [] : tensor<i8>
  %3 = linalg.generic {indexing_maps = [affine_map<() -> ()>, affine_map<() -> ()>], iterator_types = []} ins(%0 : tensor<i1>) outs(%2 : tensor<i8>) {
  ^bb0(%arg2: i1, %arg3: i8):  // no predecessors
    %10 = zexti %arg2 : i1 to i8
    linalg.yield %10 : i8
  } -> tensor<i8>
  %4 = flow.tensor.load %3 : tensor<i8>
  %5 = trunci %4 : i8 to i1
  cond_br %5, ^bb1, ^bb2(%1 : tensor<i32>)

// -----// IR Dump After LinalgDetensorize //----- //
// Note: detensoring pulls the i1 back out of its linalg region
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  %3 = zexti %2 : i1 to i8
  %4 = tensor.from_elements %3 : tensor<1xi8>
  %5 = linalg.tensor_collapse_shape %4 [] : tensor<1xi8> into tensor<i8>
  %6 = flow.tensor.load %5 : tensor<i8>
  %7 = trunci %6 : i8 to i1
  cond_br %7, ^bb1, ^bb2(%8 : i32)

// -----// IR Dump After ConvertTensorOps //----- //
// Note: the next 3 passes are cleanup after detensoring
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %c1 = constant 1 : index
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  %3 = zexti %2 : i1 to i8
  %4 = flow.tensor.splat %3 : tensor<1xi8>{%c1}
  %5 = linalg.tensor_collapse_shape %4 [] : tensor<1xi8> into tensor<i8>
  %6 = flow.tensor.load %5 : tensor<i8>
  %7 = trunci %6 : i8 to i1
  cond_br %7, ^bb1, ^bb2(%8 : i32)

// -----// IR Dump After ConvertLinalgTensorOps //----- //
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %c1 = constant 1 : index
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  %3 = zexti %2 : i1 to i8
  %4 = flow.tensor.splat %3 : tensor<1xi8>{%c1}
  %5 = flow.tensor.reshape %4 : tensor<1xi8> -> tensor<i8>
  %6 = flow.tensor.load %5 : tensor<i8>
  %7 = trunci %6 : i8 to i1
  cond_br %7, ^bb1, ^bb2(%8 : i32)

// -----// IR Dump After Canonicalizer //----- //
// Note: we're back to roughly where we started!
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  cond_br %2, ^bb1, ^bb2(%3 : i32)
// -----// IR Dump Before DispatchLinalgOnTensors //----- //

// -----// IR Dump After PromoteTensorLoads //----- //
// Note: second time this pass runs, and oops! We've discarded all the cleanup we did earlier...
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = zexti %0 : tensor<i1> to tensor<i8>
  %3 = flow.tensor.load %2 : tensor<i8>
  %4 = trunci %3 : i8 to i1
  cond_br %4, ^bb1, ^bb2(%5 : i32)

// -----// IR Dump Before mlir::iree_compiler::IREE::HAL::`anonymous-namespace'::ConvertToHALPass //----- //

D:\dev\projects\iree/../iree-data/tests/tosa_if.mlir:3:8: error: TensorRewriteAdaptor newValue is invalid type 'tensor<i8>'
  %2 = "tosa.cond_if"(%0, %1) ( {
       ^
D:\dev\projects\iree/../iree-data/tests/tosa_if.mlir:1:1: note: called from
func @if_true_test(%0: tensor<i1>, %1: tensor<i32>) -> tensor<i32> {
^
D:\dev\projects\iree/../iree-data/tests/tosa_if.mlir:3:8: error: 'flow.tensor.load' op cannot create adaptor for source

The text was updated successfully, but these errors were encountered:

benvanik · 2021-08-13T17:18:04Z

PromoteTensorLoads seems broken/actively harmful. We should kill it.

When all working flow should be fine with i1 operations (flow.tensor.load of i1/etc) - so we shouldn't need to do any manipulation in the flow dialect. It's only when we go to HAL that i1s become bad (storage is incompatible).

The other thing we should add is a verification step at the end of flow that ensures that there are no more operations acting on tensors besides the flow ops (flow.dispatch/flow.tensor.* etc) - hal issue you are seeing is because that zexti on a tensor type is still around when it cannot be.

benvanik · 2021-08-13T17:19:10Z

oh, and it's flow->hal that cares about i1 on the outside but codegen will care on the inside - a flow.dispatch.tensor.load/store inside of an executable can operate on i1 but when converting that to hal.interface.* it needs to be adjusted for storage format.

ScottTodd · 2021-08-13T17:34:52Z

👍 SGTM. I'll work on untangling this when I get back from vacation. This is another case where documentation (or verification steps) around various types would help (#5223).

ScottTodd · 2021-08-13T21:23:01Z

(More notes when I get back, or if someone else wants to chip in...)

This also relates to #3102, and that issue has more information about storage formats and points during compilation that type conversions should be handled. The if.mlir file I've been working with is pretty simple, so it would still help to have a few more complex open source models that use flow control (with i1s/booleans) in meaningful ways to poke at.

…6852) Fixes #6756 (the [`tosa if.mlir`](https://github.com/google/iree/blob/main/iree/test/e2e/tosa_ops/if.mlir) file compiles successfully using `-iree-flow-enable-linalg-detensorize` with this change) The `PromoteTensorLoads` pass was converting `i1` loads to `i8` loads using `ZeroExtendIOp` and `TruncateIOp`. That was producing weird cycles during compilation when detensoring was applied, and `flow` ops should be fine with i1 types. We still need to handle `i1` types when going to the HAL (since storage is incompatible) on the outside (external interface) and inside (codegen).

ScottTodd added bug 🐞 Something isn't working compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) labels Aug 13, 2021

ScottTodd self-assigned this Aug 13, 2021

This was referenced Aug 24, 2021

Remove PromoteTensorLoads pass and adapt Flow lowering. #6847

Closed

Remove PromoteTensorLoads pass, convert ExtractOp in TensorToFlow. #6852

Merged

ScottTodd closed this as completed in #6852 Aug 25, 2021

ScottTodd mentioned this issue Aug 26, 2021

Add VerifyOutputLegality to flow transform pipeline. #6883

Closed

ScottTodd mentioned this issue Sep 20, 2021

Fix bugs in PromoteI1ToI8Pass and move it to common input conversion. #7078

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lowering of i1 types conflicts with detensoring #6756

Lowering of i1 types conflicts with detensoring #6756

ScottTodd commented Aug 13, 2021

benvanik commented Aug 13, 2021

benvanik commented Aug 13, 2021

ScottTodd commented Aug 13, 2021

ScottTodd commented Aug 13, 2021

Lowering of i1 types conflicts with detensoring #6756

Lowering of i1 types conflicts with detensoring #6756

Comments

ScottTodd commented Aug 13, 2021

benvanik commented Aug 13, 2021

benvanik commented Aug 13, 2021

ScottTodd commented Aug 13, 2021

ScottTodd commented Aug 13, 2021