Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lowering of i1 types conflicts with detensoring #6756

Closed
ScottTodd opened this issue Aug 13, 2021 · 4 comments · Fixed by #6852
Closed

Lowering of i1 types conflicts with detensoring #6756

ScottTodd opened this issue Aug 13, 2021 · 4 comments · Fixed by #6852
Assignees
Labels
bug 🐞 Something isn't working compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)

Comments

@ScottTodd
Copy link
Collaborator

I'm encountering this while working on #1159. Compiling the tosa if.mlir file using -iree-flow-enable-linalg-detensorize produces error: TensorRewriteAdaptor newValue is invalid type 'tensor<i8>' during ConvertToHALPass, but there are deeper issues with how i1 types are handled throughout the compilation.

Here's the relevant section of IR through the compilation flow (the full IR dump is here):

// original
  builtin.func @if_true_test(%arg0: tensor<i1>, %arg1: tensor<i32>) -> tensor<i32> {
    %0 = "tosa.cond_if"(%arg0, %arg1) ( {

// -----// IR Dump After mlir::iree_compiler::IREE::ABI::WrapEntryPointsPass //----- //
// -----// IR Dump Before Canonicalizer //----- //
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = "tosa.cond_if"(%0, %1) ( {

// -----// IR Dump After TosaToSCF //----- //
// -----// IR Dump After TopLevelSCFToCFG //----- //
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  cond_br %2, ^bb1, ^bb2

// -----// IR Dump After PromoteTensorLoads //----- //
// Note: this runs twice - this is the first time!
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = zexti %0 : tensor<i1> to tensor<i8>
  %3 = flow.tensor.load %2 : tensor<i8>
  %4 = trunci %3 : i8 to i1
  cond_br %4, ^bb1, ^bb2

// -----// IR Dump After ConvertElementwiseToLinalg //----- //
// Note: i1 goes into the linalg region, handled by the HAL without detensoring
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = linalg.init_tensor [] : tensor<i8>
  %3 = linalg.generic {indexing_maps = [affine_map<() -> ()>, affine_map<() -> ()>], iterator_types = []} ins(%0 : tensor<i1>) outs(%2 : tensor<i8>) {
  ^bb0(%arg2: i1, %arg3: i8):  // no predecessors
    %10 = zexti %arg2 : i1 to i8
    linalg.yield %10 : i8
  } -> tensor<i8>
  %4 = flow.tensor.load %3 : tensor<i8>
  %5 = trunci %4 : i8 to i1
  cond_br %5, ^bb1, ^bb2(%1 : tensor<i32>)

// -----// IR Dump After LinalgDetensorize //----- //
// Note: detensoring pulls the i1 back out of its linalg region
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  %3 = zexti %2 : i1 to i8
  %4 = tensor.from_elements %3 : tensor<1xi8>
  %5 = linalg.tensor_collapse_shape %4 [] : tensor<1xi8> into tensor<i8>
  %6 = flow.tensor.load %5 : tensor<i8>
  %7 = trunci %6 : i8 to i1
  cond_br %7, ^bb1, ^bb2(%8 : i32)

// -----// IR Dump After ConvertTensorOps //----- //
// Note: the next 3 passes are cleanup after detensoring
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %c1 = constant 1 : index
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  %3 = zexti %2 : i1 to i8
  %4 = flow.tensor.splat %3 : tensor<1xi8>{%c1}
  %5 = linalg.tensor_collapse_shape %4 [] : tensor<1xi8> into tensor<i8>
  %6 = flow.tensor.load %5 : tensor<i8>
  %7 = trunci %6 : i8 to i1
  cond_br %7, ^bb1, ^bb2(%8 : i32)

// -----// IR Dump After ConvertLinalgTensorOps //----- //
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %c1 = constant 1 : index
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  %3 = zexti %2 : i1 to i8
  %4 = flow.tensor.splat %3 : tensor<1xi8>{%c1}
  %5 = flow.tensor.reshape %4 : tensor<1xi8> -> tensor<i8>
  %6 = flow.tensor.load %5 : tensor<i8>
  %7 = trunci %6 : i8 to i1
  cond_br %7, ^bb1, ^bb2(%8 : i32)

// -----// IR Dump After Canonicalizer //----- //
// Note: we're back to roughly where we started!
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = tensor.extract %0[] : tensor<i1>
  cond_br %2, ^bb1, ^bb2(%3 : i32)
// -----// IR Dump Before DispatchLinalgOnTensors //----- //

// -----// IR Dump After PromoteTensorLoads //----- //
// Note: second time this pass runs, and oops! We've discarded all the cleanup we did earlier...
builtin.func @if_true_test(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view) -> !hal.buffer_view {
  %0 = hal.tensor.cast %arg0 : !hal.buffer_view -> tensor<i1>
  %2 = zexti %0 : tensor<i1> to tensor<i8>
  %3 = flow.tensor.load %2 : tensor<i8>
  %4 = trunci %3 : i8 to i1
  cond_br %4, ^bb1, ^bb2(%5 : i32)

// -----// IR Dump Before mlir::iree_compiler::IREE::HAL::`anonymous-namespace'::ConvertToHALPass //----- //

D:\dev\projects\iree/../iree-data/tests/tosa_if.mlir:3:8: error: TensorRewriteAdaptor newValue is invalid type 'tensor<i8>'
  %2 = "tosa.cond_if"(%0, %1) ( {
       ^
D:\dev\projects\iree/../iree-data/tests/tosa_if.mlir:1:1: note: called from
func @if_true_test(%0: tensor<i1>, %1: tensor<i32>) -> tensor<i32> {
^
D:\dev\projects\iree/../iree-data/tests/tosa_if.mlir:3:8: error: 'flow.tensor.load' op cannot create adaptor for source
@ScottTodd ScottTodd added bug 🐞 Something isn't working compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) labels Aug 13, 2021
@ScottTodd ScottTodd self-assigned this Aug 13, 2021
@benvanik
Copy link
Collaborator

PromoteTensorLoads seems broken/actively harmful. We should kill it.

When all working flow should be fine with i1 operations (flow.tensor.load of i1/etc) - so we shouldn't need to do any manipulation in the flow dialect. It's only when we go to HAL that i1s become bad (storage is incompatible).

The other thing we should add is a verification step at the end of flow that ensures that there are no more operations acting on tensors besides the flow ops (flow.dispatch/flow.tensor.* etc) - hal issue you are seeing is because that zexti on a tensor type is still around when it cannot be.

@benvanik
Copy link
Collaborator

oh, and it's flow->hal that cares about i1 on the outside but codegen will care on the inside - a flow.dispatch.tensor.load/store inside of an executable can operate on i1 but when converting that to hal.interface.* it needs to be adjusted for storage format.

@ScottTodd
Copy link
Collaborator Author

👍 SGTM. I'll work on untangling this when I get back from vacation. This is another case where documentation (or verification steps) around various types would help (#5223).

@ScottTodd
Copy link
Collaborator Author

(More notes when I get back, or if someone else wants to chip in...)

This also relates to #3102, and that issue has more information about storage formats and points during compilation that type conversions should be handled. The if.mlir file I've been working with is pretty simple, so it would still help to have a few more complex open source models that use flow control (with i1s/booleans) in meaningful ways to poke at.

ScottTodd added a commit that referenced this issue Aug 25, 2021
…6852)

Fixes #6756 (the [`tosa if.mlir`](https://github.com/google/iree/blob/main/iree/test/e2e/tosa_ops/if.mlir) file compiles successfully using `-iree-flow-enable-linalg-detensorize` with this change)

The `PromoteTensorLoads` pass was converting `i1` loads to `i8` loads using `ZeroExtendIOp` and `TruncateIOp`. That was producing weird cycles during compilation when detensoring was applied, and `flow` ops should be fine with i1 types. We still need to handle `i1` types when going to the HAL (since storage is incompatible) on the outside (external interface) and inside (codegen).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants