Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[compiler][stream] iree-stream-schedule-execution produces an invalid partitioning with multiple devices #18176

Closed
sogartar opened this issue Aug 9, 2024 · 2 comments
Assignees
Labels
compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)

Comments

@sogartar
Copy link
Contributor

sogartar commented Aug 9, 2024

I ran into this input IR for the iree-stream-schedule-execution pass that results in an after-pass verification failure

error: partition set out of order; value captured declared as escaping below ...

Full error

/home/bpetkant/ws/iree/experiments/stream-partitioning-affinity-bug/iree-stream-schedule-execution-input.mlir:19:3: error: partition set out of order; value captured declared as escaping below: %63 = "stream.async.dispatch"(%6, %10, %43#1, %6, %10, %4, %13, %4, %4, %4) <{affinity = #hal.device.affinity<@__device_1>, entry_points = [@main$async_dispatch_16::@main$async_dispatch_16_elementwise_2x8_f32], operandSegmentSizes = array<i32: 2, 3, 1, 1, 1, 1, 1>, tied_operands = [-1 : index]}> : (index, index, !stream.resource<external>, index, index, index, index, index, index, index) -> !stream.resource<transient>
  util.func public @main$async(%arg0: !hal.buffer_view, %arg1: !hal.buffer_view, %arg2: !hal.buffer_view, %arg3: !hal.buffer_view, %arg4: !hal.fence, %arg5: !hal.fence) -> (!hal.buffer_view, !hal.buffer_view) attributes {inlining_policy = #util.inline.never, iree.abi.model = "coarse-fences", iree.abi.stub} {
  ^

Invocation of the iree-stream-schedule-execution pass

iree-opt \
  --pass-pipeline="builtin.module(util.func(iree-stream-schedule-execution))" \
  iree-stream-schedule-execution-input.mlir"

Compilation of the frontend input.

iree-compile \
    frontend-input.mlir \
    --iree-input-type=torch \
    --iree-hal-target-device=llvm-cpu[0] \
    --iree-hal-target-device=llvm-cpu[1] \
    -o output.vmfb

Here are the fronted input and the iree-stream-schedule-execution pass input.

The error appears on top of 643f719.

@sogartar sogartar added the compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm) label Aug 9, 2024
@sogartar sogartar self-assigned this Aug 9, 2024
@sogartar
Copy link
Contributor Author

sogartar commented Aug 9, 2024

I would like to debug this. I tried to create a smaller IR that exhibits the problem, but was unable so far. I will explore in more detail the partitioning algorithm.

@sogartar
Copy link
Contributor Author

PR #18217 attempts to fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/dialects Relating to the IREE compiler dialects (flow, hal, vm)
Projects
None yet
Development

No branches or pull requests

1 participant