Skip to content

Support Steering-Control-Based Dataflow Representation & Mapping#144

Merged
ShangkunLi merged 9 commits intocoredac:mainfrom
ShangkunLi:steer-control
Oct 8, 2025
Merged

Support Steering-Control-Based Dataflow Representation & Mapping#144
ShangkunLi merged 9 commits intocoredac:mainfrom
ShangkunLi:steer-control

Conversation

@ShangkunLi
Copy link
Copy Markdown
Collaborator

In this PR, we extend our current predicate-based dataflow representation to support steering-control-based dataflow representation by:

  • Introduce specific operations, including true/false_steer, invariant, carry, and merge
  • Implement transform-to-steer-control pass to transform our predicate-based dataflow IR into the steering control manner
  • Implement remove-predicated-type pass to transform the predicate data type to the source generic data type
  • Some minor changes in mapping_util.cpp to enable map steer-control-based dataflow IR

For example, the source ir of a simple loop is:

module {
  func.func @simple_add_loop() -> i64 attributes {accelerator = "neura"} {
    %0 = "neura.constant"() <{predicate = true, value = 16 : i64}> : () -> !neura.data<i64, i1>
    %1 = "neura.grant_once"(%0) : (!neura.data<i64, i1>) -> !neura.data<i64, i1>
    %2 = "neura.constant"() <{predicate = true, value = 1 : i64}> : () -> !neura.data<i64, i1>
    %3 = "neura.grant_once"(%2) : (!neura.data<i64, i1>) -> !neura.data<i64, i1>
    %4 = "neura.constant"() <{predicate = true, value = 1 : i64}> : () -> !neura.data<i64, i1>
    %5 = "neura.grant_once"(%4) : (!neura.data<i64, i1>) -> !neura.data<i64, i1>
    %6 = "neura.constant"() <{predicate = true, value = 0 : i64}> : () -> !neura.data<i64, i1>
    %7 = "neura.grant_once"(%6) : (!neura.data<i64, i1>) -> !neura.data<i64, i1>
    %8 = neura.reserve : !neura.data<i64, i1>
    %9 = "neura.phi"(%8, %1) : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i64, i1>
    %10 = neura.reserve : !neura.data<i64, i1>
    %11 = "neura.phi"(%10, %5) : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i64, i1>
    %12 = neura.reserve : !neura.data<i64, i1>
    %13 = "neura.phi"(%12, %7) : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i64, i1>
    %14 = "neura.icmp"(%13, %9) <{cmpType = "slt"}> : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i1, i1>
    %15 = neura.grant_predicate %11, %14 : !neura.data<i64, i1>, !neura.data<i1, i1> -> !neura.data<i64, i1>
    %16 = neura.grant_predicate %13, %14 : !neura.data<i64, i1>, !neura.data<i1, i1> -> !neura.data<i64, i1>
    %17 = neura.grant_predicate %3, %14 : !neura.data<i64, i1>, !neura.data<i1, i1> -> !neura.data<i64, i1>
    %18 = neura.grant_predicate %1, %14 : !neura.data<i64, i1>, !neura.data<i1, i1> -> !neura.data<i64, i1>
    %19 = "neura.not"(%14) : (!neura.data<i1, i1>) -> !neura.data<i1, i1>
    %20 = neura.grant_predicate %11, %19 : !neura.data<i64, i1>, !neura.data<i1, i1> -> !neura.data<i64, i1>
    %21 = "neura.add"(%15, %15) : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i64, i1>
    %22 = "neura.add"(%16, %17) : (!neura.data<i64, i1>, !neura.data<i64, i1>) -> !neura.data<i64, i1>
    neura.ctrl_mov %22 -> %12 : !neura.data<i64, i1> !neura.data<i64, i1>
    neura.ctrl_mov %21 -> %10 : !neura.data<i64, i1> !neura.data<i64, i1>
    neura.ctrl_mov %18 -> %8 : !neura.data<i64, i1> !neura.data<i64, i1>
    "neura.return"(%20) : (!neura.data<i64, i1>) -> ()
  }
}

After the transformation:

module {
  func.func @simple_add_loop() -> i64 attributes {accelerator = "neura"} {
    %0 = neura.reserve : i64
    %1 = neura.reserve : i64
    %2 = neura.reserve : i1
    %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
    %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
    %7 = neura.invariant %4, %2 : i64, i1 -> i64
    %8 = neura.invariant %3, %2 : i64, i1 -> i64
    %9 = neura.carry %5, %2, %0 : i64, i1, i64 -> i64
    %10 = neura.carry %6, %2, %1 : i64, i1, i64 -> i64
    %11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1
    neura.ctrl_mov %11 -> %2 : i1 i1
    %12 = "neura.not"(%11) : (i1) -> i1
    %13 = "neura.add"(%9, %9) : (i64, i64) -> i64
    neura.ctrl_mov %13 -> %0 : i64 i64
    %14 = "neura.add"(%10, %7) : (i64, i64) -> i64
    neura.ctrl_mov %14 -> %1 : i64 i64
    "neura.return"(%9) : (i64) -> ()
  }
}

And we can map it on to the 4x4 CGRA:

module {
  func.func @simple_add_loop() -> i64 attributes {accelerator = "neura", mapping_info = {compiled_ii = 4 : i32, mapping_mode = "spatial-only", mapping_strategy = "heuristic", rec_mii = 2 : i32, res_mii = 1 : i32, x_tiles = 4 : i32, y_tiles = 4 : i32}} {
    %0 = neura.reserve : i64
    %1 = neura.reserve : i64
    %2 = neura.reserve : i1
    %3 = "neura.constant"() <{value = 16 : i64}> {mapping_locs = [{id = 0 : i32, resource = "tile", time_step = 0 : i32, x = 0 : i32, y = 0 : i32}]} : () -> i64
    %4 = "neura.constant"() <{value = 1 : i64}> {mapping_locs = [{id = 11 : i32, resource = "tile", time_step = 1 : i32, x = 3 : i32, y = 2 : i32}]} : () -> i64
    %5 = "neura.constant"() <{value = 1 : i64}> {mapping_locs = [{id = 5 : i32, resource = "tile", time_step = 1 : i32, x = 1 : i32, y = 1 : i32}]} : () -> i64
    %6 = "neura.constant"() <{value = 0 : i64}> {mapping_locs = [{id = 13 : i32, resource = "tile", time_step = 0 : i32, x = 1 : i32, y = 3 : i32}]} : () -> i64
    %7 = "neura.data_mov"(%4) {mapping_locs = [{id = 35 : i32, resource = "link", time_step = 1 : i32}]} : (i64) -> i64
    %8 = neura.invariant %7, %2 {mapping_locs = [{id = 10 : i32, resource = "tile", time_step = 2 : i32, x = 2 : i32, y = 2 : i32}]} : i64, i1 -> i64
    %9 = "neura.data_mov"(%3) {mapping_locs = [{id = 0 : i32, resource = "link", time_step = 0 : i32}]} : (i64) -> i64
    %10 = neura.invariant %9, %2 {mapping_locs = [{id = 1 : i32, resource = "tile", time_step = 1 : i32, x = 1 : i32, y = 0 : i32}]} : i64, i1 -> i64
    %11 = "neura.data_mov"(%5) {mapping_locs = [{id = 14 : i32, resource = "link", time_step = 1 : i32}]} : (i64) -> i64
    %12 = neura.carry %11, %2, %0 {mapping_locs = [{id = 6 : i32, resource = "tile", time_step = 2 : i32, x = 2 : i32, y = 1 : i32}]} : i64, i1, i64 -> i64
    %13 = "neura.data_mov"(%6) {mapping_locs = [{id = 42 : i32, resource = "link", time_step = 0 : i32}]} : (i64) -> i64
    %14 = neura.carry %13, %2, %1 {mapping_locs = [{id = 9 : i32, resource = "tile", time_step = 1 : i32, x = 1 : i32, y = 2 : i32}]} : i64, i1, i64 -> i64
    %15 = "neura.data_mov"(%14) {mapping_locs = [{id = 27 : i32, resource = "link", time_step = 1 : i32}, {id = 25 : i32, resource = "link", time_step = 2 : i32}]} : (i64) -> i64
    %16 = "neura.data_mov"(%10) {mapping_locs = [{id = 2 : i32, resource = "link", time_step = 1 : i32}, {id = 1 : i32, resource = "link", time_step = 2 : i32}]} : (i64) -> i64
    %17 = "neura.icmp"(%15, %16) <{cmpType = "slt"}> {mapping_locs = [{id = 4 : i32, resource = "tile", time_step = 3 : i32, x = 0 : i32, y = 1 : i32}]} : (i64, i64) -> i1
    neura.ctrl_mov %17 -> %2 {mapping_locs = [{id = 10 : i32, resource = "link", time_step = 3 : i32}, {id = 16 : i32, resource = "link", time_step = 4 : i32}, {id = 28 : i32, resource = "link", time_step = 5 : i32}]} : i1 i1
    %18 = "neura.data_mov"(%17) {mapping_locs = [{id = 12 : i32, resource = "link", time_step = 3 : i32}]} : (i1) -> i1
    %19 = "neura.not"(%18) {mapping_locs = [{id = 8 : i32, resource = "tile", time_step = 4 : i32, x = 0 : i32, y = 2 : i32}]} : (i1) -> i1
    %20 = "neura.data_mov"(%12) {mapping_locs = [{id = 19 : i32, resource = "link", time_step = 2 : i32}, {id = 64 : i32, resource = "register", time_step = 3 : i32}, {id = 64 : i32, resource = "register", time_step = 4 : i32}]} : (i64) -> i64
    %21 = "neura.data_mov"(%12) {mapping_locs = [{id = 17 : i32, resource = "link", time_step = 2 : i32}, {id = 15 : i32, resource = "link", time_step = 3 : i32}, {id = 3 : i32, resource = "link", time_step = 4 : i32}]} : (i64) -> i64
    %22 = "neura.add"(%20, %21) {mapping_locs = [{id = 2 : i32, resource = "tile", time_step = 5 : i32, x = 2 : i32, y = 0 : i32}]} : (i64, i64) -> i64
    neura.ctrl_mov %22 -> %0 {mapping_locs = [{id = 7 : i32, resource = "link", time_step = 5 : i32}]} : i64 i64
    %23 = "neura.data_mov"(%14) {mapping_locs = [{id = 30 : i32, resource = "link", time_step = 1 : i32}, {id = 41 : i32, resource = "link", time_step = 2 : i32}]} : (i64) -> i64
    %24 = "neura.data_mov"(%8) {mapping_locs = [{id = 34 : i32, resource = "link", time_step = 2 : i32}]} : (i64) -> i64
    %25 = "neura.add"(%23, %24) {mapping_locs = [{id = 14 : i32, resource = "tile", time_step = 3 : i32, x = 2 : i32, y = 3 : i32}]} : (i64, i64) -> i64
    neura.ctrl_mov %25 -> %1 {mapping_locs = [{id = 45 : i32, resource = "link", time_step = 3 : i32}, {id = 31 : i32, resource = "link", time_step = 4 : i32}]} : i64 i64
    %26 = "neura.data_mov"(%12) {mapping_locs = [{id = 18 : i32, resource = "link", time_step = 2 : i32}]} : (i64) -> i64
    "neura.return"(%26) {mapping_locs = [{id = 7 : i32, resource = "tile", time_step = 3 : i32, x = 3 : i32, y = 1 : i32}]} : (i64) -> ()
  }
}

@ShangkunLi ShangkunLi marked this pull request as ready for review October 7, 2025 11:40
@ShangkunLi ShangkunLi requested a review from tancheng October 7, 2025 13:24
@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Oct 7, 2025

Can you please draw a diagram or slide, listing the before/after IRs side-by-side, and highlight which dataflow IR correspond to which steering IR? And also briefly explain the meaning of each new op, e.g., carry/merge.

And I didn't see true/false_steer in your example.

@ShangkunLi
Copy link
Copy Markdown
Collaborator Author

Can you please draw a diagram or slide, listing the before/after IRs side-by-side, and highlight which dataflow IR correspond to which steering IR? And also briefly explain the meaning of each new op, e.g., carry/merge.

And I didn't see true/false_steer in your example.

Added the true/false_steer test in latest commit.
Screenshot 2025-10-08 at 12 31 02
Screenshot 2025-10-08 at 12 31 21

Our steering operation semantics follow the definition in RipTide, as shown in the above figures.

A correspondence between the predicate-based dataflow IR and the steering-based dataflow IR is shown below.

Screenshot 2025-10-08 at 13 35 07

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Oct 8, 2025

Can you please draw a diagram or slide, listing the before/after IRs side-by-side, and highlight which dataflow IR correspond to which steering IR? And also briefly explain the meaning of each new op, e.g., carry/merge.
And I didn't see true/false_steer in your example.

Added the true/false_steer test in latest commit. Screenshot 2025-10-08 at 12 31 02 Screenshot 2025-10-08 at 12 31 21

Our steering operation semantics follow the definition in RipTide, as shown in the above figures.

A correspondence between the predicate-based dataflow IR and the steering-based dataflow IR is shown below.

Screenshot 2025-10-08 at 13 35 07
  • So our ctrl_mov is basically the true_steer?
  • And how do you recognize/identify the invariant? Everything are grant_predicate in the dataflow, how to distinguish them into different carry/invariant?

@ShangkunLi
Copy link
Copy Markdown
Collaborator Author

  • So our ctrl_mov is basically the true_steer?
  • And how do you recognize/identify the invariant? Everything are grant_predicate in the dataflow, how to distinguish them into different carry/invariant?
  • ctrl_mov and reserve are still structural non-materialized operations to denote the data dependency between an operation and its backward users.
  • We can detect those phi (%reserve, %init_value) -> grant_predicate (%phi, %cond) -> ctrl_mov %grant -> %reserve patterns, which are transformed to invariant (%init_value, %cond)

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Oct 8, 2025

  • So our ctrl_mov is basically the true_steer?
  • And how do you recognize/identify the invariant? Everything are grant_predicate in the dataflow, how to distinguish them into different carry/invariant?
  • ctrl_mov and reserve are still structural non-materialized operations to denote the data dependency between an operation and its backward users.
  • We can detect those phi (%reserve, %init_value) -> grant_predicate (%phi, %cond) -> ctrl_mov %grant -> %reserve patterns, which are transformed to invariant (%init_value, %cond)
  • Then what pattern would be true_steer?
  • Thanks~! I got the invariant's identification, which basically needs to identify a grant_once for the init_value, right?
    • Then, what is needed to identify carry? and false_steer?

@ShangkunLi
Copy link
Copy Markdown
Collaborator Author

  • So our ctrl_mov is basically the true_steer?
  • And how do you recognize/identify the invariant? Everything are grant_predicate in the dataflow, how to distinguish them into different carry/invariant?
  • ctrl_mov and reserve are still structural non-materialized operations to denote the data dependency between an operation and its backward users.
  • We can detect those phi (%reserve, %init_value) -> grant_predicate (%phi, %cond) -> ctrl_mov %grant -> %reserve patterns, which are transformed to invariant (%init_value, %cond)
  • Then what pattern would be true_steer?

  • Thanks~! I got the invariant's identification, which basically needs to identify a grant_once for the init_value, right?

    • Then, what is needed to identify carry? and false_steer?
  • At the beginning of the transformation, I remove all the grant_once ops, so it is identifying constant for the init_value
  • For carry, we identify the phi -> grant_predicate -> other computations -> ctrl_mov -> reserve. And create a carry (%init_value, %cond, %carried_value) for this pattern. The %carried_value is the source value of the reserve op.
  • For true/false_steer, we identify those grant_predicate (%value, %cond) patterns. If %cond is a not operation, we create false_steer for it, otherwise a true_steer. This process is the last process after carry and invariant identification and transformation, so we can avoid handling grant_predicates in loops.

@ShangkunLi
Copy link
Copy Markdown
Collaborator Author

@tancheng One problem here, when I try to map the steering-based dataflow IR, it seems the ALAP sort didn't respect the producer-consumer dependency. So I cannot map it...

The code is:

module {
  func.func @simple_add_loop() -> i64 attributes {accelerator = "neura", dataflow_mode = "steering"} {
    %0 = neura.reserve : i64
    %1 = neura.reserve : i64
    %2 = neura.reserve : i1
    %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
    %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
    %7 = neura.invariant %4, %2 : i64, i1 -> i64
    %8 = neura.invariant %3, %2 : i64, i1 -> i64
    %9 = neura.carry %5, %2, %0 : i64, i1, i64 -> i64
    %10 = neura.carry %6, %2, %1 : i64, i1, i64 -> i64
    %11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1
    neura.ctrl_mov %11 -> %2 : i1 i1
    %12 = neura.false_steer %9, %11 : i64, i1 -> i64
    %13 = "neura.add"(%9, %9) : (i64, i64) -> i64
    neura.ctrl_mov %13 -> %0 : i64 i64
    %14 = "neura.add"(%10, %7) : (i64, i64) -> i64
    neura.ctrl_mov %14 -> %1 : i64 i64
    "neura.return"(%12) : (i64) -> ()
  }
}

The ALAP level of %7 = neura.invariant %4, %2 : i64, i1 -> i64 is 3, but we get 2 for %11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1. So when we try to map icmp, its backward user %7 is unmapped, which causes error.

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Oct 8, 2025

  • So our ctrl_mov is basically the true_steer?
  • And how do you recognize/identify the invariant? Everything are grant_predicate in the dataflow, how to distinguish them into different carry/invariant?
  • ctrl_mov and reserve are still structural non-materialized operations to denote the data dependency between an operation and its backward users.
  • We can detect those phi (%reserve, %init_value) -> grant_predicate (%phi, %cond) -> ctrl_mov %grant -> %reserve patterns, which are transformed to invariant (%init_value, %cond)
  • Then what pattern would be true_steer?

  • Thanks~! I got the invariant's identification, which basically needs to identify a grant_once for the init_value, right?

    • Then, what is needed to identify carry? and false_steer?
  • At the beginning of the transformation, I remove all the grant_once ops, so it is identifying constant for the init_value
  • For carry, we identify the phi -> grant_predicate -> other computations -> ctrl_mov -> reserve. And create a carry (%init_value, %cond, %carried_value) for this pattern. The %carried_value is the source value of the reserve op.
  • For true/false_steer, we identify those grant_predicate (%value, %cond) patterns. If %cond is a not operation, we create false_steer for it, otherwise a true_steer. This process is the last process after carry and invariant identification and transformation, so we can avoid handling grant_predicates in loops.

Then why there is not true_steer but only false_steer in your example? how to distinguish it from carry?

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Oct 8, 2025

@tancheng One problem here, when I try to map the steering-based dataflow IR, it seems the ALAP sort didn't respect the producer-consumer dependency. So I cannot map it...

The code is:

module {
  func.func @simple_add_loop() -> i64 attributes {accelerator = "neura", dataflow_mode = "steering"} {
    %0 = neura.reserve : i64
    %1 = neura.reserve : i64
    %2 = neura.reserve : i1
    %3 = "neura.constant"() <{value = 16 : i64}> : () -> i64
    %4 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %5 = "neura.constant"() <{value = 1 : i64}> : () -> i64
    %6 = "neura.constant"() <{value = 0 : i64}> : () -> i64
    %7 = neura.invariant %4, %2 : i64, i1 -> i64
    %8 = neura.invariant %3, %2 : i64, i1 -> i64
    %9 = neura.carry %5, %2, %0 : i64, i1, i64 -> i64
    %10 = neura.carry %6, %2, %1 : i64, i1, i64 -> i64
    %11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1
    neura.ctrl_mov %11 -> %2 : i1 i1
    %12 = neura.false_steer %9, %11 : i64, i1 -> i64
    %13 = "neura.add"(%9, %9) : (i64, i64) -> i64
    neura.ctrl_mov %13 -> %0 : i64 i64
    %14 = "neura.add"(%10, %7) : (i64, i64) -> i64
    neura.ctrl_mov %14 -> %1 : i64 i64
    "neura.return"(%12) : (i64) -> ()
  }
}

The ALAP level of %7 = neura.invariant %4, %2 : i64, i1 -> i64 is 3, but we get 2 for %11 = "neura.icmp"(%10, %8) <{cmpType = "slt"}> : (i64, i64) -> i1. So when we try to map icmp, its backward user %7 is unmapped, which causes error.

Why would this happen? I think the current sorting is not pure ALAP, it is kind of mixed, right? (we can disable the test and fix it in next PR though)

@ShangkunLi
Copy link
Copy Markdown
Collaborator Author

Then why there is not true_steer but only false_steer in your example? how to distinguish it from carry?

There is another case in test/neura/steer_ctrl/for_with_if.mlir https://github.com/coredac/dataflow/pull/144/files#diff-24a95dd03aee90bcabee7106cdc76ac6f4301fe6b502e5f1a6b330af61d49f52 which leverages the true_steer op.

@tancheng
Copy link
Copy Markdown
Contributor

tancheng commented Oct 8, 2025

Then why there is not true_steer but only false_steer in your example? how to distinguish it from carry?

There is another case in test/neura/steer_ctrl/for_with_if.mlir https://github.com/coredac/dataflow/pull/144/files#diff-24a95dd03aee90bcabee7106cdc76ac6f4301fe6b502e5f1a6b330af61d49f52 which leverages the true_steer op.

I just mean in your simple example, why there is no true_steer? Is it because pattern match for carry has higher priority?

@ShangkunLi
Copy link
Copy Markdown
Collaborator Author

Then why there is not true_steer but only false_steer in your example? how to distinguish it from carry?

There is another case in test/neura/steer_ctrl/for_with_if.mlir https://github.com/coredac/dataflow/pull/144/files#diff-24a95dd03aee90bcabee7106cdc76ac6f4301fe6b502e5f1a6b330af61d49f52 which leverages the true_steer op.

I just mean in your simple example, why there is no true_steer? Is it because pattern match for carry has higher priority?

Yes, the transformation order is:

  1. transforming loop-control related ops into carry and invariant
  2. transforming complementary grant_predicate + phi into merge
    • Pattern: %0 = grant_predicate(%val0, %cond), %1 = grant_predicate(%val1, %not_cond), phi(%0, %1)
    • Transformed: %result = merge (%cond, %val0, %val1)
  3. transforming rest grant_predicates into true/false_steer

@ShangkunLi ShangkunLi merged commit e05ca6a into coredac:main Oct 8, 2025
1 check passed
ShangkunLi added a commit that referenced this pull request Mar 12, 2026
Support Steering-Control-Based Dataflow Representation & Mapping
ShangkunLi added a commit that referenced this pull request Mar 12, 2026
Support Steering-Control-Based Dataflow Representation & Mapping
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants