[Pipeline] Refactor pipeline dialect to be block-based #5332

mortbopet · 2023-06-07T09:11:44Z

This commit refactors the pipeline dialect to be block-based. This brings a major representational change in the form of:

The pipeline is no longer defined by a lexical ordering of operations and insertion of pipeline.stagesep operations to separate stages. Instead, pipeline stages are defined by blocks.
Control flow between blocks are defined by pipeline.stage operations.
Like in the current version, the pipeline can exist in register dematerialized and materialized forms. In the dematerialized form, stages (Blocks) have no arguments. In the materialized form, stages have arguments.
It is the pipeline.stage operations which infers whether to register a value or pass it directly (i.e. a wire) to the next stage.
Two top level operations exists:

pipeline.unscheduled: Unscheduled pipeline, essentially just a container of operations.
pipeline.scheduled: Scheduled pipeline, containing pipeline stages.

The motivation for this change is to improve the hierarchy of the IR, instead of relying on lexical ordering. This change also allows for more natural traversal of stages (Blocks), as well as dataflow analysis of the pipeline, which now is analogous to control flow analysis. The only slight drawback of this change is that it slightly complicates adding new pipeline stages, seeing as one has to explicitly update the control flow of the pipeline. This is a minor drawback, seeing as this is also how things work in the software world, and is easily addressed by helper methods.

Likewise, this change also removes the (old) pipeline.stage operations, which mainly were introduced to facilitate lowering. This is no longer needed with the block-based pipeline, seeing as stage in- and outputs are clearly denoted by block inputs and pipeline.stage operations.

Old representation:

%out = pipeline.pipeline(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> (i32) {
  ^bb0(%a0 : i32, %a1: i32, %g : i1):
    %add0 = comb.add %a0, %a1 : i32
    %add1 = comb.add %add0, %a0 : i32
    %add2 = comb.add %add1, %add0 : i32
    pipeline.return %add2 valid %s1_valid : i32
}

// Schedules to
%out = pipeline.pipeline(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> (i32) {
^bb0(%a0 : i32, %a1: i32, %g : i1):
  %add0 = comb.add %a0, %a1 : i32

  %s0_valid = pipeline.stagesep enable %g
  %add1 = comb.add %add0, %a0 : i32 // %a0 is a block argument fed through a stage.

  %s1_valid = pipeline.stagesep enable %s0_valid
  %add2 = comb.add %add1, %add0 : i32 // %add0 crosses multiple stages.

  pipeline.return %add2 valid %s1_valid : i32
}

// materializes to
%0 = pipeline.pipeline(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> i32 {
^bb0(%a0: i32, %a1: i32, %g: i1):
  %1 = comb.add %a0, %a1 : i32

  %1_s0, %a0_s0, %valid = pipeline.stagesep.reg enable %g regs %1, %a0 : i32, i32
  %2 = comb.add %1_s0, %a0_s0 : i32

  %2_s1, %1_s1 %valid_3 = pipeline.stagesep.reg enable %valid regs %2, %1_s0 : i32, i32
  %3 = comb.add %2_s1, %1_s1 : i32 // %1 from the entry stage is chained through both stage 1 and 2.

  pipeline.return %3 valid %valid_3 : i32
}

// Lowers to
%0 = pipeline.pipeline(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> i32 {
^bb0(%a0: i32, %a1: i32, %arg2: i1):
  %outputs:2, %valid = pipeline.stage ins %a0, %a1 enable %g : (i32, i32, i1) -> (i32, i32) {
  ^bb0(%arg3: i32, %arg4: i32, %arg6: i1):
    %2 = comb.add %arg3, %arg4 : i32
    pipeline.stage.return regs %2, %arg3 valid %arg6 : (i32, i32)
  }
  %outputs_2:2, %valid_3 = pipeline.stage ins %outputs#0, %outputs#1 enable %valid : (i32, i32) -> (i32, i32) {
  ^bb0(%arg3: i32, %arg4: i32, %arg5: i1):
    %2 = comb.add %arg3, %arg4 : i32
    pipeline.stage.return regs %2, %arg3 valid %arg5 : (i32, i32)
  }
  %1 = comb.add %outputs_2#0, %outputs_2#1 : i32
  pipeline.return %1 valid %valid_3 : i32
}

New representation

%out = pipeline.unscheduled(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> (i32) {
  ^bb0(%a0 : i32, %a1: i32, %g : i1):
    %add0 = comb.add %a0, %a1 : i32
    %add1 = comb.add %add0, %a0 : i32
    %add2 = comb.add %add1, %add0 : i32
    pipeline.return %add2 valid %s1_valid : i32
}

// schedules to
%out = pipeline.scheduled(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> (i32) {
^bb0(%a0 : i32, %a1: i32, %go : i1):
  %add0 = comb.add %a0, %a1 : i32
  pipeline.stage ^bb1 enable %go

^bb1:
  %add1 = comb.add %add0, %a0 : i32 // %a0 is a block argument fed through a stage.
  pipeline.stage ^bb2 enable %go

^bb2:
  %add2 = comb.add %add1, %add0 : i32 // %add0 crosses multiple stages.
  pipeline.return %add2 enable %go : i32 // %go crosses multiple stages
}

// Materializes to
%0 = pipeline.scheduled(%arg0, %arg1, %go) clock %clk reset %rst : (i32, i32, i1) -> i32 {
^bb0(%a0: i32, %a1: i32, %go: i1):
  %1 = comb.add %a0, %a1 : i32
  pipeline.stage ^bb1 regs (%1, %a0, %go) pass () enable %go

^bb1(%1_s0 : i32, %a0_s0 : i32, %go_s0 : i1):
  %2 = comb.add %1_s0, %a0_s0 : i32
  pipeline.stage ^bb2 regs (%2, %1_s0, %go_s0) pass () enable %go_s0

^bb2(%2_s1 : i32, %1_s1 : i32, %go_s1 : i1):
  %3 = comb.add %2_s1, %1_s1 : i32 // %1 from the entry stage is chained through both stage 1 and 2.
  pipeline.return %3 valid %go_s1 : i32 // and likewise with %go
}

// which can be directly lowered to hardware

This commit refactors the pipeline dialect to be block-based. This brings a major representational change in the form of: 1. The pipeline is no longer defined by a lexical ordering of operations and insertion of `pipeline.stagesep` operations to separate stages. Instead, pipeline stages are defined by blocks. 2. Control flow between blocks are defined by `pipeline.stage` operations. 3. Like in the current version, the pipeline can exist in register dematerialized and materialized forms. In the dematerialized form, stages (`Block`s) have no arguments. In the materialized form, stages have arguments. 4. It is the `pipeline.stage` operations which infers whether to register a value or pass it directly (i.e. a wire) to the next stage. The motivation for this change is to improve the hierarchy of the IR, instead of relying on lexical ordering. This change also allows for more natural traversal of stages (`Block`s), as well as dataflow analysis of the pipeline, which now is analogous to control flow analysis. The only slight drawback of this change is that it slightly complicates adding new pipeline stages, seeing as one has to explicitly update the control flow of the pipeline. This is a minor drawback, seeing as this is also how things work in the software world, and is easily addressed by helper methods. Likewise, this change also removes the (old) `pipeline.stage` operations, which mainly were introduced to facilitate lowering. This is no longer needed with the block-based pipeline, seeing as stage in- and outputs are clearly denoted by block inputs and `pipeline.stage` operations.

mikeurbach

I haven't reviewed the implementation but I think the representational change makes sense based on the previous discusssions.

teqdruid

I didn't make it to the stuff under 'lib', but had some high-level comments which I wanted you to see. May not get to the rest until tomorrow.

include/circt/Dialect/Pipeline/Pipeline.td

teqdruid · 2023-06-08T01:39:48Z

include/circt/Dialect/Pipeline/Pipeline.td

+    // Returns the last stage in the pipeline.
+    Block* getLastStage();
+
+    // Adds a new stage to this pipeline. It is the users responsibility to


Where? "Adds" -> "Append" would be clearer if I assume correctly.

I'd actually say 'add' is the correct term here; append would imply that this stage logically comes after all other stages, which is not true. Ordering/placement of the stage within the pipeline is up to the user.

include/circt/Dialect/Pipeline/Pipeline.td

teqdruid · 2023-06-08T01:45:06Z

docs/Dialects/Pipeline/RationalePipeline.md

  %1 = comb.add %a0, %a1 : i32
+  pipeline.stage ^bb1 regs (%1, %a0, %go) pass () enable %go


Is it critical to specify the next block? Can't we just use the lexical order? Would there ever be a use for anything but? I think not requiring that could be dangerous in that one could assume it (and use getStage(i) instead of getOrderedStage(i)) and be correct 99.9% of the time but that behavior is not guaranteed.

Personally i'd prefer to use blocks with explicit terminator destinations, i.e. stages are a linked list. It makes it easier/safer to insert new stages into a pipeline, querying successor stages (Block::getSinglePredecessor, Block::getSuccessor,...) and I personally think lexical ordering really only is a benefit for human readability (which is not the goal of the IR). Stringing together stages with explicit next-stage destinations will be correct 100% of the time.

Yeah, but it's just so unintuitive to have block 3 be the 5th pipeline stage. I'm fine with the successor blocks being explicit as long as the verifier checks that it's explicitly pointing to the next block. Again, I'm very concerned about quiet/sleeper bugs like the one I discuss above. I'm also concerned about the readability of the asm output.

stages are a linked list.

You realize that lists of blocks within a region are implemented as a literal linked list, yes? I know that's not your point, but I'm just sayin'...

querying successor stages (Block::getSinglePredecessor, Block::getSuccessor,...)

I'm not entirely certain how those are implemented, but I would assume through some sort of OpInterface (BranchOpInterface?) on the terminators which we should absolutely implement. I suspect there are some potentially useful data flow analysis upstream which use it.

Plus, there's always Block::getNextNode() and Block::getPrevNode().

Stringing together stages with explicit next-stage destinations will be correct 100% of the time.

Define "correct"... It's correct assuming the user strings it together correctly. If they do the intuitive thing and just add a block to the end, it won't be "correct". By the same token, using the numerical order is always "correct" assuming the user inserts the new block at the proper place. If they don't, it's easy to discover the mistake even after lowering to HW... In the case where a user just adds a block but doesn't modify the successors properly, I'd assume it just disappears when lowering to HW.

teqdruid · 2023-06-08T01:56:24Z

include/circt/Dialect/Pipeline/Pipeline.td

  }];

+  let arguments = (ins Variadic<AnyType>:$registers, Variadic<AnyType>:$passthroughs, I1:$enable);
+  let successors = (successor AnySuccessor:$nextStage);


See my comment in the rationale.

teqdruid · 2023-06-08T02:02:12Z

include/circt/Dialect/Pipeline/PipelineInterfaces.td

+      "bool",
+      "isLatencyInsensitive", (ins),
+      /*methodBody=*/"",
+      /*defaultImplementation=*/[{


I know this is probably just the previous logic factored out, so this may not be relevant to this PR; but, what if some of the inputs/outputs are LI. In other words, what if it's mixed LI non-LI? Would it make sense to return false for both this and isLatencySensitive below?

That should in my mind be an illegal case - it's XOR for now. In general, all of this LI-interfaced stuff is extremely blurry to me and will have to be revised once we need it.

It's blurry for me as well. We might consider ripping it out. Add it back later on with dc.value interfaces.

I'd be in favor of that as well - for a follow-up PR, though!

teqdruid

I didn't make it to the stuff under 'lib', but had some high-level comments which I wanted you to see. May not get to the rest until tomorrow.

lib/Dialect/Pipeline/Transforms/ExplicitRegs.cpp

teqdruid · 2023-06-09T02:30:29Z

docs/Dialects/Pipeline/RationalePipeline.md

  %1 = comb.add %a0, %a1 : i32
+  pipeline.stage ^bb1 regs (%1, %a0, %go) pass () enable %go


stages are a linked list.

You realize that lists of blocks within a region are implemented as a literal linked list, yes? I know that's not your point, but I'm just sayin'...

querying successor stages (Block::getSinglePredecessor, Block::getSuccessor,...)

I'm not entirely certain how those are implemented, but I would assume through some sort of OpInterface (BranchOpInterface?) on the terminators which we should absolutely implement. I suspect there are some potentially useful data flow analysis upstream which use it.

Plus, there's always Block::getNextNode() and Block::getPrevNode().

Stringing together stages with explicit next-stage destinations will be correct 100% of the time.

Define "correct"... It's correct assuming the user strings it together correctly. If they do the intuitive thing and just add a block to the end, it won't be "correct". By the same token, using the numerical order is always "correct" assuming the user inserts the new block at the proper place. If they don't, it's easy to discover the mistake even after lowering to HW... In the case where a user just adds a block but doesn't modify the successors properly, I'd assume it just disappears when lowering to HW.

mortbopet · 2023-06-09T07:56:00Z

I'm not entirely certain how those are implemented, but I would assume through some sort of OpInterface (BranchOpInterface?) on the terminators which we should absolutely implement. I suspect there are some potentially useful data flow analysis upstream which use it.

This comes "for free" when blocks are used as the successors part of an ODS definition.

Fair, correctness might not be the proper term. Regardless, I don't think lexical ordering is a compelling alternative when the current implementation has a lot more in common with CFGs, and thus may share analysis, traversal, ...

teqdruid

I only skimmed the PipelineToHW.cpp changes.

I still disagreed with the block/stage ordering issue, but for the purposes of making forward progress I'll let it go.

lib/Dialect/Pipeline/PipelineOps.cpp

lib/Dialect/Pipeline/Transforms/ExplicitRegs.cpp

lib/Dialect/Pipeline/Transforms/ScheduleLinearPipeline.cpp

lib/Conversion/PipelineToHW/PipelineToHW.cpp

mortbopet added the Pipeline label Jun 7, 2023

mortbopet requested review from mikeurbach and teqdruid June 7, 2023 09:11

mortbopet force-pushed the dev/mpetersen/refactor_pipeline branch from 7b88525 to d61fc0a Compare June 7, 2023 09:12

tidy

ee449ac

mikeurbach reviewed Jun 7, 2023

View reviewed changes

teqdruid reviewed Jun 8, 2023

View reviewed changes

teqdruid reviewed Jun 9, 2023

View reviewed changes

review comments, fixes

075e046

teqdruid approved these changes Jun 12, 2023

View reviewed changes

Review comments

c4ec869

mortbopet merged commit d57b640 into main Jun 12, 2023

darthscsi deleted the dev/mpetersen/refactor_pipeline branch June 4, 2024 14:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Pipeline] Refactor pipeline dialect to be block-based #5332

[Pipeline] Refactor pipeline dialect to be block-based #5332

mortbopet commented Jun 7, 2023 •

edited

Loading

mikeurbach left a comment

teqdruid left a comment

teqdruid Jun 8, 2023

mortbopet Jun 8, 2023

teqdruid Jun 8, 2023

mortbopet Jun 8, 2023 •

edited

Loading

teqdruid Jun 8, 2023

teqdruid Jun 9, 2023

teqdruid Jun 8, 2023

teqdruid Jun 8, 2023

mortbopet Jun 8, 2023

teqdruid Jun 8, 2023

mortbopet Jun 9, 2023

teqdruid left a comment

teqdruid Jun 9, 2023

mortbopet commented Jun 9, 2023

teqdruid left a comment

		%1 = comb.add %a0, %a1 : i32
		pipeline.stage ^bb1 regs (%1, %a0, %go) pass () enable %go

[Pipeline] Refactor pipeline dialect to be block-based #5332

[Pipeline] Refactor pipeline dialect to be block-based #5332

Conversation

mortbopet commented Jun 7, 2023 • edited Loading

Old representation:

New representation

mikeurbach left a comment

Choose a reason for hiding this comment

teqdruid left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mortbopet Jun 8, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

teqdruid left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mortbopet commented Jun 9, 2023

teqdruid left a comment

Choose a reason for hiding this comment

mortbopet commented Jun 7, 2023 •

edited

Loading

mortbopet Jun 8, 2023 •

edited

Loading