Add Moore MIR dialect and move codegen to CIRCT #235

fabianschuiki · 2021-11-21T12:01:25Z

#234 implements code generation by linking against the CIRCT project and using MLIR to generate and emit assembly. As a first step towards moving more of Moore's code over to MLIR, we should add a "Moore MIR" dialect to CIRCT.

This dialect should aim to model the mir::Rvalue and mir::Lvalue representations, and add new MLIR ops to represent the remaining SV statements and declarations (modules, processes, instances, variables, nets, assigns, conditionals, loops -- basically everything that codegen.rs knows how to emit code for, but isn't currently captured as part of the mir module). This is likely to require adding a full implementation of svlog/ty.rs in CIRCT.

Once this dialect exists, we can raise the level of abstraction in codegen.rs and emit the Moore MIR dialect instead of all the low level HW/Comb/LLHD/Standard. The translation from MIR to those low-level dialects can then move into the CIRCT project as a dedicated lowering pass. This is phenomenal because MIR being an MLIR dialect will allow us to write very concise tests that check code generation for specific SV features and semantics without the whole parser and type checking in the loop. The MIR then ends up representing a full SV design with all types and implicit operations resolved to explicit things -- making it essentially an SV semantics dialect.

Todo

Merge Link against CIRCT and use it for codegen #234
Merge Remove legacy LLHD crate codegen and dependency #236
Add "Moore" dialect to CIRCT, with an SV type system implementation (i.e. port src/svlog/ty.rs)
Add moore.mir.* operations to represent the Moore MIR
Emit Moore MIR dialect in codegen.rs instead of low-level dialects
Implement codegen as a lowering pass in CIRCT on the new dialect

The text was updated successfully, but these errors were encountered:

fabianschuiki · 2021-11-21T12:01:39Z

@maerhart What do you think about this plan?

maerhart · 2021-11-24T08:51:23Z

Sounds great to me!

Just wondering whether we can use (part of) the SV dialect in the lowering chain from SV AST to HW/Comb/LLHD. And whether this would make sense at all since the SV dialect is targeted at printing and we are focussed on lowering.

How would such a Moore MIR dialect look like?
As far as I understand, the goal of MIR is to make all types and casts explicit. We would have an operation for each expression and statement in SV (including the constructs that are directly lowered from HIR to LLHD) with a strong type system. lvalues and rvalues are probably best represented in the type system (wrapping the real type inside a lvalue/rvalue type?). The assign ops then only accept lvalues as first argument and only rvalues as second argument. Do you already have concrete ideas on this?

What exactly is the goal of HIR?
Just getting rid of syntactic sugar and ambiguities?
As the amount of expressions/statements in SV is quite large, it could make sense to have most of them only in one dialect and use other dialects as extensions to reduce the amount of redundancy once we also want to port over HIR, e.g., have an HIR dialect that only models the syntactic sugar and ambiguities part. Would that be possible? Or do we need a complete IR to model and infer the unresolved types?

fabianschuiki · 2021-11-25T13:29:31Z

Just wondering whether we can use (part of) the SV dialect in the lowering chain from SV AST to HW/Comb/LLHD. And whether this would make sense at all since the SV dialect is targeted at printing and we are focussed on lowering.

I think you're right and it'll probably be difficult to use the SV dialect directly from the start. That dialect is being driven largely by what is needed for good Verilog emission, which may be a use case very distinct from representing SV from a frontend point of view. I'm not saying that they shouldn't share as much as possible, but given the different design goals it sounds like a better approach to first build out the Moore-specific dialect, and then try to nudge that and the SV dialect in CIRCT closer together, and increase the reuse between them.

We would have an operation for each expression and statement in SV (including the constructs that are directly lowered from HIR to LLHD) with a strong type system. lvalues and rvalues are probably best represented in the type system (wrapping the real type inside a lvalue/rvalue type?). The assign ops then only accept lvalues as first argument and only rvalues as second argument. Do you already have concrete ideas on this?

Yeah I was thinking about something like you describe. MIR should be the "final bastion" of SV and be a truthful representation of the semantics of the input file, with all ambiguities and implicitness removed, and all types fully known. So pretty much what you describe: operations for all the constructs, expressions, and statements that SV has to offer. This would include all the classes, verification craziness like properties and sequences, clocking blocks, programs, interfaces, packages, assertions, and much more.

What exactly is the goal of HIR? Just getting rid of syntactic sugar and ambiguities? As the amount of expressions/statements in SV is quite large, it could make sense to have most of them only in one dialect and use other dialects as extensions to reduce the amount of redundancy once we also want to port over HIR, e.g., have an HIR dialect that only models the syntactic sugar and ambiguities part. Would that be possible? Or do we need a complete IR to model and infer the unresolved types?

The initial goal in Moore was to get around a few limitations in the early days of the AST and query system. It was intended to offer a way to resolve syntactic ambiguities (for example the cast foo'(someExpr), where you don't know if foo is a type or an expression during AST construction). I'm not sure this is still needed. I have reworked the AST in the meantime, which is now much easier to work with and I've been pushing queries to work directly on the AST where possible, instead of HIR. There was also the addition of the RST (and the ambiguity resolution queries), which basically just map a few of the AST constructs (the ambiguous ones) to one of the concrete possibilities after names have been resolved (scope and name table construction runs on the AST).

I could totally see operations survive all the way from the AST down to the MIR. In the Rust world I was very careful to prevent mutation of the ops in the different IRs, to enforce safety and make passes purely additive. But in MLIR with the mutation galore and rampant unsafety of C++, we can basically start to mutate operations as we see fit. For example, we might just have one single moore.expr.add operation to represent + in SV:

After parsing, this operation would be created from the AST, maybe with a !moore.unresolved marker type
Name resolution would then update moore.expr.ident nodes to contain a pointer (symbol or something) to the thing they are referring to
Type checking could then go in and update moore.expr.add with the correct resolved type, and also convert types from their AST construct (like logic [3:0] being a Type { name: Named("logic"), dims: Range(3, 0) }) to a corresponding type in the IR
MIR lowering would then go through the expressions and insert casts where appropriate. For example if a moore.expr.add operates on a 3 and 4 bit value, and is assigned to a 5 bit signal, it would have an operation type of 5 bits and needed a cast inserted for its operands.

You are totally right that replicating a lot of the ops just for the sake of providing a few restrictions on them (like "here the types need to be known") is probably wasteful. We could also just declare "MIR" as being a subset of all the Moore dialect operations, with certain additional restrictions on types.

fabianschuiki · 2021-11-25T13:34:24Z

My suggestion would be to start with the minimum that is needed to represent the MIR and move codegen over to CIRCT. I'm pretty sure this will already instruct quite a few design decisions, and requires implementing the fully resolved SV type system as a start. Then we can look into having implicit casts inserted on the CIRCT side, and start to move monomorphization over to CIRCT as well 😄

maerhart · 2021-11-26T21:30:10Z

Thank you for the detailed description! I completely agree with that.

I already started with a skeleton dialect for Moore MIR, some types and three ops forming a simple example plus lowering to HW/LLHD here.

fabianschuiki · 2021-11-27T08:44:18Z

Wow this is some seriously amazing work! I love it 🎉! Let me add some comments right to the commit itself. It's great that you went for a minimal working example. I would suggest that we try to merge this into upstream CIRCT as soon as possible, to keep the PRs small and easy for people to digest. Then it can evolve within CIRCT.

maerhart · 2021-11-28T16:33:59Z

Thanks for the quick feedback! I did some cleanup and addressed your comments. The diff against main is here. Let me know if there's anything else I should change or if it's ready for a PR. We just have to wait until the LLVM Submodule update PR is merged as I used the new type assembly format feature for convenience.

fabianschuiki · 2021-11-29T09:07:44Z

Cool thanks a lot, this looks great! Since we're only working with 3 types at the moment (and the LLVM submodule update upstream might take a while to get merged), would it make sense to just use the old-school manual type parsing approach instead to unblock this PR?

maerhart · 2021-11-30T21:17:28Z

Yeah that's actually an easy change. I rebased to the old LLVM version and opened the PR in CIRCT.

fabianschuiki added L-vlog Language: Verilog and SystemVerilog. C-enhancement Category: Adding or improving on features. A-codegen Area: Code generation. P-high Priority: High. labels Nov 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Moore MIR dialect and move codegen to CIRCT #235

Add Moore MIR dialect and move codegen to CIRCT #235

fabianschuiki commented Nov 21, 2021 •

edited

fabianschuiki commented Nov 21, 2021

maerhart commented Nov 24, 2021

fabianschuiki commented Nov 25, 2021

fabianschuiki commented Nov 25, 2021

maerhart commented Nov 26, 2021

fabianschuiki commented Nov 27, 2021

maerhart commented Nov 28, 2021

fabianschuiki commented Nov 29, 2021

maerhart commented Nov 30, 2021

Add Moore MIR dialect and move codegen to CIRCT #235

Add Moore MIR dialect and move codegen to CIRCT #235

Comments

fabianschuiki commented Nov 21, 2021 • edited

Todo

fabianschuiki commented Nov 21, 2021

maerhart commented Nov 24, 2021

fabianschuiki commented Nov 25, 2021

fabianschuiki commented Nov 25, 2021

maerhart commented Nov 26, 2021

fabianschuiki commented Nov 27, 2021

maerhart commented Nov 28, 2021

fabianschuiki commented Nov 29, 2021

maerhart commented Nov 30, 2021

fabianschuiki commented Nov 21, 2021 •

edited