Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the SystemVerilog bind keyword #2752

Closed
wants to merge 1 commit into from

Conversation

rswarbrick
Copy link
Contributor

@rswarbrick rswarbrick commented May 13, 2021

EDIT: This used to be a large patch set, but it's now mostly merged. The remaining commit has the following commit message:

Implement bind elaboration in the hierarchy pass

This is surprisingly finicky, because of the difference between bind
directives that target modules and those that target specific
instances of those modules (i.e. specific cells).

In the first case, we can just operate on the module in place. In the
second, we have to make fresh copies of the target module and the
"path of cells" from the bind directive down to it. For instance, in
something like this:

bind $root.foo.bar_i.baz_i qux bound_i (.*);

(where bar_i and baz_i are instances of modules called bar and baz,
respectively), we need to make a fresh clone of baz (let's call it
baz1) into which we insert an instance of qux. But we can't just
modify bar to re-type its baz_i cell from baz to baz1: what if there
are other instances of bar in the design? So we then need to make a
fresh clone of bar (let's call it bar1). Since foo here refers to the
module, rather than a particular instance, we can now modify foo to
re-type it's bar_i cell from bar to bar1 (phew!).

To avoid some extra un-needed renames, we actually count the number of
times each cell type appears in the design (see get_cell_type_counts).
If the design in the example above didn't actually use bar except as
bar_i in module foo, we actually operate in place rather than
renaming.

@rswarbrick
Copy link
Contributor Author

@zachjs: Thanks for the reviews you've already done for the component parts. I'll rebase and force-push to this PR as things get merged (or abandoned!) to make it easier to track what's still pending.

@rswarbrick
Copy link
Contributor Author

Force-push rebases onto master and drops #2753 (abandoned).

@rswarbrick
Copy link
Contributor Author

Ping! It's been a month: is there anything I can do to help get this reviewed and merged? Thanks, @zachjs, for looking at the frontend changes.

@mmicko
Copy link
Member

mmicko commented Jun 15, 2021

@rswarbrick Can you please update copyright messages ? ( please take a look at #2817 to see what is changed) Thanks.

@rswarbrick
Copy link
Contributor Author

Yep, happily. I think this is just the copyright headers in the new modules that got added, right? If so, everything should be updated now.

@rswarbrick
Copy link
Contributor Author

Incidentally, for proof that this actually works, I just managed to convert the Ibex icache FPV stuff that I'd done to bind the testbench in. It was surprisingly painless and (astonishingly) did something sensible. See lowRISC/ibex#1382 for details.

@rswarbrick
Copy link
Contributor Author

Woo, just one commit to go! :-) Marking this PR as ready for review accordingly.

@rswarbrick
Copy link
Contributor Author

Hey all. It's been a couple of weeks. Is there anything I can do to help get this over the line?

Copy link
Collaborator

@zachjs zachjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are just two things which jumped out at me on first glance. I'll do a full review tomorrow.

tests/bind/basic.sv Show resolved Hide resolved
kernel/rtlil.cc Outdated Show resolved Hide resolved
kernel/rtlil.cc Outdated Show resolved Hide resolved
kernel/binding.cc Show resolved Hide resolved
kernel/binding.cc Show resolved Hide resolved

// Otherwise, characters with indices {tgt_off .. dot_off - 1} should
// name a module and then {dot_off + 1 .. } should name the cell
// relative to the module.
Copy link
Collaborator

@zachjs zachjs Sep 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For target names which contain a dot, they must be a bind_target_instance, as bind_target_scope cannot contain a dot. So in this case, the target name must "specify a single instance into which the bind_instantiation should be inserted."

For bind directives within modules, when the target name has a dot, other tools do not allow the leading portion of the identifier to refer to a module rather than an instance. For bind directives in the compilation unit scope, when the target name has a dot, they disallow binding into a module which is instantiated twice. I believe these interpretations of a hierarchical identifier and the bind spec are accurate.

So, given the following modules

module foo;
  initial $display("hello from foo");
endmodule
module bar;
  initial $display("hello from bar");
endmodule
module act;
  initial $display("hello from act");
  foo f();
endmodule

I believe each of the following cases should be illegal:

module beep;
  act act1();
endmodule
module boop;
  beep b1();
  beep b2();
endmodule
bind act.f bar b();
module beep;
  act act1();
  act act2();
endmodule
module boop;
  beep b1();
endmodule
bind act.f bar b();
module beep;
  act act1();
  act act2();
  bind act.f bar b();
endmodule

Whereas each of the following cases are legal:

module beep;
  act act1();
endmodule
module boop;
  beep b1();
endmodule
bind act.f bar b();
module beep;
  act act1();
endmodule
module boop;
  beep b1();
endmodule
bind act bar b();
module beep;
  act act1();
  act act2();
endmodule
module boop;
  beep b1();
endmodule
bind beep.act1.f bar b();
module beep;
  act act1();
  act act2();
  bind act1.f bar b();
endmodule

Copy link
Contributor Author

@rswarbrick rswarbrick Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For bind directives within modules, when the target name has a dot, other tools do not allow the leading portion of the identifier to refer to a module rather than an instance.

Hmm. If I understand the spec correctly, both are explicitly allowed. Page 709 of IEEE 1800-2017 (section 23.3.1) has (emphasis mine):

$root allows explicit access to the top of the instantiation tree. This is useful to disambiguate a local path (which takes precedence) from the rooted path. If $root is not specified, a hierarchical path is ambiguous. For example, A.B.C can mean the local A.B.C or the top-level A.B.C (assuming there is an instance A that contains an instance B at both the top level and in the current module). The ambiguity is resolved by giving priority to the local scope and thereby preventing access to the top-level path. $root allows explicit access to the top level in those cases in which the name of the top-level module is insufficient to uniquely identify the path.

In the find_tl_cells method (where this comment points!), we only handle the top-level case. But we only call it from insert_binding in hierarchy.cc if find_rel_cell failed (handling the local case). I think that this matches the spec?

Looking one-by-one at the cases you listed:

For (1) - (3), I think that bind act.f bar b(); parses the same in each case. act.f matches the hierarchical identifier in the BNF's bind_target_instance. The interpretation in (1) and (2) is forced to be "treat act as a module and point at the f instance in that top-level module". In (1), we get one copy and in (2), we get 2 copies. In (3), we perform a local search for act.f which fails and then fall back to the top-level interpretation. (A horrible design, but I think it's legal...)

For (4)-(7), I agree that they are legal (and I think we probably agree about what they mean).

Copy link
Collaborator

@zachjs zachjs Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have convinced me that properly hierarchical binds within a module may fall back to looking up a module rather than an instance. Hence, each of the following should and do work in this PR and in some commercial suites as well.

module beep;
  act act1();
  bind act.f bar b();
endmodule
module beep;
  act act1();
endmodule
module boop;
  beep b1();
  bind act.f bar b();
endmodule

In (1), we get one copy...

There are two instances of beep, so in that test case, there are two instances of act. Indeed, there are two instances of act in all of (1) thru (3). I think other tools are relying on the "specify a single instance into which the bind_instantiation should be inserted" language to avoid having to modify multiple instances in these cases. I do think it would be a bit strange for a single hierarchical_identifier to refer to multiple cells.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that I can't count :-) But, ignoring that, can you explain in what way you think this code's behaviour should change? At the moment, it matches how I've understood the spec: presumably we disagree on some point there, but I'm not quite sure which one!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec appears to state that a bind_target_instance is used to "specify a single instance." But in (1) thru (3) above, it is being used to specify multiple distinct instances. I have not been able to find any tool which accepts (1) thru (3), and I suspect the reason is because of this restriction.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I understand. To implement the restriction, I guess we could propagate some "instantiation count" down from the top of the tree and only allow a hierarchical bind via module if that count happened to be 1. But it's pretty complicated and only enforces a restriction that's kind of arguable :-)

My reading of the spec's "The second form of bind syntax can be used to specify a single instance into which the bind_instantiation should be inserted." is, I think, slightly different from yours. Having used bind in Verilator (which doesn't support this form), the annoying problem is when you have multiple instances of a module and you only want to bind into some of them. I think this form is to support that use-case. For example, suppose you wanted to do something like "bind this protocol checker into the driver for each of my AXI blocks, but not into the driver when used in in my other interfaces". There, you want a target that's something like axi_block_module.driver_instance.

The "just one instance in the design" interpretation would mean that's not allowed if you instantiate axi_block_module more than once. Instead, you'd have to write something like $root.path.to.axi_block_instance0.driver_instance and $root.path.to.axi_block_instance1.driver_instance, which is a bit horrible: it encodes the number of AXI blocks (easy to forget if you add one!) and the exact design hierarchy (causing churn when path.to changes).

My take is:

  • The spec is slightly ambiguous here. It might really mean "just one instance in the design" or it might not.
  • If it doesn't mean that, the proposed code is correct.
  • If it does, the proposed code will work for any correct design and will do something reasonable if the design is invalid. Another tool might choose to choke on the invalid design, but there's no requirement for us to do so.

So I think I'd stick with the current code, which is in keeping with Yosys's error handling approach in general ("don't try too hard; this is a synthesis tool, not a simulator"). Of course, I wrote it, so I'm biased :-D If you disagree, do you have a suggestion for a simple way to enforce the restriction that you think should hold?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I meant to say: you wrote: "I have not been able to find any tool which accepts (1) thru (3), and I suspect the reason is because of this restriction.". When you tried it out, what were the error messages you saw? Were they of the form "No, you fool! There are multiple instances of X" or were they "Aargh. Internal error". In the former case, that's a strong argument. In the latter, not so much... :-)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm away from my computer at the moment, so I won't be able to re-run to grab those error messages. If you're interested, I often use edaplayground.com, which provides limited access to some commercial simulators for students, researchers, etc.

Regarding how to specify a limited set of modules (especially locally), rather than the following:

module beep;
  act act1();
  act act2();
  bind act.f bar b();
endmodule

The spec seems to require something like:

module beep;
  act act1();
  act act2();
  bind act1.f bar b();
  bind act2.f bar b();
endmodule

I think you may be right that this behavior wouldn't seem to affect definitely legal source, but I'd like to poke around a little more later today when I get the chance.

You described a rather complex strategy for implementing this restriction. I agree something that complex would be undesirable. Would it be possible to do it by erroring whenever we match multiple modules via a properly hierarchical identifier, i.e., one with dot(s)? That is the sort of restriction I had in my head, which seems like it might make certain pieces even simpler.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some relevant test results run against three different commercial simulators.

module beep;
	act act1();
	bind act.f bar b();
endmodule

A) Hierarchical name component lookup failed for 'act' at 'beep'
B) Failed to find 'act' in hierarchical name act.f
C) (no complaint!)

module beep;
	act act1();
	act act2();
	bind act.f bar b();
endmodule

A) Hierarchical name component lookup failed for 'act' at 'beep'
B) Failed to find 'act' in hierarchical name act.f
C) Can only bind to modules or instances

module beep;
	act act1();
endmodule
bind act.f bar b();

A) Hierarchical name component lookup failed for 'act' at '$unit_0x4ccdf83b'
B) Failed to find 'act' in hierarchical name act.f
C) no error

module beep;
	act act1();
	act act2();
endmodule
bind act.f bar b();

A) Hierarchical name component lookup failed for 'act' at '$unit_0x4ccdf83b'
B) Failed to find 'act' in hierarchical name act.f
C) Can only bind to modules or instances

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahah! I think I've finally understood what's going on. It turns out that the important bit is the previous paragraph of section 23.3.1 (at the bottom of p708). The point is that my_module.a.b.c is only allowed if my_module is a "top-level module".

A top-level module is a module that isn't instantiated by any other module. In particular, this will guarantee that there's at most one instance of the thing that the hierarchical reference points to.

I'll fix up the code accordingly, and include your examples from above in the tests.

passes/hierarchy/hierarchy.cc Show resolved Hide resolved
passes/hierarchy/hierarchy.cc Show resolved Hide resolved
return did_something;
}

log_error("Couldn't find a target for %s.\n", desc.c_str());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the bind directive have a source location we could log with this error? If so, please update this and the other log_errors added elsewhere.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice to add some logger -expect error test cases for these errors.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good idea. I'll do that.

@rswarbrick
Copy link
Contributor Author

@zachjs: Thank you very much for the careful review. That must have taken quite some time: thanks! I've just force-pushed with the suggested simplification for how we construct to_visit (which can actually be simplified further, just iterating over design.modules() directly: I think I was making a worklist here originally, but it seems that's not needed).

I'll follow up with a nicer message for the "couldn't find a target" error and a test for it. I've responded with some queries for the other comments: when you have a sec, could you have a look?

@rswarbrick
Copy link
Contributor Author

I've just pushed up a version that I think should:

I'm going to mark those discussions resolved above, to try to avoid the PR growing without bound(!). If you don't think this resolves things properly, please feel free to un-resolve them!

Still to do:

I didn't rebase, so clicking on "force-pushed" above should give a useful diff.

@rswarbrick
Copy link
Contributor Author

Hi all. I've just added a source location to the error message and added a test showing what's going on, for the cell type check and for the "couldn't find a target for..." error.

I'm not sure that it makes so much sense for the "Cannot bind into module because it didn't originally come from RTL" error, so I've left that. There's also the "Some, but not all..." error. I think this is only possible to hit if you allow hierarchical references starting with module names, which is what we're discussing in #2752 (comment). I thought I'd wait until we reach a resolution there before working on that.

I'll leave #2752 (comment) unresolved for now so that I don't forget to come back to the "Some, but not all..." error (assuming we don't completely change that code).

As before, I haven't rebased, so the force-pushed link above should give a helpful diff.

@zachjs
Copy link
Collaborator

zachjs commented Oct 11, 2021

@rswarbrick Any update on this PR? I'm eager get things moving along again!

@rswarbrick
Copy link
Contributor Author

Hi there! Sorry, I've not had any time to spend on it recently. I think that the remaining technical issue is how we enforce a requirement that binds that go through hierarchical references can only hit one thing. I had a bit of a look, and I think we'd need to do another count in the elaboration pass to figure out how many targets there are. It ends up being "how many instances of this module are there in the elaborated design". Then, to work out whether bind to my_module.foo.bar.baz is allowed, you need to iterate over all specializations of my_module and add up the counts. If the result is exactly one, you're good.

This seems quite hard work(!), and I'm not convinced it provides any actual benefit. Can you think of a cleaner way to do things?

Copy link
Collaborator

@zachjs zachjs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I poked around some more and it looks like are indeed some cases where the input is valid, but the implementation is not behaving correctly. Please let me know if I'm mistaken!

If this change will sit idle for a while, I'd like to move forward with #2716, which also adds a new case where Verilog modules are reprocessed. To avoid the existing conflicts, I'd like to cherry-pick the ast.cc/h changes you've made here and use your new interface. Does that seem reasonable to you?


// If split_point == -1, the target module and all its parents appear
// just once, so we can operate in place.
return binding.bind_into(design, *orig_module);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the condition for this is missing. Let's also get some coverage for the intended code path below once this is fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both with and without this condition manually patched locally, the following script fails on each of the following sources, despite working as expected in other tools.

read_verilog -defer -sv foo.sv
hierarchy -top top
proc
flatten
opt
select -module top
sat -verify -seq 1 -prove-asserts -show-all
module mod(input wire clk);
    wire x;
    assign x = 1;
endmodule

module val(input wire i);
    always @* assert (i == 0); // should fail
endmodule

module foo(input wire clk);
    mod y(clk);
    bind y val v(x);
endmodule

module bar(input wire clk);
    mod z(clk);
    bind z val v(x);
endmodule

module top;
    wire clk;
    foo f(clk);
    bar b(clk);
endmodule
module mod(input wire clk);
    wire x;
    assign x = 1;
endmodule

module val(input wire i);
    always @* assert (i == 0); // should fail
endmodule

module foo(input wire clk);
    mod y(clk);
endmodule

module bar(input wire clk);
    mod z(clk);
endmodule

module top;
    wire clk;
    foo f(clk);
    bar b(clk);
    bind f.y val v(x);
    bind b.z val v(x);
endmodule

for (auto mod : design.modules()) {
counts.insert(std::make_pair(mod->name, 0));
get_cell_type_counts_worker(design, counts, *mod);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After poking around some more, it seems this procedure can double count. Running hierarchy -top c with the below modules loaded, the count of a internally becomes 2 because b is itself visited twice.

module m;
endmodule
module a;
endmodule
module b;
    a x();
    bind x m z();
endmodule
module c;
    b y();
endmodule

@rswarbrick
Copy link
Contributor Author

If this change will sit idle for a while, I'd like to move forward with #2716, which also adds a new case where Verilog modules are reprocessed. To avoid the existing conflicts, I'd like to cherry-pick the ast.cc/h changes you've made here and use your new interface. Does that seem reasonable to you?

Sounds like a good plan. I'll try to get to this soon, but I'm very happy to rebase: no need to block on this!

This is surprisingly finicky, because of the difference between bind
directives that target modules and those that target specific
instances of those modules (i.e. specific cells).

In the first case, we can just operate on the module in place. In the
second, we have to make fresh copies of the target module and the
"path of cells" from the bind directive down to it. For instance, in
something like this:

    bind $root.foo.bar_i.baz_i qux bound_i (.*);

(where bar_i and baz_i are instances of modules called bar and baz,
respectively), we need to make a fresh clone of baz (let's call it
baz1) into which we insert an instance of qux. But we can't just
modify bar to re-type its baz_i cell from baz to baz1: what if there
are other instances of bar in the design? So we then need to make a
fresh clone of bar (let's call it bar1). Since foo here refers to the
module, rather than a particular instance, we *can* now modify foo to
re-type its bar_i cell from bar to bar1 (phew!).

To avoid some extra un-needed renames, we actually count the number of
times each cell type appears in the design (see get_cell_type_counts).
If the design in the example above didn't actually use bar except as
bar_i in module foo, we actually operate in place rather than
renaming.
@rswarbrick
Copy link
Contributor Author

Hi all! Sorry for the ridiculous delay in getting back to this. I've just force-pushed, but there are no interesting changes: it's just a rebase onto latest master. I'm going to go through the rest of the discussion now and try to move things forward, but I thought it might be worth rebasing first (if nothing else, it means that the force-push diffs that follow will be a bit more informative).

@rswarbrick
Copy link
Contributor Author

Ok, after a day's staring at the code (and adding test cases), I've realised there's another problem with it (other than the one that @zachjs found above). When we bind into a module, we implement this by taking a copy of the AST it came from, shoving in the bound-in cell and then calling process_and_replace_module. This isn't quite right! In particular, it's not going to handle parameterised targets properly, because we're going to forget our parameterisation.

I think I've got a plan for how to fix it, but it's going to take a bit more time. (Hopefully, measured in days rather than months this time!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants