Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V3Split not splitting #1244

Closed
veripoolbot opened this issue Nov 16, 2017 · 4 comments
Closed

V3Split not splitting #1244

veripoolbot opened this issue Nov 16, 2017 · 4 comments

Comments

@veripoolbot
Copy link

@veripoolbot veripoolbot commented Nov 16, 2017


Author Name: John Coiner (@jcoiner)
Original Redmine Issue: 1244 from https://www.veripool.org

Original Assignee: John Coiner (@jcoiner)


We have an input like this:

always @(posedge clk) begin
if ((rst_l == 0)) begin
reg1 <= 1'b0;
reg2 <= 1'b0;
// ... snip ...
reg10000 <= 1'b0;
end
else begin
reg1 <= new_reg1;
reg2 <= new_reg2;
// ... snip ...
reg10000 <= new_reg10000;
end
end

Essentially, it's thousands of resettable flops combined into a single always block.

It would be nice if V3Split were smart enough to split this up. There are a few bad consequences of not splitting this up:

  • It's bad for V3Gate runtime. We don't know why yet. Wilson saw V3Gate run orders of magnitude faster with the block split up.
  • It's bad for serial code scheduling. The code scheduler has no ability to reorder anything within this always block relative to anything else in the block. That's bad for dcaches, it limits verilator's ability to locate writes and reads of the same variable in close proximity.
  • (On the threads-jcoiner branch) It's bad for the thread partitioner's runtime, which degrades when some nodes in the graph have a huge number of dependency edges, as this one will.
  • (On the threads-jcoiner branch) It's bad for the output of the thread partitioner. We should have 10000 items that are trivially parallelizable, but instead we get a single atom that the partitioner cannot extract any parallelism from. This becomes a bottleneck on the final partitioned graph.

What exactly V3Split should output for this case is TBD. We probably don't want to reference the reset signal inline with every assignment (that's more instructions to run, and also more instructions to store.) We might want to break this large always block into a several medium-size always blocks, each of which evaluates the condition once, so that total code size and cpu footprint won't grow much.

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Nov 16, 2017


Original Redmine Comment
Author Name: John Coiner (@jcoiner)
Original Date: 2017-11-16T18:08:43Z


Whoops, here's the above code sample with formatting:

always @(posedge clk) begin
  if ((rst_l == 0)) begin
     reg1 <= 1'b0;
     reg2 <= 1'b0;
     // ... snip ...
     reg10000 <= 1'b0;
  end
  else begin
     reg1 <= new_reg1;
     reg2 <= new_reg2;
      // ... snip ...
     reg10000 <= new_reg10000; end
end

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Nov 23, 2017


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2017-11-23T16:30:54Z


Added some example tests, test_regress/t/t_alw_split_rst.v

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Feb 28, 2018


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2018-02-28T12:01:31Z


John Coiner provided an excellent patch which is merged into git develop-v4 branch, towards version 4.000.

Thanks John!!

@veripoolbot

This comment has been minimized.

Copy link
Author

@veripoolbot veripoolbot commented Sep 16, 2018


Original Redmine Comment
Author Name: Wilson Snyder (@wsnyder)
Original Date: 2018-09-16T21:27:41Z


In 4.002.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.