Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] asap7: add mock-registerfile and mock-aluregisterfile #1452

Conversation

oharboe
Copy link
Collaborator

@oharboe oharboe commented Sep 11, 2023

Study registerfile properties and ALU+registerfile properties.

The verilog is now generated by LLVM CIRCT, firtool, hence some changes to quality of results for the mock-alu.

The mock-aluregisterfile and mock-registerfile are not hooked up to Jenkins, it's enough to test mock-alu.

Even with flip flop based SRAM, unsurprisingly, the number of ports completely dominate the area and speed.

image

@oharboe
Copy link
Collaborator Author

oharboe commented Sep 11, 2023

TL;DR merits further study....

mock-aluregisterfile is used to model, for instance, megaboom 128 registers 64 bit registers with 8 read ports, 4 write ports where I hook up 2 read ports and 1 write port to a 64 bit ADD operation(a mock ALU with a single operation). The mock-aluregisterfile is implemented with flip-flops. I'm getting clock period_min = 2550.94 for this testcase.

I'm unsure if a fake asynchronous read SRAM would give more realistic figures as asynchronous read SRAMs don't normally have as many as 8 read ports and 4 write ports.

I forget where, but I've seen 1.5GHz figures for megaboom, 666ps clock period, so mock-aluregisterfile should with commercial tools, enough engineering effort and a suitable implementation of the asynchronous SRAM be able to reach 666ps clock period. I don't know Berkeley BOOM and megaboom well enough to be quite sure that this is how the registerfile and ADD are hooked up, but I don't think I'm far off. If anything, the megaboom has even more logic going per cycle.

Best as I can understand, x86 does this at 200ps on 7nm. Obviously with a lot more engineering effort, commercial tools and PDKs more tailored to CPUs.

@oharboe
Copy link
Collaborator Author

oharboe commented Sep 11, 2023

@maliberty There is no real change to the mock-alu here, just a slight difference in how the Verilog is generated. The rest, not in Jenkins, is to have a shared baseline for mock-registerfile and mock-aluregisterfile for investigations.

@rovinski
Copy link
Member

@oharboe I've seen a few of your experiments and I just want to throw out that register retiming is actually very important for some of the Berkeley designs. I implemented a Rocket Core with FPU and it turned out that the FPU pipelining scheme was completely dependent on register retiming. Yosys doesn't support register retiming, so it may be difficult to push it forward with the tools alone.

@oharboe
Copy link
Collaborator Author

oharboe commented Sep 11, 2023

@oharboe I've seen a few of your experiments and I just want to throw out that register retiming is actually very important for some of the Berkeley designs. I implemented a Rocket Core with FPU and it turned out that the FPU pipelining scheme was completely dependent on register retiming. Yosys doesn't support register retiming, so it may be difficult to push it forward with the tools alone.

Thanks!

I am doing manual retiming for what I am working on.

None of the experiments that I am proposing to merge with ORFS rely on retiming.

I am cognizant the reliance on retiming in RISC-V BOOM is endemic. It is documented as well as evident from the code.

@maliberty
Copy link
Member

Designs that are not in the CI tend to rot. What value do these bring over the existing tests? What is the runtime for these designs?

@oharboe
Copy link
Collaborator Author

oharboe commented Oct 4, 2023

@maliberty Running time for mock-aluregisterfile, mock-registerfile is a bit faster:

Log                       Elapsed seconds
1_1_yosys                        607
2_1_floorplan                     89
2_2_floorplan_io                   3
2_3_floorplan_tdms                 3
2_4_floorplan_macro                3
2_5_floorplan_tapcell              9
2_6_floorplan_pdn                 12
3_1_place_gp_skip_io             105
3_2_place_iop                      4
3_3_place_gp                    1225
3_4_place_resized                258
3_5_place_dp                     220
4_1_cts                          374
4_2_cts_fillcell                  10
5_1_grt                          249
5_2_route                       5457
6_1_merge                         64
6_report                        1229
Log                       Elapsed seconds
1_1_yosys                        967
2_1_floorplan                    140
2_2_floorplan_io                   4
2_3_floorplan_tdms                 4
2_4_floorplan_macro                4
2_5_floorplan_tapcell             10
2_6_floorplan_pdn                 12
3_1_place_gp_skip_io             129
3_2_place_iop                      4
3_3_place_gp                    1911
3_4_place_resized                423
3_5_place_dp                     317
4_1_cts                          573
4_2_cts_fillcell                  10
5_1_grt                          443
5_2_route                       8355
6_1_merge                         84
6_report                        1562

Study registerfile properties and ALU+registerfile
properties.

The verilog is now generated by LLVM CIRCT, firtool, hence
some changes to quality of results for the mock-alu.

The mock-aluregisterfile and mock-registerfile are
not hooked up to Jenkins, it's enough to test mock-alu.

Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
@oharboe
Copy link
Collaborator Author

oharboe commented Oct 4, 2023

@maliberty Leaving out CI mock-registerfile and mock-aluregisterfile until the are fast enough to be part of a normal build.

For my part, this PR is ready to be merged.

@tspyrou
Copy link
Contributor

tspyrou commented Oct 4, 2023

@maliberty I think its good to merge this so we have a goal testcase for future improvements.

@maliberty
Copy link
Member

Wouldn't register files normally be hard macros generated by a memory compiler? It appears you are generating them from ff which will be quite inefficient. Is this correct?

@maliberty
Copy link
Member

Closed in favor of #1547

@maliberty maliberty closed this Oct 11, 2023
@oharboe oharboe changed the title asap7: add mock-registerfile and mock-aluregisterfile [DO NOT MERGE] asap7: add mock-registerfile and mock-aluregisterfile Oct 11, 2023
@oharboe oharboe reopened this Oct 11, 2023
@oharboe oharboe marked this pull request as draft October 11, 2023 20:44
@oharboe
Copy link
Collaborator Author

oharboe commented Oct 11, 2023

I will update this PR when #1547 is done. Only the registerfile needs to be updated, otherwise the PR will be the same.

@oharboe oharboe closed this Dec 31, 2023
@oharboe oharboe deleted the mock-registerfile branch January 23, 2024 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants