-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DO NOT MERGE] asap7: add mock-registerfile and mock-aluregisterfile #1452
[DO NOT MERGE] asap7: add mock-registerfile and mock-aluregisterfile #1452
Conversation
35582c9
to
96b6ea9
Compare
TL;DR merits further study.... mock-aluregisterfile is used to model, for instance, megaboom 128 registers 64 bit registers with 8 read ports, 4 write ports where I hook up 2 read ports and 1 write port to a 64 bit ADD operation(a mock ALU with a single operation). The mock-aluregisterfile is implemented with flip-flops. I'm getting I'm unsure if a fake asynchronous read SRAM would give more realistic figures as asynchronous read SRAMs don't normally have as many as 8 read ports and 4 write ports. I forget where, but I've seen 1.5GHz figures for megaboom, 666ps clock period, so mock-aluregisterfile should with commercial tools, enough engineering effort and a suitable implementation of the asynchronous SRAM be able to reach 666ps clock period. I don't know Berkeley BOOM and megaboom well enough to be quite sure that this is how the registerfile and ADD are hooked up, but I don't think I'm far off. If anything, the megaboom has even more logic going per cycle. Best as I can understand, x86 does this at 200ps on 7nm. Obviously with a lot more engineering effort, commercial tools and PDKs more tailored to CPUs. |
96b6ea9
to
110a109
Compare
@maliberty There is no real change to the mock-alu here, just a slight difference in how the Verilog is generated. The rest, not in Jenkins, is to have a shared baseline for mock-registerfile and mock-aluregisterfile for investigations. |
@oharboe I've seen a few of your experiments and I just want to throw out that register retiming is actually very important for some of the Berkeley designs. I implemented a Rocket Core with FPU and it turned out that the FPU pipelining scheme was completely dependent on register retiming. Yosys doesn't support register retiming, so it may be difficult to push it forward with the tools alone. |
Thanks! I am doing manual retiming for what I am working on. None of the experiments that I am proposing to merge with ORFS rely on retiming. I am cognizant the reliance on retiming in RISC-V BOOM is endemic. It is documented as well as evident from the code. |
110a109
to
e5683db
Compare
Designs that are not in the CI tend to rot. What value do these bring over the existing tests? What is the runtime for these designs? |
e5683db
to
57fc5d8
Compare
@maliberty Running time for mock-aluregisterfile, mock-registerfile is a bit faster:
|
Study registerfile properties and ALU+registerfile properties. The verilog is now generated by LLVM CIRCT, firtool, hence some changes to quality of results for the mock-alu. The mock-aluregisterfile and mock-registerfile are not hooked up to Jenkins, it's enough to test mock-alu. Signed-off-by: Øyvind Harboe <oyvind.harboe@zylin.com>
57fc5d8
to
742a9f0
Compare
@maliberty Leaving out CI mock-registerfile and mock-aluregisterfile until the are fast enough to be part of a normal build. For my part, this PR is ready to be merged. |
@maliberty I think its good to merge this so we have a goal testcase for future improvements. |
Wouldn't register files normally be hard macros generated by a memory compiler? It appears you are generating them from ff which will be quite inefficient. Is this correct? |
Closed in favor of #1547 |
I will update this PR when #1547 is done. Only the registerfile needs to be updated, otherwise the PR will be the same. |
Study registerfile properties and ALU+registerfile properties.
The verilog is now generated by LLVM CIRCT, firtool, hence some changes to quality of results for the mock-alu.
The mock-aluregisterfile and mock-registerfile are not hooked up to Jenkins, it's enough to test mock-alu.
Even with flip flop based SRAM, unsurprisingly, the number of ports completely dominate the area and speed.