Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

very large clocked process makes it impossible to bring out non-clocked signals #17

Closed
wallclimber21 opened this issue Aug 30, 2016 · 6 comments
Assignees

Comments

@wallclimber21
Copy link
Contributor

wallclimber21 commented Aug 30, 2016

In the current code, there are some huge clocked always clauses, like the one that starts on line 1168.

There is nothing fundamentally wrong with this as long as you don't need to bring out anything combinationally.

The problem is that this is what I'd like to do. :-)

The immediate issue is that, for one reason or the other, cpuregs is not detected by Quartus as a memory block, so it consumes tons of regular registers. (From your documentation, it seems that Vivado is detecting this just fine.)

I'd like to experiment by replacing constructs like this:
cpuregs[latched_rd] <= reg_pc + (latched_compr ? 2 : 4);

to
cpuregs_wr = 1'b1;
cpuregs_addr = latched_rd
cpuregs_wdata = reg_pc + (latched_compr ? 2 : 4);

and then have 1 place with a single cpuregs[cpuregs_addr] ... clause.

If that still doesn't do the trick, I'd even instantiate an Altera memory macro to force a memory block.

With the current clocked always block, such an experiment requires a full-out rewrite of the whole always block instead of a small surgical patch.

Would you be open to converting this clock process into 2 processes, a pure combinational one and a clock one that just contains the register? I'm willing to do the work if you just tell me the naming convention. My standard convention is:

always(*)
<var_name>_nxt = ....

always @(posedge clk)
<var_name> <= <var_name>_nxt;

But I'm willing to use whatever way you prefer.

(I'm NOT asking to make the change for the cpuregs isolation itself, that could be a forked branch on my side of you think it makes the code look too ugly.)

In addition to cpuregs experiments, such a change would also make it possible to later insert logic to scan out register contents via jtag etc.

@cliffordwolf cliffordwolf self-assigned this Aug 30, 2016
@cliffordwolf
Copy link
Collaborator

Hmm.. Does Quartus handle the memory fine if you use this pre-synthesized version of the design?
http://scratch.clifford.at/picorv32_syn.v

(PicoRV32 in its default configuration, generated with yosys -p 'prep -nomem -top picorv32; memory -nomap; opt' -o picorv32_syn.v picorv32.v.)

@cliffordwolf
Copy link
Collaborator

I don't have quartus installed to test this, but I think commit 82d837b might solve the issue. Please give it a try and let me know if it does.

@wallclimber21
Copy link
Contributor Author

wallclimber21 commented Aug 30, 2016

With our picorv32_syn.v file, Quartus infers not one but two cpuregs, a block memory not MLABs. They'r each dual ported too, 32-wide and 32-deep. I think that's progress. :-)

However the commit 82d837b synthesizes to FFs.

BTW: one other reasons to make cpuregs an explicit RAM is because it's the only way to go forward when using a non-FPGA synthesis flow. Those typically don't support automated RAM inference.

@wallclimber21
Copy link
Contributor Author

It's uplifting (or depressing, depending on once pov) that a tool like Yosys is better at this than Quartus!

Looks like I should install Yosys for experimentation.

@cliffordwolf
Copy link
Collaborator

Quartus infers not one but two cpuregs

That's to be expected. The PicoRV32 default configuration is to infer a register file with two read ports (ENABLE_REGS_DUALPORT), which is usually implemented from two block rams.

However commit 82d837b synthesizes to FFs.

What's about commit c9519df?

one other reasons to make cpuregs an explicit RAM is because it's the only way to go forward when using a non-FPGA synthesis flow.

Well... with Yosys you can infer custom RAM resources pretty easily. Also: on an ASIC the area difference between a memory generated by a memory compiler and the memory from FFs that the synth tool creates is nowhere near what you would have on an FPGA. Usually it's not worth the effort for a 1 kBit memory like the RV32 register file. Especially when considering that usually you have something like a program memory right next to the CPU, and that will have to hold much more than just 32 words.

@wallclimber21
Copy link
Contributor Author

What's about commit c9519df?

Success!

That's to be expected. The PicoRV32 default configuration is to infer a register file with two read ports (ENABLE_REGS_DUALPORT), which is usually implemented from two block rams.

From quick inspection of the code, I thought that there weren't cases where you read and write to the register files during the same cycle (cpu_state_fetch. In that case, a single dualported memory with 2 RW ports would have been sufficient. But I see now that this is not the case. Carry on. :-)

With this, the issue can be closed. If one day, I really need to hand-instantiate a small RAM, it's only a minor change in the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants