Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[sram_ctrl/rtl] Timing fixes #22497

Merged
merged 2 commits into from
Apr 16, 2024
Merged

Conversation

GregAC
Copy link
Contributor

@GregAC GregAC commented Apr 9, 2024

I plan to put together a hacky block synthesis for sram_ctrl so I can check I've eliminated all of the feedthroughs that were causing issues. Will work on that tomorrow.

Contributes to resolving #22462.

@GregAC GregAC requested a review from a team as a code owner April 9, 2024 16:41
@GregAC GregAC requested review from matutem and removed request for a team April 9, 2024 16:41
@GregAC GregAC force-pushed the sram_ctrl_timing_fix branch 2 times, most recently from 79a4896 to e365526 Compare April 11, 2024 13:04
@GregAC
Copy link
Contributor Author

GregAC commented Apr 12, 2024

Got the block level synth running I've identified another feedthrough via the error_internal signal in tlul_adapter_sram:

a_ready : (gnt_i | error_internal) & reqfifo_wready & sramreqfifo_wready

I think it'll be shallower feedthrough than the ones I've eliminated but best we get rid of them all. I can get rid of it but it will be a little more work, in particular may involve some extra DV work.

@vogelpi
Copy link
Contributor

vogelpi commented Apr 12, 2024

As this second path is expected to be shallower and involves more DV work, should we first try to finalize the fix for the first and see how this impacts the actual chip timing?

@GregAC
Copy link
Contributor Author

GregAC commented Apr 12, 2024

In parallel I'm fixing up the CI issues for these current fixes so they should be good to go once they've been reviewed. Happy to trial how that impacts the chip timing alone but I would like to continue with this other fix. Long feedthroughs starting at Ibex going through the crossbar and then through SRAM and all the way back in Ibex generally aren't good. With them gone the implementation flow may find it has more placement freedom and wants to put in fewer buffers giving better QoR. In particular could give us some modest area savings that we'll appreciate having.

This adds a new output to tlul_adapter_sram to signal when a
read-modify-write operation is pending. This is used in sram_ctrl to fix
a timing path that factored a tilelink write enable signal to a
tilelink ready signal.

Signed-off-by: Greg Chadwick <gac@lowrisc.org>
Previously the grant signal from the SRAM primitive was used directly to
calculate a tilelink ready signal. As the grant factors in this request
this produces a valid -> ready timing path. This causes timing issues
and is not required.

This commit refactors the logic to avoid using the SRAM grant to
calculate the tilelink ready.

Signed-off-by: Greg Chadwick <gac@lowrisc.org>
@GregAC
Copy link
Contributor Author

GregAC commented Apr 12, 2024

After a bit more investigation this new path may not be that much shallower than the previous ones anyway, so I think worth fixing it before doing a new implementation trial.

(reg2hw.status.escalated.q) ? (tl_gate_resp_pending & tlul_we) : 1'b1;
assign key_valid =
(key_req_pending_q) ? 1'b0 :
(reg2hw.status.escalated.q) ? (tl_gate_resp_pending & sram_rmw_in_progress) : 1'b1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to take tlul_we or sram_rmw_in_progress into account for key_vaiid in the escalated case? As far as I can see, in the escalated case key_valid is only used to make prim_ram_1p_scr grant requests -- can't we make it grant all requests, including reads?

The reason I'm asking is because the change would be much simpler, and it would also mean less code to maintain

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a defence in depth? With this setup we've both got a TLUL gate and the clearing of key_valid from stopping a transaction acting on the SRAM after escalation. If key_valid is always true other than when key_req_pending_q is set then after an escalation any transaction that reaches the tlul_sram_adapter will get acted upon.

I note there's an explicit clone of the escalation signal, with one clone setting the escalated bit (which causes key_valid is go low) and the other clone closing the TLUL gate:

lc_tx_t [1:0] escalate_en;
prim_lc_sync #(
.NumCopies (2)
) u_prim_lc_sync (
.clk_i,
.rst_ni,
.lc_en_i (lc_escalate_en_i),
.lc_en_o (escalate_en)
);

There's some discussion on these PRs: #11863 and #17072

There's also the argument we want to change behaviour as little as possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I think defense in depth and changing behavior as little as possible are sufficient arguments for making the change this way

// a request from tlul_req will be granted regardless of whether a request exists. This is done
// for timing reasons so that the output TL ready isn't combinatorially connected to the incoming
// TL valid. In particular we must not use `sram_gnt` in the expression to avoid this.
assign tlul_gnt = key_valid & ~init_req;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change LGTM

Copy link
Contributor

@andreaskurth andreaskurth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM based on the discussions in this PR. Thanks @GregAC!

@vogelpi
Copy link
Contributor

vogelpi commented Apr 16, 2024

Please don't merge this yet, AFAIK Greg found another path he is currently working on improving.

@GregAC
Copy link
Contributor Author

GregAC commented Apr 16, 2024

Yes I've got two more fixes inbound, currently investigating some regression failures and then will update this PR with them.

@andreaskurth andreaskurth changed the title Sram ctrl timing fix [sram_ctrl/rtl] Timing fixes Apr 16, 2024
@andreaskurth
Copy link
Contributor

Since we are agreed on this, we could just merge this one for now and create a follow-up PR for the other improvements (tracked against #22462)?

@GregAC
Copy link
Contributor Author

GregAC commented Apr 16, 2024

Regression issues I'm looking at may be related to these changes so I'd prefer to at least understand those before we merge this.

@GregAC
Copy link
Contributor Author

GregAC commented Apr 16, 2024

Failures look to be related to my new work so merging this now

@GregAC GregAC merged commit 3cf650c into lowRISC:master Apr 16, 2024
31 checks passed
@GregAC GregAC deleted the sram_ctrl_timing_fix branch April 16, 2024 11:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants