Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

target:riscv: optimize register accesses #842

Merged
merged 5 commits into from
May 25, 2023

Conversation

en-sc
Copy link
Collaborator

@en-sc en-sc commented Apr 24, 2023

These commits reduce the number of accesses to target's registers by better utilizing register cache.

Copy link
Collaborator

@timsifive timsifive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I love this. Unnecessary register accesses annoy me every time I look at a log and I just haven't gotten around to fixing them.

I haven't done a proper review yet.

Can you add short comments to each register function that highlights the difference between all of them?

@sifiverobert
Copy link

I am not sure if using 'gdb_regno' in low-level debug access functions is appropriate.
These indices are rather defined by ISA (and DM ...) specification and NOT by GDB.
I know these GDB_REGNO_xxx were there before, but when we re-factor the code, we should make it more generic and not more (GDB) specific.

@en-sc
Copy link
Collaborator Author

en-sc commented Apr 25, 2023

I am not sure if using 'gdb_regno' in low-level debug access functions is appropriate. These indices are rather defined by ISA (and DM ...) specification and NOT by GDB. I know these GDB_REGNO_xxx were there before, but when we re-factor the code, we should make it more generic and not more (GDB) specific.

I also think this is a valid point of concern. However, the transition from int to gdb_regno, in my opinion, is justified for a handful of reasons:

  • There are quite a few places in which such a variable is used in the following way:
if (<var> >= GDB_REGNO_<XXX> && <var> <= GDB_REGNO_<YYY>)

In such uses it is silently expected that <var> is part of some enumeration.

  • In my opinion, it is better to use one enumeration consistently through the codebase, rather than supporting two enumerations, one representing gdb's point of view and the other representing HW's point, together with the tools to convert between them.

  • Furthermore, if it becomes necessary to create such HW regno enumeration, it would still be easier if the use of the currently present enumeration was consistent.

So, in my opinion, the creation of such an enumeration and conversion should be done in a separate commit and is somewhat out of scope of this PR.

@en-sc
Copy link
Collaborator Author

en-sc commented Apr 25, 2023

Can you add short comments to each register function that highlights the difference between all of them?

Addressed.

@en-sc en-sc force-pushed the en-sc/optimize-access branch 3 times, most recently from c1893cd to 82600fd Compare April 25, 2023 12:58
Copy link
Collaborator

@timsifive timsifive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make sure to run make all in riscv-tests/debug? That has coverage for at least some of the scratch RAM cases.

This is a large change, and I haven't entirely grokked it. Can you give an example or two of cases where registers are now cached that they wouldn't be previously?

src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
Copy link
Collaborator

@JanMatCodasip JanMatCodasip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look really fine 👍

Thank you for taking the time to improve the code.

I have only smaller suggestions.

src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv.c Show resolved Hide resolved
src/target/riscv/riscv.h Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Show resolved Hide resolved
src/target/riscv/riscv.c Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Show resolved Hide resolved
@en-sc
Copy link
Collaborator Author

en-sc commented May 2, 2023

This is a large change, and I haven't entirely grokked it. Can you give an example or two of cases where registers are now cached that they wouldn't be previously?

  • This is an annotated log of vector register write prior to the change (this is a second write in a row with mstatus set to 0x8000000a00000600 before accessing the registers):
Debug: 2400 469 riscv.c:4861 register_set(): [riscv.cpu] write 0xfffffffffffffffc0000000000000004 to v1 (valid=0)
...
//Reading mstatus from hart as this is a vector register access
...
Debug: 2412 470 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//Reading mstatus from hart to prepare to access vtype
...
Debug: 2422 471 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//Reading vtype
...
Debug: 2432 471 riscv-013.c:1472 register_read_direct(): [riscv.cpu] vtype = 0x18
...
//Reading mstatus from hart to prepare to read vl
...
Debug: 2442 472 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//Reading vl
...
Debug: 2452 472 riscv-013.c:1472 register_read_direct(): [riscv.cpu] vl = 0x2
Debug: 2453 472 riscv-013.c:1293 register_write_direct(): [riscv.cpu] vtype <- 0x18
...
//Reading mstatus from hart to prepare to read vtype
...
Debug: 2463 473 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//Wrote 0x18 to vtype
...
Debug: 2505 476 riscv-013.c:1293 register_write_direct(): [riscv.cpu] vl <- 0x2
...
//Reading mstatus from hart to prepare to write vl
...
Debug: 2515 477 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//Wrote  0x2 to vl
...
//Writing the vector register
...
Debug: 2599 481 riscv-013.c:1293 register_write_direct(): [riscv.cpu] vtype <- 0x18
...
//Reading mstatus from hart to prepare to write vtype
...
Debug: 2609 481 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//vtype restored 
...
Debug: 2651 483 riscv-013.c:1293 register_write_direct(): [riscv.cpu] vl <- 0x2
...
//Reading mstatus from hart to prepare to write vl
...
Debug: 2661 485 riscv-013.c:1472 register_read_direct(): [riscv.cpu] mstatus = 0x8000000a00000600
...
//vl restored
...
User : 2703 488 options.c:63 configuration_output_handler(): 0:v1 (/128): 0xfffffffffffffffc0000000000000004
  • And this is the same log after the change:
Debug: 2235 539 riscv.c:4951 register_set(): [riscv.cpu] write 0xfffffffffffffffc0000000000000004 to v1 (valid=0)
...
Debug: 2238 539 riscv-013.c:1085 prep_for_register_access(): [riscv.cpu] Preparing mstatus to access vl
Debug: 2239 539 riscv.c:4498 riscv_get_register(): [riscv.cpu] Read mstatus: 0x8000000a00000600 (cached)
Debug: 2240 539 riscv.c:4498 riscv_get_register(): [riscv.cpu] Read vtype: 0x18 (cached)
Debug: 2241 539 riscv.c:4498 riscv_get_register(): [riscv.cpu] Read vl: 0x2 (cached)
Debug: 2242 539 riscv.c:4443 riscv_set_or_write_register(): [riscv.cpu] Wrote 0x18 to vtype (valid=true, dirty=false)
Debug: 2243 539 riscv.c:4443 riscv_set_or_write_register(): [riscv.cpu] Wrote 0x2 to vl (valid=true, dirty=false)
...
//Reading the vector register
...
Debug: 2287 540 riscv.c:4443 riscv_set_or_write_register(): [riscv.cpu] Wrote 0x18 to vtype (valid=true, dirty=false)
Debug: 2288 540 riscv.c:4443 riscv_set_or_write_register(): [riscv.cpu] Wrote 0x2 to vl (valid=true, dirty=false)
Debug: 2289 540 riscv-013.c:1118 cleanup_after_register_access(): [riscv.cpu] Restoring mstatus to 0x8000000a00000600
Debug: 2290 540 riscv.c:4443 riscv_set_or_write_register(): [riscv.cpu] Wrote 0x8000000a00000600 to mstatus (valid=true, dirty=false)
User : 2291 540 options.c:63 configuration_output_handler(): 0:v1 (/128): 0xfffffffffffffffc0000000000000004

To sum up, after the change:

  • Writes to a register of the same value as the one present in the cache do not write to hart, in particular to a WARL register (MSTATUS, VL and VTYPE in the example above). This allows reading MSTATUS only ones, when no change is needed.

  • Using the riscv_get_register and the new riscv_write_register functions allows MSTATUS to be read at most twice and written once during preparation for vector register access (read to get the value, write to set it, read to get the updated value, since it is a WARL register).

  • Similarly, the number of accesses to VL and VTYPE is reduced.

@TommyMurphyTM1234
Copy link
Collaborator

Writes to a register of the same value as the one present in the cache do not write to hart

Is is definitely and always the case that there are no registers that cause side effects when the value already stored is written again and, for which, ignoring such writes will result in incorrect behaviour?

@en-sc
Copy link
Collaborator Author

en-sc commented May 2, 2023

Can you make sure to run make all in riscv-tests/debug? That has coverage for at least some of the scratch RAM cases.

I've run all the tests from make all. Tests on the spike-multy platform hang and does not finish. I've gotten the same behavior without my changes. This could be an issue with spike/gdb version. Can you please provide me with info on which versions these tests were passing before?

@en-sc
Copy link
Collaborator Author

en-sc commented May 2, 2023

Writes to a register of the same value as the one present in the cache do not write to hart

Is is definitely and always the case that there are no registers that cause side effects when the value already stored is written again and, for which, ignoring such writes will result in incorrect behaviour?

This question interests me as well. My understanding is that there isn't one, but I'm not sure.

@en-sc en-sc force-pushed the en-sc/optimize-access branch 2 times, most recently from 54be833 to ce6ab8b Compare May 2, 2023 15:40
@en-sc en-sc force-pushed the en-sc/optimize-access branch 2 times, most recently from b10689d to e16cfbf Compare May 3, 2023 07:38
@timsifive
Copy link
Collaborator

Thanks for showing the difference in vector accesses. That's a very nice improvement.

I've run all the tests from make all. Tests on the spike-multy platform hang and does not finish. I've gotten the same behavior without my changes. This could be an issue with spike/gdb version. Can you please provide me with info on which versions these tests were passing before?

Last I really looked, spike-multi passed for me but it was hugely slower than it used to be. I'm not sure what happened, and I've been ignoring the problem. They're really just to test connecting OpenOCD RV32 and RV64 targets. I don't think your code would have been affected by that.

Copy link
Collaborator

@JanMatCodasip JanMatCodasip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@en-sc Thank you for addressing the review feedback so far.

I have visually re-checked the merge request once again and it looks fine. In summary, I have only:

  • Few minor suggestions about comments & code style.
  • One doubt about the save & restore of target state in examine(). I think this part had been reworked in a better way already, but now this save & restore is back. Why? :)

src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv-013.c Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
JanMatCodasip
JanMatCodasip previously approved these changes May 22, 2023
Copy link
Collaborator

@JanMatCodasip JanMatCodasip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done one last visual check of the changes and it looks good to me. Thanks.

src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Outdated Show resolved Hide resolved
src/target/riscv/riscv.c Show resolved Hide resolved
JanMatCodasip
JanMatCodasip previously approved these changes May 22, 2023
@JanMatCodasip
Copy link
Collaborator

@en-sc Also please make sure your branch is rebased on top of current riscv trunk. That's because a merge from the upstream (openocd.org) has recently taken place.

en-sc added 4 commits May 22, 2023 11:55
Since writing a register can make some GPRs dirty (e.g. S0, S1), registers
should be flushed in reverse order, so GPRs are flushed last.

Change-Id: Ice352a4df4ae064619c0f9905db634a7b57e4711
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
Change-Id: Ia476251e835fa5fd129ae6b679c6049c5c60c716
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
This commit introduces a new function, which can be used to reduce number
of register accesses.

Change-Id: I125809726eb7797b11121175c3ad66bedb66dd0d
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
Introduce riscv_write_register to prep_for_register/vector_access and
cleanup_after_register/vector_access.

Change-Id: I77a0a06ac6f12eceec309f0aff94aa77bd56ff55
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
@en-sc
Copy link
Collaborator Author

en-sc commented May 22, 2023

@en-sc Also please make sure your branch is rebased on top of current riscv trunk. That's because a merge from the upstream (openocd.org) has recently taken place.

Done.

Change-Id: I45731d501f6261c4142c70afacf3fbbe42cf2806
Signed-off-by: Evgeniy Naydanov <evgeniy.naydanov@syntacore.com>
@en-sc
Copy link
Collaborator Author

en-sc commented May 23, 2023

Last change was a bugfix. Current version passes tests from riscv-tests/debug.

@en-sc
Copy link
Collaborator Author

en-sc commented May 23, 2023

@timsifive, can this PR be merged today, or do I need to do something?

@timsifive
Copy link
Collaborator

I'm giving @JanMatCodasip a final chance at commenting, but otherwise I was going to merge it tomorrow.

@timsifive timsifive merged commit c8e7e35 into riscv-collab:riscv May 25, 2023
@timsifive
Copy link
Collaborator

🎆

@en-sc en-sc deleted the en-sc/optimize-access branch August 14, 2024 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants