Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix loadmem bug in firesim_tsi.cc #1401

Merged
merged 2 commits into from Jan 26, 2023
Merged

Fix loadmem bug in firesim_tsi.cc #1401

merged 2 commits into from Jan 26, 2023

Conversation

jerryz123
Copy link
Contributor

The default chunk_align (32b) was less than the default host-DRAM interface (64b), which caused drops on some loadmem_writes. Fix this for now by setting chunk_align to 8.

This error would manifest only when the address written had (addr + len) % 8 == {5|6|7}.
For example, writing 0x6 bytes to 0x100 would result in a sequence of writes:

  1. write 2 bytes to 0x104
  2. write 4 bytes to 0x100

The second write would generate a 8-byte write to 0x100, clobbering the first write.

Related PRs / Issues

UI / API Impact

None

Verilog / AGFI Compatibility

Contributor Checklist

  • Is this PR's title suitable for inclusion in the changelog and have you added a changelog:<topic> label?
  • Did you add Scaladoc/docstring/doxygen to every public function/method?
  • Did you add at least one test demonstrating the PR?
  • Did you delete any extraneous prints/debugging code?
  • Did you state the UI / API impact?
  • Did you specify the Verilog / AGFI compatibility impact?
  • If applicable, did you regenerate and publicly share default AGFIs?
  • If applicable, did you apply the ci:fpga-deploy label?
  • If applicable, did you apply the Please Backport label?

Reviewer Checklist (only modified by reviewer)

Note: to run CI on PRs from forks, comment @Mergifyio copy main and manage the change from the new PR.

  • Is the title suitable for inclusion in the changelog and does the PR have a changelog:<topic> label?
  • Did you mark the proper release milestone?
  • Did you check whether all relevant Contributor checkboxes have been checked?

@jerryz123 jerryz123 added bug Something isn't working changelog:fixed Put PR title in 'Fixed' section of changelog labels Jan 22, 2023
@jerryz123 jerryz123 marked this pull request as ready for review January 22, 2023 10:05
Copy link
Contributor

@abejgonzalez abejgonzalez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this have an impact on performance?

@jerryz123
Copy link
Contributor Author

jerryz123 commented Jan 23, 2023

No. This only affects initial binary loading, and only for the tail case - the last 8 bytes in a section in a ELF.

Even so it doesn't actually change the number of host-DRAM requests - it only aligns the chunked requests properly.

@davidbiancolin
Copy link
Contributor

We've had other issues with handling the last bytes of loadmem in the past. In lieu of actually testing the infrastructure better, could we check in a few binaries that have elf sizes that trigger this problem?

@jerryz123
Copy link
Contributor Author

I think the right way to do this is to add some self-test in firesim_tsi_t. Checking some sequence of write/read commands can validate that the host-DRAM interface is sane before the target payload is loaded.

One of the test vectors should check this case. It is sufficient to check that:

write(0x8000_0000, 6, &data);
read(0x8000_0000, 6, &data);

behaves correctly.

@davidbiancolin
Copy link
Contributor

I think the right way to do this is to add some self-test in firesim_tsi_t. Checking some sequence of write/read commands can validate that the host-DRAM interface is sane before the target payload is loaded.

One of the test vectors should check this case. It is sufficient to check that:

write(0x8000_0000, 6, &data);
read(0x8000_0000, 6, &data);

behaves correctly.

That's a great idea so long as you pull in the address somewhere sane. Could you do that here?

@jerryz123
Copy link
Contributor Author

That's a great idea so long as you pull in the address somewhere sane. Could you do that here?

It turns out this is actually not so easy to do safely.

The problem is:

  • SerialBridge, which instantiates fesvr, doesn't know anything about the underlying implementation of target memory (it might not even be FPGA-DRAM)
  • LoadMem doesn't know the target address map, LoadMem talks directly to DRAM through the host address map

So the only way to add a FESVR-side self-test of the loadmem interface is to add assumptions on the relationship between the FESVR-accessible memory map and the LoadMem widget. I don't think this should be done.

I think there are more pressing issues with the current setup. I believe an ELF file with sections mapped to invalid memory regions can cause a bus hang, or even worse, generate writes to host DRAM regions allocated to unrelated bridges without any indication. This ought to be fixed by moving the loadmem widget behind the AXI4 Xbar in FPGATop (instead of in front of it).

In any case, I'd like this PR to get merged soon, as its blocking a chisel/rc/scala bump. I've improved this fix to pull hKey.dataBits from the generated header, instead of hardcoding it. Fixing the loadmem setup to be more robust should be a longer-term goal.

@jerryz123
Copy link
Contributor Author

Nvm, my better fix is no longer applicable now that the driver has been split apart more. Hardcoding chunk_align to 8 is the best temporary solution, but the longer term goal IMO should be to more tightly couple the loadmem path to fesvr.

chunk_align has to match the loadmem-DRAM interface width
Hardcode this for now
Copy link
Contributor

@abejgonzalez abejgonzalez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. You should also make an issue so we can track this there as well.

Copy link
Contributor

@davidbiancolin davidbiancolin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see the small comment change so this is more trackable, otherwise ship it

sim/firesim-lib/src/main/cc/fesvr/firesim_tsi.h Outdated Show resolved Hide resolved
Co-authored-by: David Biancolin <david.biancolin@sifive.com>
@jerryz123 jerryz123 merged commit 0b8289a into main Jan 26, 2023
@jerryz123 jerryz123 deleted the jerryz123-patch-3 branch January 26, 2023 00:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working changelog:fixed Put PR title in 'Fixed' section of changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants