New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import the microwatt PowerPC core #245
Comments
I now have MW working locally with LiteDRAM using a custom wishbone => LiteDRAM native bus adapter which also does the bus upsizing. At the moment, LiteDRAM is built with a built-in riscv for the memory inits. I'll be looking at hooking up the CSRs to the wishbone and porting the LiteX BIOS code in the next few days as travel & time permits. That way I can take out the riscv, uart and other gunk in the LiteDRAM core and save space on the Arty. I was going to look at proper LiteX integration next (I learning LiteX as I go). I've done my own wishbone adapter in vhdl mostly because the current LiteX ones have broken bus UpConverters, but that should be fixed eventually. |
Great thanks, your approach is probably the best to get into things progressively. Happy to help for creating the CPU wrapper. Can you give more informations about the broken bus UpConverters, i'd like to have a look. (or you can create another issue for this is you want). |
Mostly one of them won't do "both" (it's in the code), only one direction, and the other one uses FlipFlop() which seems to be deprecated.. |
(sorry in an airline lounge about to board my flight to europe so a bit terse :-) |
Microwatt has been integrated as a submodule, wrapped with a vhdl/migen wrapper and gateware has been integrated in LiteX. Minimal software support has also been added. The software and gateware compiles fine. We now need to simulate a SoC with Microwatt CPU (we can't use LiteX simulator since Microwatt is in VHDL, unless we have a verilog model of it it) and finish the software support. Any help is welcome :) |
GHDL now seems to be able synthetize Microwatt: https://twitter.com/antonblanchard/status/1219448773333487616 This indeed seems to be working with the attached script/procedure: Install GHDL
Get Microwatt:
Synthetize Microwatt sources:
microwatt_ghdl_synth.py: #!/usr/bin/env python3
import os
files = [
# Common / Types / Helpers
"decode_types.vhdl",
"wishbone_types.vhdl",
"utils.vhdl",
"common.vhdl",
"helpers.vhdl",
# Fetch
"fetch1.vhdl",
"fetch2.vhdl",
# Instruction/Data Cache
"cache_ram.vhdl",
"plru.vhdl",
"dcache.vhdl",
"icache.vhdl",
# Decode
"insn_helpers.vhdl",
"decode1.vhdl",
"gpr_hazard.vhdl",
"cr_hazard.vhdl",
"control.vhdl",
"decode2.vhdl",
# Register/CR File
"register_file.vhdl",
"crhelpers.vhdl",
"cr_file.vhdl",
# Execute
"ppc_fx_insns.vhdl",
"logical.vhdl",
"rotator.vhdl",
"countzero.vhdl",
"execute1.vhdl",
# Load/Store
"loadstore1.vhdl",
# Multiply/Divide
"multiply.vhdl",
"divider.vhdl",
# Writeback
"writeback.vhdl",
# Core
"core_debug.vhdl",
"core.vhdl",
]
for f in files:
os.system("ghdl -a --std=08 ../{}".format(f))
os.system("ghdl --synth --std=08 core") |
With 9bef218, Microwatt is now running on hardware. It will still be useful to support the GHDL-synth flow to ease simulations and use the FOSS toolchains. |
Install ghdl-yosys-plugin:
|
Generate the verilog (from ghdl-synthesis-test branch):
|
Looks great, I'll play with this and maybe integrate some of that into Microwatt own makefiles, it will definitely be useful for simulating with litedram. BTW. What do you use on the DDR side for simulating litedram ? A micron model ? Or do you have your own ? |
@ozbenh: just for info, with this, GHDL/Yosys were able to convert Microwatt to verilog using the For the simulation, we have a DRAM model that we use with litex_sim: https://github.com/enjoy-digital/litedram/blob/master/litedram/phy/model.py. |
Thanks. Is there a way for LiteX to generate a verilog version of the DRAM model ? For the "standalone microwatt" case, I want to toy around with the user port interface to wishbone to do things like pipelining etc... and the easiest seems to be to do it in verilog using a little test bench, and throw the whole lot at verilator. I can then use that verilog in microwatt directly or convert it back to vhdl. As for running the converted microwatt, I'll give that a try asap. |
Hrn... thinking twice, that means I probably also need sim models of all the xilinx PLL etc... that won't be as easy as I initially thought... |
Allright, had to hack/tweak a few things, I'll get back to you, I now got the sim running. I'll try to get to the bottom of it but it might take a while. I assume there's no way to get the report() statements out of ghdl.... Also note that --trace-fst and --trace-end xxx both generate errors when building the sim. |
@ozbenh: good, have you also been able to get the CPU/BIOS working in simulation? I could work on finishing the integration with litex_sim next week. I'll also look at --trace-fst/--trace-end. |
No I haven't yet. I can see the CPU fetching some instructions and I see them out of the icache but it stops doing that sanely pretty quickly. I haven't figured out why yet. Note: It's a very painful process, because microwatt stores everything in records and the ghdl-synth+yosys process turns all these into giant vectors :-( Also the vcd files coming out of litex are humongous :-) I wish instead the records would be broken in separate wire/vectors with something like recordname_wirename instead... Anyway, I'll continue digging as time permits. I also noticed a while pile of warnings out of yosys (or maybe verilator?) about Case values overlap (example pattern 0x3). These seem to come from a whole bunch of those constructs in the generated verilog that do look bogus:
I'm pretty sure the "simplified vhdl" that ghdl spits out has all those "?" as "0" |
So there was a ghdl synth bug. I made a test case and Tristan fixed it (ghdl/ghdl#1319). It works in sim with the latest microwatt, though you probably want the patch below applied to microwatt (at least until Anton merges it ) and then wire the interrupt to the core to '0'. Note about interrupts: If we're ever going to run Linux on microwatt with LiteX we'll want the xics interrupt controller model, not the traditional LiteX one. Which probably means adding SW support for it as well to the LiteX BIOS.
|
Great! Thanks for looking at this, i'll reproduce your results and will do the LiteX integration to automate this when runnning |
@ozbenh: with a02077d, you now just have to set |
Great, thanks. Yes there are problems with how memories are inferred with Yosys still. |
By reducing the number of ICache/DCache lines to |
The GHDL-Yosys-plugin path can now be selected with Simulation with the verilog generated from GHDL-Yosys-plugin and Verilator:
Build on Arty with the VHDL files:
Build on Arty with the verilog generated from GHDL-Yosys-plugin:
Some improvements can still be done on the integration (add burst/irq support) but this could be discussed in more specific issues/PRs. |
Ok, that's going to be a problem for Microwatt. We rely heavily on it. Without LUT RAM things like cache tags, TLBs and register file will be orders of magnitude larger in the generated FPGA (and timing will go down the sink). Basically anything large that needs async read is a LUT RAM for us |
and doing sync reads would require adding even more pipeline stages/latency |
LUT RAM for ECP5 is fully supported in Yosys, there is only one configuration. |
Thanks Dave. As long as it infers a 2D array with synchronous writes and async reads a a LUT RAM we should be ok with microwatt. If I had an ECP5 board at hand I could give it more love (and generate litedram for it) but I don't at the moment and can't quite spare the funds right now. |
As for block RAMs, we have wrappers for all of our use of it which could easily be replaced by explicit primitives if necessary. I did that to make it easier to either replace them or tweak them to match tool inferrence limitations. Note that our dcache does use the "output register" option of Xilinx block RAMs to help timing. |
@ozbenh Still having a real hard time getting Microwatt to actually fit on an ECP5-45 with any room left over for significant (>5% die area) peripherals, even with caches cut back to one line each. Any other thoughts on trimming Microwatt back somewhat and making it fit better? |
Not really. Someone who understands the toolchain should look into where is most of the area going, it's strange that it seems to be using 2 to 3 times more LUTs than on the Artix... |
@madscientist159: last time i tested on ECP5, Yosys still had issues with the caches and i had to reduce |
@enjoy-digital I'm already doing that, it's still sitting at a ridiculously high resource usage. Switching to the default RISC-V CPU (which we can't really use for other reasons -- not even set up to test / debug with it beyond synthesis) yields a drop from 95% usage to 50% usage on the ECP5, with all other peripherals etc. unchanged. |
That's it exactly, I did some digging and apparently the required TDP RAMs are not supported by Yosys, causing a ridiculous explosion in resources (something over 11k cells just for a two-line I cache and D cache). Relevant failure:
@ozbenh I wonder if we could add a mode to Microwatt that just disables the caches for now since a two-line cache isn't going to be useful in the first place, and the amount of resources sucked down are causing the design to be near worthless on real world FPGAs? |
What RAM object is this ... the main cache rams don't have async reads, they are SDP with one sync read and one sync write port... the tags however have async read, are they the problem here ? For 2 lines there should be only 2 tags and they should fit in LUT RAMs. The above log doesn't say which array/entity it is. We don't really have a design that runs with caches off at this point, we would have to change thing potentially quite a bit, but I'd rather we fixed the above. |
BTW. How did you change the cache sizes ? Where did you edit the generic ? You need to change the values in core.vhdl not the defaults in icache.vhdl or dcache.vhdl |
Also.. the TLBs have async reads, so they would fit in LUT RAM. We might be able to make things smaller by having both tags and TLBs in block RAM but at the cost of some extra latency |
@ozbenh RAM object is "icache_32_2_2_64_12_56_5ba93c9db0cff93f52b521d7420e43f6eda2784f.\897:", there are a bunch of them that are similar. The entire I cache reports no BRAM usage and a ton of cells used:
Also, curiously, the rotator is using a ridiculous amount of resources:
Those two blocks alone account for 20% of the entire resource usage of the LiteX/Microwatt design, so something seems off. 😉 EDIT: Also, yes, defaults changed in core.vhdl. It literally won't fit at all even with a bare bones design if the caches aren't reduced significantly (I reduced them to two lines each as that seems to be as small as they will go). |
@ozbenh I suppose one approach could be to move the inferred block RAMs into their own module, so that those of us with toolchains that don't actually infer BRAMs (like the Yosys one) could manually insert a device-specific instantiation... |
The block RAMs for caches are already their own module: cache_ram Which is why I wonder whether the problem you are having might have more to do with the TLBs which are async read, Double check you changed the caches size properly in core.vhdl though. As for the rotator I don't know what's up there, it might be worth talking to Paul, it could be that either ghdl-synth or yosys is doing a crap job of it... the Vivado synth summary looks like this (I don't know how to get more detailed cell level util for it):
|
I'm currently working on trying to get IRQ support enabled for Microwatt in LiteX. Before I dive too deeply into this, is this support already partially (or fully) implemented in anyone's development tree but not yet merged? EDIT: The approach I'm considering would be to instantiate the |
I've been working on this for the past couple of days, making decent progress. The current snag is the fact that POWER has a fixed location for its exception handlers, and with the current Microwatt implementation that exception handler table is located in read only data. Both the BIOS and the application need to use different exception handlers, so I'm currently working on rewriting and relocating at least part of the handler into RAM so that it can be patched by the loaded application. |
@madscientist159: thanks for the feedback. In case it could be useful for your tests, it's possible to make the embedded ROM writable by setting |
Due to how the POWER architecture works, you should really have your main RAM at 0... We could add some kind of microwatt specific hackery to offset the exception vectors but this would make it harder to do things like support Linux |
@ozbenh Good to know. I did get things working with a soft offset, basically I wrote some hardcoded exception entry asm that uses a reserved vector address variable in SRAM to jump to the generic C exception handler vector. I might push that up as at least a temporary enablement patch, since LiteX seems to make a lot of assumptions about what exists at address 0 (basically the BIOS, and to a large extent does expect it to be ROM, not RAM). |
@ozbenh One other question -- is the XICS controller actually documented anywhere? I pieced together enough from the existing source to write some basic test code and verified that the IPI and DEC interrupts work, but I haven't been having luck so far on external interrupts. I suspect I just haven't set up the controller properly, but since there are no public docs on XICS registers / operation that I've been able to find I'm stumbling around a bit in the dark here. |
Sadly it's not anywhere I'm aware of. Also while the presentation registers (ICP) are somewhat architected, the source control ones are not, they are generally hidden behind some firmware (RTAS or OPAL). I do have a patch somewhere for Linux to add support to the ones I made up for microwatt in my github tree. It's pretty trivial though, mostly the registers contain the priority of the interrupt for each source, with "ff" meaning masked. |
@benh That was enough to get me pointed in the right direction...LiteX/Microwatt now has proper interrupt support in PR #677 ! If there's a desire to shuffle things around such that address 0 is RAM and some other address is ROM, I'd recommend that be a separate patch series. It's not just as simple as rearranging the RAM/ROM in the address space, partly because the LiteX BIOS assumes it can turn on interrupts for its own purposes. A follow-on series that moves RAM/ROM around would also have to add code to both set up the tables in RAM and avoid loading applications over them... |
Thanks all, i think we can now close this since Microwatt is now supported in LiteX and create separate issues for the remaining specific points, as #693 for example. |
@ozbenh While the IRQ system has been working quite nicely overall, I've noticed we get a single spurious interrupt with source 0 / IRQ 0 at startup. It seems harmless enough, and right now we're just ignoring anything with source 0, but I'd like to know what is generating the interrupt and what the purpose of source 0 is (already figured out source 2 is the IPI mechanism). Any hints? 😉 |
On Mon, 2020-11-09 at 14:01 -0800, Timothy Pearson wrote:
@ozbenh While the IRQ system has been working quite nicely overall,
I've noticed we get a spurious IRQ with source 0 / IRQ 0 at startup.
It seems harmless enough, and right now we're just ignoring anything
with source 0, but I'd like to know what is generating the interrupt
and what the purpose of source 0 is (already figured out source 2 is
the IPI mechanism). Any hints? 😉
0 is spurrious ... not sure what's up, might need to wave it.
Cheers,
Ben.
|
I wanted know to is there any resource for interfacing Ethernet with microwatt using litex |
https://github.com/antonblanchard/microwatt
The text was updated successfully, but these errors were encountered: