Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Import the microwatt PowerPC core #245

Closed
mithro opened this issue Aug 22, 2019 · 63 comments
Closed

Import the microwatt PowerPC core #245

mithro opened this issue Aug 22, 2019 · 63 comments

Comments

@mithro
Copy link
Collaborator

mithro commented Aug 22, 2019

https://github.com/antonblanchard/microwatt

A tiny Open POWER ISA softcore written in VHDL 2008. It aims to be simple and easy to understand.

@mithro
Copy link
Collaborator Author

mithro commented Aug 22, 2019

@shenki

@ozbenh
Copy link
Contributor

ozbenh commented Sep 4, 2019

I now have MW working locally with LiteDRAM using a custom wishbone => LiteDRAM native bus adapter which also does the bus upsizing.

At the moment, LiteDRAM is built with a built-in riscv for the memory inits. I'll be looking at hooking up the CSRs to the wishbone and porting the LiteX BIOS code in the next few days as travel & time permits.

That way I can take out the riscv, uart and other gunk in the LiteDRAM core and save space on the Arty.

I was going to look at proper LiteX integration next (I learning LiteX as I go). I've done my own wishbone adapter in vhdl mostly because the current LiteX ones have broken bus UpConverters, but that should be fixed eventually.

@enjoy-digital
Copy link
Owner

Great thanks, your approach is probably the best to get into things progressively. Happy to help for creating the CPU wrapper.

Can you give more informations about the broken bus UpConverters, i'd like to have a look. (or you can create another issue for this is you want).

@ozbenh
Copy link
Contributor

ozbenh commented Sep 5, 2019

Mostly one of them won't do "both" (it's in the code), only one direction, and the other one uses FlipFlop() which seems to be deprecated..

@ozbenh
Copy link
Contributor

ozbenh commented Sep 5, 2019

(sorry in an airline lounge about to board my flight to europe so a bit terse :-)

@enjoy-digital
Copy link
Owner

Microwatt has been integrated as a submodule, wrapped with a vhdl/migen wrapper and gateware has been integrated in LiteX. Minimal software support has also been added. The software and gateware compiles fine. We now need to simulate a SoC with Microwatt CPU (we can't use LiteX simulator since Microwatt is in VHDL, unless we have a verilog model of it it) and finish the software support. Any help is welcome :)

@enjoy-digital
Copy link
Owner

GHDL now seems to be able synthetize Microwatt: https://twitter.com/antonblanchard/status/1219448773333487616

This indeed seems to be working with the attached script/procedure:

Install GHDL

$ git clone https://github.com/ghdl/ghdl
$ cd ghdl
$ ./configure --enable-libghdl --enable-synth
$ make
$ make install

Get Microwatt:

git clone https://github.com/antonblanchard/microwatt
cd microwatt
git checkout ghdl-synthesis

Synthetize Microwatt sources:

  • create a build directory in microwatt.
  • copy microwatt_ghdl_synth.py script to it
  • execute:
./microwatt_ghdl_synth.py > microwatt.vhd

microwatt_ghdl_synth.py:

#!/usr/bin/env python3

import os

files = [
    # Common / Types / Helpers
    "decode_types.vhdl",
    "wishbone_types.vhdl",
    "utils.vhdl",
    "common.vhdl",
    "helpers.vhdl",

    # Fetch
    "fetch1.vhdl",
    "fetch2.vhdl",

    # Instruction/Data Cache
    "cache_ram.vhdl",
    "plru.vhdl",
    "dcache.vhdl",
    "icache.vhdl",

    # Decode
    "insn_helpers.vhdl",
    "decode1.vhdl",
    "gpr_hazard.vhdl",
    "cr_hazard.vhdl",
    "control.vhdl",
    "decode2.vhdl",

    # Register/CR File
    "register_file.vhdl",
    "crhelpers.vhdl",
    "cr_file.vhdl",

    # Execute
    "ppc_fx_insns.vhdl",
    "logical.vhdl",
    "rotator.vhdl",
    "countzero.vhdl",
    "execute1.vhdl",

    # Load/Store
    "loadstore1.vhdl",

    # Multiply/Divide
    "multiply.vhdl",
    "divider.vhdl",

    # Writeback
    "writeback.vhdl",

    # Core
    "core_debug.vhdl",
    "core.vhdl",
]

for f in files:
    os.system("ghdl -a --std=08 ../{}".format(f))

os.system("ghdl --synth --std=08 core")

@enjoy-digital
Copy link
Owner

With 9bef218, Microwatt is now running on hardware. It will still be useful to support the GHDL-synth flow to ease simulations and use the FOSS toolchains.

@enjoy-digital
Copy link
Owner

Install ghdl-yosys-plugin:

git clone https://github.com/ghdl/ghdl-yosys-plugin
make
sudo cp ghdl.so /usr/local/share/yosys/plugins/ghdl.so

@enjoy-digital
Copy link
Owner

Generate the verilog (from ghdl-synthesis-test branch):

microwatt.ys:


ghdl --ieee=synopsys -fexplicit -frelaxed-rules --std=08 \
decode_types.vhdl \
wishbone_types.vhdl \
utils.vhdl \
common.vhdl \
helpers.vhdl \
fetch1.vhdl \
fetch2.vhdl \
cache_ram.vhdl \
plru.vhdl \
dcache.vhdl \
icache.vhdl \
insn_helpers.vhdl \
decode1.vhdl \
gpr_hazard.vhdl \
cr_hazard.vhdl \
control.vhdl \
decode2.vhdl \
register_file.vhdl \
crhelpers.vhdl \
cr_file.vhdl \
ppc_fx_insns.vhdl \
logical.vhdl \
rotator.vhdl \
countzero.vhdl \
execute1.vhdl \
loadstore1.vhdl \
multiply.vhdl \
divider.vhdl \
writeback.vhdl \
core_debug.vhdl \
core.vhdl \
microwatt_wrapper.vhdl \
-e microwatt_wrapper
write_verilog microwatt.v

yosys -q -m ghdl microwatt.ys

@ozbenh
Copy link
Contributor

ozbenh commented May 16, 2020

Looks great, I'll play with this and maybe integrate some of that into Microwatt own makefiles, it will definitely be useful for simulating with litedram.

BTW. What do you use on the DDR side for simulating litedram ? A micron model ? Or do you have your own ?

@enjoy-digital
Copy link
Owner

@ozbenh: just for info, with this, GHDL/Yosys were able to convert Microwatt to verilog using the ghdl-synthesis-test branch or Microwatt. I tried litex_sim and Verilator was able to compile it and run it but the BIOS was not showing up and i haven't investigated. If you want to run the simulation, you can follow the previous steps to generate microwatt.v then replace this: https://github.com/enjoy-digital/litex/blob/master/litex/soc/cores/cpu/microwatt/core.py#L105-L158 with platform.add_source("microwatt.v") and do: litex_sim --cpu-type=microwatt (you can add --trace to generate the simulation waveform and see what is going on).

For the simulation, we have a DRAM model that we use with litex_sim: https://github.com/enjoy-digital/litedram/blob/master/litedram/phy/model.py.

@ozbenh
Copy link
Contributor

ozbenh commented May 16, 2020

Thanks. Is there a way for LiteX to generate a verilog version of the DRAM model ? For the "standalone microwatt" case, I want to toy around with the user port interface to wishbone to do things like pipelining etc... and the easiest seems to be to do it in verilog using a little test bench, and throw the whole lot at verilator. I can then use that verilog in microwatt directly or convert it back to vhdl.

As for running the converted microwatt, I'll give that a try asap.

@ozbenh
Copy link
Contributor

ozbenh commented May 16, 2020

Hrn... thinking twice, that means I probably also need sim models of all the xilinx PLL etc... that won't be as easy as I initially thought...

@ozbenh
Copy link
Contributor

ozbenh commented May 17, 2020

Allright, had to hack/tweak a few things, I'll get back to you, I now got the sim running. I'll try to get to the bottom of it but it might take a while. I assume there's no way to get the report() statements out of ghdl.... Also note that --trace-fst and --trace-end xxx both generate errors when building the sim.

@enjoy-digital
Copy link
Owner

@ozbenh: good, have you also been able to get the CPU/BIOS working in simulation? I could work on finishing the integration with litex_sim next week. I'll also look at --trace-fst/--trace-end.

@ozbenh
Copy link
Contributor

ozbenh commented May 17, 2020

No I haven't yet. I can see the CPU fetching some instructions and I see them out of the icache but it stops doing that sanely pretty quickly. I haven't figured out why yet. Note: It's a very painful process, because microwatt stores everything in records and the ghdl-synth+yosys process turns all these into giant vectors :-( Also the vcd files coming out of litex are humongous :-)

I wish instead the records would be broken in separate wire/vectors with something like recordname_wirename instead...

Anyway, I'll continue digging as time permits.

I also noticed a while pile of warnings out of yosys (or maybe verilator?) about Case values overlap (example pattern 0x3). These seem to come from a whole bunch of those constructs in the generated verilog that do look bogus:

    input [2:0] a;
    input [23:0] b;
    input [7:0] s;
    (* parallel_case *)
    casez (s)
      8'b???????1:
        \8878  = b[2:0];
      8'b??????1?:
        \8878  = b[5:3];
      8'b?????1??:
        \8878  = b[8:6];
      8'b????1???:
        \8878  = b[11:9];
      8'b???1????:
        \8878  = b[14:12];
      8'b??1?????:
        \8878  = b[17:15];
      8'b?1??????:
        \8878  = b[20:18];
      8'b1???????:
        \8878  = b[23:21];
      default:
        \8878  = a;
    endcase
  endfunction

I'm pretty sure the "simplified vhdl" that ghdl spits out has all those "?" as "0"

@ozbenh
Copy link
Contributor

ozbenh commented May 18, 2020

So there was a ghdl synth bug. I made a test case and Tristan fixed it (ghdl/ghdl#1319). It works in sim with the latest microwatt, though you probably want the patch below applied to microwatt (at least until Anton merges it ) and then wire the interrupt to the core to '0'.

Note about interrupts: If we're ever going to run Linux on microwatt with LiteX we'll want the xics interrupt controller model, not the traditional LiteX one. Which probably means adding SW support for it as well to the LiteX BIOS.

 [PATCH] irq: Simplify xics->core irq input

Use a simple wire. common.vhdl types are better kept for things
local to the core. We can add more wires later if we need to for
HV irqs etc...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---
 common.vhdl   | 5 -----
 core.vhdl     | 4 ++--
 execute1.vhdl | 4 ++--
 soc.vhdl      | 6 +++---
 xics.vhdl     | 4 ++--
 5 files changed, 9 insertions(+), 14 deletions(-)

diff --git a/common.vhdl b/common.vhdl
index ed97e0c..61252bd 100644
--- a/common.vhdl
+++ b/common.vhdl
@@ -316,11 +316,6 @@ package common is
     constant WritebackToCrFileInit : WritebackToCrFileType := (write_cr_enable => '0', write_xerc_enable => '0',
 							       write_xerc_data => xerc_init,
 							       others => (others => '0'));
-
-    type XicsToExecute1Type is record
-	irq : std_ulogic;
-    end record;
-
 end common;
 
 package body common is
diff --git a/core.vhdl b/core.vhdl
index 0664c73..f3806a3 100644
--- a/core.vhdl
+++ b/core.vhdl
@@ -34,7 +34,7 @@ entity core is
 	dmi_wr		: in std_ulogic;
 	dmi_ack	        : out std_ulogic;
 
-	xics_in		: in XicsToExecute1Type;
+	ext_irq		: in std_ulogic;
 
 	terminated_out   : out std_logic
         );
@@ -272,7 +272,7 @@ begin
             flush_out => flush,
 	    stall_out => ex1_stall_out,
             e_in => decode2_to_execute1,
-            i_in => xics_in,
+            ext_irq_in => ext_irq,
             l_out => execute1_to_loadstore1,
             f_out => execute1_to_fetch1,
             e_out => execute1_to_writeback,
diff --git a/execute1.vhdl b/execute1.vhdl
index 8286d30..fccba5e 100644
--- a/execute1.vhdl
+++ b/execute1.vhdl
@@ -24,7 +24,7 @@ entity execute1 is
 
 	e_in  : in Decode2ToExecute1Type;
 
-	i_in : in XicsToExecute1Type;
+	ext_irq_in : std_ulogic;
 
 	-- asynchronous
         l_out : out Execute1ToLoadstore1Type;
@@ -410,7 +410,7 @@ begin
 		ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#900#, 64));
 		report "IRQ valid: DEC";
 		irq_valid := '1';
-	    elsif i_in.irq = '1' then
+	    elsif ext_irq_in = '1' then
 		ctrl_tmp.irq_nia <= std_logic_vector(to_unsigned(16#500#, 64));
 		report "IRQ valid: External";
 		irq_valid := '1';
diff --git a/soc.vhdl b/soc.vhdl
index 841d72f..400b230 100644
--- a/soc.vhdl
+++ b/soc.vhdl
@@ -100,7 +100,7 @@ architecture behaviour of soc is
     signal wb_xics0_out  : wb_io_slave_out;
     signal int_level_in  : std_ulogic_vector(15 downto 0);
 
-    signal xics_to_execute1 : XicsToExecute1Type;
+    signal core_ext_irq  : std_ulogic;
 
     -- Main memory signals:
     signal wb_bram_in     : wishbone_master_out;
@@ -170,7 +170,7 @@ begin
 	    dmi_wr => dmi_wr,
 	    dmi_ack => dmi_core_ack,
 	    dmi_req => dmi_core_req,
-	    xics_in => xics_to_execute1
+	    ext_irq => core_ext_irq
 	    );
 
     -- Wishbone bus master arbiter & mux
@@ -512,7 +512,7 @@ begin
 	    wb_in => wb_xics0_in,
 	    wb_out => wb_xics0_out,
 	    int_level_in => int_level_in,
-	    e_out => xics_to_execute1
+	    core_irq_out => core_ext_irq
 	    );
 
     -- BRAM Memory slave
diff --git a/xics.vhdl b/xics.vhdl
index 421513a..4d3e9e5 100644
--- a/xics.vhdl
+++ b/xics.vhdl
@@ -35,7 +35,7 @@ entity xics is
 
 	int_level_in : in std_ulogic_vector(LEVEL_NUM - 1 downto 0);
 
-	e_out : out XicsToExecute1Type
+	core_irq_out : out std_ulogic
         );
 end xics;
 
@@ -80,7 +80,7 @@ begin
     wb_out.dat <= r.wb_rd_data;
     wb_out.ack <= r.wb_ack;
     wb_out.stall <= '0'; -- never stall wishbone
-    e_out.irq <= r.irq;
+    core_irq_out <= r.irq;
 
     comb : process(all)
 	variable v : reg_internal_t;

@enjoy-digital
Copy link
Owner

Great! Thanks for looking at this, i'll reproduce your results and will do the LiteX integration to automate this when runnning litex_sim --cpu=microwatt.

@enjoy-digital
Copy link
Owner

@ozbenh: with a02077d, you now just have to set use_ghdl_yosys_synth to True to convert the Microwatt sources from VHDL to verilog automatically during the build. So if you want to use it in simulation, just do litex_sim --cpu-type=microwatt or with a target: target.py --cpu-type=microwatt --build (i haven't tested on hardware yet since it seems the caches are not inferred correctly and the resource usage explodes).

@ozbenh
Copy link
Contributor

ozbenh commented May 18, 2020

Great, thanks. Yes there are problems with how memories are inferred with Yosys still.

@enjoy-digital
Copy link
Owner

By reducing the number of ICache/DCache lines to 2 to avoid the resource usage explosion, the generated verilog is working fine on hardware and built with FOSS tools :) : https://twitter.com/enjoy_digital/status/1262701132012490754

IMG_5628

@enjoy-digital
Copy link
Owner

The GHDL-Yosys-plugin path can now be selected with --cpu-variant=standard+ghdl. We can now simulate and build Microwatt with vendors' or FOSS toolchains:

Simulation with the verilog generated from GHDL-Yosys-plugin and Verilator:

lxsim --cpu-type=microwatt --cpu-variant=standard+ghdl

Build on Arty with the VHDL files:

./arty.py --cpu-type=microwatt

Build on Arty with the verilog generated from GHDL-Yosys-plugin:

./arty.py --cpu-type=microwatt --cpu-variant=standard+ghdl 

Some improvements can still be done on the integration (add burst/irq support) but this could be discussed in more specific issues/PRs.

@ozbenh
Copy link
Contributor

ozbenh commented Jun 13, 2020

Ok, that's going to be a problem for Microwatt. We rely heavily on it. Without LUT RAM things like cache tags, TLBs and register file will be orders of magnitude larger in the generated FPGA (and timing will go down the sink). Basically anything large that needs async read is a LUT RAM for us

@ozbenh
Copy link
Contributor

ozbenh commented Jun 13, 2020

and doing sync reads would require adding even more pipeline stages/latency

@daveshah1
Copy link
Collaborator

LUT RAM for ECP5 is fully supported in Yosys, there is only one configuration.

@ozbenh
Copy link
Contributor

ozbenh commented Jun 13, 2020

Thanks Dave. As long as it infers a 2D array with synchronous writes and async reads a a LUT RAM we should be ok with microwatt. If I had an ECP5 board at hand I could give it more love (and generate litedram for it) but I don't at the moment and can't quite spare the funds right now.

@ozbenh
Copy link
Contributor

ozbenh commented Jun 13, 2020

As for block RAMs, we have wrappers for all of our use of it which could easily be replaced by explicit primitives if necessary. I did that to make it easier to either replace them or tweak them to match tool inferrence limitations. Note that our dcache does use the "output register" option of Xilinx block RAMs to help timing.

@enjoy-digital enjoy-digital reopened this Jun 13, 2020
@madscientist159
Copy link

@ozbenh Still having a real hard time getting Microwatt to actually fit on an ECP5-45 with any room left over for significant (>5% die area) peripherals, even with caches cut back to one line each. Any other thoughts on trimming Microwatt back somewhat and making it fit better?

@ozbenh
Copy link
Contributor

ozbenh commented Jul 14, 2020

Not really. Someone who understands the toolchain should look into where is most of the area going, it's strange that it seems to be using 2 to 3 times more LUTs than on the Artix...

@enjoy-digital
Copy link
Owner

@madscientist159: last time i tested on ECP5, Yosys still had issues with the caches and i had to reduce NUM_LINES of the icache and dcache (tested with 2 instead of 64 as it was done in the initial GHDL Synth tests).

@madscientist159
Copy link

@enjoy-digital I'm already doing that, it's still sitting at a ridiculously high resource usage. Switching to the default RISC-V CPU (which we can't really use for other reasons -- not even set up to test / debug with it beyond synthesis) yields a drop from 95% usage to 50% usage on the ECP5, with all other peripherals etc. unchanged.

@madscientist159
Copy link

madscientist159 commented Jul 21, 2020

1959

That's it exactly, I did some digging and apparently the required TDP RAMs are not supported by Yosys, causing a ridiculous explosion in resources (something over 11k cells just for a two-line I cache and D cache). Relevant failure:

  Checking rule #4 for bram type $__ECP5_DP16KD (variant 1):
    Bram geometry: abits=10 dbits=18 wports=0 rports=0
    Estimated number of duplicates for more read ports: dups=1
    Metrics for $__ECP5_DP16KD: awaste=960 dwaste=8 bwaste=17792 waste=17792 efficiency=5
    Rule #4 for bram type $__ECP5_DP16KD (variant 1) accepted.
    Mapping to bram type $__ECP5_DP16KD (variant 1):
      Shuffle bit order to accommodate enable buckets of size 9..
      Results of bit order shuffling: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 4$
      Write port #0 is in clock domain \clk.
        Mapped to bram port A1.
      Read port #0 is in clock domain !~async~.
        Bram port B1.1 has incompatible clock type.
        Failed to map read port #0.
    Mapping to bram type $__ECP5_DP16KD failed.

@ozbenh I wonder if we could add a mode to Microwatt that just disables the caches for now since a two-line cache isn't going to be useful in the first place, and the amount of resources sucked down are causing the design to be near worthless on real world FPGAs?

@ozbenh
Copy link
Contributor

ozbenh commented Jul 21, 2020

What RAM object is this ... the main cache rams don't have async reads, they are SDP with one sync read and one sync write port... the tags however have async read, are they the problem here ? For 2 lines there should be only 2 tags and they should fit in LUT RAMs. The above log doesn't say which array/entity it is.

We don't really have a design that runs with caches off at this point, we would have to change thing potentially quite a bit, but I'd rather we fixed the above.

@ozbenh
Copy link
Contributor

ozbenh commented Jul 21, 2020

BTW. How did you change the cache sizes ? Where did you edit the generic ? You need to change the values in core.vhdl not the defaults in icache.vhdl or dcache.vhdl

@ozbenh
Copy link
Contributor

ozbenh commented Jul 21, 2020

Also.. the TLBs have async reads, so they would fit in LUT RAM. We might be able to make things smaller by having both tags and TLBs in block RAM but at the cost of some extra latency

@madscientist159
Copy link

madscientist159 commented Jul 21, 2020

@ozbenh RAM object is "icache_32_2_2_64_12_56_5ba93c9db0cff93f52b521d7420e43f6eda2784f.\897:", there are a bunch of them that are similar. The entire I cache reports no BRAM usage and a ton of cells used:

=== icache_32_2_2_64_12_56_5ba93c9db0cff93f52b521d7420e43f6eda2784f ===

   Number of wires:               5777
   Number of wire bits:           9501
   Number of public wires:        5777
   Number of public wire bits:    9501
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:               6802
     L6MUX21                      1018
     LUT4                         3750
     PFUMX                        1468
     TRELLIS_DPR16X4               112
     TRELLIS_FF                    450
     cache_ram_3_64_1489f923c4dca729178b3e3233458550d8dddf29      2
     plru_1                          2

Also, curiously, the rotator is using a ridiculous amount of resources:

=== rotator ===

   Number of wires:               6345
   Number of wire bits:           9665
   Number of public wires:        6345
   Number of public wire bits:    9665
   Number of memories:               0
   Number of memory bits:            0
   Number of processes:              0
   Number of cells:               7242
     CCU2C                         326
     L6MUX21                      1011
     LUT4                         4050
     PFUMX                        1855

Those two blocks alone account for 20% of the entire resource usage of the LiteX/Microwatt design, so something seems off. 😉

EDIT: Also, yes, defaults changed in core.vhdl. It literally won't fit at all even with a bare bones design if the caches aren't reduced significantly (I reduced them to two lines each as that seems to be as small as they will go).

@madscientist159
Copy link

@ozbenh I suppose one approach could be to move the inferred block RAMs into their own module, so that those of us with toolchains that don't actually infer BRAMs (like the Yosys one) could manually insert a device-specific instantiation...

@ozbenh
Copy link
Contributor

ozbenh commented Jul 21, 2020

The block RAMs for caches are already their own module: cache_ram

Which is why I wonder whether the problem you are having might have more to do with the TLBs which are async read,
though that's a hell lot of LUTs... the TLB in the icache is direct mapped, so shouldn't really take more than 1024 4-bit LUTs unless Yosys can't do LUT RAM. The TLB in the dcache is set associative and thus is going to be bigger.

Double check you changed the caches size properly in core.vhdl though.

As for the rotator I don't know what's up there, it might be worth talking to Paul, it could be that either ghdl-synth or yosys is doing a crap job of it... the Vivado synth summary looks like this (I don't know how to get more detailed cell level util for it):

Module rotator
Detailed RTL Component Info :
+---Adders :
           2 Input      6 Bit       Adders := 1
+---Muxes :
           4 Input     64 Bit        Muxes := 4
           2 Input     32 Bit        Muxes := 2
           2 Input      7 Bit        Muxes := 6
           2 Input      6 Bit        Muxes := 1
           2 Input      2 Bit        Muxes := 1
           2 Input      1 Bit        Muxes := 1

@madscientist159
Copy link

madscientist159 commented Oct 11, 2020

I'm currently working on trying to get IRQ support enabled for Microwatt in LiteX. Before I dive too deeply into this, is this support already partially (or fully) implemented in anyone's development tree but not yet merged?

EDIT: The approach I'm considering would be to instantiate the xics_icp / xics_ics modules into the Microwatt Python data, and wire them up appropriately. Anything else I should be aware of / any "gotchas" with the current XICS HDL?

@madscientist159
Copy link

I've been working on this for the past couple of days, making decent progress.

The current snag is the fact that POWER has a fixed location for its exception handlers, and with the current Microwatt implementation that exception handler table is located in read only data. Both the BIOS and the application need to use different exception handlers, so I'm currently working on rewriting and relocating at least part of the handler into RAM so that it can be patched by the loaded application.

@enjoy-digital
Copy link
Owner

@madscientist159: thanks for the feedback. In case it could be useful for your tests, it's possible to make the embedded ROM writable by setting integrated_rom_mode to rw here https://github.com/enjoy-digital/litex/blob/master/litex/soc/integration/soc_core.py#L74.

@ozbenh
Copy link
Contributor

ozbenh commented Oct 14, 2020

Due to how the POWER architecture works, you should really have your main RAM at 0... We could add some kind of microwatt specific hackery to offset the exception vectors but this would make it harder to do things like support Linux

@madscientist159
Copy link

@ozbenh Good to know. I did get things working with a soft offset, basically I wrote some hardcoded exception entry asm that uses a reserved vector address variable in SRAM to jump to the generic C exception handler vector. I might push that up as at least a temporary enablement patch, since LiteX seems to make a lot of assumptions about what exists at address 0 (basically the BIOS, and to a large extent does expect it to be ROM, not RAM).

@madscientist159
Copy link

madscientist159 commented Oct 14, 2020

@ozbenh One other question -- is the XICS controller actually documented anywhere? I pieced together enough from the existing source to write some basic test code and verified that the IPI and DEC interrupts work, but I haven't been having luck so far on external interrupts. I suspect I just haven't set up the controller properly, but since there are no public docs on XICS registers / operation that I've been able to find I'm stumbling around a bit in the dark here.

@ozbenh
Copy link
Contributor

ozbenh commented Oct 14, 2020

Sadly it's not anywhere I'm aware of. Also while the presentation registers (ICP) are somewhat architected, the source control ones are not, they are generally hidden behind some firmware (RTAS or OPAL). I do have a patch somewhere for Linux to add support to the ones I made up for microwatt in my github tree. It's pretty trivial though, mostly the registers contain the priority of the interrupt for each source, with "ff" meaning masked.

@madscientist159
Copy link

madscientist159 commented Oct 16, 2020

@benh That was enough to get me pointed in the right direction...LiteX/Microwatt now has proper interrupt support in PR #677 !

If there's a desire to shuffle things around such that address 0 is RAM and some other address is ROM, I'd recommend that be a separate patch series. It's not just as simple as rearranging the RAM/ROM in the address space, partly because the LiteX BIOS assumes it can turn on interrupts for its own purposes. A follow-on series that moves RAM/ROM around would also have to add code to both set up the tables in RAM and avoid loading applications over them...

@enjoy-digital
Copy link
Owner

Thanks all, i think we can now close this since Microwatt is now supported in LiteX and create separate issues for the remaining specific points, as #693 for example.

@madscientist159
Copy link

madscientist159 commented Nov 9, 2020

@ozbenh While the IRQ system has been working quite nicely overall, I've noticed we get a single spurious interrupt with source 0 / IRQ 0 at startup. It seems harmless enough, and right now we're just ignoring anything with source 0, but I'd like to know what is generating the interrupt and what the purpose of source 0 is (already figured out source 2 is the IPI mechanism). Any hints? 😉

@ozbenh
Copy link
Contributor

ozbenh commented Nov 10, 2020 via email

@PrithvirajChauhan1
Copy link

PrithvirajChauhan1 commented May 29, 2021

I wanted know to is there any resource for interfacing Ethernet with microwatt using litex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants