TordBoyau

A pipelined RISC-V processor

Instructions

Included Vivado project is configured for an ARTY A35T

Load project in Vivado, synthesize, create bitstream, send to device
Included firmware computes a raytracing image and displays it on the TTY using ANSI codes
Connect to terminal using 1000000 bauds (see terminal.sh, adapt to your setup)

Instructions for other boards / Yosys-NextPNR

Plug your board, then use BOARDS/run_<BOARD_NAME>.sh. Implemented for:

ulx3s

(other boards are coming)

Configuration

Several parameters can be configured in soc.v:

Name	Description
`CPU_FREQ`	Depending on the options, timings will validate around 100-120 MHz
`CONFIG_PC_PREDICT`	Enables `D`-`F` path, used by branch prediction and return address stack
`CONFIG_RAS`	Enables return address stack
`CONFIG_GSHARE`	GSHARE branch predictor (uses BTFNT if not set)
`CONFIG_RV32M`	RV32M instruction set (`MUL`,`DIV`,`REM`).
`CONFIG_DEBUG`	Enables built-in debugger/disassembler (used in simulation)
`CONFIG_INITIALIZE`	Initializes register file and BHT (required by Icarus and some synth tools)

Firmware

Firmware takes the form of two files, PROGROM.hex that contains code, and DATARAM.hex that contains variables initialization. The included firmware computes an image in raytracing and sends it to the TTY (1000000 bauds). It also measures the average CPI, and a 'raystones' performance score (pixels/s/MHz).

Some precompiled firmwares are available in PRECOMPILED_FIRMWARE/<arch>/<progname>/PROGROM.hex and DATARAM.hex. To use one of them, just copy PROGROM.hex and DATARAM.hex in TordBoyau/ (the same directory that contains soc.v) and re-synthesize (or launch simulation).

Other firmwares can be compiled, see learn-fpga, pipeline tutorial for more details (PROGROM.hex and DATARAM.hex are portable between both projects, just make sure you target the same instruction set (RV32I or RV32IM). You will need also to remove all the lines of zeroes after line 1024 in DATARAM.hex (the core in learn-fpga is configured with 64kB of data ram, and here it is 16kB, which suffices for most examples).

Performance (RV32I) (A35T/Vivado)

branch prediction	CoreMarks/MHz	DMips/MHz	Raystones	LUTs	FFs	MaxFreq
none	0.928	1.298	5.665	909	517	125 MHz
static (BTFNT)	1.118	1.488	6.633	938	516	125 MHz
static + RAS	1.147	1.528	6.795	1040	676	105 MHz
gshare	1.124	1.562	7.186	1297	547	120 MHz
gshare + RAS	1.153	1.606	7.375	1388	711	100 MHz

Performance (RV32IM) (A35T/Vivado)

branch prediction	CoreMarks/MHz	DMips/MHz	Raystones	LUTs	FFs	MaxFreq
none	2.387	1.341	15.296	1368	681	< 80 MHz
static (BTFNT)	2.763	1.545	16.097	1363	680	< 80 MHz
static + RAS	2.790	1.579	16.476	1478	840	< 80 MHz
gshare	2.837	1.597	17.753	1760	711	< 80 MHz
gshare + RAS	2.866	1.634	18.215	1801	875	< 80 MHz

Vivado complains that it fails to meet timings even at 80 MHz, to be investigated...
However, in practice, it seems to work at 140 MHz with the largest configuration (gshare + RAS). CoreMarks and Dhrystones both validate correct operation, and RayStones generates the correct image.

Debugger / disassembler

Simulation can be started using BOARDS/run_verilator.sh. If CONFIG_DEBUG is set in soc.v, then one can see the content of the pipeline stages, the hazards, register forwarding, branch prediction, return address stack. It is also possible to create "breakpoints", by defining the breakpoint signal in TordBoyau5.v (default breakpoint is on TTY character display).

Sequential pipeline

A completely sequential version TordBoyau5_sequential is included. It has a state machine that executes each stage sequentially, without hazard nor data forwarding. It is there to estimate an upper boundary of what maxfreq one can expect on a given FPGA. On the ARTY, it validates at 150 MHz (still works at 160 MHz).

Documentation on the design

Next steps / TODO

Try to validate RV32IM at 140 MHz or so
- Activating RAS makes maxfreq drop, to be investigated.
- Activating RV32M makes maxfreq dramatically drop, to be investigated.
- I don't have a Branch Target Buffer, I'm always computing the branch target, maybe it is not good.
RAM is loaded at the end of the Execute stage and written in Mem stage. Maybe it is not good (especially if it uses two ports of the BRAM)
register bank is read at the beginning of Execute instead of Decode, which is not classical. On the positive side, then register forwarding muxes only need to be three-ways. On the negative side, it probably makes the critical path longer.
Write Amaranth glue code for LiteX, so that we can run Doom on it. Doom already works for the simpler non-pipelined FemtoRV cores. Here we need to adapt LiteX cache and plug it onto PROGROM and DATARAM.
It seems that alignment logic for load and store plays a role in the critical path. A 6 stages pipeline may be more optimal, to be tested.
Write scripts to synthesize using yosys and nextpnr-xilinx
Write scripts for other boards (ULX3S, orange crab, ...)

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
BOARDS		BOARDS
Images		Images
PLL		PLL
PRECOMPILED_FIRMWARE		PRECOMPILED_FIRMWARE
DATARAM.hex		DATARAM.hex
LICENSE		LICENSE
PROGROM.hex		PROGROM.hex
README.md		README.md
TordBoyau.xpr		TordBoyau.xpr
TordBoyau5.v		TordBoyau5.v
TordBoyau5_sequential.v		TordBoyau5_sequential.v
arty.xdc		arty.xdc
clean.sh		clean.sh
clockworks.v		clockworks.v
emitter_uart.v		emitter_uart.v
riscv_disassembly.v		riscv_disassembly.v
sim_main.cpp		sim_main.cpp
soc.v		soc.v
terminal.sh		terminal.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TordBoyau

Instructions

Instructions for other boards / Yosys-NextPNR

Configuration

Firmware

Performance (RV32I) (A35T/Vivado)

Performance (RV32IM) (A35T/Vivado)

Debugger / disassembler

Sequential pipeline

Documentation on the design

Next steps / TODO

About

Releases

Packages

Languages

License

BrunoLevy/TordBoyau

Folders and files

Latest commit

History

Repository files navigation

TordBoyau

Instructions

Instructions for other boards / Yosys-NextPNR

Configuration

Firmware

Performance (RV32I) (A35T/Vivado)

Performance (RV32IM) (A35T/Vivado)

Debugger / disassembler

Sequential pipeline

Documentation on the design

Next steps / TODO

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages