Permalink
Browse files

Zu9 board support (#98)

* Fixed scripts to work on zu9 MPSoC boards. Documented script interface.
  • Loading branch information...
jameshegarty committed Jan 30, 2018
1 parent fbebc49 commit 8cd97bd34fc4026362dc99b1685b5eaf45e5a385
View
@@ -88,8 +88,83 @@ A verbose debug mode can be activated by setting the environment variable `v`, i
v=1 make terra
Camera Test Rig
Build Scripts
---------------
Rigel hosts build scripts that automatically test streaming modules (e.g. those produced with Rigel) on multiple Xilinx Zynq SoC platforms. The scripts are not tied to Rigel however, and can be used with any verilog block that implements a streaming (Handshake) interface, similar to AXI-stream, and provides valid configuration options.
Each supported SoC platform has a folder in the *platform/* directory. Each platform is identified by a name in the format *CHIPCOMPILER*. Currently support CHIP options are **zynq10** (Zynq 7010, such as the Zybo board), **zynq20** (Zynq 7020, such as the Zedboard), **zynq100** (Zynq 7100, such as the AES-MINI-ITX-7z100-G), and **zu9** (Zynq Ultrascale+ MPSOC XCZU9EG, such as the ZCU102 board). Currently supported COMPILER options are **ise** and **vivado**. For example, compiling for the Zedboard (which uses the 7020 chip) on Vivado would use the scripts in *platform/zynq20vivado*.
As the scripts only use internal wiring on the FPGA (no external IO), they should be robust to working on different FPGA board with the same chip. Different SKUs in the same chip series can typically be supported by simply changing the Xilinx 'part' option in the compile script. Pull requests for more parts would be appreciated.
Simulator platforms are also supported: *platform/verilator* contains scripts to test the target block using the open source verilator simulator.
### Scripts ###
Within each platform directory, two scripts are provided:
**compile** VERILOG_FILE METADATA_FILE BUILDDIR OUTFILE
Wrap the input Verilog file with a test harness, and compile it to either a bitstream or executable (for simulators).
* *VERILOG_FILE*: path to the verilog file input (see description below for expected module interface).
* *METADATA_FILE*: path to the metadata file describing the Verilog module in VERILOG_FILE (see description below).
* *BUILDDIR*: Path to a folder where build temporaries, logs, and stats can be written. Each script will expect the directory *BUILDDIR* to exist in the working directory from which they are called.
* *OUTFILE*: Path where the final output file of the script should be written. For Zynq plaforms this is a bitfile, for Verilator it is an executable.
**run** BITFILE METADATA_FILE OUTFILE PLATFORM_SPECIFIC
For FPGA platforms, this logs in to the FPGA, programs the fabric, runs the tests, and transfers it back to the host. For simulator platforms, this runs the simulator.
* *BITFILE*: Compiled FPGA bitfile to run (returned from the compile script). For simulator platforms, this is the compiled executable.
* *METADATA_FILE*: path to the metadata file describing the Verilog module in the bitfile (see description below).
* *OUTFILE*: Path where the final output file of the script should be written. This will be the output of the streaming module, written to a file as raw binary data.
* *PLATFORM_SPECIFIC*: (optional) platform dependent data field. For FPGA platforms, this is the IP address of the FPGA board (e.g. 192.168.1.10).
Each script will expect all paths to be relative to the working directory from which they are called.
### Module Metadata File ###
The module metadata file describes the IO interface of your verilog module, and the test data to run. Module metadata is provided in lua table format:
return {topModule=[String], tapBits=[Number], tapValue=[String], inputImage=[String],
inputBitsPerPixel=[Number], inputV=[Number], inputWidth=[Number], inputHeight=[Number],
inputV=[Number], outputBitsPerPixel=[Number], outputV=[Number], outputWidth=[Number],
outputHeight=[Number]}
With the following definition:
* *topModule*: Name of verilog module to run as top.
* *tapBits*: Total size in bits of runtime configurable GPIO constants.
* *tapValue*: GPIO constants to use for test, in hex.
* *inputImage*: Path to test input data file (stored in raw binary format). Path should be relative to the working directory where the build scripts are run from.
* *bitsPerPixel*: Number of bits per pixel.
* *V*: Number of pixels your module expects per cycle. Width of your verilog module should be *V*bitsPerPixel*.
* *width*: image width in pixels.
* *height*: image height in pixels.
The options 'bitsPerPixel', 'V', 'width', and 'height' behave the same for input and output.
*Note:* the metadata file format contains redundant information so that the output of your module can be interpreted as an image. However, this is not a strict requirement, and the script works for raw binary data as well. If IO is not an image, simply set 'width=# of tokens', 'height=1', 'bitsPerPixel=bitwidth of module interface', and 'V=1'.
## Expected Verilog Module interface ##
The scripts expect the Verilog top module being tested to have the following format:
module Top( input CLK, input reset, input [tapBits-1:0] taps,
input [inputBitsPerPixel*inputV:0] process_input, output ready,
output [outputBitsPerPixel*outputV:0] process_output, input ready_downstream );
With the following definition:
* *CLK*: module clock
* *reset*: synchronous module reset, resets when high.
* *taps*: Packed GPIO pins.
* *process_input*: Input data token. Valid bit is packed in top bit of this value.
* *ready*: upstream ready (for the module input)
* *process_output*: Output data token. Valid bit is packed in top bit of this value.
* *ready_downstream* downstream ready (for the module output)
If your Verilog module does not conform to this interface, we suggest adding a small wrapper module to translate the interfaces, and marking that as top.
### Caveats ###
While the scripts should be able to build bitstreams successfully for most boards, Zynq SoCs also contain a large software stack that is more variable. Non-standard Linux configurations can cause our scripts to need to vary significantly. For most run scripts, we tested it with the default binaries from the [Xilinx binary releases](http://www.wiki.xilinx.com/Zynq+Releases).
Overview
========
View
@@ -225,7 +225,7 @@ function harnessTop(t)
local harnessOption = t.harness
if harnessOption==nil then harnessOption=1 end
local MD = {inputBitsPerPixel=R.extractData(iover):verilogBits()/(inputP), inputWidth=t.inSize[1], inputHeight=t.inSize[2], outputBitsPerPixel=oover:verilogBits()/(outputP), outputWidth=t.outSize[1], outputHeight=t.outSize[2], inputImage=t.inFile, topModule= fn.name, inputP=inputP, outputP=outputP, simCycles=t.simCycles, tapBits=tapBits, tapValue=tapValueString, harness=harnessOption, ramFile=t.ramFile}
local MD = {inputBitsPerPixel=R.extractData(iover):verilogBits()/(inputP), inputWidth=t.inSize[1], inputHeight=t.inSize[2], outputBitsPerPixel=oover:verilogBits()/(outputP), outputWidth=t.outSize[1], outputHeight=t.outSize[2], inputImage=t.inFile, topModule= fn.name, inputV=inputP, outputV=outputP, simCycles=t.simCycles, tapBits=tapBits, tapValue=tapValueString, harness=harnessOption, ramFile=t.ramFile}
if fn.sdfInput~=nil then
assert(#fn.sdfInput==1)
View
@@ -1,38 +1,3 @@
//-----------------------------------------------------------------------------
// system.v
//-----------------------------------------------------------------------------
// The axi bus expects the number of valid data items to exactly match the # of addresses we send.
// This module checks for underflow (too few valid data items). If there are too few, it inserts DEADBEEFs to make it correct.
// lengthOutput is in bytes
module UnderflowShim(input CLK, input RST, input [31:0] lengthOutput, input [63:0] inp, input inp_valid, output [63:0] out, output out_valid);
parameter WAIT_CYCLES = 2048;
reg [31:0] outCnt;
reg [31:0] outLen;
reg fixupMode;
reg [31:0] outClks = 0;
always@(posedge CLK) begin
if (RST) begin
outCnt <= 32'd0;
outLen <= lengthOutput;
fixupMode <= 1'b0;
outClks <= 32'd0;
end else begin
outClks <= outClks + 32'd1;
if(inp_valid || fixupMode) begin outCnt <= outCnt+32'd8; end // AXI does 8 bytes per clock
if(outClks > WAIT_CYCLES) begin fixupMode <= 1'b1; end
end
end
assign out = (fixupMode)?(64'hDEAD):(inp);
assign out_valid = (RST)?(1'b0):((fixupMode)?(outCnt<outLen):(inp_valid));
endmodule
module stage
(
inout [53:0] MIO,
@@ -131,7 +96,7 @@ module stage
assign CONFIG_READY = READER_READY && WRITER_READY;
Conf conf(
Conf #(.ADDR_BASE(32'h70000000)) conf(
.ACLK(FCLK0),
.ARESETN(ARESETN),
.S_AXI_ARADDR(PS7_ARADDR),
View
@@ -99,6 +99,8 @@ module Conf(
.M_AXI_WVALID(LITE_WVALID)
);
parameter ADDR_BASE = 32'd0;
parameter NREG = 4;
parameter W = 32;
@@ -113,7 +115,7 @@ reg [31:0] counter;
reg r_state;
wire [1:0] r_select;
assign r_select = LITE_ARADDR[3:2];
assign ar_good = {LITE_ARADDR[31:4], 2'b00, LITE_ARADDR[1:0]} == 32'h70000000;
assign ar_good = {LITE_ARADDR[31:4], 2'b00, LITE_ARADDR[1:0]} == ADDR_BASE;
assign LITE_ARREADY = (r_state == IDLE);
assign LITE_RVALID = (r_state == RWAIT);
always @(posedge ACLK) begin
@@ -142,7 +144,7 @@ reg w_wroteresp;
wire [1:0] w_select;
assign w_select = LITE_AWADDR[3:2];
assign aw_good = {LITE_AWADDR[31:4], 2'b00, LITE_AWADDR[1:0]} == 32'h70000000;
assign aw_good = {LITE_AWADDR[31:4], 2'b00, LITE_AWADDR[1:0]} == ADDR_BASE;
assign LITE_AWREADY = (w_state == IDLE);
assign LITE_WREADY = (w_state == RWAIT) && !w_wrotedata;
View
@@ -6,8 +6,13 @@ module stage
wire [3:0] fclk;
wire [3:0] fclkresetn;
wire FCLK0;
BUFG bufg(.I(fclk[0]),.O(FCLK0));
assign ARESETN = fclkresetn[0];
wire ARESETN;
//AA change here: removed buffer for now
BUFG_PS bufg(.I(fclk[0]),.O(FCLK0));
//assign FCLK0 = fclk[0];
assign ARESETN = 1'b1; //fclkresetn[0];
wire [31:0] PS7_ARADDR;
@@ -75,7 +80,7 @@ module stage
assign CONFIG_READY = READER_READY && WRITER_READY;
Conf conf(
Conf #(.ADDR_BASE(32'hA0000000)) conf(
.ACLK(FCLK0),
.ARESETN(ARESETN),
.S_AXI_ARADDR(PS7_ARADDR),
@@ -116,7 +121,7 @@ module stage
reg [31:0] clkcnt = 0;
// assign LED = clkcnt[20:13];
assign LED = clkcnt[28:21];
assign LED = CONFIG_SRC[15:8];//clkcnt[28:21];
always @(posedge FCLK0) begin
// if(ARESETN == 0)
@@ -321,51 +326,55 @@ module stage
.MAXIGP2RREADY (),
.MAXIGP2AWQOS (),
.MAXIGP2ARQOS (),
.SAXIGP0RCLK (),
.SAXIGP0WCLK (),
.SAXIGP0RCLK (FCLK0),
.SAXIGP0WCLK (FCLK0),
.SAXIGP0ARUSER (),
.SAXIGP0AWUSER (),
.SAXIGP0AWID (),
.SAXIGP0AWADDR (),
.SAXIGP0AWLEN (),
.SAXIGP0AWSIZE (),
.SAXIGP0AWBURST (),
.SAXIGP0AWADDR (M_AXI_AWADDR),
.SAXIGP0AWLEN (M_AXI_AWLEN),
.SAXIGP0AWSIZE (M_AXI_AWSIZE),
.SAXIGP0AWBURST (M_AXI_AWBURST),
.SAXIGP0AWLOCK (),
.SAXIGP0AWCACHE (),
.SAXIGP0AWPROT (),
.SAXIGP0AWVALID (),
.SAXIGP0AWREADY (),
.SAXIGP0WDATA (),
.SAXIGP0WSTRB (),
.SAXIGP0WLAST (),
.SAXIGP0WVALID (),
.SAXIGP0WREADY (),
.SAXIGP0AWVALID (M_AXI_AWVALID),
.SAXIGP0AWREADY (M_AXI_AWREADY),
.SAXIGP0WDATA (M_AXI_WDATA),
.SAXIGP0WSTRB (M_AXI_WSTRB),
.SAXIGP0WLAST (M_AXI_WLAST),
.SAXIGP0WVALID (M_AXI_WVALID),
.SAXIGP0WREADY (M_AXI_WREADY),
.SAXIGP0BID (),
.SAXIGP0BRESP (),
.SAXIGP0BVALID (),
.SAXIGP0BREADY (),
.SAXIGP0BRESP (M_AXI_BRESP),
.SAXIGP0BVALID (M_AXI_BVALID),
.SAXIGP0BREADY (M_AXI_BREADY),
.SAXIGP0ARID (),
.SAXIGP0ARADDR (),
.SAXIGP0ARLEN (),
.SAXIGP0ARSIZE (),
.SAXIGP0ARBURST (),
.SAXIGP0ARADDR (M_AXI_ARADDR),
.SAXIGP0ARLEN (M_AXI_ARLEN),
.SAXIGP0ARSIZE (M_AXI_ARSIZE),
.SAXIGP0ARBURST (M_AXI_ARBURST),
.SAXIGP0ARLOCK (),
.SAXIGP0ARCACHE (),
.SAXIGP0ARPROT (),
.SAXIGP0ARVALID (),
.SAXIGP0ARREADY (),
.SAXIGP0ARVALID (M_AXI_ARVALID),
.SAXIGP0ARREADY (M_AXI_ARREADY),
.SAXIGP0RID (),
.SAXIGP0RDATA (),
.SAXIGP0RRESP (),
.SAXIGP0RLAST (),
.SAXIGP0RVALID (),
.SAXIGP0RREADY (),
.SAXIGP0RDATA (M_AXI_RDATA),
.SAXIGP0RRESP (M_AXI_RRESP),
.SAXIGP0RLAST (M_AXI_RLAST),
.SAXIGP0RVALID (M_AXI_RVALID),
.SAXIGP0RREADY (M_AXI_RREADY),
.SAXIGP0AWQOS (),
.SAXIGP0ARQOS (),
.SAXIGP0RCOUNT (),
.SAXIGP0WCOUNT (),
.SAXIGP0RACOUNT (),
.SAXIGP0WACOUNT (),
.SAXIGP1RCLK (),
.SAXIGP1WCLK (),
.SAXIGP1ARUSER (),
@@ -636,46 +645,48 @@ module stage
.SAXIGP6WCOUNT (),
.SAXIGP6RACOUNT (),
.SAXIGP6WACOUNT (),
.SAXIACPACLK (FCLK0),
.SAXIACPAWADDR (M_AXI_AWADDR),
.SAXIACPACLK (),
.SAXIACPAWADDR (),
.SAXIACPAWID (),
.SAXIACPAWLEN (M_AXI_AWLEN),
.SAXIACPAWSIZE (M_AXI_AWSIZE),
.SAXIACPAWBURST (M_AXI_AWBURST),
.SAXIACPAWLEN (),
.SAXIACPAWSIZE (),
.SAXIACPAWBURST (),
.SAXIACPAWLOCK (),
.SAXIACPAWCACHE (),
.SAXIACPAWPROT (),
.SAXIACPAWVALID (M_AXI_AWVALID),
.SAXIACPAWREADY (M_AXI_AWREADY),
.SAXIACPAWVALID (),
.SAXIACPAWREADY (),
.SAXIACPAWUSER (),
.SAXIACPAWQOS (),
.SAXIACPWLAST (M_AXI_WLAST),
.SAXIACPWDATA (M_AXI_WDATA),
.SAXIACPWSTRB (M_AXI_WSTRB),
.SAXIACPWVALID (M_AXI_WVALID),
.SAXIACPWREADY (M_AXI_WREADY),
.SAXIACPBRESP (M_AXI_BRESP),
.SAXIACPWLAST (),
.SAXIACPWDATA (),
.SAXIACPWSTRB (),
.SAXIACPWVALID (),
.SAXIACPWREADY (),
.SAXIACPBRESP (),
.SAXIACPBID (),
.SAXIACPBVALID (M_AXI_BVALID),
.SAXIACPBREADY (M_AXI_BREADY),
.SAXIACPARADDR (M_AXI_ARADDR),
.SAXIACPBVALID (),
.SAXIACPBREADY (),
.SAXIACPARADDR (),
.SAXIACPARID (),
.SAXIACPARLEN (M_AXI_ARLEN),
.SAXIACPARSIZE (M_AXI_ARSIZE),
.SAXIACPARBURST (M_AXI_ARBURST),
.SAXIACPARLEN (),
.SAXIACPARSIZE (),
.SAXIACPARBURST (),
.SAXIACPARLOCK (),
.SAXIACPARCACHE (),
.SAXIACPARPROT (),
.SAXIACPARVALID (M_AXI_ARVALID),
.SAXIACPARREADY (M_AXI_ARREADY),
.SAXIACPARVALID (),
.SAXIACPARREADY (),
.SAXIACPARUSER (),
.SAXIACPARQOS (),
.SAXIACPRID (),
.SAXIACPRLAST (M_AXI_RLAST),
.SAXIACPRDATA (M_AXI_RDATA),
.SAXIACPRRESP (M_AXI_RRESP),
.SAXIACPRVALID (M_AXI_RVALID),
.SAXIACPRREADY (M_AXI_RREADY),
.SAXIACPRLAST (),
.SAXIACPRDATA (),
.SAXIACPRRESP (),
.SAXIACPRVALID (),
.SAXIACPRREADY (),
.PLACECLK (),
.SACEFPDAWVALID (),
.SACEFPDAWREADY (),
View
@@ -194,12 +194,27 @@ local globals = {}
if metadata.tapBits>0 then
globals[R.newGlobal("taps","input",types.bits(metadata.tapBits),0)] = 1
end
local hsfnorig = RM.liftVerilog( metadata.topModule, R.Handshake(types.bits(metadata.inputBitsPerPixel*metadata.inputP)), R.Handshake(types.bits(metadata.outputBitsPerPixel*metadata.outputP)), readAll(VERILOGFILE), globals, {{metadata.sdfInputN,metadata.sdfInputD}}, {{metadata.sdfOutputN,metadata.sdfOutputD}})
local hsfnSdfInput = {{1,1}}
local hsfnSdfOutput = {{1,1}}
if metadata.sdfInputN~=nil then
hsfnSdfInput = {{metadata.sdfInputN,metadata.sdfInputD}}
hsfnSdfOutput = {{metadata.sdfOutputN,metadata.sdfOutputD}}
end
-- hack: if user passed us now extected cycles info, just set expected cycles to a billion,
-- which should work for almost all pipelines, but still protect the bus (maybe?)
if metadata.sdfInputN==nil and metadata.earlyOverride==nil then
metadata.earlyOverride = 16*1024*1024*100
end
local hsfnorig = RM.liftVerilog( metadata.topModule, R.Handshake(types.bits(metadata.inputBitsPerPixel*metadata.inputV)), R.Handshake(types.bits(metadata.outputBitsPerPixel*metadata.outputV)), readAll(VERILOGFILE), globals, hsfnSdfInput, hsfnSdfOutput)
local hsfn = axiRateWrapper(hsfnorig)
local iRatio, oRatio = R.extractData(hsfn.inputType):verilogBits()/R.extractData(hsfnorig.inputType):verilogBits(), R.extractData(hsfn.outputType):verilogBits()/R.extractData(hsfnorig.outputType):verilogBits()
--print("IRATIO",iRatio,oRatio,metadata.inputP,metadata.outputP)
local inputCount = (metadata.inputWidth*metadata.inputHeight)/(iRatio*metadata.inputP)
local outputCount = (metadata.outputWidth*metadata.outputHeight)/(oRatio*metadata.outputP)
local inputCount = (metadata.inputWidth*metadata.inputHeight)/(iRatio*metadata.inputV)
local outputCount = (metadata.outputWidth*metadata.outputHeight)/(oRatio*metadata.outputV)
local inputBytes, outputBytes
hsfn, inputBytes, outputBytes = harnessAxi( hsfn, inputCount, outputCount, metadata.underflowTest, hsfn.inputType, metadata.earlyOverride )
------------------------------
Oops, something went wrong.

0 comments on commit 8cd97bd

Please sign in to comment.