## SoC Design Lab 4-2 Report

組員: 柯駿驊 陳法諭 聶家任 莊詠翔

1. Design block diagram –datapath, control-path



- 2. The interface protocol between firmware, user project and testbench The interface protocol between firmware, user project and testbench is the Wishbone. We need to convert the WB to AXI-Lite interface to communicate with our Verilog-FIR. If wb address is in the range 3000\_0000 3000\_007F, read/write convert to Axilite. If wb address is 3000\_0080 ( send X[n]), write transaction converts to axi stream master to send data to verilog FIR. If wb address is 3000\_0084 ( read Y[n] ), Read transaction, converts to axi stream slave to read Y[n] from verilog FIR.
- 3. Waveform and analysis of the hardware/software behavior. 由於我們在 firware 中以 for loop 的形式將資料送給 fir engine, WB 接收到兩個 output 間的時間會因為去讀取 0x3800....的地方而拉長,如下圖所示,從 fir 的第一個 output 至第二個 output 的間隔約為 500cycle。



4. What is the FIR engine theoretical throughput, i.e. data rate? Actually measured throughput?

The theoretical throughput would 1/11\*clock frequency since the fir engine needs 11 cycle to compute one stream output. However, since we use bram to store needed data. We need to use more cycle since bram is synchronous read. As figure below, actually measured throughput in our design is 1/24\*clock frequency. (21843 cycle to 21866 cycle)



5. What is latency for firmware to feed data?

由於我們在 firware 中以 for loop 的形式將資料送給 fir engine,WB 接收到兩個 output 間的時間會因為去讀取 0x3800...的地方而拉長,如下圖所示,從 fir 的第一個 output 至第二個 output 的間隔約為 500 cycle。





6. What techniques used to improve the throughput? Does bram12 give better performance, in what way?

我們可以使用 bram12 來增加 throughput,由於本次的 fir engine 最多只需要進行 11 次乘加運算,使用 bram12 可以利用到 pre-fetch 的方式,在前一個 cycle 提前準備好資料,故能將實際 throughput 增加。

7. Can you suggest other method to improve the performance? 在合成時,可以利用 retime 的技術,提升效能。