## FIR Workbook (lab\_3)

Student ID: 20726557

This lab involves designing a FIR filter which communicates through the AXI-Lite and AXI-Stream protocols. AXI-Lite controls the configuration registers and read/write of the tap coefficients, while AXI-Stream controls the dataflow of the FIR filter. The tap coefficients and data for the FIR filter are stored inside BRAM modules.

At the beginning of operation, ap\_idle is set. After initial configuration of the data length, number of taps, and the tap coefficients, the FIR filter receives ap\_start through AXI-Lite protocol and deasserts ap\_idle.

When ap\_start is asserted, FIR filter operation retrieves and outputs the computed FIR data through the AXI-Stream protocol. When the last data is retrieved and outputted, ap\_done and ap\_idle will be asserted to await for next FIR filter operation.

## **Block Diagram**



### **Operation**

#### **AXI Protocols**

For this lab, the AXI-Lite and the AXI-Stream Protocols are used to read/write data to the configuration registers and the BRAM.

#### **AXI-Lite Read**

The AXI-Lite Read protocol contains two processes: address and data.



To read the data, the receiver sends an "arvalid" along with the address to be read "araddr". The transmitter samples the address and asserts "arready" to notify the receiver to invalidate "araddr". After sending a "arready" as a response to the "arvalid", the transmitter will assert "rvalid" along with the data to be read "rdata". When the receiver is ready to retrieve the data, it sets "rready". When both "rvalid" and "rready" is asserted, the receiver reads the data from the transmitter and both signals deasserts in the next cycle.

#### **AXI-Lite Write**

Similar to the AXI-Lite Read protocol, AXI-Lite Write protocol contains both the address and data processes.



To write data, the transimitter asserts "awvalid" along with the receiver address "awaddr". When the receiver detects the "awvalid", the receiver sends a response "awready" to notify the transimitter that it has sampled the address. The transmitter will deassert "awvalid" and invalidate "awaddr" upon detecting "awready" from the receiver.

When the data is ready to be transmitted from the transmitter to the receiver, the transmitter asserts "wvalid" and sets the data to "wdata". The receiver samples the data from "wdata" and sends a "wready" to transmitter to deassert "wvalid" and invalidate "wdata". When the receiver have both the address and data, the data is written to the given address.

There is no dependency between transimitter asserting "awvalid" and "wvalid". The transmitter can send both send both at the same time, or one before the other.

#### **AXI-Stream**

AXI-Stream is used to stream-in and stream-out a series of data.



To transmit data, the transmitter asserts "tvalid" to indicate the data is valid to be read by the receiver. When the receiver is ready to retrieve the data through "tdata", it asserts "tready" to indicate to the transmitter to send the next set of data in the stream. The cycle

continues until the transmitter sets "tlast" along with the "tvalid" to indicate the last set of data to be sent. After receiving "tlast" and receiving the "tready" from the receiver, the transmitter will halt the stream and set "tvalid" to logic 0.

### **Configuration Registers and Tap Coefficient**

When the system first turns on, the system is in "reset" state due to the "axis\_rst\_n" set to logic low (active-low). The system becomes "idle" state once "axis\_rst\_n" is set to logic high. The ap\_idle in the AP Configuration Register is set to logic 1, while ap\_start and ap\_done is set to logic 0.

Before the start of the operation, the configuration of the data length, number of taps, and the tap coefficient must be written to the system. The data length and the number of taps are saved in their respective Configuration Registers, while the tap coefficients are saved into a BRAM. The AXI-Lite protocol is used to write the data into these configuration registers and the BRAM. The system contains a AXI-Lite Finite State Machine (FSM) to control the read/writes based on the AXI-Lite protocol mentioned above. The following are the correspond addresses for the AXI-Lite protocol.

- 0x00 AP Configuration register
- 0x10-13 Data Length Configuration Register
- 0x14-18 Tap Number Configuration Register
- 0x40-7F Tap Coefficient BRAM



Each AXI-Lite operation only writes to one address at a time. Therefore, multiple AXI-Lite operations are needed to fully program the data into the system. When the data length, number of taps, and the tap coefficient are written to their respective locations, the system can start operation.

#### **FIR Computation**

To start the operation, the ap\_start is generated by writing logic 1 through AXI-Lite Write. If ap\_idle is logic 0, the write is ignored by the configuration register as the system is already running. Writing ap\_start as logic 1 will also set ap\_idle to logic 0.

Setting ap\_start allows the FIR computation to begin and enables the use of the AXI-Stream FSM to stream-in and stream-out the data through the AXI-Stream Protocol. In this lab, the name of the "Xn" stream-in data is "ss", while the "Yn" output stream is "sm". When the system detects "ss-tready" is asserted, the axi-stream protocol starts streaming in data for the FIR computation. In addition, the ap configuration register performs a write to set ap\_start to logic 0.



When the FIR computation is started, the data BRAM is first initated by writing 0 to all addresses such that there is no "x" unknown state during computation.

The FIR computation follows the following formula:

$$y[t] = \Sigma (h[i] * x[t - i])$$

To perform the FIR computation, the "Xn" data is streamed in through AXI-Stream "sstdata". "ss-tready" is set such that the stream-in data can prepare the next set of data. At the same time, the tap coefficient 0 is read from at address[0] (0x40-43) of the Tap BRAM. The "Xn" stream-in data is multiplied by with the tap coefficient that was read. The mutiplied data is then stored and saved in a register.

After multiplying, the Data from address[0] is read from the data BRAM, and the Tap coefficient 1 is read from address[1] (0x44-47). The data from stream-in is written to data BRAM address[0] in the next cycle, and the original data from data BRAM address[0] is multiplied with the Tap coefficient 1. This is then added with the previous multiply result

saved in a register. In the subsequent cycle, a new data and tap coefficient is read from the next address and the previous data is written to the next address, causing a "shift" in data. The data is multiplied with the tap coefficient and added to the sum of all the multiples.

When the address reaches the end (depends on the number of taps), the last tap coefficient from the tap BRAM and the data from data BRAM is read out. The computed result of the FIR is then stream-out as "Yn" through "sm-tdata" in the AXI-Stream protocol. The data that was read out of the RAM is shifted out and removed from the data BRAM. The system repeats the steps above to compute the next set of FIR computation by streaming-in the next "Xn".



When the system stream-in the last data (determined by the data length), "ss-tlast" is asserted. The final round of FIR computation is performed. When the last FIR computation is completed, "Yn" is streamed out along with "sm-tlast" to indicate the last stream-out operation. The AXI-Stream FSM sets "ap\_idle" and "ap\_done" to logic 1.

To check the status on the FIR engine, the system can sample the AP configuration register for the current status. The system samples "ap\_done" to check for the completion of the FIR engine. "ap\_done" will be cleared and reset to logic 0 once it is read. The FIR engine can then be reconfigured and perform the FIR computation after setting "ap\_start" to logic 1 again.



# **Resource Usage**

FF, LUT



| 65 2. Memory                                                           |      |       |            |           |       |  |
|------------------------------------------------------------------------|------|-------|------------|-----------|-------|--|
| 66                                                                     |      |       |            |           |       |  |
| 67                                                                     |      |       |            |           |       |  |
| 68 +                                                                   | +    |       | +          | +         | +     |  |
| 69   Site Type                                                         | Used | Fixed | Prohibited | Available | Util% |  |
| 70 +                                                                   | +    |       | +          | +         | +     |  |
| 71   Block RAM Tile                                                    | 0    | 0     | 0          | 140       | 0.00  |  |
| 72   RAMB36/FIFO*                                                      | 0    | 0     | 0          | 140       | 0.00  |  |
| 73   RAMB18                                                            | 0    | 0     | 0          | 280       | 0.00  |  |
| 74 +                                                                   | +    |       | +          | +         | +     |  |
| 75 * Note: Each Block RAM Tile only has one FIFO logic available and t |      |       |            |           |       |  |
| Block RAM Tile. that tile can still accommodate a RAMB18E1             |      |       |            |           |       |  |

# **Performance Report**

## Latency



### **Throughput**



```
26 -----Start simulation-----
27 Start FIR, Round
28 ----Start the data input(AXI-Stream)----
29 ----Start the data output(AXI-Stream)----
30 ----Start the ap_done sampling(AXI-Lite)----
31 ----Start the illegal tap sampling(AXI-Lite)----
32 AXI-Stream inputting data
33 AXI-Stream outputting data
                                       0...
34 The throughput is
                             33 cycles
35 AXI-Stream inputting data
36 AXI-Stream outputting data
                                       1...
37 The throughput is
                             33 cycles
38 AXI-Stream inputting data
39 AXI-Stream outputting data
                                       2...
40 The throughput is
                             33 cycles
```

## **Timing Report**

## Frequency



## **Report Timing**



### **Max Delay Path**

```
549 Max Delay Paths
551 Slack (MET):
                             0.260ns (required time - arrival time)
552
                             axistream_mult_reg[2]/C
     Source:
                                (rising edge-triggered cell FDCE clocked by axis_clk {rise@0.000ns fall@5.000ns period=10.000ns})
553
554
     Destination:
                              axistream mult reg[31]/D
                                (rising edge-triggered cell FDCE clocked by axis_clk {rise@0.000ns fall@5.000ns period=10.000ns})
556
     Path Group:
                              axis_clk
                              Setup (Max at Slow Process Corner)
557
     Path Type:
     Requirement:
                             10.000ns (axis_clk rise@10.000ns - axis_clk rise@0.000ns)
9.604ns (logic 4.158ns (43.294%) route 5.446ns (56.706%))
11 (CARRY4=6 LUT3=2 LUT5=2 LUT6=1)
558
559
     Data Path Delay:
560
     Logic Levels:
     Clock Path Skew:
                              -0.145ns (DCD - SCD + CPR)
sy (DCD): 2.128ns = ( 12.128 - 10.000 )
561
       Destination Clock Delay (DCD):
562
       Source Clock Delay
                                (SCD):
                                          2.456ns
                             al (CPR): 0.184ns
0.035ns ((TSJ^2 + TIJ^2)^1/2 + DJ) / 2 + PE
       Clock Pessimism Removal (CPR):
564
565
     Clock Uncertainty:
       Total System Jitter
                                (TSJ):
                                          0.071ns
567
       Total Input Jitter
Discrete Jitter
                                (TIJ):
                                         0.000ns
568
                                 (DJ):
                                          0.000ns
569
       Phase Error
                                 (PE):
                                          0.000ns
570
571
         Location
                                                               Incr(ns) Path(ns)
                                                                                         Netlist Resource(s)
                                  Delay type
572
573
                                 (clock axis_clk rise edge)
                                                                    0.000
                                                                               0.000 г
574
                                                                    0.000
                                                                               0.000 r axis clk (IN)
575
                                                                   0.000
                                                                               0.000
                                  net (fo=0)
                                                                                         axis clk
576
                                                                                        axis clk IBUF inst/I
577
                                                                               0.972 r axis_clk_IBUF_inst/0
1.771 axis_clk_IBUF
                                  IBUF (Prop_ibuf_I_0)
                                                                   0.972
578
579
                                  net (fo=1, unplaced)
                                                                   0.800
                                                                               r axis_clk_IBUF_BUFG_inst/I

1.872 r axis_clk_IBUF_BUFG_inst/O

2.456 axis_clk_IBUF_BUFG
580
581
                                  BUFG (Prop_bufg_I_0)
                                                                    0.101
                                  net (fo=260, unplaced)
582
                                                                   0.584
                                                                                    r axistream_mult_reg[2]/C
583
                                  FDCE
584
                                  FDCE (Prop_fdce_C_Q)
                                                                               2.934 r axistream_mult_reg[2]/Q
                                                                   0.478
585
                                  net (fo=52, unplaced)
                                                                               3.982
                                                                                         axistream_mult[2]
586
                                                                   1.048
587
                                                                                     r axistream_mult[23]_i_59/I0
                                                                               4.277 r axistream_mult[23]_i_59/0
4.927 axistream_mult[23]_i_59_n_0
588
                                  LUT6 (Prop_lut6_I0_0)
                                                                    0.295
589
                                  net (fo=2, unplaced)
                                                                    0.650
                                                                                      r axistream_mult_reg[23]_i_47/DI[1]
590
591
                                  CARRY4 (Prop_carry4_DI[1]_CO[3])
                                                                               5.447 r axistream_mult_reg[23]_i_47/C0[3] 5.447 axistream_mult_reg[23]_i_47_n_0
592
                                                                    0.520
593
                                  net (fo=1, unplaced)
                                                                    0.000
                                                                                      r axistream_mult_reg[27]_i_66/CI
594
595
                                  CARRY4 (Prop_carry4_CI_0[2])
596
                                                                    0.256
                                                                               5.703 r axistream_mult_reg[27]_i_66/0[2]
597
                                  net (fo=3, unplaced)
                                                                    0.923
                                                                               6.626
                                                                                         axistream_mult_reg[27]_i_66_n_5
598
                                                                                      r axistream_mult[27]_i_52/I2
599
                                  LUT3 (Prop_lut3_I2_0)
                                                                    0.301
                                                                               6.927 r axistream_mult[27]_i_52/0
                                                                    0.639
600
                                  net (fo=1, unplaced)
                                                                               7.566
                                                                                         axistream_mult[27]_i_52_n_0
601
                                                                                     r axistream_mult_reg[27]_i_18/DI[1]
602
                                  CARRY4 (Prop_carry4_DI[1]_CO[3])
603
                                                                    0.520
                                                                               8.086 r axistream_mult_reg[27]_i_18/C0[3]
                                                                    0.000
604
                                  net (fo=1, unplaced)
                                                                               8.086
                                                                                         axistream_mult_reg[27]_i_18_n_0
605
                                                                                      r axistream_mult_reg[31]_i_21/CI
606
                                  CARRY4 (Prop_carry4_CI_0[3])
607
                                                                    0.331
                                                                               8.417 r axistream_mult_reg[31]_i_21/0[3]
608
                                  net (fo=5, unplaced)
                                                                                         axistream_mult_reg[31]_i_21_n_4
609
                                                                                     r axistream_mult[27]_i_11/I0
610
                                  LUT3 (Prop_lut3_I0_0)
                                                                    0.307
                                                                               9.370 r axistream_mult[27]_i_11/0
                                                                                     axistream_mult[27]_i_11_n_0
r axistream_mult[27]_i_3/I1
                                  net (fo=1, unplaced)
611
                                                                    0.449
                                                                               9.819
612
                                                                               9.943 r axistream_mult[27]_i_3/0
10.416 axistream_mult[27]_i_3_n_0
613
                                  LUT5 (Prop_lut5_I1_0)
                                                                    0.124
                                  net (fo=1, unplaced)
614
                                                                                      r axistream_mult_reg[27]_i_2/DI[3]
615
616
                                  CARRY4 (Prop_carry4_DI[3]_CO[3])
                                                                    0.396
617
                                                                              10.812 r axistream_mult_reg[27]_i_2/C0[3]
                                                                                        axistream_mult_reg[27]_i_2_n_0
618
                                  net (fo=1, unplaced)
                                                                                      r axistream_mult_reg[31]_i_2/CI
619
620
                                  CARRY4 (Prop_carry4_CI_0[3])
621
                                                                    0.331
                                                                              11.143 r axistream_mult_reg[31]_i_2/0[3]
622
                                  net (fo=1, unplaced)
                                                                    0.618
                                                                              11.761 axistream_mult0[31]
                                                                              r axistream_mult[31]_i_1/I0
12.060 r axistream_mult[31]_i_1/0
623
624
                                  LUT5 (Prop_lut5_I0_0)
                                                                                         axistream_mult[31]_i_1_n_0
                                  net (fo=1, unplaced)
                                                                   0.000
                                                                              12.060
625
                                                                                    r axistream_mult_reg[31]/D
626
627
```

| 628 |                            |                            |          |                           |  |  |  |
|-----|----------------------------|----------------------------|----------|---------------------------|--|--|--|
| 629 | (clock axis_clk rise edge) | (clock axis_clk rise edge) |          |                           |  |  |  |
| 630 |                            | 10.000                     | 10.000 r |                           |  |  |  |
| 631 |                            | 0.000                      | 10.000 r | axis_clk (IN)             |  |  |  |
| 632 | net (fo=0)                 | 0.000                      | 10.000   | axis clk                  |  |  |  |
| 633 |                            |                            | г        | axis clk IBUF inst/I      |  |  |  |
| 634 | IBUF (Prop ibuf I 0)       | 0.838                      | 10.838 г | axis clk IBUF inst/0      |  |  |  |
| 635 | net (fo=1, unplaced)       | 0.760                      | 11.598   |                           |  |  |  |
| 636 |                            |                            | г        | axis clk IBUF BUFG inst/I |  |  |  |
| 637 | BUFG (Prop bufg I 0)       | 0.091                      |          | axis clk IBUF BUFG inst/0 |  |  |  |
| 638 | net (fo=260, unplaced)     | 0.439                      | 12.128   |                           |  |  |  |
| 639 | FDCE                       |                            | г        | axistream mult reg[31]/C  |  |  |  |
| 640 | clock pessimism            | 0.184                      | 12.311   |                           |  |  |  |
| 641 | clock uncertainty          | -0.035                     | 12.276   |                           |  |  |  |
| 642 | FDCE (Setup fdce C D)      | 0.044                      | 12.320   | axistream mult reg[31]    |  |  |  |
| 643 | (                          |                            |          |                           |  |  |  |
| 644 | required time              |                            | 12.320   |                           |  |  |  |
| 645 | arrival time               |                            | -12.060  |                           |  |  |  |
| 646 | 311 0000 00110             |                            |          |                           |  |  |  |
| 647 | slack                      |                            | 0.260    |                           |  |  |  |
| 511 | Stack                      |                            | 5.200    |                           |  |  |  |

## **Simulation Waveform**

## **Coefficient Program and Read Back**

Coefficient Program



#### Read Back



#### **Data Stream**

#### Data-in Stream-in



#### Data-out Stream-out



### **RAM Access Control**

Tap BRAM (AXI-Lite Write during ap\_idle)



Tap BRAM (AXI-Lite Read during ap\_idle)



### Tap BRAM (Not ap\_idle, drop write transaction, read invalid value 32'hffffffff)



#### Data BRAM (AXI-Stream in, Shift Data)



## Github

https://github.com/AnthonyGithub/EESM6000C-Lab-3