### Integrated Systems Architectures

# Lab 1: design and implementation of a digital filter General description



## 1 Reference model development

Design a digital filter with a cut-off frequency of 2 kHz. Set the sampling frequency to 10 kHz. The filter you have to design can be either a Finite Impulse Response (FIR) or Infinite Impulse Response (IIR) filter depending on your group number, through parameter p, which is defined as follows: if the group number is even then p=1 and you have to design an FIR filter; otherwise p=0 and you have to design an IIR filter.

# 1.1 Filter design and coefficient quantization using Matlab/Octave

You can use the *fir1* or *butter* functions available in Matlab or Octave to design the FIR or IIR filter respectively. Please note that Octave is an open source project and that Matlab is available for free for Polito students, check the page "MathWorks - Total Academic Headcount" on:

#### https://www.areait.polito.it.

On Portale della didattica you can download an example showing you the use of both the fir1 and butter functions (myfir\_design.m and myiir\_design.m). These files require two parameters: i) the order of the filter, ii) the number of bits to represent the coefficients.

To set these two parameters use the following algorithm.

- 1. reverse-order (alphabetically) the surnames of the members of your group: e.g. Palmer, Lake, Emerson.
- 2. let x and y be the number of characters in the first two reverse-ordered surnames (e.g. x = 6 for Palmer and y = 4 for Lake), then obtain the order of the filter (N) and the number of bits  $(n_b)$  as follows:

$$N = 2^{p} \cdot [(x \bmod 2) + 1] + 6 \cdot p, \tag{1}$$

$$n_b = (y \bmod 7) + 8, \tag{2}$$

where p has been defined at the beginning of Section 1.

Example (Palmer, Lake, ...) leads to x = 6 and y = 4, so  $N_{p=0} = 1$ ,  $N_{p=1} = 8$  and  $n_b = 12$ .

### 1.2 Testing the filter and fixed point implementation

### 1.2.1 First step: Matlab/Octave pseudo-fixed-point

Now, you have to test your filter on a signal. To this purpose on  $Portale\ della\ didattica$  you can download the scripts  $my\_fir\_filter.m$  and  $my\_iir\_filter.m$ , which create a signal made of two sinusoidal waves at two different frequencies (one falls in band and one out of the band). Then, each script i) calls the proper function  $(myfir\_design\ or\ myiir\_design)$ , ii) applies the filter and iii) displays the results. Moreover, each script saves both the input and output samples as quantized data represented on  $n_b-1$  bits for the fractional part and one bit for the interger part, as integer values  $(samples.txt\ and\ resultsm.txt)$ . Check that  $|b_j|<1$  and  $|a_k|<1$  for every i (except  $a_0$  which is always equal to 1) so that the quantized values  $(bq\ and\ aq)$  and the corresponding integer values  $(bi\ and\ ai)$  are correctly represented.

### 1.2.2 Second step: fixed-point C model

At this point, you have to write in C language a fixed point implementation of the filtering operation, namely:

$$y_i = \sum_j x_{i-j} \cdot b_j - \sum_k y_{i-k} \cdot a_k \tag{3}$$

where  $x_l$ ,  $y_l$ ,  $b_l$  and  $a_l$  are the input samples, output samples and filter coefficients respectively. There are several possibilities to implement (3), but **you must use direct form for FIR** filters and direct form II (canonical direct form) for IIR filters. Please note that on Portale della didattica you can download two examples (myfilter.c and myfilterII.c) where an IIR filter is implemented as fixed point direct form I (myfilter.c) and direct form II (myfilterII.c). Direct form for an FIR filter can be straightforwardly derived from myfilter.c. These programs i) help you in evaluating the performance of the fixed point implementation with respect to the Matlab/Octave one and ii) are the reference model you will use to develop, debug and test the digital filter as an hardware architecture (see Section 2). Please note that the two examples already reduce the number of bits after each multiplication, to reduce the bitwidth (and so the area) of your filter. Note: the notation used in this document, in the Matlab/Octave files and C fixed point model is the one used by Matlab, i.e.  $b_i$  are the coefficients of the moving-average part and  $a_i$  are the coefficients of the auto-regressive part, as in (3), see the help of the Matlab/Octave filter function for details.

Then, evaluate the Total Harmonic Distortion (THD) of the result y of your fixed point implementation. You can use the thd function in Matlab to obtain it. Choose the number of bits to discard after each multiplication to minimize the area of your architecture with a maximum THD of -30dB.

# 2 VLSI implementation

## 2.1 Starting architecture development

Draw the block scheme of the architecture and the corresponding timing diagram according with the requirements specified in Section 1 (use https://wavedrom.com). Develop in VHDL the architecture of the filter you designed and equip all the flip flops and registers with an enable signal. Note: do not choose the architecture of adders and multipliers at this time. Use '\*' and '+' instead. The filter interface is shown in Fig. 1 (a): the samples (DIN) enter one each clock cycle with a validation signal (VIN). When VIN='1' a new sample is valid and can be loaded into the architecture. The output DOUT contains the result of the filtering (y) and VOUT is a validation signal: VOUT='1' when DOUT is valid. Note: all the inputs and the outputs of the filter must be loaded by registers. All the data are represented as 2-complement normalized-fixed-point values: the weight of the most significant bit (MSB) is  $-2^0$ , the weight of the least significant bit (LSB) is  $2^{-n_b+1}$ . Note: use std\_logic\_vector, signed or unsigned for fixed point data representation.

Finally, prepare a hierarchical test-bench, where the signal generation is implemented in VHDL, whereas the top of the hierarchy is a verilog file (see Fig. 1 (b)). From *Portale della didattica* you can download an example of the files required to create a mixed VHDL/verilog testbench.

**Note:** To ease the design flow we suggest to avoid the use of the *generic* statement and of user-defined types in the **top** entity of the filter. You can use the *generic* for the modules which are inside the design. If you want the top of the hierarchy to be parametric, then please use a package instead.

#### 2.2 Simulation

Simulate and verify with Modelsim/Questasim the system you described. In particular, you have to compare the results produced by your VHDL with the ones you obtained with your fixed point model developed in C language (see Section 1). **Note: the results produced by the VHDL must be equal to the ones obtained in C.** Suggestion structure your design environment as follows. Create a working directory (e.g. lab1). In lab1 create:

- src as the directory containing the VHDL file/files of the filter;
- tb as the directory containing the VHDL and the verilog of the test-bench;
- sim as the directory containing the project files created by Modelsim/Questasim, the text files with input data (e.g. samples.txt) and your simulation scripts (if any).

**Note:** the testbench must show that the circuit is working correctly even when *VIN* goes to zero, as shown in Fig. 2.



Figure 2

### 2.3 Implementation

#### 2.3.1 Logic synthesis

Synthesize the filter with Synopsys Design Compiler (see the documentation on *Portale della didattica*) according to the following specifications.

- 1. Check the messages generated by Design Compiler, especially the ones generated during the *analize*, *elaborate* and *compile* phases, as they are showing you potential problems such as incomplete sensitivity list, inferred latches, combinational loops, ...
- 2. Find the maximum clock frequency your design achieves  $(f_M)$ , that corresponds to the minimum period with slack met and equal to zero; find the corresponding area.
- 3. Verify the netlist via simulation with a clock frequency equal to  $f_M$ . Check that the testbench is not violating setup and hold times and adjust it as needed. Measure on the waveform viewer the time distance between the first valid output and the last valid output. The output results must be the same as the ones obtained with the VHDL.
- 4. Obtain the switching activity to estimate the power consumption of your design (see the documentation on *Portale della didattica*).

5. Re-synthesize the design enabling clock gating and re-estimate the power consumption.

Suggestion structure your design environment as follows. Create in lab1:

- syn as the directory containing the synthesis scripts and the working files produced by the logic synthesizer;
- netlist as the directory containing the output of the logic synthesizer.

**Remember** to remove the *work* directory and recreate it any time you make a new synthesis so that Design Compiler will start from a clean folder.

#### 2.3.2 Place & Route

Place and route the filter (with active clock gating) with Cadence Innovus (see the documentation on *Portale della didattica*) according to the following specifications.

- 1. Check the messages generated by Innovus at each step (project setup, floorplanning, placement, routing, ...).
- 2. Set  $f_{clk} = f_M/2$  by properly modifying the .sdc file and find the area of your design.
- 3. Verify the netlist via simulation, setting  $f_{clk} = f_M/2$ . Measure on the waveform viewer the time distance between the first valid output and the last valid output. The output results must be the same as the ones obtained with the VHDL.
- 4. Obtain the switching activity (with  $f_{clk} = f_M/2$ ) to estimate the power consumption of your design (see the documentation on *Portale della didattica*).

Suggestion structure your design environment as follows. Create in lab1:

• innovus as the directory containing the place and route files and results.

### 2.4 Advanced architecture development

Improve your architecture by applying the following techniques:

• L-unfolding and pipelining for FIR filters,

(a)

• J-look-ahead, retiming and pipelining (where possible) for IIR filters,

where L=3 and J=1. Choose the number of pipeline stages to achieve the maximum possible throughput without adding pipeline to the arithmetic blocks, namely draw your architecture on paper, put registers between the arithmetic blocks and make experiments with Design Compiler. For the new architecture repeat all the steps detailed in Sections 2.1, 2.2 and 2.3.

Note for FIR filters: the final architecture must have the interace shown in Fig. 3, you can use the *unfolded* version of the testbench files available on *Portale della didattica* to test the unfolded architecture.



(b)

### Figure 3

Note for IIR filters: derive the J-look-ahead architecture from the <u>direct form II</u> representation. Moreover, with the look-ahead technique 'new' coefficients, such as  $b_0 \cdot a_1$ , are required. As a consequence, precision and bit-width could change with respect to the first architecture. Thus, you have to develop a new fixed point C-model to test the architecture. As in the first architecture, you have to choose the number of bits to discard after each multiplication to minimize the area of your architecture <u>with a maximum THD of -30dB</u>.