## **Integrated Systems Architectures**

Lab 2: digital arithmetic



Figure 1

The aim of this lab is to deal with digital arithmetic issues.

# 1 Digital arithmetic and logic synthesizers

## 1.1 Introduction and background

As you could appreciate during the first lab, modern logic synthesizers (such as Synopsys Design Compiler) handle the behavioural description of adders and multipliers by directly inferring the component in the netlist. For Design Compiler this is possible due to the availability of Design Ware (DW), which is basically a collection of ready-to-use blocks. In particular, the report\_resources command shows the arithmetic resources employed in your design and the corresponding architecture. To see the complete Design Compiler documentation you have to source the initialization script:

## $\verb|source|/software/scripts/init_synopsys||$

and then type sold, which stands for Synopsys On-Line Documentation.

As detailed in the documentation, Design Ware contains among the others

- a parametric adder (*DW01\_add*), which can be implemented as ripple-carry (rpl), carry-look-ahead (cla) or parallel-prefix (pparch);
- a parametric multiplier  $(DW02\_mult)$ , which can be implemented as carry-save (csa) or parallel-prefix (pparch).

For further details see

/software/synopsys/sold\_current/doc/online/dw\_ip/doc/dwf/intro.pdf.

You can force Design Compiler to use a specific architecture for each element (adder, multiplier, ...) in your design by using the set\_implementation command. After specifying the clock constraints, before issuing the compile command you can specify the implementation of each cell. You can get the name of each cell from the report generated by the report\_resources command. If you want to specify one architecture for all the adders you can use the find command.

example Specify one architecture (e.g. ripple-carry) for adder cell add\_124: set\_implementation DW01\_add/rpl add\_124

example Specify one architecture (e.g. ripple-carry) for all the adders: set\_implementation DW01\_add/rpl [find cell \*add\_\*]

#### 1.1.1 Dealing with hierarchy

If your desing has some hierarchical organization you have to discover the hierarchy to customize the set\_implementation command. This can be done by checking the report\_resources report. Sometimes this can be difficult so you can first flatten the hierarchy and then force Design Compiler to use a DW component.

**example** Flatten the hierarchy and specify one architecture (e.g. ripple-carry) for all the adders:

```
ungroup -all -flatten
set_implementation DW01_add/rpl [find cell *add_*]
```

## 1.2 Pipelining

Instead of writing by hand a pipelined multiplier or adder you can exploit Design Compiler capability of optimizing register position (retiming). A simple example is the design of a pipelined multiplier. You can describe the multiplier with the behavioural operator "\*" and place a chain of registers at the output. Then, you force Design Compiler to re-compile the design performing retiming by using the optimize\_registers command. During this operation input/output registers might be moved by the tool, so increasing the input/output delay. You can keep input/output registers fixed by using the set\_dont\_touch clause. In order to make clear which registers are input/output and which not, it is recommended to use good naming conventions, e.g. use the suffix \_in\_reg for input registers and \_out\_reg for output registers. In this case you can keep input/output registers fixed by issuing:

```
compile
set_dont_touch *_in_reg
set_dont_touch *_out_reg
optimize_registers
```

#### 1.3 Optimization

Several optimization tools are automatically enabled by using the *ultra* mode in Design Compiler. To enable *ultra* mode you have to add:

```
set_ultra_optimization true
```

before issuing the analyze command. Then, instead of using the compile command use compile\_ultra.

### 1.4 Assignment

Use the VHDL of the <u>last architecture</u> you developed for lab 1 and try the following. **Note:** do not instantiate Design Ware components in your design, use Design Compiler commands to infer them instead.

- 1. Force Design Compiler to use Design Ware adders and multipliers for <u>all</u> the cells of your design (Note: use the compile command). Show the obtained results trying <u>all</u> the possible combinations of DW adders and multipliers and forcing Design Compiler to achieve the maximum clock frequency. You have to show the results by getting the relevant information from the reports generated by the report\_resources, report\_timing and report\_area commands. Verify the correct behaviour of the whole design through simulation (as in lab 1).
- 2. Improve the performance of your design by adding pipeline registers to the multipliers. Note: do not force Design Compiler to infer a specific DW arithmetic block, leave it free to choose the best solution with the compile\_ultra command. Use the optimize\_registers command to exploit retiming and find the maximum clock frequency. Verify the correct behaviour of the whole design through simulation (as in lab 1). Note: FIR filters do not give any particular problem even

- with deeply-pipelined multipliers. On the other hand, pipelining in the autoregressive part of IIR filters may need adoption of look-ahead techniques.
- 3. Challenge: can you do better than Design Compiler? Design a Baugh-Wooley multiplier for 2's complement data. The adder plane must rely on a Dadda-tree. Then use the new multiplier, referred to as *version 1* instead of the behavioural operator "\*". **Note:** Verify via simulation the correct behaviour of both multiplier and whole filter circuit and show the results.
- 4. Idea: you can do better by reducing the precision. Modify (if needed) your digital filter such that only the most significant part of the multiplication is used (see example in Fig. 1). Then, remove all the adders related to the 5 least significant bits. This solutions will be referred to as *version 2*. Show via simulation the error you obtain on the result.
- 5. For the two versions of the multiplier defined above: synthesize the complete filter architecture with Synopsys Design Compiler, find the maximum clock frequency and compare/discuss the obtained results.
- 6. Optional part
  - Describe in VHDL the arithmetic circuit proposed in one of the following references [1, 2, 3]. You can find a copy of each paper on "Portale della didattica".
    - (a) Prepare a testbench and simulate the design. The testbench should highlight the specific behaviour of the circuit. Note: verify the correct behaviour of the circuit via extensive simulations.
  - (b) Synthesize the circuit with Synopsys Design Compiler. Find the maximum frequency and the corresponding area.
  - (c) Modify your digital filter (lab 1) to include instances of the arithmetic circuit you designed. Note: verify the correct behaviour of the whole architecture via extensive simulations. Finally, synthesize the complete architecture with Synopsys Design Compiler, find the maximum clock frequency and compare/discuss the obtained results.

## References

- [1] Y.-H. Chen. An accuracy-adjustment fixed-width Booth multiplier based on multilevel conditional probability. *IEEE Transactions on VLSI Systems*, 23(1):203–207, Jan 2015.
- [2] A. Cilardo, D. De Caro, N. Petra, F. Caserta, N. Mazzocca, E. Napoli, and A. G. M. Strollo. High speed speculative multipliers based on speculative carry-save tree. *IEEE Transactions on Circuits and Systems I*, 61(12):3426–3435, Dec 2014.
- [3] L. Qian, C. Wang, W. Liu, F. Lombardi, and J. Han. Design and evaluation of an approximate Wallace-Booth multiplier. In *International Symposium on Circuits and Systems*, pages 1974–1977, 2016.