

# Design of a 32-bit RISC-V Processor

Arvin Delavari – Summer 2023 Supervisor: Professor Mirzakuchaki PhD.

Electronics Research Center Iran University of Science and Technology

# **Contents:**

- Error Configurable Approximate Multiplier Design
- Image Processing using Approximate Multiplier
- 32-bit RISC-V Processor Design
- Processor Physical Layout and Verification
- Assistant Software for Code Execution
- References



## Error Configurable Approximate Multiplier Design

A Low-Power High-Speed Accuracy-Controllable Approximate Multiplier Design [1]

Carry Maskable Adder (CMA)

Proposal of a new approximate multiplier architecture

Error Configurable Multiplier







### Error Configurable Approximate Multiplier Design

Proposal of a new Error Configurable Adder:

|   |   |     | Er = o |     | Er = 1 |     |                  |
|---|---|-----|--------|-----|--------|-----|------------------|
| A | В | Cin | Cout   | Sum | Cout   | Sum |                  |
| О | 0 | 0   | 0      | О   | 0      | 0   |                  |
| О | 0 | 1   | 0      | 1   | 0      | 1   |                  |
| О | 1 | О   | 0      | 1   | О      | 1   |                  |
| 0 | 1 | 1   | 0      | 1   | 1      | О   | 1 <b>←</b> 2     |
| 1 | 0 | 0   | 0      | 1   | 0      | 1   |                  |
| 1 | О | 1   | 1      | 1   | 1      | О   | $3 \leftarrow 2$ |
| 1 | 1 | 0   | 1      | 0   | 1      | 0   |                  |
| 1 | 1 | 1   | 1      | 1   | 1      | 1   |                  |





<sup>\*</sup> Normalized gate delay models to analyze circuit performance mentioned in [2]



## Error Configurable Approximate Multiplier Design

|      | NMED | MRED | ER    |
|------|------|------|-------|
| m_7b | 0.25 | 0.85 | 36.16 |
| m_6b | 0.26 | 0.99 | 43.46 |
| m_5b | 0.29 | 1.31 | 52.07 |
| m_4b | 0.35 | 1.93 | 61.05 |
| m_3b | 0.49 | 3.05 | 69.61 |
| m_2b | 0.71 | 4.57 | 74.93 |
| m_1b | 1.05 | 6.5  | 78.1  |
| m_ob | 1.64 | 9.02 | 80.02 |

|        | NMED | MRED | ER    |
|--------|------|------|-------|
| ecm_7b | 0.25 | 0.85 | 36.16 |
| ecm_6b | 0.25 | 0.85 | 36.16 |
| ecm_5b | 0.26 | 0.97 | 39.73 |
| ecm_4b | 0.28 | 1.33 | 44.33 |
| ecm_3b | 0.34 | 2.11 | 50.33 |
| ecm_2b | 0.48 | 3.59 | 57.02 |
| ecm_1b | 0.76 | 5.89 | 61.8  |
| ecm_ob | 1.25 | 8.94 | 65.07 |

# MRED Comparison between multipliers implmented with CMA and ECA



# Error Rate Comparison between multipliers implemented with CMA and ECA





# Image Processing using Approximate Multiplier

#### PSNR for image sharpening algorithm on Lenna image

|     | M_7b  | M_6b  | M_5b  | M_4b  | M_3b  | M_2b  | M_1b  | M_ob  |
|-----|-------|-------|-------|-------|-------|-------|-------|-------|
| CMA | 46.32 | 44.78 | 42.94 | 38.93 | 33.63 | 28.15 | 24.81 | 15.90 |
| ECA | 46.32 | 46.32 | 43.99 | 41.78 | 37.51 | 31.27 | 26.52 | 21.82 |

Original



Sharpened



Sharpened with ECA





### 32-bit RISC-V Processor Design

- Modular and Extensive Design
- Distributed Control Logic
- 5 stage pipeline

- Compatible with GCC compiler [3] (C and Assembly)
- I-Extension of RISC-V ISA [4]
- Synthesizable and implementable on FPGA (Ultrascale)

# BLOCK DIAGRAM



# Processor Physical Layout and Verification





| Core specifications |        |  |  |  |
|---------------------|--------|--|--|--|
| Clock Cycle Time    | 4 ns   |  |  |  |
| CPI (R,I-TYPE)      | 1.13   |  |  |  |
| Frequency           | 250MHz |  |  |  |

| Module                        | Max Delay (ps) |  |  |
|-------------------------------|----------------|--|--|
| Address Generator             | 3844.84        |  |  |
| Arithmetic Logic Unit         | 3099.01        |  |  |
| Control Status Registers      | 747.689        |  |  |
| Hazard Forward Unit           | 1131.73        |  |  |
| Immediate Generator           | 1016.44        |  |  |
| Instruction Decoder           | 716.437        |  |  |
| Jump Branch Unit              | 243.115        |  |  |
| Register File                 | 695.34         |  |  |
| Normalized Memory Access Time | 10000 - 40000  |  |  |
| Fetch Unit                    | 308.907        |  |  |
| Load Store Unit               | 569.903        |  |  |

• Processed with TSMC 180nm technology [5]

Synthesis tool: Yosys [6]Layout: Magic VLSI [7]



#### **Assistant Software for Code Execution**

#### • Windows:

Venus Simulator (Assembly code) "Visual Studio Code" extension

Test flow:

Assembly output (.txt)  $\rightarrow$  Python Script  $\rightarrow$  instruction memory HEX file  $\rightarrow$  Testbench

#### • Linux:

RISC-V GCC compiler toolchain (C and Assembly code)

Test flow:

C code  $\rightarrow$  makefile  $\rightarrow$  GCC compiler toolchain  $\rightarrow$  instruction memory HEX file  $\rightarrow$  Testbench





#### References

- [1] T. Yang, T. Ukezono, and T. Sato, "A Low-Power High-Speed Accuracy-Controllable Approximate Multiplier Design", 2018 23<sup>rd</sup> Asia and South Pacific Design Automation Conference (ASP-DAC), Jeju, Korea (south), 2018, pp.605-610
- [2] Wen-Chang Yeh and Chein-Wei Jen, "High-speed Booth encoded parallel multiplier design," in IEEE Transactions on Computers, vol. 49, no. 7, pp. 692-701, July 2000
- [3] https://github.com/riscv-collab/riscv-gnu-toolchain
- [4] https://riscv.org/technical/specifications/
- [5] https://www.tsmc.com/english/dedicatedFoundry/technology/logic/l\_018micron
- [6] https://www.yosyshq.com/open-source
- [7] http://opencircuitdesign.com/magic/
- [8] D. A. Patterson and J. L. Hennessy, "Computer Organization and Design: The Hardware/Software Interface," 5th ed., *Morgan Kaufmann*, 2013.
- [9] J. L. Hennessy and D. A. Patterson, "Computer Organization and Design: A Quantitative Approach," 6th ed., *Morgan Kaufmann*, 2017.

