Name: Christian Miranda and Zachary Huang

**Date:** 06/09/2025

## EE188 Assignment 3 - Hitachi SH-2 CPU Pipelined Design

A 5 stage pipeline with instruction fetch, instruction decode, execution, memory access, writeback was implemented. Major changes were made to sh2\_cpu.vhd and sh2\_control.vhd.

The design can be tested in ghdl by invoking make test. All tests continue to pass except the branch tests.

## Timing and Resource Usage

The pipelined and non-pipelined designs were implemented using Xilinx Vivado for the Xilinx Spartan 7 XC7S25-1CSGA225C.

The resource and timing reports for both designs are provided in the files titled report\_pipelined.pdf and reports\_unpipelined.pdf.

The clock period for the non-pipelined design is determined by a maximum path delay of 34.814 ns. This corresponds to a maximum clock speed of 28.72 MHz.

The clock period for the pipelined design is determined by a maximum path delay of 41.784 ns. This corresponds to a maximum clock speed of 23.93 MHz.

Thus, we see that our maximum clock speed decreased by about 5 MHz, but because we are executing approximately 1 instruction per clock in the pipelined design versus 1 instruction per 3 clocks in the non-pipelined design, we have a net increase in instruction throughput (a little less than 3 times as many instructions per clock).

The non-pipelined design produced the following resource utilization report:

| _ |                       |      |    |       | L          | <b></b>   |    |       |     |
|---|-----------------------|------|----|-------|------------|-----------|----|-------|-----|
|   | Site Type             | Used |    | Fixed | Prohibited | Available |    | Util% |     |
| Ī | Slice LUTs            | 2150 |    | 0     | 0          | 14600     | l  | 14.73 | 1   |
| - | LUT as Logic          | 2150 | -  | 0     | l 0        | 14600     | 1  | 14.73 | 1   |
| - | LUT as Memory         | 1 0  | -  | 0     | l 0        | 5000      | l  | 0.00  | 1   |
| - | Slice Registers       | 975  | 1  | 0     | l 0        | l 29200   | l  | 3.34  | 1   |
| - | Register as Flip Flop | 787  | 1  | 0     | l 0        | l 29200   | l  | 2.70  | 1   |
| - | Register as Latch     | 188  | 1  | 0     | 1 0        | 29200     | l  | 0.64  | 1   |
| - | F7 Muxes              | 257  |    | 0     | l 0        | l 7300    | l  | 3.52  | 1   |
| - | F8 Muxes              | 128  | 1  | 0     | 0          | l 3650    | ١  | 3.51  | 1   |
| + |                       | +    | -+ |       | <b></b>    | <b></b>   | +- |       | . + |

The pipelined design produced the following resource utilization report:

| + |                                       | +            |          | +-     |       | +               | +         | +-     |                | + |
|---|---------------------------------------|--------------|----------|--------|-------|-----------------|-----------|--------|----------------|---|
|   | Site Type                             | Us           | ed       | <br> - | Fixed | Prohibited<br>+ | •         | <br>   | Util%          |   |
|   | Slice LUTs<br>LUT as Logic            | 23:<br>  23: |          |        | 0     |                 | l 14600   | •      | 15.95<br>15.95 | • |
|   | LUT as Memory                         | İ            | 0        | i      | 0     | l o             | 5000      | İ      | 0.00           | İ |
|   | Slice Registers Register as Flip Flop | 12;<br>  10; | 32<br>37 | 1      | 0     |                 |           | •      | 4.22           |   |
| i | Register as Latch                     |              | 35       | i      | 0     |                 |           | •      | 0.57           | i |
|   | F7 Muxes                              | . –          | 57<br>28 |        | 0     |                 |           |        | 3.52           |   |
| + | F8 Muxes                              | 1:           | 20<br>   | <br> - | 0     | l 0<br>+        | 3650<br>+ | <br>+- | 3.51           | + |

Therefore, the increase in resource utilization is marginal (~1% increase) compared to the speedup.