# **CprE 381 Homework 7**

[Note: The first couple of questions are intended to help you familiarize yourself with pipelining. The third question will help you understand data hazards and how to mitigate them. The final question will help you prepare for the exam by looking through lecture notes and online quizzes and then thinking which questions I might ask.]

## 1. Pipeline Simulation

For each of the modules from Figure 4.51 (ZyBooks Figure 4.7.16) that are listed in the table below, specify what the inputs and outputs are in each cycle for the following code. You do not need to specify values for those modules, inputs, or outputs not listed in the table. Manually simulate (i.e., fill out the table) until the sw has completed (i.e., left the write-back stage). [Hint: the table already has the first couple of cycles filled out. If a value depends on an instruction before or after the code below, report it as X.]

| Cycle | Instruction Memory |          |               | ,<br>Register File |              |               | ALU |            |                             |            | Memto<br>Reg<br>MUX |   |   | PCSrc MUX |            |   |
|-------|--------------------|----------|---------------|--------------------|--------------|---------------|-----|------------|-----------------------------|------------|---------------------|---|---|-----------|------------|---|
|       | Addr               | Instr    | Read<br>reg 1 | Read data 1        | Write<br>reg | Write<br>data | А   | В          | Op<br>(e.g,<br>add,<br>sub) | ALU result | 1                   | 0 | s | 1         | 0          | s |
| 1     | 0x00000010         | addi<br> | Х             | х                  | Х            | Х             | Х   | Х          | Х                           | Х          | х                   | х | х | Х         | 0x00000014 | 0 |
| 2     | 0x00000014         | lui<br>  | 0x00          | 0x00000000         | Х            | Х             | Х   | Х          | х                           | х          | Х                   | Х | х | Х         | 0x00000018 | 0 |
| 3     | 0x00000018         | xor<br>  | 0x00          | 0x00000000         | Х            | Х             | 0x0 | 0x0000003F | addi                        | 0x0000003F | Х                   | Х | Х | Х         | 0x0000001c | 0 |

#### [Pipelined MIPS – Simulation Table]

```
# Assume that $a0 = 3, $a1 = 1024, $a2 = 1023, $a3 = -1
# at the start of your manual simulation.
# Assume that lui is supported by the lui operation in
# the ALU and that the value shifted for lui is the B
# input of the ALU (note that this is likely different
# than your project implementation and that's OK).
# The following instructions start at address 0x00000010:
addi $t7, $zero, 63
lui $s7, 0x1010
xor $t5, $a2, $a3
sub $t6 $a0, $a1
```

```
sl1 $zero, $zero, 0
ori $s7, $s7, 0x0040
beq $t7, $a3, Exit
sl1 $zero, $zero, 0
sl1 $zero, $zero, 0
sl1 $zero, $zero, 0
sw $t7, 0($s7)
...
Exit: # This label resolves to address 0x00000100.
```

2. Data Dependencies, Hazard Detection, and Hazard Avoidance

[**WARNING**: The datapath described and used here is intentionally different from your project's datapath. The goal is for you to be able to understand the relationship between datapath design and hazards.]

a. Identify all the read after write data dependencies in the following code. Use the format (reading instruction address, register read/written, writing instruction addr).

```
# The following code starts at address
0x00400010 addiu $t0, $zero, 0 # clear i ($t0)
j cond
loop:
addu $t1, $a0, $t0
1b $t1, 0($t1) # load value to histogram addu
$t1, $a3, $t1 # calculate histogram bin 1b $t2,
0($t1) # load bin value
addiu $t2, $t2, 1 # increment bin value
sb $t2, 0($t1) # store bin value
addiu $t0, $t0, 1 # increment i
sltu $t1, $t0, $a1 # loop condition check (N in $a1) bne
$t1, $zero, loop
check:
addu $t1, $a3, $a2 # check bin value at input ($a2)
location
1b $t1, 0($t1)
sltiu $t1, $t1, 100
bne $t1, $zero, Exit
jal detection # jump to function (not shown) Exit:
```

 Identify all the read after write data dependencies in the following code. Use the format (reading instruction address, register read/written, writing instruction



b. The 5-stage MIPS pipeline described in lecture (and used for your term project) is not the only possible pipeline design. Consider the six stages described in the table below (this ordering is actually similar to the original MIPS pipeline described in a 1981 research paper). Using your answer to part 3a., please identify all possible data hazards occurring in the eight instructions following the loop label assuming the register file reads the new value being written into a specific register in a given cycle. Justify your answer with a pipeline diagram that shows possible backwards dependencies.

#### In spreadsheet

[Note: Only compute any data hazards that occur from the instruction at address 0x00400018 through address 0x00400034.]

| Stage Order | Stage Abbreviation     | Stage Function                                  |
|-------------|------------------------|-------------------------------------------------|
| 1           | IF (Instruction Fetch) | Load Instruction<br>from M[PC];<br>increment PC |

| 2 | ID (Instruction<br>Decode) | Generate control signals for instruction; read register operands; branch and jump Next PCs determined |
|---|----------------------------|-------------------------------------------------------------------------------------------------------|
| 3 | AG (Address<br>Generation) | Calculate addresses<br>for memory<br>operations                                                       |
| 4 | M (Memory Access)          | Access memory                                                                                         |
| 5 | EX (Execution)             | Perform any ALU operations                                                                            |
| 6 | WB (Writeback)             | Write results (ALU or memory operations) back to registers                                            |

#### c. Pipeline Hazard Avoidance (Software)

Insert the minimum number of NOP instructions in order to allow the instruction sequence used in 3b to run correctly on the 6-stage pipeline described above without any hardware hazard detection or forwarding logic. NOTE: the instruction sequence used does not have any control flow instructions! In spread sheet 33 NOPS

### d. Pipeline Hazard Avoidance (Forwarding Logic)

Now we want to implement forwarding logic in HW to improve the CPI and # of instructions executed in our implementation. Specify the forwarding condition for the address generation module's input A (assume this is the base register input for the base offset address mode). Use the same format as we did in lecture (see Lec10.1) where the ForwardA select signal is 00 for the ID/AG pipeline register, 01 for the M/EX pipeline register, and 10 for the EX/WB pipeline register. Demonstrate that your forwarding condition holds true by drawing and annotating a pipeline diagram for the execution of the first six instructions after the loop label (addu, 1b, addu, 1b, addiu, sb). if (EX/MEM.RegWrite

and  $(EX/MEM.RegisterRd \neq 0)$ 

#### and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10

if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10



In spreadsheet

### e. Why such a pipeline?

What would be a possible advantage of having the above six-stage pipeline (think about this pipeline implementing a more CISC-like ISA)? What would your average CPI be? (answer qualitatively rather than quantitatively)

The hope would be that there would be a lower critical path as you're splitting up the processor more. This means you can increase the frequency to increase the instructions per second.

#### 3. Exam Question

Develop your own exam question (roughly 10-15 points) from MIPS arithmetic, single cycle processor design, performance analysis, pipelining, or data hazards. Your question shouldn't simply ask students to recall information, but should ask for an application of a concept or require understanding of a concept or need analysis of a processor/application. You must include a correct and complete solution to your question. This question should be your own work and not copied from a book or an old exam. *Post your question AND solution to the Exam 2 channel for others to use.* 

Consider the program: addi \$t0, \$zero, 4 xor \$t1, \$t0, \$zero

subi \$t3, \$t0, \$zero beq \$t4, \$zero, end addi \$t4, \$t4, 7 andi \$t5, \$t0, 5 xor \$t9, \$t8, \$t4

Assume we are using the 5 cycle multicycle pipelined processor in class without forwarding.

- a) Are there any dependencies? Where and what kind?
- b) Are there any hazards? List them.
- c) What is a change in the code you could make to avoid a hazard?
- d) Say you wanted to implement forwarding. What would the conditions be for the select line of your mux

#### Solutions:

a)

```
addi $t0, $zero, 4 - NONE
xor $t1, $t0, $zero - addi
subi $t3, $t0, $zero - addi
beq $t4, $zero, end - NONE
addi $t4, $t4, 7 - beq, addi(itself)
andi $t5, $t0, 5 - addi(first)
xor $t9, $t8, $t4 - addi(second), beq
```

b)

1a. EX/MEM.RegisterRd = ID/EX.RegisterRs
1b. EX/MEM.RegisterRd = ID/EX.RegisterRt
2a. MEM/WB.RegisterRd = ID/EX.RegisterRs
2b. MEM/WB.RegisterRd = ID/EX.RegisterRt

addi \$t0, \$zero, 4 - NONE xor \$t1, \$t0, \$zero - 1a subi \$t3, \$t0, \$zero - 2a beq \$t4, \$zero, end - NONE addi \$t4, \$t4, 7 - NONE andi \$t5, \$t0, 5 - NONE xor \$t9, \$t8, \$t4 - 2b c) move the last xor instruction a head of the addi instruction so that it doesn't have to wait for \$t4 to writeback.

d)

# 1. EX hazard:

```
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRs)) ForwardA = 10
if (EX/MEM.RegWrite)
```

```
if (EX/MEM.RegWrite
and (EX/MEM.RegisterRd ≠ 0)
and (EX/MEM.RegisterRd = ID/EX.RegisterRt)) ForwardB = 10
```

#### 2. MEM hazard:

```
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd = 0)
and (MEM/WB.RegisterRd = ID/EX.RegisterRs)) ForwardA = 01
```

```
if (MEM/WB.RegWrite
and (MEM/WB.RegisterRd = 0)
and (MEM/WB.RegisterRd = ID/EX.RegisterRt)) ForwardB = 01
```