**CSE 140 Lab/HW#4 – 3/14 11:59PM**

# Pipelined Datapath and Data Hazards (30 pts + 40 pts)

1. Consider the following code.

addi $sp, $sp, -8 // I1 sw $ra, 4($sp) // I2

sw $a0, 0($sp) // I3 slti $t0, $a0, 1 // I4 addi $t0, $t0, 1 // I5

addi $sp, $sp, 8 // I6

* 1. Assume that we have a hazard detection unit in a 5-stage pipelined architecture as learned in the class. We do not have a forwarding unit. The hazard detection unit stalls instructions to resolve hazards. How many cycles would take to execute the code?
* 5-stage pipeline is the following: IF 🡪 ID 🡪 EX 🡪 MEM 🡪 WB

|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | **CC1** | **CC2** | **CC3** | **CC4** | **CC5** | **CC6** | **CC7** | **CC8** | **CC9** | **CC10** | **CC11** | **CC12** | **CC13** | **CC14** |
| **I1** | IF | ID | EX | MEM | WB |  |  |  |  |  |  |  |  |  |
| **I2** |  | IF | ID |  |  | EX | MEM | WB |  |  |  |  |  |  |
| **I3** |  |  | IF | ID |  |  |  |  | EX | MEM | WB |  |  |  |
| **I4** |  |  |  | IF | ID |  | EX | MEM | WB |  |  |  |  |  |
| **I5** |  |  |  |  | IF | ID |  |  |  | EX | MEM | WB |  |  |
| **I6** |  |  |  |  |  | IF | ID |  |  |  |  | EX | MEM | WB |

* The number of cycles it takes to execute these lines of code without a forwarding unit is a total of fourteen cycles.

* 1. Repeat a. but assume that we also have a forwarding unit that forwards data from EX/MEM to EXE and from MEM/WB to EXE.

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | **CC1** | **CC2** | **CC3** | **CC4** | **CC5** | **CC6** | **CC7** | **CC8** | **CC9** | **CC10** |
| **I1** | IF | ID | EX | MEM | WB |  |  |  |  |  |
| **I2** |  | IF | ID | EX | MEM | WB |  |  |  |  |
| **I3** |  |  | IF | ID | EX | MEM | WB |  |  |  |
| **I4** |  |  |  | IF | ID | EX | MEM | WB |  |  |
| **I5** |  |  |  |  | IF | ID | EX | MEM | WB |  |
| **I6** |  |  |  |  |  | IF | ID | EX | MEM | WB |

* The number of cycles it takes to execute it with a forwarding unit is a total of ten clock cycles.

1. Consider the following code.

|  |  |  |
| --- | --- | --- |
| I1: | LOOP: | lw $a0, 0($sp) |
| I2: |  | slti $t0, $a0, 1 |
| I3: |  | beq $t0, $zero, L1 |
| I4: |  | addi $v0, $zero, 1 |
| I5: |  | addi $sp, $sp, 8 |
| I6: | L1: | lw $ra, 4($sp) |
| I7: |  | addi $t4, $ra, 8 |
| I8: |  | sw $t4, 0($t1) |
| I9: |  | b LOOP |
| I10: |  | add $t4, $t4, $t3 |

Assume that we have data forwarding paths from EX/MEM to EXE stage and from MEM/WB to EXE stage. We do early branch determination (branch outcome is generated in ID stage and the PC value is updated in ID stage). Branches are predicted untaken but the actual outcomes are as shown in the table. Show the first 18-cycles of execution of the code in a timing diagram shown below. When an instruction is stalled in a pipeline stage, fill the pipeline stage’s name for the instruction until the end of stall (as shown on the page 44 of the lecture slide, “CSE140\_Lecture-4\_Processor-4”). Draw additional rows if needed. Show instruction id to the first column.

|  |  |  |
| --- | --- | --- |
| ID | Branch instructions | Branch outcome (NT: not taken, T: taken) |
| I3 | beq $t0, $zero, L1 | T at the first iteration  NT from the second iteration |
| I9 | b LOOP | Always T |

|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
|  | **CC1** | **CC2** | **CC3** | **CC4** | **CC5** | **CC6** | **CC7** | **CC8** | **CC9** | **CC10** | **CC11** | **CC12** | **CC13** | **CC14** | **CC15** | **CC16** | **CC17** | **CC18** |
| **I1** | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |  |  |  |  |  |  |  |
| **I2** |  | **IF** | **ID** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |  |  |  |  |  |
| **I3** |  |  | **IF** | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |  |  |  |  |
| **I4** |  |  |  |  | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |  |  |  |
| **I5** |  |  |  |  |  | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |  |  |
| **I6** |  |  |  |  |  |  | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |  |
| **I7** |  |  |  |  |  |  |  | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |  |
| **I8** |  |  |  |  |  |  |  |  | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |  |  |
| **I9** |  |  |  |  |  |  |  |  |  | **IF** | **ID** | **ID** | **EXE** | **MEM** | **WB** |  |  |  |
| **I10** |  |  |  |  |  |  |  |  |  |  | **IF** | **IF** | **ID** | **EXE** | **MEM** | **WB** |  |  |

# Branch Predictor (30 pts)

3. Consider a branch instruction that has following outcomes:

NT, T, NT, T, NT, NT, NT, T, T, T, NT, T, NT, T

1. What is the misprediction rate of always-taken and always-not-taken predictors for this sequence of branch outcomes?

* Misprediction happens when a branch isnt able to be accepted.
* Always-taken: There are seven misprediction and there being a total of fourteen instructions we can calculate a misprediction rate of 0.5
* Always-not-taken: As before there are seven mispredictions and we get a misprediction rate of 7/14 = 0.5

1. What is the misprediction rate of one-bit branch predictor for this sequence of branch outcomes? The initial predictor state is 0.

* Since, we are finding the misprediction rate of one-bit branch predictor and we are beginning with 0 then we get 9 mispredictions. 9 mispredictions out of the 14 instructions will make a rate of 0.643.

1. What is the misprediction rate of two-bit branch predictor for this sequence of branch outcomes? The initial predictor state is 00.

* So, a 2-bit branch predictor has the following:

00 – strong not-taken 01 – weak not-taken

10 – weak taken 11 – strong taken

* The number of mispredictions are equal to five. So, five out of fourteen is equal to 0.357.

# Submission Guideline

* Submit your solution in an MS Word or a pdf format to the CatCourse.
* Deadline: **3/14 11:59PM** (All sections have the same deadline for this HW)