No branch delay slot.

b stands for bubble.

- means the instruction was squashed.

| Times          | Т0  | T1  | T2  | Т3  | T4  | T5  | Т6  | T7  | Т8  | Т9  | T10 | T11 | T12 | T13 | T14 |
|----------------|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|-----|
| add r2 r2 r3   | IF0 | IF1 | ID  | X0  | WB  |     |     |     |     |     |     |     |     |     |     |
| add r3 r3 #1   |     | IF0 | IF1 | ID  | X0  | WB  |     |     |     |     |     |     |     |     |     |
| ld r4 [r3+r5]  |     |     | IF0 | IF1 | ID  | X0  | X1  | X2  | Х3  | WB  |     |     |     |     |     |
| add r7 r7 r4   |     |     |     | IF0 | IF1 | ID  | ID  | ID  | ID  | X0  | WB  |     |     |     |     |
| cmp r3 r7      |     |     |     |     | IF0 | IF1 | IF1 | IF1 | IF1 | ID  | X0  | X1  | WB  |     |     |
| bne foo        |     |     |     |     |     | IF0 | IF0 | IF0 | IF0 | IF1 | ID  | ID  | ID  |     |     |
| ldr            |     |     |     |     |     |     | b   | b   | b   | IF0 | IF1 | IF1 | IF1 | -   |     |
| mov r0 #0      |     |     |     |     |     |     |     |     |     |     | IF0 | IF0 | IF0 | -   |     |
| Times          | T12 | T13 | T14 | T15 | T16 | T17 | T18 | T19 | T20 | T21 | T22 | T23 | T24 | T25 | T26 |
| str r2 [r3 #0] | b   | -   |     |     |     |     |     |     |     |     |     |     |     |     |     |
| add r2 r2 r3   |     | IF0 | IF1 | ID  | X0  | WB  |     |     |     |     |     |     |     |     |     |
| add r3 r3 #1   |     |     | IF0 | IF1 | ID  | X0  | WB  |     |     |     |     |     |     |     |     |
| ld r4 [r3+r5]  |     |     |     | IF0 | IF1 | ID  | X0  | X1  | X2  | Х3  | WB  |     |     |     |     |
| add r7 r7 r4   |     |     |     |     | IF0 | IF1 | ID  | ID  | ID  | ID  | X0  | WB  |     |     |     |
| cmp r3 r7      |     |     |     |     |     | IF0 | IF1 | IF1 | IF1 | IF1 | ID  | X0  | X1  | WB  |     |
| bne foo        |     |     |     |     |     |     | IF0 | IF0 | IF0 | IF0 | IF1 | ID  | ID  | ID  |     |
| ldr            |     |     |     |     |     |     |     | b   | b   | b   | IF0 | IF1 | IF1 | IF1 | -   |

3a) Branch Penalty: 4 (IF1 ID X0 X1)

Jump Penalty: 3 (IF1 ID X0)

CPI Penalty:  $0.1 \cdot 4 \cdot 0.7 + 0.1 \cdot 3 = 0.58$ 

Ideal CPI with a perfect pipeline is 1, so with our penalty the CPI is 1.58

- **3b)** Fifty percent of branches have a useful delay slot.  $CPI = 0.1 \cdot 4 \cdot 0.5 + 0.7 + 0.1 \cdot 3 = 2.2$
- **3c)** In this case there is no branch penalty. so the CPI penalty is  $0.1 \cdot 3 = 0.3$  So the ideal cpi plus the penalty is now 1.3
- **3d)** Additional CPI penalties:

One cycle after LW/Other  $0.25 \cdot 4$ 

Two cycles after LW/Other  $0.125 \cdot 3$ 

 $CPI = 1.58 + .25 \cdot 4 + 0.125 \cdot 3 = 2.955$ 

So data hazards suck is what I gather from this.

- **4a)** True. The mov instruction after bge is executed all the way through even though it should stop after the bge ID since it knows it needs to jump to the add. There should be a nop there to prevent this.
- 4b) True. Every shown type of instruction has the stages: F1 F2 ID EX MM WB.
- **4c)** Unknown. No situation where data forwarding would have to happen is shown in the example above.
- 4d) Unknown. How would I know if its a 32-bit architecture?
- 4e) True. A control hazard as referenced in part a.
- 4f) False. Pipelining causes branch hazards. We will always have a bubble of at least size 2 because of the 2 fetch stages.