## **VE370 HW5 Sample Solution**

## **Problem 1**

1. Input: Equal, ID.Branch
Output: IF.Flush, PCWrite

#### 2. Not moved:

|                  | * * |    |    |    |    |    |    |    |    |    |    |    |    |    |    |    |    |    |
|------------------|-----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
|                  | 1   | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 |
| lw R2,0(R1)      | IF  | ID | EX | ME | WB |    |    |    |    |    |    |    |    |    |    |    |    |    |
| beq R2,R0,Label2 |     | IF | *  | ID | EX | ME | WB |    |    |    |    |    |    |    |    |    |    |    |
| lw R3,0(R2)      |     |    |    | IF | ID | EX | ME | WB |    |    |    |    |    |    |    |    |    |    |
| beq R3,R0,Label1 |     |    |    |    | IF | *  | ID | EX | ME | WB |    |    |    |    |    |    |    |    |
| beq R2,R0,Label2 |     |    |    |    |    |    |    |    |    | IF | ID | EX | ME | WB |    |    |    |    |
| sw R1,0(R2)      |     |    |    |    |    |    |    |    |    |    |    |    |    | IF | ID | EX | ME | WB |

#### Moved:

|                  | 1  | 2  | 3  | 4  | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 | 14 | 15 | 16 |
|------------------|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
| lw R2,0(R1)      | IF | ID | EX | ME | WB |    |    |    |    |    |    |    |    |    |    |    |
| beq R2,R0,Label2 |    | IF | *  | *  | ID | EX | ME | WB |    |    |    |    |    |    |    |    |
| lw R3,0(R2)      |    |    |    |    | IF | ID | EX | ME | WB |    |    |    |    |    |    |    |
| beq R3,R0,Label1 |    |    |    |    |    | IF | *  | *  | ID | EX | ME | WB |    |    |    |    |
| beq R2,R0,Label2 |    |    |    |    |    |    |    |    |    | IF | ID | EX | ME | WB |    |    |
| sw R1,0(R2)      |    |    |    |    |    |    |    |    |    |    |    | IF | ID | EX | ME | WB |

Not moved: 18 clock cycles Moved: 16 clock cycles Speedup: 18/16 = 1.125

3. lw R2, 0(R1)

beq R2, R0, Label2

lw R3, 0(R2)
beq R3, R0, Label1

4. Since lw would write the register in the first half of the clock cycle, and then beq is able to read it in the second half of the clock cycle, actually no forwarding path is needed after the 2 stall cycles are inserted. Hence, you will get full points if you answer this question.

### **Problem 2**

- $1.2 \times (1 0.45) \times 0.3 = 0.33$
- 2. Assume that branch outcomes are determined in the EX stage and jump instruction is detected in ID stage, that there are no data hazards, and that no delay slots are used.

$$2 \times (1 - 0.55) \times 0.3 + 1 \times 0.05 = 0.32$$

# Problem 3

| Pattern      | Predictor State | Prediction | Correctness |  |  |  |  |  |  |  |
|--------------|-----------------|------------|-------------|--|--|--|--|--|--|--|
| First round  |                 |            |             |  |  |  |  |  |  |  |
| Т            | SNT             | NT         | Incorrect   |  |  |  |  |  |  |  |
| NT           | NT              | NT         | Correct     |  |  |  |  |  |  |  |
| NT           | SNT             | NT         | Correct     |  |  |  |  |  |  |  |
| Т            | SNT             | NT         | Incorrect   |  |  |  |  |  |  |  |
| Т            | NT              | NT         | Incorrect   |  |  |  |  |  |  |  |
| Т            | Т               | Т          | Correct     |  |  |  |  |  |  |  |
| Т            | ST              | Т          | Correct     |  |  |  |  |  |  |  |
| NT           | ST              | Т          | Incorrect   |  |  |  |  |  |  |  |
| Other rounds |                 |            |             |  |  |  |  |  |  |  |
| Т            | Т               | Т          | Correct     |  |  |  |  |  |  |  |
| NT           | ST              | Т          | Incorrect   |  |  |  |  |  |  |  |
| NT           | Т               | Т          | Incorrect   |  |  |  |  |  |  |  |
| Т            | NT              | NT         | Incorrect   |  |  |  |  |  |  |  |
| Т            | Т               | Т          | Correct     |  |  |  |  |  |  |  |
| Т            | ST              | Т          | Correct     |  |  |  |  |  |  |  |
| Т            | ST              | Т          | Correct     |  |  |  |  |  |  |  |
| NT           | ST              | Т          | Incorrect   |  |  |  |  |  |  |  |

For the predictor state: SNT for "strong" predict not taken state, ST for "strong" predict taken state

The accuracy will be 50% if this pattern is repeated forever.