# 1. O1 to MUX6



#### 2. O0 to MUX7 and O1 to MUX6



#### 3. O0 to MUX8



#### 4. EX to MUX7



#### 5. Data cache to MUX9



### 6. O0 to Zero detector



## 7. Register File to MUX1



# 8. Branch Target Buffer to MUX1



i) Clock cycles with ALU Forwarding Enabled = 10 Result of r1 = 15



ii)



Clock cycles with No ALU Interlock = 10 iii)



Q3.

i) Clock cycles until halt = 51Instructions executed = 39



These 2 numbers aren't equal because the first instruction requires 4 clock cycles.

There are 2 ST instructions every jump and there are 4 jumps.

## ii) Clock cycles with Branch Interlock = 53



This number differs from part (i) as with branch interlock, there is a lack of Branch target buffer and a 1 tick delay is added instead, 2 clock cycles are needed because of this.

iii) With Branch Prediction with 2 shift instructions swapped.



This time, there is 9 less instructions executed and 2 less jumps as there is a difference of 8 between instructions executed and clock cycles.

Execution time is quicker as less instructions.