# Architetture Out-of-order

Andrea Bartolini <a.bartolini@unibo.it>

(Architettura dei) Calcolatori Elettronici, 2021/2022

# Concept of dyanamic scheduling

```
1. fdiv.d f0, f2, f4
2. fadd.d f10, f0, f8
```

3. fsub.d f12, f8, f14

# Concept of dyanamic scheduling

|                     | T1 | T2 | Т3 | <b>T4</b> | T5 | Т6 | <b>T7</b> | Т9 | T10 | T11 |  |
|---------------------|----|----|----|-----------|----|----|-----------|----|-----|-----|--|
| fdiv.d fQ, f2, f4   |    |    |    |           |    |    |           |    |     |     |  |
| fadd.d f10, f0, f8  |    |    |    |           |    |    |           |    |     |     |  |
| fsub.d f12, f8, f14 |    |    |    |           |    |    |           |    |     |     |  |

# Concept of dyanamic scheduling

|                           | <b>T1</b> | T2 | Т3 | <b>T4</b> | <b>T5</b> | Т6 | <b>T7</b> | Т9 | T10 | T11 |  |
|---------------------------|-----------|----|----|-----------|-----------|----|-----------|----|-----|-----|--|
| fdiv.d <b>f0</b> , f2, f4 | IF        | ID | EX | EX        |           |    |           |    |     |     |  |
| fadd.d f10, <b>f0</b> ,f8 |           | IF | ID | ID        |           |    |           |    |     |     |  |
| fsub.d f12, f8, f14       |           |    | IF | IF        |           |    |           |    |     |     |  |

- True dependency on f0
- fsub.d stalls yet isn't dependent on fadd.d nor fdiv.d
- Suppose we let fsub.d move around the stall and execute out of order?

# Pipeline bloccanti / in-order

- ✓ Simple pipelines <u>fetch an instruction and issue it</u>
- χ <u>Unless there is a data dependence</u> between an instruction already in the pipeline and the fetched instruction that cannot be hidden with bypassing or forwarding.
  - Forwarding logic reduces the effective pipeline latency so that certain dependences do not result in hazards.
  - If unavoidable hazard => the hazard detection hardware stalls the pipeline
  - No new instructions are fetched or issued until the dependence is cleared.
- Structural hazards and data hazards are checked in the ID stage.
- Static Scheduling:
  - The compiler reorders instructions to avoid/reduce stalls.
- Dynamic Scheduling:
  - The hardware rearranges the instruction execution to avoid/reduce stalls.
  - Handles dependencies not known at compile time
  - The same binary runs efficiently on multiple architectures

## Pipeline non bloccanti / out-of-order

We want an <u>instruction to begin execution</u> as soon as its <u>operands are</u> <u>available</u> "even if a predecessor is stalled".

- => separate the issue process into two parts:
  - checking the structural hazards
  - waiting for the absence of a data hazard.
- 1. Decode and issue instructions in order, but
- 2. Execute instructions as soon as their data operands are available
  - => <u>out-of-order execution</u> => <u>out-of-order completion</u>.

# Pipeline non bloccanti / out-of-order

The pipeline will do out-of-order execution, which implies out-of-order completion.

- $\Rightarrow$  ID stage split into 2 stages:
- 1. <u>Issue:</u> Decode instruction, check for structural hazards.
- 2. Read Operands: Wait for data hazard to be resolved, read the operands
- $\Rightarrow$  IF unchanged.
- ⇒ EX considered multicycle, Two phases:
- Instruction begin execution
- Instruction complete execution

Between two times an instruction is in execution!

# Scoreboarding

#### <u>Dynamically scheduled pipeline (in-order issue)</u>

- $\Rightarrow$  all instructions pass through the issue stage in order but,
- ⇒ they can be stalled or bypass each other in the second stage (read operands) and thus enter execution out of order.

#### **Scoreboarding**

- ⇒allows instructions to execute out of order when sufficient resources & no data dependences
- ⇒records data dependences
  - i. Corresponds to instruction issue
  - ii. Replaces part of the ID step in the RISC V pipeline
- ⇒determines when the instruction can read its operands and begin execution.
- ⇒decides if the instruction cannot execute immediately
  - $\Rightarrow$  it monitors every change in the hardware and decides when the instruction can execute.
- ⇒controls when an instruction can write its result into the destination register.
- ⇒All hazard detection and resolution are centralized in the scoreboard.

# Scoreboarding

- Keep track of instructions, functional units, and registers to handle hazards
   Goal: CPI=1
- "Pull out" independent instructions later in the "issue window".
- Wait queue after Issue to hold stalled instructions waiting for operands
  - It may not really exist => implemented by the scoreboard!
- Multiple or pipelined functional units
  - recall: during issue, we stall on a structural hazard

## Scoreboard:

- 1. Instruction status current step (of 4) for an instruction
- **2. Register result status** which FU writes a given register (used to set Qj, Qk when issuing)
- 3. Functional Unit (FUs) status Field Description:
  - i. Busy Unit (busy or not)
  - ii. Op Operation to perform (e.g., +, -)
  - iii. Fi Destination register
  - iv. Fj,Fk Source register numbers

IF

- v. Qj,Qk FUs producing sources Fj,Fk
- vi. Rj,Rk Flags indicating Fj,Fk are ready

Queue

ID



# Stage of Scoreboard Control

## 1. Issue (I)

> decode instruction, check for structural and WAW hazards, stall when necessary

## 2. Read Operands (RO)

wait until no RAW hazards, then read operands, send operation to FU

## 3. Execution (EX)

➤ FU starts execution upon receiving the operands & notifies scoreboard when it's completed

## 4. Write Result (WR)

Scoreboard checks for WAR hazards and stalls write to register file to avoid them

## Hazard Detection:

## Structural hazards (I):

reservation")

## WAW hazards (I):

> ensure that no previous active instruction has the same destination

## RAW hazards (RO):

> check that no previous active instruction writes a source register

## WAR hazards (WR):

➤ before writing a result, check if any previous instructions that haven't gone past RO need that register as a source

# Scoreboard pipeline control

| <u>Status</u><br>Issue | Wait until FU not busy && not result                                                     | Bookkeeping  Busy(FU) $\leftarrow$ Y; Op(FU) $\leftarrow$ op; Fi(FU) $\leftarrow$ D;  Fj(FU) $\leftarrow$ S1; Fk(FU) $\leftarrow$ S2; Qj $\leftarrow$ Res(S1);  Qk $\leftarrow$ Res(S2); Rj $\leftarrow$ !Qj; Rk $\leftarrow$ !Qk;  Res(D) $\leftarrow$ FU |
|------------------------|------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Read ops               | Rj && Rk                                                                                 | Rj←N; Rk←N                                                                                                                                                                                                                                                 |
| Executed               | FU done                                                                                  |                                                                                                                                                                                                                                                            |
| Write dest             | $\forall$ f((Fj(f) $\neq$ Fi(FU)  <br>Rj(f)=N) &&<br>(Fk(f) $\neq$ Fi(FU)  <br>Rk(f)=N)) | $\forall$ f(if Qj(f)=FU, Rj(f) $\leftarrow$ Y);<br>$\forall$ f(if Qk(f)=FU, Rj(f) $\leftarrow$ Y);<br>Result(Fi(FU)) $\leftarrow$ 0; Busy(FU) $\leftarrow$ N                                                                                               |

## Write Destination

#### Wait Until

#### Bookkeeping

```
 \begin{array}{lll} \forall f((Fj(f) \neq Fi(FU) \mid | & \forall f(if \ Qj(f) = FU, \ Rj(f) \leftarrow Y); \\ Rj(f) = N) \ \& \& & \forall f(if \ Qk(f) = FU, \ Rj(f) \leftarrow Y); \\ (Fk(f) \neq Fi(FU) \mid | & Result(Fi(FU)) \leftarrow O; \ Busy(FU) \leftarrow N \end{array}
```

#### Rk(f)=N)

#### Wait until condition says:

Fj(f)≠Fi(FU) does this FU write a result used by another FU?

Rj(f)=N is the other FU waiting for this result?

#### **Bookkeeping says:**

if Qj(f)=FU,  $Rj(f)\leftarrow Y$  set register ready for all consumer FUs Result(Fi(FU)) $\leftarrow 0$ ; clear entry in the register result table

## Scoreboard example

#### Code Sequence:

```
Id.d f6, 34(R2)
Id.d f2, 45(R3)
fmul.d f0, f2, f4
fsub.d f8, f6, f2
fdiv.d f10, f0, f6
fadd.d f6, f8, f2
```

#### **Functional Units:**

• 1 Integer, 2 FP multipliers, 1 FP adder, 1 FP divider

#### Pipeline Latencies:

```
• ld: 1 cycle <= uses Integer unit
```

• fmul: 10 cycles <=used FP multiplier

• fsub, fadd: 2 cycles <= uses FP adder

• fdiv: 40 cycles <= uses FP divider





**CLOCK: 1) Fill in FU status, mark FU producing results.** 

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    |        |       |
| ld.d   | F2  | 45+ | R3 |       |      |        |       |
| fmul.d | F0  | F2  | F4 |       |      |        |       |
| fsub.d | F8  | F6  | F2 |       |      |        |       |
| fdiv.d | F10 | F0  | F6 |       |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### **Functional unit status**

|      | *************************************** |      |    |    |    |       |    |       |  |
|------|-----------------------------------------|------|----|----|----|-------|----|-------|--|
| Time | Name                                    | Busy | Op | Fi | Fj | Fk Qj | Qk | Rj Rk |  |
|      | Integer                                 | Yes  | Ld | F6 |    | R2    |    | Yes   |  |
|      | Mult1                                   | No   |    |    |    |       |    |       |  |
|      | Mult2                                   | No   |    |    |    |       |    |       |  |
|      | Add                                     | No   |    |    |    |       |    |       |  |
|      | Divide                                  | No   |    |    |    |       |    |       |  |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30 Int

FU

CLOCK: 2) Issue 2° ld?

#### Instruction status Op Finish Write k Issue Read 2 F6 34+ R2 3 ld.d 45+ R3 ld.d F2 F4 F0 fmul.d F6 F8 F2 fsub.d F10 F0 F6 fdiv.d F8 F2 F6 fadd.d Functional unit status Fi Fj Fk Qj Time Name Busy Op Qk Rj Rk Yes Integer F6 Ld R2 Yes Mult1 No Mult2 No Add No Divide No Register result status F4 F6 F8 F10 F12 ... F0 F2 F30 FU Int

**CLOCK:** 3) Single cycle load completes. What if we have a multi-cycle load?

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 |       |      |        |       |
| fmul.d | F0  | F2  | F4 |       |      |        |       |
| fsub.d | F8  | F6  | F2 |       |      |        |       |
| fdiv.d | F10 | F0  | F6 |       |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### **Functional unit status**

| Time | Name    | Busy | Op | Fi | Fj | Fk Qj | Qk | Rj F | ?k |  |
|------|---------|------|----|----|----|-------|----|------|----|--|
|      | Integer | Yes  | Ld | F6 |    | R2    |    | Y    | es |  |
|      | Mult1   | No   |    |    |    |       |    |      |    |  |
|      | Mult2   | No   |    |    |    |       |    |      |    |  |
|      | Add     | No   |    |    |    |       |    |      |    |  |
|      | Divide  | No   |    |    |    |       |    |      |    |  |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30 Int

CLOCK: 4) load completes, write result (WAR?).

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     |      |        |       |
| fmul.d | F0  | F2  | F4 |       |      |        |       |
| fsub.d | F8  | F6  | F2 |       |      |        |       |
| fdiv.d | F10 | F0  | F6 |       |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### **Functional unit status**

| Time | Name    | Busy | Op | Fi | Fj | Fk Qj | Qk Rj | Rk  |  |
|------|---------|------|----|----|----|-------|-------|-----|--|
|      | Integer | Yes  | Ld | F2 |    | R3    |       | Yes |  |
|      | Mult1   | No   |    |    |    |       |       |     |  |
|      | Mult2   | No   |    |    |    |       |       |     |  |
|      | Add     | No   |    |    |    |       |       |     |  |
|      | Divide  | No   |    |    |    |       |       |     |  |

#### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30
Int

FU

CLOCK: 5) structural hazard for Int unit resolved. 2<sup>nd</sup> load can issue, FMUL is fetched.

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    |        |       |
| fmul.d | F0  | F2  | F4 | 6     |      |        |       |
| fsub.d | F8  | F6  | F2 |       |      |        |       |
| fdiv.d | F10 | F0  | F6 |       |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### Functional unit status

| Time | Name    | Busy | Op   | Fi | Fj | Fk | Qj  | Qk | Rj | Rk  |  |
|------|---------|------|------|----|----|----|-----|----|----|-----|--|
|      | Integer | Yes  | Ld   | F2 |    | R3 |     |    |    | Yes |  |
|      | Mult1   | Yes  | Mult | F0 | F2 | F4 | Int |    | No | Yes |  |
|      | Mult2   | No   |      |    |    |    |     |    |    |     |  |
|      | Add     | No   |      |    |    |    |     |    |    |     |  |
|      | Divide  | No   |      |    |    |    |     |    |    |     |  |

#### Register result status

FU

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Int

**CLOCK**: 6) - LD operands available, issue FMUL. Can the FSUB issue on the next cycle?

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      |       |
| fmul.d | F0  | F2  | F4 | 6     |      |        |       |
| fsub.d | F8  | F6  | F2 | 7     |      |        |       |
| fdiv.d | F10 | F0  | F6 |       |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### **Functional unit status**

| · dilletion | WITH C TOUTON |      |      |    |    |    |     |     |     |     |   |
|-------------|---------------|------|------|----|----|----|-----|-----|-----|-----|---|
| Time        | Name          | Busy | Op   | Fi | Fj | Fk | Qj  | Qk  | Rj  | Rk  |   |
|             | Integer       | Yes  | Ld   | F2 |    | R3 |     |     |     | Yes | ٦ |
|             | Mult1         | Yes  | Mult | F0 | F2 | F4 | Int |     | No  | Yes |   |
|             | Mult2         | No   |      |    |    |    |     |     |     |     |   |
|             | Add           | Yes  | Sub  | F8 | F6 | F2 |     | Int | Yes | No  |   |
|             | Divide        | No   |      |    |    |    |     |     |     |     |   |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Int Add

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      |       |
| fmul.d | F0  | F2  | F4 | 6     |      |        |       |
| fsub.d | F8  | F6  | F2 | 7     |      |        |       |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

### **Functional unit status**

| Time | Name    | Busy | Op   | Fi  | Fj | Fk | Qj  | Qk  | Rj  | Rk  |   |
|------|---------|------|------|-----|----|----|-----|-----|-----|-----|---|
|      | Integer | Yes  | Ld   | F2  |    | R3 |     |     |     | Yes | П |
|      | Mult1   | Yes  | Mult | F0  | F2 | F4 | Int |     | No  | Yes |   |
|      | Mult2   | No   |      |     |    |    |     |     |     |     |   |
|      | Add     | Yes  | Sub  | F8  | F6 | F2 |     | Int | Yes | No  |   |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1  |     | No  | Yes |   |

### Register result status

 F0
 F2
 F4
 F6
 F8
 F10
 F12
 ...
 F30

 Mult1
 Int
 Add Div

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     |      |        |       |
| fsub.d | F8  | F6  | F2 | 7     |      |        |       |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### **Functional unit status**

| Time | Name    | Busy | Op   | Fi  | Fj | Fk Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|-------|----|-----|-----|
|      | Integer | No   |      |     |    |       |    |     |     |
|      | Mult1   | Yes  | Mult | F0  | F2 | F4    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |       |    |     |     |
|      | Add     | Yes  | Sub  | F8  | F6 | F2    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 M1 |    | No  | Yes |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

FU

**CLOCK: 8(b) (the LD completes. FMUL and FADD ready.)** 

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    |        |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    |        |       |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 |       |      |        |       |

#### Functional unit status

| Time | Name    | Busy | Op   | Fi  | Fj | Fk Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|-------|----|-----|-----|
|      | Integer | No   |      |     |    |       |    |     |     |
| 10   | Mult1   | Yes  | Mult | F0  | F2 | F4    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |       |    |     |     |
| 2    | Add     | Yes  | Sub  | F8  | F6 | F2    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 M  | 1  | No  | Yes |

#### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

| Instruction st | <u>atus</u> |     |    |       |      |        |       |
|----------------|-------------|-----|----|-------|------|--------|-------|
| Ор             | d           | j   | k  | Issue | Read | Finish | Write |
| ld.d           | F6          | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d           | F2          | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d         | F0          | F2  | F4 | 6     | 9    |        |       |
| fsub.d         | F8          | F6  | F2 | 7     | 9    | 11     |       |
| fdiv.d         | F10         | F0  | F6 | 8     |      |        |       |
| fadd.d         | F6          | F8  | F2 |       |      |        |       |

#### Functional unit status

| Time | Name    | Busy | Op   | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|----|----|----|-----|-----|
|      | Integer | No   |      |     |    |    |    |    |     |     |
| 8    | Mult1   | Yes  | Mult | F0  | F2 | F4 |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |    |    |     |     |
| 0    | Add     | Yes  | Sub  | F8  | F6 | F2 |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1 |    | No  | Yes |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

| Instruction st | atus    |             |    |       |      |       |    |     |     |     |     |     |
|----------------|---------|-------------|----|-------|------|-------|----|-----|-----|-----|-----|-----|
| Ор             | d       | j           | k  | Issue | Read | Finis | sh | Wri | te  |     |     |     |
| ld.d           | F6      | 34+         | R2 | 1     | 2    | 3     |    | 4   |     |     |     |     |
| ld.d           | F2      | 45+         | R3 | 5     | 6    | 7     |    | 8   |     |     |     |     |
| fmul.d         | F0      | F2          | F4 | 6     | 9    |       |    |     |     |     |     |     |
| fsub.d         | F8      | F6          | F2 | 7     | 9    | 11    |    | 12  |     |     |     |     |
| fdiv.d         | F10     | F0          | F6 | 8     |      |       |    |     |     |     |     |     |
| fadd.d         | F6      | F8          | F2 |       |      |       |    |     |     |     |     |     |
| Functional ur  | nit sta | atus        |    |       |      |       |    |     |     |     |     |     |
| Time           | Nan     | пе          |    | Busy  | Op   | Fi    | Fj | Fk  | Qj  | Qk  | Rj  | Rk  |
|                | Inte    | ger         |    | No    |      |       |    |     |     |     |     |     |
| 7              | Mult    | 1           |    | Yes   | Mult | F0    | F2 | F4  |     |     | Yes | Yes |
|                | Mult    | 2           |    | No    |      |       |    |     |     |     |     |     |
|                | Add     |             |    | No    |      |       |    |     |     |     |     |     |
|                | Divi    | de          |    | Yes   | Div  | F10   | F0 | F6  | M1  |     | No  | Yes |
| Register resu  | ılt sta | <u>ıtus</u> |    |       |      |       |    |     |     |     |     |     |
|                |         |             |    | F0    | F2   | F4    | F6 | F8  | F10 | F12 |     | F30 |
|                | FU      |             |    | Mult1 |      |       |    |     | Div |     |     |     |

**CLOCK: 12) - FSUB completes before FMUL. Can we read operands for FDIV? What about FADD?** 

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | FO  | F2  | F4 | 6     | 9    |        |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 | 13    |      |        |       |

#### Functional unit status

| Time | Name    | Busy | Op   | Fi  | Fj | Fk ( | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|------|----|----|-----|-----|
|      | Integer | No   |      |     |    |      |    |    |     |     |
| 6    | Mult1   | Yes  | Mult | F0  | F2 | F4   |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |      |    |    |     |     |
|      | Add     | Yes  | Add  | F6  | F8 | F2   |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 I | M1 |    | No  | Yes |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

FU

**CLOCK: 13) - FADD structural hazard cleared.** 

| Op     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    |        |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 | 13    | 14   |        |       |

### Functional unit status

| Time | Name    | Busy | Op   | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|----|----|----|-----|-----|
|      | Integer | No   |      |     |    |    |    |    |     |     |
| 5    | Mult1   | Yes  | Mult | F0  | F2 | F4 |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |    |    |     |     |
| 2    | Add     | Yes  | Add  | F6  | F8 | F2 |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1 |    | No  | Yes |

### Register result status

FU F2 F4 F6 F8 F10 F12 ... F30
Mult1 Add Div

**CLOCK: 14) - FADD reads operands.** 

| Op     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    |        |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 | 13    | 14   |        |       |

#### Functional unit status

| Time | Name    | Busy | Op   | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|----|----|----|-----|-----|
|      | Integer | No   |      |     |    |    |    |    |     |     |
| 4    | Mult1   | Yes  | Mult | F0  | F2 | F4 |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |    |    |     |     |
| 1    | Add     | Yes  | Add  | F6  | F8 | F2 |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1 |    | No  | Yes |

### Register result status

FU

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

**CLOCK: 15)** 

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    |        |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 | 13    | 14   | 16     |       |

#### **Functional unit status**

| Time | Name    | Busy | Op   | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|----|----|----|-----|-----|
|      | Integer | No   |      |     |    |    |    |    |     |     |
| 2    | Mult1   | Yes  | Mult | F0  | F2 | F4 |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |    |    |     |     |
|      | Add     | Yes  | Add  | F6  | F8 | F2 |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1 |    | No  | Yes |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

FU

**CLOCK: 16) - FADD finishes.** 

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    |        |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 | 13    | 14   | 16     |       |

#### Functional unit status

| Time | Name    | Busy | Ор   | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|----|----|----|-----|-----|
|      | Integer | No   |      |     |    |    |    |    |     |     |
| 1    | Mult1   | Yes  | Mult | F0  | F2 | F4 |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |    |    |     |     |
|      | Add     | Yes  | Add  | F6  | F8 | F2 |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1 |    | No  | Yes |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    | 19     |       |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     |      |        |       |
| fadd.d | F6  | F8  | F2 | 13    | 14   | 16     |       |

#### **Functional unit status**

| Time | Name    | Busy | Op   | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |
|------|---------|------|------|-----|----|----|----|----|-----|-----|
|      | Integer | No   |      |     |    |    |    |    |     |     |
| 0    | Mult1   | Yes  | Mult | F0  | F2 | F4 |    |    | Yes | Yes |
|      | Mult2   | No   |      |     |    |    |    |    |     |     |
|      | Add     | Yes  | Add  | F6  | F8 | F2 |    |    | Yes | Yes |
|      | Divide  | Yes  | Div  | F10 | F0 | F6 | M1 |    | No  | Yes |

### Register result status

F0 F2 F4 F6 F8 F10 F12 ... F30

Mult1 Add Div

FU

**CLOCK: 19) - FMUL finishes after FSUB.** 

| Instruction st       | <u>atus</u> |             |    |       |      |       |    |     |     |     |     |     |
|----------------------|-------------|-------------|----|-------|------|-------|----|-----|-----|-----|-----|-----|
| Ор                   | d           | j           | k  | Issue | Read | Finis | sh | Wri | te  |     |     |     |
| ld.d                 | F6          | 34+         | R2 | 1     | 2    | 3     |    | 4   |     |     |     |     |
| ld.d                 | F2          | 45+         | R3 | 5     | 6    | 7     |    | 8   |     |     |     |     |
| fmul.d               | F0          | F2          | F4 | 6     | 9    | 19    |    | 20  |     |     |     |     |
| fsub.d               | F8          | F6          | F2 | 7     | 9    | 11    |    | 12  |     |     |     |     |
| fdiv.d               | F10         | F0          | F6 | 8     |      |       |    |     |     |     |     |     |
| fadd.d               | F6          | F8          | F2 | 13    | 14   | 16    |    |     |     |     |     |     |
| <u>Functional ur</u> | nit sta     | <u>atus</u> |    |       |      |       |    |     |     |     |     |     |
| Time                 | Nan         | ne .        |    | Busy  | Ор   | Fi    | Fj | Fk  | Qj  | Qk  | Rj  | Rk  |
|                      | Inte        | ger         |    | No    |      |       |    |     |     |     |     |     |
|                      | Mult        | 1           |    | No    |      |       |    |     |     |     |     |     |
|                      | Mult        | 2           |    | No    |      |       |    |     |     |     |     |     |
|                      | Add         |             |    | Yes   | Add  | F6    | F8 | F2  |     |     | Yes | Yes |
|                      | Divid       | de          |    | Yes   | Div  | F10   | F0 | F6  |     |     | Yes | Yes |
| Register resu        | lt sta      | <u>tus</u>  |    | F0    | F2   | F4    | F6 | F8  | F10 | F12 |     | F30 |

Add

Div

**CLOCK: 20) - FMUL completes, FDIV ready to go** 

| Instruction st       | atus           |             |    |       |      |       |     |     |     |     |     |     |
|----------------------|----------------|-------------|----|-------|------|-------|-----|-----|-----|-----|-----|-----|
| Ор                   | d              | j           | k  | Issue | Read | Finis | sh  | Wri | te  |     |     |     |
| ld.d                 | F6             | 34+         | R2 | 1     | 2    | 3     |     | 4   |     |     |     |     |
| ld.d                 | F2             | 45+         | R3 | 5     | 6    | 7     |     | 8   |     |     |     |     |
| fmul.d               | F0             | F2          | F4 | 6     | 9    | 19    |     | 20  |     |     |     |     |
| fsub.d               | F8             | F6          | F2 | 7     | 9    | 11    |     | 12  |     |     |     |     |
| fdiv.d               | F10            | F0          | F6 | 8     | 21   |       |     |     |     |     |     |     |
| fadd.d               | F6             | F8          | F2 | 13    | 14   | 16    |     |     |     |     |     |     |
|                      |                |             |    |       |      |       |     |     |     |     |     |     |
| <u>Functional ur</u> | <u>iit sta</u> | <u>atus</u> |    |       |      |       |     |     |     |     |     |     |
| Time                 | Nan            | ne          |    | Busy  | Op   | Fi    | Fj  | Fk  | Qj  | Qk  | Rj  | Rk  |
|                      | Integ          | ger         |    | No    |      |       |     |     |     |     |     |     |
|                      | Mult           | 1           |    | No    |      |       |     |     |     |     |     |     |
|                      | Mult           | 2           |    | No    |      |       |     |     |     |     |     |     |
|                      | Add            |             |    | Yes   | Add  | F6    | F8  | F2  |     |     | Yes | Yes |
|                      | Divid          | de          |    | Yes   | Div  | F10   | F0  | F6  |     |     | Yes | Yes |
|                      |                |             |    |       |      |       |     |     |     |     |     |     |
| Register resu        | lt sta         | <u>tus</u>  |    |       |      |       |     |     |     |     |     |     |
|                      |                |             |    | F0    | F2   | F4    | F6  | F8  | F10 | F12 |     | F30 |
|                      | FU             |             |    |       |      |       | Add |     | Div |     |     |     |

**CLOCK: 21 - FDIV has operands. Can FADD now complete on next cycle?** 

| Instruction status |                |             |    |          |           |           |          |     |     |     |     |            |
|--------------------|----------------|-------------|----|----------|-----------|-----------|----------|-----|-----|-----|-----|------------|
| Ор                 | d              | j           | k  | Issue    | Read      | Finis     | sh       | Wri | te  |     |     |            |
| ld.d               | F6             | 34+         | R2 | 1        | 2         | 3         |          | 4   |     |     |     |            |
| ld.d               | F2             | 45+         | R3 | 5        | 6         | 7         |          | 8   |     |     |     |            |
| fmul.d             | F0             | F2          | F4 | 6        | 9         | 19        |          | 20  |     |     |     |            |
| fsub.d             | F8             | F6          | F2 | 7        | 9         | 11        |          | 12  |     |     |     |            |
| fdiv.d             | F10            | F0          | F6 | 8        | 21        |           |          |     |     |     |     |            |
| fadd.d             | F6             | F8          | F2 | 13       | 14        | 16        |          | 22  |     |     |     |            |
|                    |                |             |    |          |           |           |          |     |     |     |     |            |
| Functional ur      | <u>iit sta</u> | <u>atus</u> |    |          |           |           |          |     |     |     |     |            |
| Time               | Nan            | ne          |    | Busy     | Op        | Fi        | Fj       | Fk  | Qj  | Qk  | Rj  | Rk         |
|                    | Inte           | ger         |    | No       |           |           |          |     |     |     |     |            |
|                    | Mult           | 1           |    | N.L.     |           |           |          |     |     |     |     |            |
|                    |                | . 1         |    | No       |           |           |          |     |     |     |     |            |
|                    | Mult           |             |    | No<br>No |           |           |          |     |     |     |     |            |
|                    | Mult<br>Add    |             |    |          |           |           |          |     |     |     |     |            |
| 40                 |                | 2           |    | No       | Div       | F10       | F0       | F6  |     |     | Yes | Yes        |
| 40                 | Add            | 2           |    | No<br>No | Div       | F10       | F0       | F6  |     |     | Yes | Yes        |
| 40 Register resu   | Add<br>Divid   | 2<br>de     |    | No<br>No | Div       | F10       | F0       | F6  |     |     | Yes | Yes        |
|                    | Add<br>Divid   | 2<br>de     |    | No<br>No | Div<br>F2 | F10<br>F4 | F0<br>F6 | F6  | F10 | F12 |     | Yes<br>F30 |

**CLOCK: 22) FDIV executes, FADD writes result.** 

| QD     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    | 19     | 20    |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     | 21   | 61     |       |
| fadd.d | F6  | F8  | F2 | 13    | 14   | 16     | 22    |

### **Functional unit status**

| Time | Name    | Busy | Op  | Fi  | Fj | Fk | Qj | Qk | Rj  | Rk  |  |
|------|---------|------|-----|-----|----|----|----|----|-----|-----|--|
|      | Integer | No   |     |     |    |    |    |    |     |     |  |
|      | Mult1   | No   |     |     |    |    |    |    |     |     |  |
|      | Mult2   | No   |     |     |    |    |    |    |     |     |  |
|      | Add     | No   |     |     |    |    |    |    |     |     |  |
| 0    | Divide  | Yes  | Div | F10 | F0 | F6 |    |    | Yes | Yes |  |

### Register result status

FU

| _F0 | F2 | F4 | F6 | F8 | F10 | F12 | F30 |  |
|-----|----|----|----|----|-----|-----|-----|--|
|     |    |    |    |    | Div |     |     |  |

**CLOCK: 61) FDIV finishes** 

| Ор     | d   | j   | k  | Issue | Read | Finish | Write |
|--------|-----|-----|----|-------|------|--------|-------|
| ld.d   | F6  | 34+ | R2 | 1     | 2    | 3      | 4     |
| ld.d   | F2  | 45+ | R3 | 5     | 6    | 7      | 8     |
| fmul.d | F0  | F2  | F4 | 6     | 9    | 19     | 20    |
| fsub.d | F8  | F6  | F2 | 7     | 9    | 11     | 12    |
| fdiv.d | F10 | F0  | F6 | 8     | 21   | 61     | 62    |
| fadd.d | F6  | F8  | F2 | 13    | 14   | 16     | 22    |

#### **Functional unit status**

| Time | Name    | Busy | Op | Fi | Fj | Fk Qj | Qk Rj | Rk |  |
|------|---------|------|----|----|----|-------|-------|----|--|
|      | Integer | No   |    |    |    |       |       |    |  |
|      | Mult1   | No   |    |    |    |       |       |    |  |
|      | Mult2   | No   |    |    |    |       |       |    |  |
|      | Add     | No   |    |    |    |       |       |    |  |
|      | Divide  | No   |    |    |    |       |       |    |  |

### Register result status



**CLOCK:** 62 (FDIV writes its result back)

# Scoreboarding: Pro and Contra

### Structural hazards possible on buses

- Buses to/from register file may be limited
- Ensure there are not more instructions "in flight" than can successfully fetch registers

### No forwarding

- Operands are read only when both are available in the register file
- Fortunately, instructions write into the register file right away

### Summary:

- Stalls on I if WAW or structural hazards.
- Stalls on RO if RAW
- Stalls on WR if WAR