



# 计算机体系结构

周学海 xhzhou@ustc.edu.cn 0551-63606864 中国科学技术大学



### Review: 预测器的基本结构



- ・ 根据转移历史(和PC)来选择状态
- ・ 由状态决定预测值(输出)
- · 根据实际结果(outcomes)更新状态信息



### Review

#### · 基于BHT表的预测器:

- Basic 2-bit predictor:
- Global predictor:
  - 每个分支对应多个m-bit预测器
  - 最近n次的分支转移的每一种情况分别对应其中一个预测器
- Local predictor:
  - 每个分支对应多个m-bit预测器
  - 该分支最近n次分支转移的每一种情况分别对应其中一个预测器
- Tournament predictor:
  - 从多种预测器的预测结果中选择合适的预测结果。
  - 例如: 两级全局预测器与两级局部预测器

#### · 优化取指令的带宽

- 基于BTB的分支预测器
- Return Address Stack
- **二集成的独立的取指部件**



### Return Address Predictors

- 投机执行面临的挑战: 预测间接跳转
  - 运行时才能确定分支目标地址
- ・多数间接跳转来源于Procedure Return
  - 采用BTB时,对于过程返回的预测精度较低
  - SPEC CPU95测试,这类分支预测的准确性不到60%
- ・ 使用一个小的缓存(栈) 存放 Return Address
  - 过程调用时将返回地址压入该栈
  - 过程返回时通过弹栈操作获得转移地址



### Instruction Fetch Unit





## 第5章 指令级并行

#### 5.1 指令级并行的基本概念及静态指令流调度

ILP及挑战性问题 软件方法挖掘指令集并行 基本块内的指令集并行

#### 5.2硬件方法挖掘指令级并行(4学时)

- 5.2-1 指令流动态调度方法之一: Scoreboard
- 5.2-2 指令流动态调度方法之二:Tomasulo

#### 5.3 分支预测方法

#### 5.4 基于硬件的推测执行

- 5.5-1 存储器访问冲突消解
- 5.5-2 多发射技术
- 5.6 多线程技术



### 5.4 推断执行

#### 支持推断执行 的Tomasulo

### 代码执行 示例

### Tomasulo 小结

- 1. 带有ROB的机器结构
- 2. 四阶段算法描述

- 1. 简单代码示例
- 2. 推断执行示例

- 1. ROB的作用
- 2. 动态内存歧义消除

5/10/2021



### 一种支持推断执行的机器结构



支持硬件推断执行和精确异 常处理的Tomasulo算法所 依赖的机器结构

- 1. 带有Reorder Buffer (ROB)
- 2. **带有BPB和BTB**,具有快速解决控制相关的能力



## 硬件支持推断执行以及精确异常

### · 需要硬件缓存没有提交的指令结果: reorder buffer (ROB)

- 3 个域: 指令类型,目的地址,值
- Reorder buffer 可以作为操作数源 => 就像有更多的寄存器 (与RS类似)
- 当指令执行阶段完成后,用ROB的编号 代替RS中的值
- 增加指令提交阶段 (Commit)
- ROB提供执行完成阶段和提交阶段的操作数
- 一旦结果提交,结果就写入寄存器
- 在预测错误时,容易恢复推断执行的指令,或发生异常时,容易恢复状态





### 支持推断执行的 Tomasulo 算法的四阶段

#### 1. Issue—get instruction from FP Op Queue

- 如果RS和ROB有空闲单元就发射指令。如果寄存器或ROB中源操作数可用,就将其发送到RS,目的地址的ROB编号也发送给RS

#### 2. Execution—operate on operands (EX)

- 当操作数就绪后,开始执行。如果没有就绪,监测CDB,检查RAW相关(注:需要检测CDB冲突)

#### 3. Write result—finish execution (WB)

- 将运算结果通过CDB传送给所有等待结果的FU以及ROB单元, 标识RS可用

### 4. Commit—update register with reorder result

- 按ROB表中顺序, 如果结果已有, 就更新寄存器(或存储器), 并将该指令从ROB表中删除
- 预测错误或有异常 (中断) 时, 刷新ROB
- P191 Figure 3.14 (英文版), P141 Figure 3-9 (中文版)



### Issue

| Status                       | Wait until                  | Action or bookkeeping                                                                                                                                                                                                                                                                                                                                                                                          |
|------------------------------|-----------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Issue<br>all<br>instructions | Reservation station (r) and | <pre>if (RegisterStat[rs].Busy)/*in-flight instr. writes rs*/     {h ← RegisterStat[rs].Reorder;     if (ROB[h].Ready)/* Instr completed already */         {RS[r].Vj ← ROB[h].Value; RS[r].Qj ← 0;}     else {RS[r].Qj ← h;} /* wait for instruction */ } else {RS[r].Vj ← Regs[rs]; RS[r].Qj ← 0;}; RS[r].Busy ← yes; RS[r].Dest ← b; ROB[b].Instruction ← opcode; ROB[b].Dest ← rd;ROB[b].Ready ← no;</pre> |
| FP operations and stores     | ROB (b)<br>both available   | <pre>if (RegisterStat[rt].Busy) /*in-flight instr writes rt*/     {h ← RegisterStat[rt].Reorder;     if (ROB[h].Ready)/* Instr completed already */         {RS[r].Vk ← ROB[h].Value; RS[r].Qk ← 0;}     else {RS[r].Qk ← h;} /* wait for instruction */ } else {RS[r].Vk ← Regs[rt]; RS[r].Qk ← 0;};</pre>                                                                                                    |
| FP operations                |                             | RegisterStat[rd].Reorder ← b; RegisterStat[rd].Busy ← yes; ROB[b].Dest ← rd;                                                                                                                                                                                                                                                                                                                                   |
| Loads                        |                             | RS[r].A ← imm; RegisterStat[rt].Reorder ← b;<br>RegisterStat[rt].Busy ← yes; ROB[b].Dest ← rt;                                                                                                                                                                                                                                                                                                                 |
| Stores                       |                             | RS[r].A ← imm;                                                                                                                                                                                                                                                                                                                                                                                                 |

h: ROB中当前指令所依赖的指令对应的ROB编号;

b: 当前指令对应的ROB编号



# Execute

| Execute<br>FP op | (RS[r].Qj == 0) and $(RS[r].Qk == 0)$                                          | Compute results—operands are in Vj and Vk |  |
|------------------|--------------------------------------------------------------------------------|-------------------------------------------|--|
| Load step 1      | (RS[r].Qj == 0) and<br>there are no stores<br>earlier in the queue             | $RS[r].A \leftarrow RS[r].Vj + RS[r].A;$  |  |
| Load step 2      | Load step 1 done<br>and all stores earlier<br>in ROB have<br>different address | Read from Mem[RS[r].A]                    |  |
| Store            | (RS[r].Qj == 0) and<br>store at queue head                                     | ROB[h].Address ← RS[r].Vj + RS[r].A;      |  |



### Write result & Commit

```
Write result Execution done at r b \leftarrow RS[r].Dest; RS[r].Busy <math>\leftarrow no;
                                   \forall x (if (RS[x].Qj==b) \{RS[x].Vj \leftarrow result; RS[x].Qj \leftarrow 0\}); \forall x (if (RS[x].Qk==b) \{RS[x].Vk \leftarrow result; RS[x].Qk \leftarrow 0\});
all but store and CDB available
                                   ROB[b]. Value ← result: ROB[b]. Ready ← yes:
              Execution done at r ROB[h]. Value \leftarrow RS[r]. Vk;
Store
              and (RS[r].Qk ==
              0)
Commit
              Instruction is at the
                                   d ← ROB[h].Dest; /* register dest, if exists */
                                   if (ROB[h].Instruction==Branch)
              head of the ROB
                                       {if (branch is mispredicted)
              (entry h) and
                                        {clear ROB[h], RegisterStat; fetch branch dest;};}
              ROB[h].ready ==
                                   else if (ROB[h].Instruction==Store)
              yes
                                             {Mem[ROB[h].Destination] ← ROB[h].Value;}
                                   else /* put the result in the register destination */
                                       \{Regs[d] \leftarrow ROB[h].Value;\};
                                   ROB[h].Busy ← no; /* free up ROB entry */
                                   /* free up dest register if no one else writing it */
                                   if (RegisterStat[d].Reorder==h) {RegisterStat[d].Busy ← no;};
```



### 5.4 推断执行

#### 支持推断执行 的Tomasulo

### 代码执行 示例

### Tomasulo 小结

- 1. 带有ROB的机器结构
- 2. 四阶段算法描述

- 1. 简单代码示例
- 2. 推断执行示例

- 1. ROB的作用
- 2. 动态内存歧义消除



## 例如:

LD F6, 34(R2)
LD F2, 45(R3)
MULT F0, F2, F4
SUBD F8, F6, F2
DIVD F10, F0, F6
ADDD F6, F8, F2

假设: 执行阶段的周期数

LD: 1 cycles MULT: 10 cycles

SUBD/ADDD: 2cycles DIVD: 40 cycles



| Time        | Name  | Busy  | 0p   | Vj          | Vk    | Qj          | Qk    | Dest  |           |         |
|-------------|-------|-------|------|-------------|-------|-------------|-------|-------|-----------|---------|
| 0           | Add1  | No    |      |             |       |             |       |       | Reser     | rvation |
| 0           | Add2  | No    |      |             |       |             |       |       | Sta       | ation   |
| 0           | Add3  | No    |      |             |       |             |       |       |           |         |
| 0           | Mult1 | No    |      |             |       |             |       |       |           |         |
| 0           | Mult2 | No    |      |             |       |             |       |       |           |         |
|             |       |       |      |             |       |             |       |       |           |         |
|             |       |       |      |             |       |             |       | •     | Busy      | Address |
|             | i     | Entry | Busy | Instruction | State | Destination | Value | Load1 |           |         |
| LD F6, 34(R | 2)    | 1     |      |             |       |             |       | Load2 |           |         |
| LD F2, 45(R | ,     | 2     |      |             |       |             |       | Load3 |           |         |
| •           | ,     | 3     |      |             |       |             |       |       |           |         |
| MULT FO, F2 | 2, F4 | 4     |      |             |       |             |       |       |           |         |
| SUBD F8, F6 | 6, F2 | 5     |      |             |       |             |       |       |           |         |
| DIVD F10, F |       | 6     |      |             |       |             |       |       |           |         |
| ADDD 50, 1  | •     | 7     |      |             |       |             |       | Red   | order But | ffer    |

Cycle

ADDD F6, F8, F2

8 9 10

|   | _        | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ••••• | F30 |
|---|----------|----|----|----|----|----|-----|-----|-------|-----|
| 0 | Reorder# |    |    |    |    |    |     |     |       |     |
|   | Busy     | No | No | No | No | No | No  | No  |       | No  |

xhzhou@USTC 16 5/10/2021



| Time | Name  | Busy | 0р | Vj | Vk | Qj | Qk | Dest |
|------|-------|------|----|----|----|----|----|------|
| 0    | Add1  | No   |    |    |    |    |    |      |
| 0    | Add2  | No   |    |    |    |    |    |      |
| 0    | Add3  | No   |    |    |    |    |    |      |
| 0    | Mult1 | No   |    |    |    |    |    |      |
| 0    | Mult2 | No   |    |    |    |    |    |      |

Reservation Station

LD F6, 34(R2) Head LD F2, 45(R3) MULT F0, F2, F4 SUBD F8, F6, F2 DIVD F10, F0, F6 ADDD F6, F8, F2

| Entry | Busy | Instruction    | State | Destination | Value |
|-------|------|----------------|-------|-------------|-------|
| 1     | Yes  | LD F6, 34 (R2) | Issue | F6          |       |
| 2     |      |                |       |             |       |
| 3     |      |                |       |             |       |
| 4     |      |                |       |             |       |
| 5     |      |                |       |             |       |
| 6     |      |                |       |             |       |
| 7     |      |                |       |             |       |
| 8     |      |                |       |             |       |
| 9     |      |                |       |             |       |
| 10    |      |                |       |             |       |

xhzhou@USTC

Busy Address
Yes 34+Regs[R2]

Load1

Load2 Load3

Reorder Buffer

Cycle

| 1        | Reorder# |
|----------|----------|
| 5/10/202 | Busy     |

|    | F0 | F2 | F4 | F6  | F8 | F10 | F12 | ••••• | F30 |  |
|----|----|----|----|-----|----|-----|-----|-------|-----|--|
| r# |    |    |    | #1  |    |     |     |       |     |  |
|    | No | No | No | Yes | No | No  | No  |       | No  |  |

1



| Time | Name  | Busy | 0p | Vj | Vk | Qj | Qk | Dest |
|------|-------|------|----|----|----|----|----|------|
| 0    | Add1  | No   |    |    |    |    |    |      |
| 0    | Add2  | No   |    |    |    |    |    |      |
| 0    | Add3  | No   |    |    |    |    |    |      |
| 0    | Mult1 | No   |    |    |    |    |    |      |
| 0    | Mult2 | No   |    |    |    |    |    |      |

Reservation Station

Head LD F6, 34(R2) tail LD F2, 45(R3) MULT F0, F2, F4 SUBD F8, F6, F2 DIVD F10, F0, F6 ADDD F6, F8, F2

| Entry | Busy | Instruction    | State | Destination | Value |
|-------|------|----------------|-------|-------------|-------|
| 1     | Yes  | LD F6, 34 (R2) | Ex1   | F6          |       |
| 2     | Yes  | LD F2, 45 (R3) | Issue | F2          |       |
| 3     |      |                |       |             |       |
| 4     |      |                |       |             |       |
| 5     |      |                |       |             |       |
| 6     |      |                |       |             |       |
| 7     |      |                |       |             |       |
| 8     |      |                |       |             |       |
| 9     |      |                |       |             |       |
| 10    |      |                |       |             |       |

Address Busy Yes 34+Regs[R2] Yes 45+Regs[R3]

Load1

Load2 Load3

Reorder Buffer

Cycle

Reorder Busy 5/10/2021

| _  | F0 | F2  | F4 | F6  | F8 | F10 | F12 | ••••• | F30 |
|----|----|-----|----|-----|----|-----|-----|-------|-----|
| r# |    | #2  |    | #1  |    |     |     |       |     |
|    | No | Yes | No | Yes | No | No  | No  |       | No  |

xhzhou@USTC



| Time | Name  | Busy | Ор   | Vj | Vk       | Qj | Qk | Dest |
|------|-------|------|------|----|----------|----|----|------|
| 0    | Add1  | No   |      |    |          |    |    |      |
| 0    | Add2  | No   |      |    |          |    |    |      |
| 0    | Add3  | No   |      |    |          |    |    |      |
| 0    | Mult1 | Yes  | Mu1t |    | Regs[F4] | #2 |    | #3   |
| 0    | Mult2 | No   |      |    |          |    |    |      |

Reservation Station

Head LD F6, 34(R2) LD F2, 45(R3) tail MULT F0, F2, F4 SUBD F8, F6, F2 DIVD F10, F0, F6 ADDD F6, F8, F2

| Entry | Busy | Instruction     | State | Destination | Value      |
|-------|------|-----------------|-------|-------------|------------|
| 1     | Yes  | LD F6, 34 (R2)  | Write | F6          | Mem[load1] |
| 2     | Yes  | LD F2, 45 (R3)  | Ex1   | F2          |            |
| 3     | Yes  | MULT F0, F2, F4 | Issue | F0          |            |
| 4     |      |                 |       |             |            |
| 5     |      |                 |       |             |            |
| 6     |      |                 |       |             |            |
| 7     |      |                 |       |             |            |
| 8     |      |                 |       |             |            |
| 9     |      |                 |       |             |            |
| 10    |      |                 |       |             |            |

Address Busy No Yes 45+Regs[R3]

Load1

Load2

Load3

Reorder Buffer

Cycle

Reorder# Busy 5/10/2021

|   | F0  | F2  | F4 | F6  | F8 | F10 | F12 | ••••• | F30 |
|---|-----|-----|----|-----|----|-----|-----|-------|-----|
| # | #3  | #2  |    | #1  |    |     |     |       |     |
|   | Yes | Yes | No | Yes | No | No  | No  |       | No  |

xhzhou@USTC



| Time 2 0 | Name<br>Add1<br>Add2 | Busy<br>Yes<br>No | Op<br>SUB | Vj<br>Regs[F6]   | Vk<br>Mem[45+regs[R3]] | Qj | Qk<br>#2 | Dest<br>#4 | Reservation<br>Station |
|----------|----------------------|-------------------|-----------|------------------|------------------------|----|----------|------------|------------------------|
| 0        | Add3                 | No                |           |                  |                        |    |          |            |                        |
| 10       | Mult1                | Yes               | Mult      | Mem[45+Regs[R3]] | Regs[F4]               |    |          | #3         |                        |
| 0        | Mult2                | No                |           |                  |                        |    |          |            |                        |

Head

tail

LD F6, 34(R2)

LD F2, 45(R3)

MULT F0, F2, F4

SUBD F8, F6, F2

DIVD F10, F0, F6

ADDD F6, F8, F2

Cycle

Reorder# 4 Busy

|   | Entry | Busy | Instruction     | State  | Dest. | Value      |
|---|-------|------|-----------------|--------|-------|------------|
|   | 1     | Yes  | LD F6, 34 (R2)  | Commit | F6    | Mem[load1] |
|   | 2     | Yes  | LD F2, 45 (R3)  | Write  | F2    | Mem[1oad2] |
|   | 3     | Yes  | MULT F0, F2, F4 | Issue  | F0    |            |
|   | 4     | Yes  | SUBD F8, F6, F2 | Issue  | F8    |            |
|   | 5     |      |                 |        |       |            |
|   | 6     |      |                 |        |       |            |
|   | 7     |      |                 |        |       |            |
|   | 8     |      |                 |        |       |            |
|   | 9     |      |                 |        |       |            |
|   | 10    |      |                 |        |       |            |
| Į | 10    |      |                 |        |       |            |

Busy Load1 Load2 Load3

No No Address

Reorder Buffer

| _ | F0  | F2  | F4 | F6 | F8  | F10 | F12 | •••• | F30 |
|---|-----|-----|----|----|-----|-----|-----|------|-----|
| # | #3  | #2  |    |    | #4  |     |     |      |     |
|   | Yes | Yes | No | No | Yes | No  | No  |      | No  |



| Time<br>1<br>0 | Name<br>Add1<br>Add2<br>Add3 | Busy<br>Yes<br>No<br>No | Op<br>SUB   | Vj<br>Regs[F6]   | Vk<br>Mem[45+regs[R3]] | Qj | Qk | Dest<br>#4 |
|----------------|------------------------------|-------------------------|-------------|------------------|------------------------|----|----|------------|
| 9              | Mult1<br>Mult2               | Yes<br>Yes              | Mult<br>DIV | Mem[45+Regs[R3]] | Regs[F4]<br>Regs[F6]   | #3 |    | #3<br>#5   |

Reservation Station

LD F6, 34(R2) LD F2, 45(R3) Head MULT F0, F2, F4 SUBD F8, F6, F2<sub>Tail</sub> DIVD F10, F0, F6 ADDD F6, F8, F2

| Entry | Busy | Instruction      | State  | Dest. | Value      |
|-------|------|------------------|--------|-------|------------|
| 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
| 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[1oad2] |
| 3     | Yes  | MULT F0, F2, F4  | Ex1    | F0    |            |
| 4     | Yes  | SUBD F8, F6, F2  | Ex1    | F8    |            |
| 5     | Yes  | DIVD F10, F0, F6 | Issue  | F10   |            |
| 6     |      |                  |        |       |            |
| 7     |      |                  |        |       |            |
| 8     |      |                  |        |       |            |
| 9     |      |                  |        |       |            |
| 10    |      |                  |        |       |            |

Load1 No
Load2 No
Load3

Reorder Buffer

Cycle

5

Reorder‡ Busy

|    | F0  | F2 | F4 | F6 | F8  | F10 | F12 | ••••• | F30 |
|----|-----|----|----|----|-----|-----|-----|-------|-----|
| r# | #3  |    |    |    | #4  | #5  |     |       |     |
|    | Yes | No | No | No | Yes | Yes | No  |       | No  |



F0

#3

Yes

Reorder#

Busy 5/10/2021

F2

No

F4

No

## Tomasulo With Reorder Buffer-Cycle 6

| Time  | Name  | Busy  | 0p   | Vј               | Vk               | Qj    | Qk           | Dest        |              |
|-------|-------|-------|------|------------------|------------------|-------|--------------|-------------|--------------|
| 0     | Add1  | Yes   | SUB  | Regs[F6]         | Mem[45+regs[R3]] |       |              | #4          | Reservation  |
| 0     | Add2  | Yes   | ADD  |                  | Regs[F2]         | #4    |              | #6          | Station      |
| 0     | Add3  | No    |      |                  |                  |       |              |             |              |
| 8     | Mu1t1 | Yes   | MULT | Mem[45+Regs[R3]] | Regs[F4]         |       |              | #3          |              |
| 0     | Mult2 | Yes   | DIV  |                  | Regs[F6]         | #3    |              | #5          |              |
|       |       |       |      |                  |                  |       |              |             |              |
|       |       |       |      |                  |                  |       |              |             | Busy Address |
|       |       | Entry | Busy | Instruction      | State            | Dest. | Value        | Load1       | No           |
|       |       | 1     | Yes  | LD F6, 34 (R2)   | Commit           | F6    | Mem[load1]   | Load2       | No           |
|       |       | 2     | Yes  | LD F2, 45 (R3)   | Commit           | F2    | Mem[load2]   | Load3       |              |
|       | Head  | 3     | Yes  | MULT F0, F2, F4  | Ex2              | F0    |              |             |              |
|       |       | 4     | Yes  | SUBD F8, F6, F2  | Ex2              | F8    |              |             |              |
|       |       | 5     | Yes  | DIVD F10, F0, F6 | Issue            | F10   |              |             |              |
|       | Tail  | 6     | Yes  | ADDD F6, F8, F2  | Issue            | F6    |              |             |              |
|       |       | 7     |      |                  |                  |       |              | Reo         | rder Buffer  |
|       |       | 8     |      |                  |                  |       |              |             |              |
|       |       | 9     |      |                  |                  |       |              |             |              |
|       |       | 10    |      |                  |                  |       |              |             |              |
| Cycle |       |       |      |                  |                  |       |              |             |              |
|       |       |       |      | - 4              |                  |       | <b>5</b> 4.0 | <b>5</b> 10 | 700          |

F6

#6

F8

Yes

#4

F10

#5

Yes

F12

No

F30

No



Reorder#

Busy 5/10/2021

#3

Yes

No

# Tomasulo With Reorder Buffer-Cycle 7

| Time   | Name         | Busy      | 0p   | Vј               | Vk       | Qj    | Qk         | Dest  | Dogg    |                    |
|--------|--------------|-----------|------|------------------|----------|-------|------------|-------|---------|--------------------|
| 0<br>2 | Add1<br>Add2 | No<br>Yes | ADD  | #4               | Regs[F2] |       |            | #6    |         | ervation<br>cation |
| 0      | Add3         | No        |      |                  | 0 2 3    |       |            |       |         |                    |
| 7      | Mult1        | Yes       | MULT | Mem[45+Regs[R3]] | Regs[F4] |       |            | #3    |         |                    |
| 0      | Mult2        | Yes       | DIV  |                  | Regs[F6] | #3    |            | #5    |         |                    |
|        |              |           |      |                  |          |       |            |       | _       |                    |
|        |              |           |      |                  |          |       |            |       | Busy    | Address            |
|        |              | Entry     | Busy | Instruction      | State    | Dest. | Value      | Load1 | No      |                    |
|        |              | 1         | Yes  | LD F6, 34 (R2)   | Commit   | F6    | Mem[load1] | Load2 | No      |                    |
|        |              | 2         | Yes  | LD F2, 45 (R3)   | Commit   | F2    | Mem[load2] | Load3 |         |                    |
|        | Head         | 3         | Yes  | MULT F0, F2, F4  | Ex3      | F0    |            |       |         |                    |
|        |              | 4         | Yes  | SUBD F8, F6, F2  | Write    | F8    | F6-#2      |       |         |                    |
|        |              | 5         | Yes  | DIVD F10, F0, F6 | Issue    | F10   |            |       |         |                    |
|        | Tail         | 6         | Yes  | ADDD F6, F8, F2  | Issue    | F6    |            |       |         |                    |
|        |              | 7         |      |                  |          |       |            | Reo   | rder Bu | ffer               |
|        |              | 8         |      |                  |          |       |            |       |         |                    |
|        |              | 9         |      |                  |          |       |            |       |         |                    |
|        |              | 10        |      |                  |          |       |            |       |         |                    |
| Cycle  |              |           |      |                  |          |       |            |       |         |                    |
|        |              | F0        | F2   | F4               | F6       | F8    | F10        | F12   | •••••   | F30                |

#6

No

#4

Yes

#5

Yes

No

No 23



| Time   | Name     | Busy  | 0p   | Vj               | Vk         | Qj    | Qk         | Dest  |              |   |
|--------|----------|-------|------|------------------|------------|-------|------------|-------|--------------|---|
| 0      | Add1     | No    |      |                  |            |       |            |       | Reservation  |   |
| 1      | Add2     | Yes   | ADD  | #4               | Regs[F2]   |       |            | #6    | Station      |   |
| 0      | Add3     | No    |      |                  |            |       |            |       |              |   |
| 6      | Mult1    | Yes   | MULT | Mem[45+Regs[R3]] | Regs[F4]   |       |            | #3    |              |   |
| 0      | Mult2    | Yes   | DIV  |                  | Regs[F6]   | #3    |            | #5    |              |   |
|        |          |       |      |                  |            |       |            |       | _            |   |
|        |          |       |      |                  |            |       |            |       | Busy Address |   |
|        |          | Entry | Busy | Instruction      | State      | Dest. | Value      | Load1 | No           |   |
|        |          | 1     | Yes  | LD F6, 34 (R2)   | Commit     | F6    | Mem[load1] | Load2 | No           |   |
|        |          | 2     | Yes  | LD F2, 45 (R3)   | Commit     | F2    | Mem[1oad2] | Load3 |              |   |
|        | Head     | 3     | Yes  | MULT F0, F2, F4  | Ex4        | F0    |            |       |              |   |
|        |          | 4     | Yes  | SUBD F8, F6, F2  | Write      | F8    | F6-#2      |       |              |   |
|        |          | 5     | Yes  | DIVD F10, F0, F6 | Issue      | F10   |            |       |              |   |
|        | Tail     | 6     | Yes  | ADDD F6, F8, F2  | Ex1        | F6    |            |       |              |   |
|        |          | 7     |      |                  |            |       |            | Reo   | rder Buffer  |   |
|        |          | 8     |      |                  |            |       |            |       |              |   |
|        |          | 9     |      |                  |            |       |            |       |              |   |
|        |          | 10    |      |                  |            |       |            |       |              |   |
| Cycle  |          |       |      |                  |            |       |            |       |              |   |
|        |          | F0    | F2   | F4               | F6         | F8    | F10        | F12   | •••• F30     | _ |
| 8      | Reorder# | #3    |      |                  | #6         | #4    | #5         |       |              |   |
| 5/10/  | Busy     | Yes   | No   | No               | Yes        | Yes   | Yes        | No    | No<br>24     |   |
| 5/ 10/ | ZUZI     |       |      | X                | hzhou@USTC |       |            |       | 24           |   |



| Time    | Name     | Busy  | 0p   | Vj               | Vk         | Qj    | Qk         | Dest  | ] [     |           |
|---------|----------|-------|------|------------------|------------|-------|------------|-------|---------|-----------|
| 0       | Add1     | No    | ADD  | 11.4             | D [DO]     |       |            | 11.0  |         | ervation  |
| 0       | Add2     | Yes   | ADD  | #4               | Regs[F2]   |       |            | #6    | S       | tation    |
| 0       | Add3     | No    |      |                  |            |       |            |       |         |           |
| 5       | Mult1    | Yes   | MULT | Mem[45+Regs[R3]] | Regs[F4]   |       |            | #3    |         |           |
| 0       | Mult2    | Yes   | DIV  |                  | Regs[F6]   | #3    |            | #5    |         |           |
|         |          |       |      |                  |            |       |            |       |         |           |
|         |          |       |      |                  |            |       |            |       | Busy    | Address   |
|         |          | Entry | Busy | Instruction      | State      | Dest. | Value      | Load1 | No      |           |
|         |          | 1     | Yes  | LD F6, 34 (R2)   | Commit     | F6    | Mem[load1] | Load2 | No      |           |
|         |          | 2     | Yes  | LD F2, 45 (R3)   | Commit     | F2    | Mem[1oad2] | Load3 |         |           |
|         | Head     | 3     | Yes  | MULT F0, F2, F4  | Ex5        | F0    |            |       |         |           |
|         |          | 4     | Yes  | SUBD F8, F6, F2  | Write      | F8    | F6-#2      |       |         |           |
|         |          | 5     | Yes  | DIVD F10, F0, F6 | Issue      | F10   |            |       |         |           |
|         | Tail     | 6     | Yes  | ADDD F6, F8, F2  | Ex2        | F6    |            |       |         |           |
|         |          | 7     |      |                  |            |       |            | Reo   | rder Bu | ıffer     |
|         |          | 8     |      |                  |            |       |            |       |         |           |
|         |          | 9     |      |                  |            |       |            |       |         |           |
|         |          | 10    |      |                  |            |       |            |       |         |           |
| Cycle   |          |       |      |                  |            |       |            |       |         |           |
|         |          | F0    | F2   | F4               | F6         | F8    | F10        | F12   | •••••   | F30       |
| 9       | Reorder# | #3    |      |                  | #6         | #4    | #5         |       |         |           |
| E /40 / | Busy     | Yes   | No   | No               | Yes        | Yes   | Yes        | No    |         | <u>No</u> |
| 5/10/   | 202I     |       |      | X                | hzhou@USTC |       |            |       |         | 25        |



| Time<br>0<br>0 | Name<br>Add1<br>Add2<br>Add3 | Busy<br>No<br>No<br>No      | Ор                           | Vj                                                                                                               | Vk                                        | Qj                       | Qk                                          | Dest                    |                  | rvation<br>ation |
|----------------|------------------------------|-----------------------------|------------------------------|------------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------|---------------------------------------------|-------------------------|------------------|------------------|
| 4              | Mult1                        | Yes                         | MULT                         | Mem[45+Regs[R3]]                                                                                                 | Regs[F4]                                  |                          |                                             | #3                      |                  |                  |
| 0              | Mult2                        | Yes                         | DIV                          |                                                                                                                  | Regs[F6]                                  | #3                       |                                             | #5                      |                  |                  |
|                | Head<br>Tail                 | Entry  1 2 3 4 5 6 7 8 9 10 | Busy Yes Yes Yes Yes Yes Yes | Instruction  LD F6, 34 (R2)  LD F2, 45 (R3)  MULT F0, F2, F4  SUBD F8, F6, F2  DIVD F10, F0, F6  ADDD F6, F8, F2 | State Commit Commit Ex6 Write Issue Write | Dest. F6 F2 F0 F8 F10 F6 | Value  Mem[load1]  Mem[load2]  F6-#2  #4+F2 | Load1<br>Load2<br>Load3 | Busy<br>No<br>No | Address          |
| Cycle          | ,                            |                             |                              |                                                                                                                  |                                           |                          |                                             |                         |                  |                  |
| 10             | Reorder#                     | F0<br>#3                    | F2                           | F4                                                                                                               | F6<br>#6                                  | F8<br>#4                 | F10<br>#5                                   | F12                     | •••••            | F30              |
|                | Busv                         | ⊬s<br>Yes                   | No                           | No                                                                                                               | Yes                                       | Yes                      | ₩5<br>Yes                                   | No                      |                  | No               |
| 5/10/          | /2021                        |                             |                              |                                                                                                                  | izhou@USTC                                |                          |                                             |                         |                  | 26               |



| Time<br>0<br>0 | Name<br>Add1<br>Add2<br>Add3 | Busy<br>No<br>No<br>No | Ор                           | Vj                                                                                                         | Vk                                        | Qj                       | Qk                                          | Dest                    |                  | ervation<br>tation |
|----------------|------------------------------|------------------------|------------------------------|------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------|---------------------------------------------|-------------------------|------------------|--------------------|
| 3              | Mult1                        | Yes                    | MULT                         | Mem[45+Regs[R3]]                                                                                           | Regs[F4]                                  |                          |                                             | #3                      |                  |                    |
| 0              | Mult2                        | Yes                    | DIV                          |                                                                                                            | Regs[F6]                                  | #3                       |                                             | #5                      |                  |                    |
|                | Head<br>Tail                 | Entry  1 2 3 4 5 6 7 8 | Busy Yes Yes Yes Yes Yes Yes | Instruction LD F6, 34 (R2) LD F2, 45 (R3) MULT F0, F2, F4 SUBD F8, F6, F2 DIVD F10, F0, F6 ADDD F6, F8, F2 | State Commit Commit Ex7 Write Issue Write | Dest. F6 F2 F0 F8 F10 F6 | Value  Mem[load1]  Mem[load2]  F6-#2  #4+F2 | Load1<br>Load2<br>Load3 | Busy<br>No<br>No | Address            |
|                |                              | 9<br>10                |                              |                                                                                                            |                                           |                          |                                             |                         |                  |                    |
| Cycle          |                              |                        |                              |                                                                                                            |                                           |                          |                                             |                         |                  |                    |
| 4 4            | D 1 ''                       | F0                     | F2                           | F4                                                                                                         | F6                                        | F8                       | F10                                         | F12                     | •••••            | F30                |
| 11             | Reorder#                     | #3<br>V                | <b>N</b> I -                 | <b>N</b> -                                                                                                 | #6                                        | #4                       | #5                                          | NT -                    |                  | N -                |
| 5/10/          | Busy<br>2021                 | Yes                    | No                           | No X                                                                                                       | Yes<br>hzhou@USTC                         | Yes                      | Yes                                         | No                      |                  | No<br>27           |



| Time<br>0<br>0 | Name<br>Add1<br>Add2<br>Add3 | Busy<br>No<br>No<br>No | 0p                           | Vj                                                                                                         | Vk                                        | Qj                       | Qk                                          | Dest                    |                  | rvation<br>ation |
|----------------|------------------------------|------------------------|------------------------------|------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------|---------------------------------------------|-------------------------|------------------|------------------|
| 2              | Mult1                        | Yes                    | MULT                         | Mem[45+Regs[R3]]                                                                                           | Regs[F4]                                  |                          |                                             | #3                      |                  |                  |
| 0              | Mult2                        | Yes                    | DIV                          |                                                                                                            | Regs[F6]                                  | #3                       |                                             | #5                      |                  |                  |
|                | Head<br>Tail                 | Entry 1 2 3 4 5 6 7    | Busy Yes Yes Yes Yes Yes Yes | Instruction LD F6, 34 (R2) LD F2, 45 (R3) MULT F0, F2, F4 SUBD F8, F6, F2 DIVD F10, F0, F6 ADDD F6, F8, F2 | State Commit Commit Ex8 Write Issue Write | Dest. F6 F2 F0 F8 F10 F6 | Value  Mem[load1]  Mem[load2]  F6-#2  #4+F2 | Load1<br>Load2<br>Load3 | Busy<br>No<br>No | Address          |
|                |                              | 8<br>9<br>10           |                              |                                                                                                            |                                           |                          |                                             |                         |                  |                  |
| Cycle          | Į                            | 10                     |                              |                                                                                                            |                                           |                          |                                             |                         |                  |                  |
|                | ,                            | F0                     | F2                           | F4                                                                                                         | F6                                        | F8                       | F10                                         | F12                     | •••••            | F30              |
| 12             | Reorder#                     | #3                     |                              |                                                                                                            | #6                                        | #4                       | #5                                          |                         |                  |                  |
| 5/10/          | Busy<br>/2021                | Yes                    | No                           | No xh                                                                                                      | Yes<br>izhou@USTC                         | Yes                      | Yes                                         | No                      |                  | No<br>28         |



| Time<br>0<br>0 | Name<br>Add1<br>Add2<br>Add3 | Busy<br>No<br>No<br>No | Ор                           | Vj                                                                                                               | Vk                                        | Qj                       | Qk                                          | Dest                    |                  | ervation<br>tation |
|----------------|------------------------------|------------------------|------------------------------|------------------------------------------------------------------------------------------------------------------|-------------------------------------------|--------------------------|---------------------------------------------|-------------------------|------------------|--------------------|
| 1              | Mult1                        | Yes                    | MULT                         | Mem[45+Regs[R3]]                                                                                                 | Regs[F4]                                  |                          |                                             | #3                      |                  |                    |
| 0              | Mult2                        | Yes                    | DIV                          |                                                                                                                  | Regs[F6]                                  | #3                       |                                             | #5                      |                  |                    |
|                | Head<br>Tail                 | Entry  1 2 3 4 5 6 7   | Busy Yes Yes Yes Yes Yes Yes | Instruction  LD F6, 34 (R2)  LD F2, 45 (R3)  MULT F0, F2, F4  SUBD F8, F6, F2  DIVD F10, F0, F6  ADDD F6, F8, F2 | State Commit Commit Ex9 Write Issue Write | Dest. F6 F2 F0 F8 F10 F6 | Value  Mem[load1]  Mem[load2]  F6-#2  #4+F2 | Load1<br>Load2<br>Load3 | Busy<br>No<br>No | Address            |
|                |                              | 8<br>9                 |                              |                                                                                                                  |                                           |                          |                                             |                         |                  |                    |
| C 1            |                              | 10                     |                              |                                                                                                                  |                                           |                          |                                             |                         |                  |                    |
| Cycle          |                              | F0                     | F2                           | F4                                                                                                               | F6                                        | F8                       | F10                                         | F12                     | •••••            | F30                |
| 13             | Reorder#                     | #3                     |                              | * *                                                                                                              | #6                                        | #4                       | #5                                          |                         |                  | 100                |
| 5/10/          | Busy<br>2021                 | Yes                    | No                           | No xi                                                                                                            | Yes<br>nzhou@USTC                         | Yes                      | Yes                                         | No                      |                  | No<br>29           |



| Time<br>0<br>0 | Name<br>Add1<br>Add2<br>Add3 | Busy<br>No<br>No<br>No | Ор                           | Vj                                                                                                               | Vk                                         | Qj                       | Qk                                          | Dest                    |                  | ervation<br>tation |
|----------------|------------------------------|------------------------|------------------------------|------------------------------------------------------------------------------------------------------------------|--------------------------------------------|--------------------------|---------------------------------------------|-------------------------|------------------|--------------------|
| 0              | Mult1                        | Yes                    | MULT                         | Mem[45+Regs[R3]]                                                                                                 | Regs[F4]                                   |                          |                                             | #3                      |                  |                    |
| 0              | Mult2                        | Yes                    | DIV                          |                                                                                                                  | Regs[F6]                                   | #3                       |                                             | #5                      |                  |                    |
|                | Head<br>Tail                 | Entry 1 2 3 4 5        | Busy Yes Yes Yes Yes Yes Yes | Instruction  LD F6, 34 (R2)  LD F2, 45 (R3)  MULT F0, F2, F4  SUBD F8, F6, F2  DIVD F10, F0, F6  ADDD F6, F8, F2 | State Commit Commit Ex10 Write Issue Write | Dest. F6 F2 F0 F8 F10 F6 | Value  Mem[load1]  Mem[load2]  F6-#2  #4+F2 | Load1<br>Load2<br>Load3 | Busy<br>No<br>No | Address            |
|                |                              | 7<br>8<br>9<br>10      |                              |                                                                                                                  |                                            |                          |                                             | Rec                     | order Bu         | ıffer              |
| Cycle          |                              | 10                     |                              |                                                                                                                  |                                            |                          |                                             |                         |                  |                    |
|                |                              | F0                     | F2                           | F4                                                                                                               | F6                                         | F8                       | F10                                         | F12                     | •••••            | F30                |
| 14             | Reorder#                     | #3                     | N.T.                         | NT.                                                                                                              | #6                                         | #4                       | #5                                          | NT.                     |                  |                    |
| 5/10/          | Busy<br>2021                 | Yes                    | No                           | No X                                                                                                             | Yes<br>hzhou@USTC                          | Yes                      | Yes                                         | No                      |                  | No<br>30           |



| Time<br>0<br>0<br>0<br>0 | Name Add1 Add2 Add3 Mult1 | Busy<br>No<br>No<br>No | Op<br>DIV  | Vj                               | Vk                | Qj       | Qk                       | Dest<br>#5     |         | rvation<br>ation |
|--------------------------|---------------------------|------------------------|------------|----------------------------------|-------------------|----------|--------------------------|----------------|---------|------------------|
| 40                       | Mult2                     | Yes                    | DIA        | #2*Regs[F4]                      | Regs[F6]          |          |                          | #5             |         |                  |
|                          |                           |                        | _          |                                  |                   |          |                          |                | Busy    | Address          |
|                          | ſ                         | Entry                  | Busy       | Instruction                      | State             | Dest.    | Value                    | Load1          | No      |                  |
|                          |                           | 1<br>2                 | Yes<br>Yes | LD F6, 34 (R2)<br>LD F2, 45 (R3) | Commit<br>Commit  | F6<br>F2 | Mem[load1]<br>Mem[load2] | Load2<br>Load3 | No      |                  |
|                          | Head                      | 3                      |            |                                  |                   | F0       |                          | Loado          |         |                  |
|                          | пеаа                      |                        | Yes        | MULT FO, F2, F4                  | Write             |          | #2*F4                    |                |         |                  |
|                          |                           | 4                      | Yes        | SUBD F8, F6, F2                  | Write             | F8       | F6-#2                    |                |         |                  |
|                          |                           | 5                      | Yes        | DIVD F10, F0, F6                 | Issue             | F10      |                          |                |         |                  |
|                          | Tail                      | 6                      | Yes        | ADDD F6, F8, F2                  | Write             | F6       | #4+F2                    |                |         |                  |
|                          |                           | 7                      |            |                                  |                   |          |                          | Rec            | rder Bu | ffer             |
|                          |                           | 8                      |            |                                  |                   |          |                          |                |         |                  |
|                          |                           | 9                      |            |                                  |                   |          |                          |                |         |                  |
| C 1                      | Į                         | 10                     |            |                                  |                   |          |                          |                |         |                  |
| Cycle                    |                           | F0                     | F2         | F4                               | F6                | F8       | F10                      | F12            | •••••   | F30              |
| 15                       | Reorder#                  | #3                     |            |                                  | #6                | #4       | #5                       |                |         |                  |
| 5/10/2                   | Busy                      | Yes                    | No         | No                               | Yes<br>nzhou@USTC | Yes      | Yes                      | No             |         | No<br>31         |



| Time | Name  | Busy  | 0p   | Vј             | Vk       | Qj    | Qk         | Dest  |      |          |
|------|-------|-------|------|----------------|----------|-------|------------|-------|------|----------|
| 0    | Add1  | No    |      |                |          |       |            |       | Res  | ervation |
| 0    | Add2  | No    |      |                |          |       |            |       | S    | tation   |
| 0    | Add3  | No    |      |                |          |       |            |       |      |          |
| 0    | Mult1 | No    |      |                |          |       |            |       |      |          |
| 39   | Mult2 | Yes   | DIV  | #2*Regs[F4]    | Regs[F6] |       |            | #5    |      |          |
|      |       |       |      |                |          |       |            |       |      |          |
|      |       |       |      |                |          |       |            |       | Busy | Address  |
|      |       | Entry | Busy | Instruction    | State    | Dest. | Value      | Load1 | No   |          |
|      |       | 1     | Yes  | LD F6, 34 (R2) | Commit   | F6    | Mem[load1] | Load2 | No   |          |
|      |       | 2     | Yes  | LD F2, 45 (R3) | Commit   | F2    | Mem[load2] | Load3 |      |          |

| Busy A | ddress |
|--------|--------|

|   | Entry | Busy | Instruction      | State  | Dest. | Value      |
|---|-------|------|------------------|--------|-------|------------|
|   | 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
|   | 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[load2] |
|   | 3     | Yes  | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
| 1 | 4     | Yes  | SUBD F8, F6, F2  | Write  | F8    | F6-#2      |
|   | 5     | Yes  | DIVD F10, F0, F6 | Ex1    | F10   |            |
| l | 6     | Yes  | ADDD F6, F8, F2  | Write  | F6    | #4+F2      |
|   | 7     |      |                  |        |       |            |
|   | 8     |      |                  |        |       |            |
|   | 9     |      |                  |        |       |            |
|   | 10    |      |                  |        |       |            |

Reorder Buffer

| Cvc     | e1e           |
|---------|---------------|
| $\circ$ | $\mathcal{L}$ |

| 16    | Reorder# |
|-------|----------|
|       | Busy     |
| 5/10/ | /2021    |

Head

Tai1

|    | F0 | F2 | F4 | F6             | F8  | F10 | F12 | ••••• | F30 |
|----|----|----|----|----------------|-----|-----|-----|-------|-----|
| r# |    |    |    | #6             | #4  | #5  |     |       |     |
|    | No | No | No | Yes            | Yes | Yes | No  |       | No  |
| _  |    |    | )  | 'nznoll@llSTC: |     |     |     |       | 37  |



| Time | Name  | Busy | 0p  | Vj          | Vk       | Qj | Qk | Dest |
|------|-------|------|-----|-------------|----------|----|----|------|
| 0    | Add1  | No   |     |             |          |    |    |      |
| 0    | Add2  | No   |     |             |          |    |    |      |
| 0    | Add3  | No   |     |             |          |    |    |      |
| 0    | Mult1 | No   |     |             |          |    |    |      |
| 38   | Mult2 | Yes  | DIV | #2*Regs[F4] | Regs[F6] |    |    | #5   |

Reservation Station

|   | Entry | Busy | Instruction      | State  | Dest. | Value      |
|---|-------|------|------------------|--------|-------|------------|
|   | 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
|   | 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[load2] |
|   | 3     | Yes  | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
|   | 4     | Yes  | SUBD F8, F6, F2  | Commit | F8    | F6-#2      |
| 1 | 5     | Yes  | DIVD F10, F0, F6 | Ex2    | F10   |            |
|   | 6     | Yes  | ADDD F6, F8, F2  | Write  | F6    | #4+F2      |
|   | 7     |      |                  |        |       |            |
|   | 8     |      |                  |        |       |            |
|   | 9     |      |                  |        |       |            |
|   | 10    |      |                  |        |       |            |

Busy Address
No
No

Load1 Load2

Load3

Reorder Buffer

Cycle

17 Reorder# Busy 5/10/2021

Head

Tai1

| _  | F0 | F2 | F4 | F6          | F8 | F10 | F12 | ••••• | F30  |
|----|----|----|----|-------------|----|-----|-----|-------|------|
| r# |    |    |    | #6          |    | #5  |     |       |      |
|    | No | No | No | Yes         | No | Yes | No  |       | No   |
| _  |    |    |    | XN7NOU@USTC |    |     |     |       | .3.3 |



| Time | Name  | Busy | 0p  | Vј          | Vk       | Qj | Qk | Dest |
|------|-------|------|-----|-------------|----------|----|----|------|
| 0    | Add1  | No   |     |             |          |    |    |      |
| 0    | Add2  | No   |     |             |          |    |    |      |
| 0    | Add3  | No   |     |             |          |    |    |      |
| 0    | Mult1 | No   |     |             |          |    |    |      |
| 37   | Mult2 | Yes  | DIV | #2*Regs[F4] | Regs[F6] |    |    | #5   |

Reservation Station

| F | Entry | Busy | Instruction      | State  | Dest. | Value      |
|---|-------|------|------------------|--------|-------|------------|
|   | 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
|   | 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[1oad2] |
|   | 3     | Yes  | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
|   | 4     | Yes  | SUBD F8, F6, F2  | Commit | F8    | F6-#2      |
|   | 5     | Yes  | DIVD F10, F0, F6 | Ex3    | F10   |            |
|   | 6     | Yes  | ADDD F6, F8, F2  | Write  | F6    | #4+F2      |
|   | 7     |      |                  |        |       |            |
|   | 8     |      |                  |        |       |            |
|   | 9     |      |                  |        |       |            |
|   | 10    |      |                  |        |       |            |

Busy Address
No
No

Load1 Load2

Load3

Reorder Buffer

Cycle

18 Reorder Busy 5/10/2021

Head

Tai1

|    | F0 | F2 | F4 | F6          | F8 | F10 | F12 | ••••• | F30 |
|----|----|----|----|-------------|----|-----|-----|-------|-----|
| r# |    |    |    | #6          |    | #5  |     |       |     |
|    | No | No | No | Yes         | No | Yes | No  |       | No  |
| _  |    |    | )  | 'n7nnH@HSTC |    |     |     |       | 34  |

J+



Continue.....37 Cycles



| Time | Name  | Busy | Ор  | Vj          | Vk       | Qj | Qk | Dest |
|------|-------|------|-----|-------------|----------|----|----|------|
| 0    | Add1  | No   |     |             |          |    |    |      |
| 0    | Add2  | No   |     |             |          |    |    |      |
| 0    | Add3  | No   |     |             |          |    |    |      |
| 0    | Mult1 | No   |     |             |          |    |    |      |
| 0    | Mult2 | Yes  | DIV | #2*Regs[F4] | Regs[F6] |    |    | #5   |

Reservation Station

| Entry | Busy | Instruction      | State  | Dest. | Value      |
|-------|------|------------------|--------|-------|------------|
| 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
| 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[load2] |
| 3     | Yes  | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
| 4     | Yes  | SUBD F8, F6, F2  | Commit | F8    | F6-#2      |
| 5     | Yes  | DIVD F10, F0, F6 | Ex40   | F10   | #3/F6      |
| 6     | Yes  | ADDD F6, F8, F2  | Write  | F6    | #4+F2      |
| 7     |      |                  |        |       |            |
| 8     |      |                  |        |       |            |
| 9     |      |                  |        |       |            |
| 10    |      |                  |        |       |            |

Busy Address No No

Load1

Load2

Load3

Reorder Buffer

Cycle

54 Reorder# Busy 5/10/2021

Head

Tai1

| _  | F0 | F2 | F4 | F6         | F8 | F10 | F12 | ••••• | F30 |
|----|----|----|----|------------|----|-----|-----|-------|-----|
| r# |    |    |    | #6         |    | #5  |     |       |     |
|    | No | No | No | Yes        | No | Yes | No  |       | No  |
| _  |    |    | X  | nznou@USTC |    |     |     |       | .36 |



## Tomasulo With Reorder Buffer-Cycle 56

| Time | Name  | Busy | 0p | Vј | Vk | Qj | Qk | Dest |
|------|-------|------|----|----|----|----|----|------|
| 0    | Add1  | No   |    |    |    |    |    |      |
| 0    | Add2  | No   |    |    |    |    |    |      |
| 0    | Add3  | No   |    |    |    |    |    |      |
| 0    | Mult1 | No   |    |    |    |    |    |      |
| 0    | Mult2 | No   |    |    |    |    |    |      |

Reservation Station

| Entry | Busy | Instruction      | State  | Dest. | Value      |
|-------|------|------------------|--------|-------|------------|
| 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
| 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[1oad2] |
| 3     | Yes  | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
| 4     | Yes  | SUBD F8, F6, F2  | Commit | F8    | F6-#2      |
| 5     | Yes  | DIVD F10, F0, F6 | Write  | F10   | #3/F6      |
| 6     | Yes  | ADDD F6, F8, F2  | Write  | F6    | #4+F2      |
| 7     |      |                  |        |       |            |
| 8     |      |                  |        |       |            |
| 9     |      |                  |        |       |            |
| 10    |      |                  |        |       |            |

Address Busy No No

Load1 Load2

Load3

Reorder Buffer

Cycle

56 Reorder# Busy 5/10/2021

Head

Tail

| _  | F0 | F2 | F4 | F6          | F8 | F10 | F12 | ••••• | F30 |
|----|----|----|----|-------------|----|-----|-----|-------|-----|
| r# |    |    |    | #6          |    | #5  |     |       |     |
|    | No | No | No | Yes         | No | Yes | No  |       | No  |
| _  |    |    |    | xnznou@USTC |    |     |     |       | .37 |



## Tomasulo With Reorder Buffer-Cycle 57

| Time | Name  | Busy | 0p | Vj | Vk | Qj | Qk | Dest |
|------|-------|------|----|----|----|----|----|------|
| 0    | Add1  | No   |    |    |    |    |    |      |
| 0    | Add2  | No   |    |    |    |    |    |      |
| 0    | Add3  | No   |    |    |    |    |    |      |
| 0    | Mult1 | No   |    |    |    |    |    |      |
| 0    | Mult2 | No   |    |    |    |    |    |      |

Reservation Station

|   | Entry | Busy | Instruction      | State  | Dest. | Value      |
|---|-------|------|------------------|--------|-------|------------|
| ĺ | 1     | Yes  | LD F6, 34 (R2)   | Commit | F6    | Mem[load1] |
|   | 2     | Yes  | LD F2, 45 (R3)   | Commit | F2    | Mem[1oad2] |
|   | 3     | Yes  | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
|   | 4     | Yes  | SUBD F8, F6, F2  | Commit | F8    | F6-#2      |
|   | 5     | Yes  | DIVD F10, F0, F6 | Commit | F10   | #3/F6      |
|   | 6     | Yes  | ADDD F6, F8, F2  | Write  | F6    | #4+F2      |
|   | 7     |      |                  |        |       |            |
|   | 8     |      |                  |        |       |            |
|   | 9     |      |                  |        |       |            |
|   | 10    |      |                  |        |       |            |

Address Busy Load1 No Load2 No Load3

Reorder Buffer

Cycle

57 Reorder# Busy 5/10/2021

Head

|    | F0 | F2 | F4 | F6          | F8 | F10 | F12 | •••• | F30 |
|----|----|----|----|-------------|----|-----|-----|------|-----|
| c# |    |    |    | #6          |    |     |     |      |     |
|    | No | No | No | Yes         | No | No  | No  |      | No  |
|    |    |    |    | xnznou@USTC |    |     |     |      | 38  |



## Tomasulo With Reorder Buffer-Cycle 58

| Time | Name  | Busy | 0p | Vj | Vk | Qj | Qk | Dest |
|------|-------|------|----|----|----|----|----|------|
| 0    | Add1  | No   |    |    |    |    |    |      |
| 0    | Add2  | No   |    |    |    |    |    |      |
| 0    | Add3  | No   |    |    |    |    |    |      |
| 0    | Mult1 | No   |    |    |    |    |    |      |
| 0    | Mult2 | No   |    |    |    |    |    |      |

Reservation Station

| Entry | Busy               | Instruction      | State  | Dest. | Value      |
|-------|--------------------|------------------|--------|-------|------------|
| 1     | Yes LD F6, 34 (R2) |                  | Commit | F6    | Mem[load1] |
| 2     | Yes                | LD F2, 45 (R3)   | Commit | F2    | Mem[1oad2] |
| 3     | Yes                | MULT F0, F2, F4  | Commit | F0    | #2*F4      |
| 4     | Yes                | SUBD F8, F6, F2  | Commit | F8    | F6-#2      |
| 5     | Yes                | DIVD F10, F0, F6 | Commit | F10   | #3/F6      |
| 6     | Yes                | ADDD F6, F8, F2  | Commit | F6    | #4+F2      |
| 7     |                    |                  |        |       |            |
| 8     |                    |                  |        |       |            |
| 9     |                    |                  |        |       |            |
| 10    |                    |                  |        |       |            |

Load1 No
Load2 No
Load3

Reorder Buffer

Cycle

58 Reorder# Busy 5/10/2021

Head

|    | F0 | F2 | F4 | F6 | F8 | F10 | F12 | ••••• | F30 |  |
|----|----|----|----|----|----|-----|-----|-------|-----|--|
| r# | No | No | No | No | No | No  | No  |       | No  |  |

00



## Tomasulo With Reorder Buffer-Summary

| Instruction      | Issue | Exec Comp          | WriteBack | Commit |
|------------------|-------|--------------------|-----------|--------|
| LD F6, 34 (R2)   | 1     | 2                  | 3         | 4      |
| LD F2, 45 (R3)   | 2     | 3                  | 4         | 5      |
| MULT F0, F2, F4  | 3     | 5 <sup>~</sup> 14  | 15        | 16     |
| SUBD F8, F6, F2  | 4     | 5 <sup>~</sup> 6   | 7         | 17     |
| DIVD F10, F0, F6 | 5     | 16 <sup>~</sup> 55 | 56        | 57     |
| ADDD F6, F8, F2  | 6     | 8 <sup>~</sup> 9   | 10        | 58     |

In-order Issue/Commit Out-of-Order Execution/WriteBack



| Loop | L.S F0, 0(R1)    |
|------|------------------|
|      | L.S F1, O(R2)    |
|      | ADD.S F2, F1, F0 |
|      | S.S F2, O(R1)    |
|      | ADDI R1,R1, #4   |
|      | ADDI R2,R2, #4   |
|      | SUBI R3,R3,#1    |
|      | BNEZ R3, Loop    |

#### · 假设:

- Load和store部件: 计算访存地址 需要 2 cycle; 对Cache访问 需要 1个cycle
- 浮点ADD执行:需要6个cycle
- Store操作内部分解为两个操作操作: S.S-A 计算访存地址; S.S-D 对Cache访问
- 其他整型类执行:需要2个cycle



# Tomsasulo算法执行示例(无预测)

|     |                 | Issue | Exe<br>Start | Exe<br>End | Cache | CDB  | 备注     |
|-----|-----------------|-------|--------------|------------|-------|------|--------|
| 11  | L.S F0, 0(R1)   | 1     | 2            | 3          | (4)   | (5)  |        |
| 12  | L.S F1, O(R2)   | 2     | 3            | 4          | (5)   | (6)  |        |
| 13  | ADD.S F2,F1,F0  | 3     | 7            | 12         |       | (13) | 等待F1   |
| 14  | S.S-A F2, O(R1) | 4     | 5            | 6          |       |      |        |
| 15  | S.S-D F2,0(R1)  | 5     | 14           | 15         | (16)  |      | 等待F2   |
| 16  | ADDI R1,R2, #4  | 6     | 7            | 8          |       | (9)  |        |
| 17  | ADDI R2, R2,#4  | 7     | 8            | 9          |       | (10) |        |
| 18  | SUBI R3, R3, #1 | 8     | 9            | 10         |       | (11) |        |
| 19  | BNEZ R3, Loop   | 9     | 12           | 13         |       | (14) | 等待R3的值 |
| 110 | L.S F0, O(R1)   | 15    | 16           | 17         | (18)  | (19) | 等待19   |
| 111 | L.S F1, 0(R2)   | 16    | 17           | 18         | (19)  | (20) |        |
| 112 | ADD.S F2,F1,F0  | 17    | 21           | 26         |       | (27) | 等待F1   |



# Tomsasulo算法执行示例(有预测)

|     |                 | Issue | Exe<br>Start | Exe<br>End | Cache | CDB  | Commit | 备注                                     |
|-----|-----------------|-------|--------------|------------|-------|------|--------|----------------------------------------|
| 11  | L.S F0, 0(R1)   | 1     | 2            | 3          | 4     | (5)  | 6      |                                        |
| 12  | L.S F1, 0(R2)   | 2     | 3            | 4          | 5     | (6)  | 7      |                                        |
| 13  | ADD.S F2,F1,F0  | 3     | 7            | 12         |       | (13) | 14     | 等待F1                                   |
| 14  | S.S-A F2, O(R1) | 4     | 5            | 6          |       |      |        |                                        |
| 15  | S.S-D F2,0(R1)  | 5     | 14           | 15         | 16    |      | (17)   | 等待F2                                   |
| 16  | ADDI R1,R2, #4  | 6     | 7            | 8          |       | (9)  | (18)   |                                        |
| 17  | ADDI R2, R2,#4  | 7     | 8            | 9          |       | (10) | (19)   |                                        |
| 18  | SUBI R3, R3, #1 | 8     | 9            | 10         |       | (11) | (20)   |                                        |
| 19  | BNEZ R3, Loop   | 9     | 14           | 15         | -     | (16) | (21)   | 等待R3的值,WR阶段<br>(CDB争用)与I10,<br>I11存在冲突 |
| 110 | L.S F0, 0(R1)   | 10    | 11           | 12         | 13    | (14) | (22)   |                                        |
| 111 | L.S F1, 0(R2)   | 11    | 12           | 13         | 14    | (15) | (23)   |                                        |
| 112 | ADD.S F2,F1,F0  | 12    | 16           | 21         |       | (22) | (24)   | 等待F1                                   |



### 5.4 推断执行

#### 支持推断执行 的Tomasulo

### 代码执行 示例

#### Tomasulo 小结

- 1. 带有ROB的机器结构
- 2. 四阶段算法描述

- 1. 简单代码示例
- 2. 推断执行示例

- 1. ROB的作用
- 2. 动态内存歧义消除



## 使用ROB保持机器的精确状态

### ·ROB维持了机器的精确状态,允许投机执行

- 直到确认无异常 然后进入提交阶段
- 直到确定分支预测正确进入提交阶段
- 如果有异常或预测错误
  - 刷新ROB、RS和寄存器结果状态表

### ·存储器操作使用类似的方法

- Memory Ordering Buffer (MOB)
  - Store操作的结果先存放到MOB中,然后提交阶段按存储操作的程序序提交



## Memory Disambiguation: 处理对存储器引用的数据相关

Question: 给定一个指令序列, store, load 这两个操作是否有关?即下列代码是否有相关问题?
 Eg: st 0(R2),R5

Id R6,0(R3)

- · 我们是否可以较早启动ld?
  - Store的地址可能会延迟很长时间才能得到.
  - 我们也许想在同一个周期开始这两个操作的执行.
- 两种方法:
  - No Speculation: 不进行load操作,直到我们确信地址 O(R2) ≠ O(R3)
  - Speculation: 我们可以假设他们相关还是不相关 (called "dependence speculation"), 如果推测错误通过ROB来修正
- 参考书: Gonzalez, A., et al. (2011). "Processor
  Microarchitecture: An Implementation Perspective." Synthesis
  Lectures on Computer Architecture #12, Morgan & Claypool
  Publishers



## Memory Disambiguation

**TABLE 6.1:** Memory disambiguation schemes.

| NAME                            | SPECULATIVE | DESCRIPTION                                                                                                                                |
|---------------------------------|-------------|--------------------------------------------------------------------------------------------------------------------------------------------|
| Total Ordering                  | No          | All memory accesses are processed in order.                                                                                                |
| Partial Ordering                | No          | All stores are processed in order, but loads execute out of order as long as all previous stores have computed their address.              |
| Load Ordering<br>Store Ordering | No          | Execution between loads and stores is out of order, but all loads execute in order among them, and all stores execute in order among them. |
| Store Ordering                  | Yes         | Stores execute in order, but loads execute completely out of order.                                                                        |

· 非投机方式的基本原则: 当前存储器指令之前的store 指令计算存储器地址后, 才能执行当前的存储器操作



## Summary-Tomasulo小结 #1/3

#### · Reservations stations: 寄存器重命名,缓冲源操作数

- 避免寄存器成为瓶颈
- 避免了Scoreboard中无法解决的 WAR, WAW hazards
- 允许硬件做循环展开
- 不限于基本块(IU先行,解决控制相关)

#### Reorder Buffer:

- 提供了撤销指令运行的机制
- 指令以发射序存放在ROB中
- 指令顺序提交

#### · 分支预测对提高性能是非常重要的

- 推断执行: 在控制相关还没有解决情况下, 就开始执行
- 推断执行利用了ROB撤销指令执行的机制
  - 处理预测错误时,撤销推测执行的指令
- 基于BHT的分支预测技术
- 基于BTB的分支预测技术



## Summary-Tomasulo小结

#2/3

#### ・贡献

- Dynamic scheduling
- Register renaming
- Load/store disambiguation
- · 360/91 后 Pentium II; PowerPC 604; MIPS R10000; HP-PA 8000; Alpha 21264使用这种技术
- ・ 不足之处:
  - Too many value copy operations
    - Register File →RS→ROB→Register File
  - Too many muxes/busses (CDB)
    - Values are from everywhere to everywhere else!
  - Reservation Stations mix values(data) and tags (control)
    - Slow down max clock frequency



## Summary-Tomasulo小结

#3/3

### ·存储器访问的冲突消解

- 非投机方式的冲突消解
  - Total Ordering
  - Partial Ordering
    - Load指令前的store指令已经完成了地址计算,有可能乱序执行存储器load操作
  - Load Ordering, Store Ordering
    - Load指令前的存储器访问指令已经完成了地址计算, load队头的 load操作有可能在store指令之前执行访存操作。
- 投机方式的执行
  - Store Ordering
  - 假设Load操作与之前未计算出有效地址的store操作无关。
- · 问题: 给出四种访问方式挖掘并行性的能力排序。



## Acknowledgements

- These slides contain material developed and copyright by:
  - John Kubiatowicz (UCB)
  - Krste Asanovic (UCB)
  - John Hennessy (Standford) and David Patterson (UCB)
  - Chenxi Zhang (Tongji)
  - Muhamed Mudawar (KFUPM)
- UCB material derived from course CS152、CS252、CS61C
- KFUPM material derived from course COE501、COE502