1. un the following code on Scoreboard with function units including two FP multipliers, one FP divider, one FP adder that can perform float add and sub operations, and two integer units that is responsible for memory accessing, integer ALU operations and branch).

(1) Fill the following tables with the scoreboard status at the end of 5<sup>th</sup> clock cycle when the first L.D instruction just finish

WB stage. The scoreboard status includes instruction status, function units status and FP register status.

L.D F0,0(R1);  $F0 \leftarrow MEM[R1+0]$ L.D F4,0(R2) ;  $F4 \leftarrow MEM[R2+0]$ MUL.D F6,F0,F4 ; F6**←**F0 \* F4 ; F8←F0 + F2 ADD.D F8,F0,F2 ;  $MEM[R3 + 0] \leftarrow F6$ S.D F6, 0(R3);  $MEM[R4+0] \leftarrow F8$ F8, 0(R4) S.D

|                                   |                              |                               | ١ |
|-----------------------------------|------------------------------|-------------------------------|---|
| Instruction-<br>producing-result- | Instruction consuming result | Latency·(·in·<br>clock·cycle) | ) |
| FP-operation.                     | FP-operation-                | 3.                            |   |
| FP-operation -                    | FP-store (S.D)               | 2.                            |   |
| FP·load(L.D)。                     | FP-opereation.               | 1,                            |   |
| FP·load(L.D)。                     | FP·store (S.D)               | 0.                            |   |

| Instructions   | 发射 (Issue) | 读操作数(RO) | 执行 (EXE) | 写结果(WB) |
|----------------|------------|----------|----------|---------|
|                | clock      | clock    | clock    | clock   |
| L.D F0, 0(R1)  | 1          | 2        | 3        | 5       |
| L.D F4, 0(R2)  | 2          | 3        | 4        |         |
| MUL.D F6,F0,F4 | 3          |          |          |         |
| ADD.D F8,F0,F2 | 4          |          |          |         |
| S.D F6, 0(R3)  |            |          |          |         |
| S.D F8, 0(R4)  |            |          |          |         |

|                |      | Function Unit Status |    |    |    |    |    |     |     |       |  |
|----------------|------|----------------------|----|----|----|----|----|-----|-----|-------|--|
| Function units | Busy | Op                   | Fi | Fj | Fk | Qj | Qk | Rj  | Rk  | Addre |  |
| Integer1       | no   | L.D                  | Fo | RI |    |    |    | ho  |     | 0+R1  |  |
| Integer2       | yes  | LD                   | F4 | R2 |    |    |    | no  |     | 0+R2  |  |
| FPMult1        | yes  | MUL.D                | Fb | FO | Fφ |    |    | yes | no  |       |  |
| FPMult2        |      |                      |    |    |    |    |    |     |     |       |  |
| FPDivider      |      |                      |    |    |    |    |    |     |     |       |  |
| FPAdder        | yes  | ADD. D               | F8 | Fo | FZ |    |    | yes | yes |       |  |

|    |    | Register Status |          |         |         |     |     |       |     |  |  |
|----|----|-----------------|----------|---------|---------|-----|-----|-------|-----|--|--|
|    | F0 | F2              | F4       | F6      | F8      | F10 | F12 | ••••• | F30 |  |  |
| FU |    |                 | Integer2 | FPMnlt1 | FPAdder |     |     |       |     |  |  |

Scoreboard是影器推读写 所以一拍写下一拍读 (2) Fill the following table with instruction status at the end of **6th** Clock Cycle as the first line in the table.

| and the female winds with institute the state of the effect of the site into the time in the meter. |           |          |           |         |  |  |  |  |  |
|-----------------------------------------------------------------------------------------------------|-----------|----------|-----------|---------|--|--|--|--|--|
|                                                                                                     | 发射(Issue) | 读操作数(RO) | 执行(EXE)   | 写结果(WB) |  |  |  |  |  |
| L.D F0, 0(R1)                                                                                       | 1         | 2        | 3         | 5       |  |  |  |  |  |
| L.D F4, 0(R2)                                                                                       | 2         | 3        | 4         | 6       |  |  |  |  |  |
| MUL.D F6,F0,F4                                                                                      | 3         | & CLKE   | F4 TWB    |         |  |  |  |  |  |
| ADD.D F8,F0,F2                                                                                      | 4         | 6 CLKS   | FO WB (L- | -拍WB)   |  |  |  |  |  |
| S.D F6, 0(R3)                                                                                       | Ь         |          | •         |         |  |  |  |  |  |
| S.D F8, 0(R4)                                                                                       |           |          |           |         |  |  |  |  |  |

|                | Function Unit Status |      |    |    |    |       |    |          |     |      |
|----------------|----------------------|------|----|----|----|-------|----|----------|-----|------|
| Function units | Busy                 | Op   | Fi | Fj | Fk | Qj    | Qk | Rj       | Rk  | A    |
| Integer1       | 465                  | S.D. |    | Fb |    | MULD. |    | no       |     | OtRS |
| Integer2       | no                   |      |    |    |    |       |    | Mac      | ues | D+R2 |
| FPMult1        | yes                  | MVLD | Fb | Fo | F4 |       |    | no<br>no | 70  | _    |
| FPMult2        |                      |      |    |    |    |       |    |          |     |      |
| FPDivider      |                      |      |    |    |    |       |    |          |     |      |
| FPAdder        | yes                  | ADDD | F8 | Fo | FZ |       |    | Nυ       | N0  |      |

|    |    | Register Status |    |         |          |     |     |       |     |  |  |
|----|----|-----------------|----|---------|----------|-----|-----|-------|-----|--|--|
|    | F0 | F2              | F4 | F6      | F8       | F10 | F12 | ••••• | F30 |  |  |
| FU |    |                 |    | FPMultl | FPA dder |     |     |       |     |  |  |

2. (30 points) For the following instruction sequence:

L.D F6, 34(R2)

L.D F2, 45(R3)

MUL.D F0, F2, F4

SUB.D F8, F2, F6

DIV.D F10, F0, F6

ADD.D F6, F8, F2

(1) Fill in the tables that Tomasulo algorithm used when the first instruction completes and finishes writing result. Assume the memory access unit needs two clock cycles to do execution: one for address calculation and one for memory access.

Time

|   | Instru | action status |              |
|---|--------|---------------|--------------|
|   | ISSUE  | EXECUTE       | WRITE RESULT |
| 1 | 1      | 3             | 4            |
| 2 | 2      | 4             | 1            |
| 3 | 3      | ·             |              |
| 4 | 4      |               |              |
| 5 | 1      |               |              |
| 6 |        |               |              |

|       |      |       | Reserv | ation stations |        |    |                         |
|-------|------|-------|--------|----------------|--------|----|-------------------------|
| NAME  | BUSY | Op    | Vj     | Vk             | Qj     | Qk | A                       |
| Load1 | no   | L.D.  | [R 2]  |                |        |    | 34 <b>+</b> R2<br>45+R3 |
| Load2 | yes  | L.D.  | [R3]   |                |        |    | 45+R3                   |
| Add1  | Yes  | SUB.D |        | M[34+R2]       | Loadz  |    |                         |
| Add2  |      |       |        |                |        |    |                         |
| Add3  |      |       |        |                |        |    |                         |
| Mult1 | yes  | MVL.D |        | [F4]           | Load 2 |    |                         |
| Mult2 |      |       |        |                |        |    |                         |

| Register status                  |       |        |  |  |      |  |  |  |     |  |
|----------------------------------|-------|--------|--|--|------|--|--|--|-----|--|
| Field F0 F2 F4 F6 F8 F10 F12 F30 |       |        |  |  |      |  |  |  | F30 |  |
| Qi                               | Multi | Load 2 |  |  | Add1 |  |  |  |     |  |

L.D. F6,34(R2) L.D. F2,45(R3) MULD F0,F2,F4 SUB,D F8,F2,F6 DIV,D F10,F0,F6 ADD,D F6,F8,F2

Vj Vk得到WB后下一扫开始EX

(2) Assume the following latencies: load is 1 clock cycle; add is 2 clock cycles; multiply is 10 clock cycles; divide is 40 clock cycles. Fill in the tables when the instruction MUL.D is about to write result ( write result in next cycle).

Time

4 \ X | 5 \ \ X

012

|   | Instru | action status  |              |
|---|--------|----------------|--------------|
|   | ISSUE  | EXECUTE finish | WRITE RESULT |
|   | Clock  | Clock          | Clock        |
| 1 |        | 3              | 4            |
| 2 | 2      | 4,             | 5            |
| 3 | 3      | 6-1 <b>6</b>   |              |
| 4 | 4      | 6 -8           | 9            |
| 5 | 5      |                |              |
| 6 | 6      | 10-12          | 13           |

|       | Reservation stations |       |          |          |       |    |                         |  |  |  |  |  |
|-------|----------------------|-------|----------|----------|-------|----|-------------------------|--|--|--|--|--|
| NAME  | BUSY                 | Op    | Vj       | Vk       | Qj    | Qk | A                       |  |  |  |  |  |
| Load1 | no                   | L.D.  | [R2]     |          |       |    | 34 <b>+</b> R2          |  |  |  |  |  |
| Load2 | no                   | L.D.  | [R3]     |          |       |    | 34 <b>+</b> R2<br>45+R3 |  |  |  |  |  |
| Add1  | no                   | SUBD  | M[45tRs] | M[34+R2] |       |    |                         |  |  |  |  |  |
| Add2  | n0                   | ADD.D | CF8]     | M[45tR3) |       |    |                         |  |  |  |  |  |
| Add3  |                      |       |          |          |       |    |                         |  |  |  |  |  |
| Mult1 | yes                  | MVL.D | M[45+R3) | [F4]     |       |    |                         |  |  |  |  |  |
| Mult2 | yes                  | DIV.D |          | M[34tR2] | Mult/ |    |                         |  |  |  |  |  |

| Register status |       |    |    |    |    |       |     |  |     |
|-----------------|-------|----|----|----|----|-------|-----|--|-----|
| Field           | F0    | F2 | F4 | F6 | F8 | F10   | F12 |  | F30 |
| Qi              | Multi |    |    |    |    | Multz |     |  |     |

M[45+R3]