## HWY

ID: 11062107 Name: 2131 41 1.

(a) sd. ld. beg (since they need immediate)

(b) add beg

(C) Branch
Mem Read

RegWrite

Mem Read

Note: The series of the series

IS Branch=1, PC+4→ PC+ imm , 且 : imm=0, PC=PC+0 導致無限循環。IS MemRead=1, 包y MemRead、MemWrite 「計=1, 包s memory會同時讀寫,根據XIZ+0位置輸以以 ReadJata 且: RegWrite=1, ReadJata 會容入 Register File,但是: in,[11:1]=0,且xo不允许 自入,所以不會有影響。

Z,

01EEA52316

Sw X30, 10 (X29)

(a) Branch: 0

Mem Read: 0

MemtoReg · X

ALUOp: 00

. . . .

MenWrite: |

ALUSAU:

RegWrite: 0

(b) Reg [29] , 10, (= 0000 0000 0000 000 A hex

3,

(a) "add" is the least time consuming one, it need 600 ps.

add: 25 + 235 + 130 + 15 + 170 + 15 + 10

PC Read I-Mem Register Max ALU Writing Pegister

File Max Setup

= 600 ps

= 820 ps

Sd: 25 + 235 + 130 + 110 + 235 = 195 ps

PL Z-Mem Register ALU D-Mem

Feed File

beg: 25 + 235 + 130 + 170 + 15 + 15 + 5 + 10

PL T-Men Register ALU Mux Mux Single Register
Read File Gente Setup

= 605 ps

(b) 820 ps

Since minimum clock period should be able to execute every instruction, and the longest instruction which is "Id" takes 820 ps, so the clock period should be 820 ps.

4.

Original time

1200 X 4/4 (80)

= 12 x /0 (sev)

improved time

# of clocks = 1200 + 4

= 124

1204/2×109 = 602 × 10-9

speedup =  $\frac{12 \times 10^{-1}}{602 \times 10^{-9}}$ 

= 1.99 #

5,

(a) RAW

0 add X29, X6, X7

Sub X30, X28, X29

RAW

@ snb X30, X28, X29

11/X) 0 (XII)

load-use

3 | X28, 4 (X5)

••••

Sub X30, X28, X29

(b) No since load-use hazard will not occur.
since (1d) ZF ZO EX MEM WB

so no hazard that will require the pipeline to Stall.



4 NOPs #

6.

(A)

$$ncmacy = \frac{5}{8} = 0.627 = 62.5\%$$

always - taken

$$acuracy = \frac{3}{8} = 0.315 = 31.5 \%$$

## (b) starts at T

| Ground truth | NT | Т  | T        | NT | NT         | NT | NT         | T  |
|--------------|----|----|----------|----|------------|----|------------|----|
| State        | T  | ٧T | _        | 丁  | ΛT         | NT | ΝT         | ΛT |
| Decision     | Т  | ΝT | Τ        | Т  | <b>V</b> T | ΝT | <b>√</b> ⊤ | NT |
| Correctness  | X  | ×  | <b>V</b> | X  | V          | /  | <b>V</b>   | Κ  |
|              |    |    |          |    |            |    |            |    |

$$\alpha \mu = \frac{4}{8} = 50 \%$$

(C)

| Ground truth | NT | Т  | T  | NT | NT | NT       | NT       | T  |
|--------------|----|----|----|----|----|----------|----------|----|
| State        | ST | WT | ST | ST | WT | ΜN       | SN       | SN |
| Decision     | τ  | Т  | Т  | Т  | 丁  | ΝT       | ΝT       | ٨T |
| Correctness  | X  | V  | V  | X  | X  | <b>V</b> | <b>V</b> | X  |

ST: Strongly predict taken

WT. Weakly predict taken

WN: Weakly predict not taken

SN: Strongly predict not taken

$$accuracy = \frac{4}{8} = 50 \%$$

(1+4)

(a) | clock wdes , No #

beg X11, X12, Label

Sd X/5, O(X23)

Id X15, O(X24)

add X11, X6, X12

NOP

NOP

sub XII, XI3, XI2

芒射增NOP指空,因NOP位於記憶体内,故在 feetul NOP時,也會有 structural hazard。
·· NO 共

(b) beg XII, XIZ, Label

add XII, X6, XIZ

sub XII, X13, XIZ

sd XIS, O(X23)

1d XIS, O(X24)

9 cycles (5+4) #

(C) (4+4+4) = 1.2 #

if determined in ID, should add 1 NOP between beg, sd

if ... "MEM, " " 3 NOPs ... beg, sd

(d) The original result takes 5+5-1=9 cycles.

The new result with only 4 stages takes 5+4-1=8 cycles. Speedup =  $\frac{9}{8}=1.125$  #

(e) The clock period of original pipelined process is 220.

The new pipelined process has a clock period of max(220, 180) + 27 = 247. The speedup is  $220 \times 9$ /  $245 \times 8$ 



|         | IF                                         | ID                  | EX                | MEM               | WB              |
|---------|--------------------------------------------|---------------------|-------------------|-------------------|-----------------|
| Cycle 1 | add X12, X6, X5                            | -                   | -                 | -                 | -               |
| Cycle 2 | sub X10, XII, X12                          | ald X12, X6, XJ     | -                 | -                 | -               |
| Cycle 3 | beg XII, XIZ, Label                        | sub XIO, XII, XIZ   | add X12, X6, XJ   | -                 | -               |
| Cycle 4 | 29 XII 0 (X15)                             | beg XII, XIZ, Label | snb X10, XII, X12 | ald X12, X6, XJ   | -               |
| Cycle 5 | first instruction of the exception handler | NOP                 | Nop               | snb XIO, XII, XIZ | ald X12, X6, XT |