## CS 60003: High Performance Computer Architecture CLASS-TEST - 1 [Spring 2023-2024] Model Solution

1) (a) Tpipelined win = Tlongest-stage min + d vegistery 2115+01115 = 2.1115 (b) In the Steady State, Every 5 clocks, 4 instructions would recoin be completed.

CPI | pipelined = 4 instructions

(c) Note than in a non-pipelined machine, there is no possibility of any Stall Suppose, a program consist of "I" instructions

:. time taken to execute the program on the Mon-pipelined

= I \* CPI \* T/ non-pipelined

Similarly, = I \* 1.25 \* T/pipelined

Zuon-pipelined

(d) If the #of pipeline stages become infinite, then, the delay of pipelined machine is only the delay of the vegisters, it.,

Hofstages -> & T/pipelined = divegistry = 0.1ms

Both answers coould be accepted.

idering extra stall of cycles

) This answer also makes sense since the clock eyele time-period is 400 almost zero (because of infinite # of pipeline stages). Hence, the only pipeline delay NOW is because of the vegister delay.

(a) When there is no branch misprediction, the only stalls are boarse of RAW hazards. Which Cannot be resolved (: there is no forwarding), except those which can be resolved because of the nature of the register file loher WBJ-> ID register file reading is possible. D: Cannot be resolved (even if there had be on Loop: ld, a1, 0(42) addin alight, for warding) 2: Cannet be vesolood (: there is no forwarding.)

Bowers, by sdelaying instruction forthe for set (see

properties strong drag van below), they do not effectively

affects the performance (WB > ID "forwarding through sd 21,0(42) addi az, az, az Sub 914, 93, 92 bu+z 914, Loop youe register file" helps) (3): Same as for #2. (4): Same as for #2 Dependencies Note that Since branch outcomes are known in Exstage there are 2 cycles penalty in case of branch wispredictions -> denotes "forcoarding through vegisted file" 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 DXMW SSDXHQ addi FDXM addi Sub buez (b) In the 1st of cycle, assuming the 1-bit branch predictor is in Not Take State, there is a misprediction => 2 of cycles penalty because of pipeline flux Again, in the last of cycle, there is similarly a 2 of cycle penalty. The loop runs total = 99 times mutae The middle 99-1-1 = 97 iterations take to d cycles each (because of : total # of & cycles = 97 \* 16 + 20 + 18 = 1786 & cycles Note: if the leyete stall due to pipeline flushes in the last iteration is ignoved, then # of & cycles = 1784 MARIN 1588. Note: if initially 1-bit Branch Redicted is in Taken Charles then there is no misprediction in 1st iteration. 1586 : total # of \$ cycles = 97 x 16 + 18 + 18 = 1050

| Clock Cycle #4 Issue Road Ex. Waite Instr. Status                                            | - 51                                                                           |
|----------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------|
| foul 2 2 3                                                                                   | -0140                                                                          |
| 4                                                                                            | Susy Op Fr F. F. Fre Si St.                                                    |
| Add/Sub Yes                                                                                  | Add for for for for fes fest of for for for for for for for for for            |
| Fo. F2. F5                                                                                   | Reg. Res. Status. fo f2 f5 f6 f8 f10 2<br>C Cycle# FU Add/sub                  |
| 7 Joshe Ry                                                                                   | Rock Cycle #13. ISS. Read Ex. Dv. Res. 1800 100 100 100 100 100 100 100 100 10 |
| Structural polition of 12 5 File By Ble R. Rie Instructural Eusy Op Fi F; File By Ble R. Rie | fadd.d Busy Op Fi. E. Fie Bi St. Ri. Re.                                       |
| Sus for fo fo Yes                                                                            | M2 No PT F8 F6 Yes                         |
| 4e 15 2 f                                                                                    | Reg. Startus; fo fee for fo fe f8 f10<br>Clock Cyclet FU; Add/Sub Div.         |
|                                                                                              | executive that                                                                 |
|                                                                                              | Division ands execution from our clock #45.                                    |
|                                                                                              |                                                                                |