# **CS 2410 Computer Architecture Project Report**

# -Yoshita Buthalapalli

-yrb3@pitt.edu

Input instruction and Instructions to execute are specified in the README file. The simulator output is the final register values, memory values, statistics and the tomasulo table.

### **Comparative analysis:**

1) Study the effect of changing the issue and commit width to 2. That is setting NW=NB=2 rather than 4.

When we reduced NB from 4 to 2, the CDB utilization jumped from 46% to 73%. This was because the busses were being used to their max potential by WB and Commit. When NB was 4, few of the buses we left idle whenever instruction with longer latencies were being executed. When NB was made 2, the pace of outputs generated by EX stage, matched with the WB and C stage.

When NW was reduced to 2, the number of cycles taken to finish the execution increased from 29 to 37. This was because, even though the EX units were available, they couldn't be utilized as only 2 instructions could be issued. As shown below, Instruction at address 8 finished early when NW was 4. Also, as NW reduced, the Reservation stalls also reduced. This was obvious because if less instructions are issued then reservation stations wont get filled as quickly as when NW was 4.

|               | ВС | W] | Е | I | D | F | Addr |
|---------------|----|----|---|---|---|---|------|
| When NW was 4 | 5  | 4  | 3 | 2 | 2 | 1 | 0    |
|               | 6  | 5  | 4 | 2 | 2 | 1 | 4    |
|               | 6  | 5  | 3 | 2 | 2 | 1 | 8    |
|               |    |    |   |   |   |   |      |
| When NW was 2 | 5  | 4  | 3 | 2 | 2 | 1 | 0    |
|               | 6  | 5  | 4 | 2 | 2 | 1 | 4    |
|               | 8  | 7  | 4 | 3 | 2 | 1 | 8    |

2) Study the effect of changing the fetch/decode width. That is setting NF = 2 rather than 4.

Ans. When NF is reduced to 2, the stalls due to full Reservation stations, the stalls due to full Reorder buffer and CDB usage reduces. This is because fewer instructions are fetched every cycle, so fewer a decoded and issued at every cycle. The cycles taken to complete execution didn't increase by much (in the test case that I tried) because the latency of operations like fmul and fewer stalls compensated for low fetch rate.

3) Study the effect of changing the number of reorder buffer entries. That is setting NR = 4, 8, and 32

#### Ans.

With the increase in NR, the number of stalls due to full ROB reduces. This is obvious too because larger the ROB capacity, lesser the chance of it getting filled up quickly and causing stall. Also, the Cycles taken to finish total execution reduced because the ROB stalls reduced.

The CDB utilization and Reservation stalls increased gradually with the increase in NR. This was because:

As the ROB stalls reduced, more instructions could be issued sooner, hence RS stalls and CDB usage increased slightly.

NF = Number of instructions that can be fetched every cycle. NW = Number of instructions issued every clock cycle to reservation stations NR = Number of entries in Reorder Buffer NB = Number of Common Data Busses available

Detailed result outputs are provided at the end of the report.

| NF | NW | NR | NB | Cycles<br>taken | CDB usage | RS Stalls | ROB Stalls |
|----|----|----|----|-----------------|-----------|-----------|------------|
| 4  | 4  | 16 | 4  | 29              | 46.5      | 4         | 5          |
| 4  | 2  | 16 | 2  | 37              | 72.9      | 2         | 5          |
| 2  | 4  | 16 | 4  | 30              | 45        | 2         | 4          |
| 4  | 4  | 4  | 4  | 47              | 28.7      | 0         | 37         |
| 4  | 4  | 8  | 4  | 35              | 38.5      | 3         | 19         |
| 4  | 4  | 32 | 4  | 29              | 46.5      | 8         | 0          |

### **Project Design:**

This project consists of 5 code files and 5 test files:

- Simulator.h
- Simulator.c
- Stages.c
- Makefile
- CircularQueue.h contains circularqueue and dictionary implementation.



Processor struct is used to store and keep track of all the global variables like reorder buffer(ROB), integer registers, float registers, number of stalls, total commits, BTB etc.

stages.c consists of code for all the stages.

### Fetch Stage:

- Fetches upto NF instructions from *instructionmemory dict*, with address PC as key.
- Each instruction is added to fetch/decode buffer called fdpipelineres which is a queue.
- Check the Branch target buffer for this instruction address/PC. BTB(instrAddr,TargetAddr) is as hash disctionary. If the instruction address is found in the BTB keys, then PC is set to the corresponding value i.e the target address. Else, PC will be incremented by 4.

#### Decode Stage:

- In this stage, we start decoding the instructions that were fetched in the previous cycle. Hence, we dequeue each instruction from *fdpipeline* and not fdpipelineres. At the end of every cycle, the contents of fdpipelineres are copied into fdpipeline.
- Parse the instructions using string tokenizer and store each instruction as *Instruction* struct into another queue *dipipeline*.

#### Issue Stage:

- Can issue upto NW instructions from the dipipeline. Decode and issue happens in same cycle.
- check if we shouldn't issue this instruction(stall) because a branch instruction before this hasn't been resolved yet.
- Check if ROB is full, if yes, then increment stall counter for ROB and return;
- Check if the reservation station of the corresponding instruction unit is full; if yes, then increment the counter.
- Start filling the Reservation Status table entry based on Tomasulo algorithm and rename the destination registers from the available p0-p32 physical registers. RenameRegister Table is also a hash dictionary with ROB index as *key* and register value as the *value*. Only 32 entries can be in RenameRegister table. If it is full, then the instruction is not issued/stalled.
- Update BTB as needed.
- Enqueue the instruction entry into ROB table inorder and dequeue this instruction from dipipeline. Each entry in ROB stores the instruction, its destination register, its destination renamed register and if the instruction is ready to execute/waiting.

#### **Execute Stage:**

- Read the reservation stations in fifo order for each unit. If the instruction is ready to execute then,
- Perform the corresponding operation if the latency cycles have been completed. For
  implementing latency, I used a circular queue of size equal to the latency of that unit. The queue
  has one result node and all other are null nodes. Foreach cycle, the front node is pushed back.
  Once the result node is found, the latency ends and the instruction is sent to mem stage in next
  cycle.

• The outputs of the units are stored in a buffer2. Mem and WB reads from buffer1. At the end of cycle buffer2 values are transferred to buffer1. This is done because we WB the outputs from previous cycle.

# Mem stage:

• Reads the data in memory location into register for FLD.

#### Commit stage:

- Executed commit before writeback to avoid having a pipeline between the two.
- Reads the topmost entry in the ROB circular queue because commit must maintain in-order.
- If the instruction has finished execution, then, update the data memory and free renamed registers. Else, return.

## Writeback stage:

- Can write back only (NB-number of commits) instructions in this cycle.
- For each instruction in writebackbuffer, if it is fsd, then store the value into the memory location.
- Update the ROB entries. Whichever ROB entry/row had this instruction ROB index in its qj or qk in the reservation station table must be updated.



### Test cases: I tried 3 more test cases to test different scenarios

# Test2.dat: for testing RAW and WAR dependency

0,0

22,1.1

11,2.2

fld F2, 22(R0)

fld F1, 11(R0)

fadd F3,F2,F1

fsub F4,F3,F1

fmul F5,F4,F1

## After simulation, the output was:

Final Memory values:

Mem[0] = 0.00

Mem[22] = 1.10

Mem[11] = 2.20

## Registers:

$$R0 = 0$$
  $F0 = 0.00$ 

$$R1 = 0$$
  $F1 = 2.20$ 

$$R4 = 0$$
  $F4 = 1.10$ 

$$R5 = 0$$
  $F5 = 2.42$ 

Total instructions executed: 5

Stalls because the reservation stations are occupied: 0

Stalls because the Reorder Buffer is full: 0

Average CDB utilization: 12.500000

Total cycles taken: 20

Tomasulo Table:

Instr address | FETCH| DECODE | ISSUE | EX | WB | COMMIT

0 1 2 2 3 5 6

4 1 2 2 4 6 7

8 1 2 2 7 10 11

12 1 2 2 11 14 15

16 2 3 3 15 19 20

# Test3.dat: testing load/store forwarding

1,1

2,2

0,0

fld F1,1(R0)

fld F2,2(R0)

fadd F4,F2,F1

fsub F5,F4,F1

fmul F4,F5,F1

fsd F4,1(R0)

fld F6,1(R0)

## After simulation, the output was:

Final Memory values:

Mem[2] = 2.00

Mem[0] = 0.00

Mem[1] = 2.00

### Registers:

$$R0 = 0$$
  $F0 = 0.00$ 

$$R2 = 0$$
  $F2 = 2.00$ 

$$R3 = 0$$
  $F3 = 0.00$ 

$$R4 = 0$$
  $F4 = 2.00$ 

$$R6 = 0$$
  $F6 = 2.00$ 

Total instructions executed: 7

Stalls because the reservation stations are occupied: 3

Stalls because the Reorder Buffer is full: 0

Average CDB utilization: 14.000000

Total cycles taken: 25

Tomasulo Table:

Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT

Test4.dat: simple branch loop where first and last predictions are incorrect

1,1

2,2

0,0

```
addi R1,R0,4
fld F1,1(R0)
fld F2,2(R0)
loop: fadd F6,F6,F1
addi R1,R1,-1
bne R0,R1,loop
fsub F8,F2,F1
After simulation, the output was:
Final Memory values:
Mem[1] = 1.00
Mem[2] = 2.00
Mem[0] = 0.00
Registers:
R0 = 0
         F0 = 0.00
R1 = 0 F1 = 1.00
R2 = 0 F2 = 2.00
R6 = 0 F6 = 4.00
R8 = 0
         F8 = 1.00
Total instructions executed: 16
Stalls because the reservation stations are occupied: 0
Stalls because the Reorder Buffer is full: 0
Average CDB utilization: 25.806452
Total cycles taken: 31
Tomasulo Table:
Instr address | FETCH| DECODE| ISSUE | EX | WB | COMMIT
```

0 1 2 2 3 4

5

```
2
             2
4
    1
                 3
                      5
                          6
8
    1
        2
             2
                 4
                      6
                          7
12
    1
         2
             2
                  6
                      9
                           10
16
    2
         3
             3
                  5
                      6
                           10
20
    2
         3
             3
                  7
                      8
                           10
12
    9
         10
              10
                   11
                        14
                             15
16
    9
         10
              10
                   11
                        12
                             15
20
    9
         10
              10
                   13
                        14
                             15
12
    9
         10
              15
                   16
                        19
                             20
16
    10
         11
              15
                   16
                        17
                             20
         11
20
    10
              15
                   18
                        19
                             20
                   21
12
     10
         11
              20
                        24
                             25
16
    10
         11
              20
                   21
                        22
                             25
20
                   23
                             25
    11
         12
              20
                        24
24
    25
         26
              26
                   27
                        30
                             31
```

bash-4.2\$

## **RESULTS for comparative analysis on Test benchmark:**

Enter NF: 4 Enter NW: 4 Enter NR: 16 Enter NB: 4

```
Final Memory values:

Mem[0] = 111.00

Mem[8] = 14.00

Mem[16] = 5.00

Mem[24] = 10.00

Mem[100] = 2.00

Mem[200] = 12.00

Mem[124] = 128.00

Mem[116] = 63.00

Mem[116] = 63.00

Mem[108] = 195.00
```

```
Registers:
R0 = 0
          F0 = 195.00
R1 = 0
          F1 = 0.00
R2 = 100
               F2 = 12.00
R3 = 0
          F3 = 0.00
R4 = 0
          F4 = 27.00
************************************
Total instructions executed: 27
Stalls because the reservation stations are occupied: 4
Stalls because the Reorder Buffer is full: 5
Average CDB utilization: 46.551723
Total cycles taken: 29
Tomasulo Table:
Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT
   0
        1
            2
                 2
                     3
                          4
                              5
   4
        1
            2
                 2
                     4
                          5
                              6
   8
        1
            2
                 2
                     3
                          5
                              6
             2
                 2
                          7
                               8
   12
        1
                      5
   16
        2
             3
                 3
                      5
                          6
                               8
    20
        2
             3
                 3
                      8
                          12 13
             3
   24
        2
                 6
                      7
                          9
                               13
             3
    28
        2
                 6
                      13
                          16
                               17
             4
                 6
   32
        3
                      17
                          18
                                19
   36
        3
             4
                 6
                      7
                          8
                               19
   40
        3
             4
                 7
                      8
                          9
                               19
        8
             9
                 9
                      10
                          12
                                19
   12
        8
             9
                 9
   16
                      10
                          11
                                20
   20
        8
             9
                 9
                      13
                           17
                                20
   24
        8
             9
                 10
                      11
                           13
                                 20
   28
        9
             10
                                 22
                 10
                       18
                           21
   32
        9
             10
                       22
                                 24
                  10
                            23
   36
        9
             10
                  10
                       11
                            12
                                24
   40
        9
             10
                                 24
                  11
                       12
                            13
   12
        10
              11
                  14
                       15
                             17
                                 24
   16
        10
              11
                  14
                       15
                             16
                                 25
   20
        10
              11
                   14
                       18
                             22
                                25
   24
        10
              11
                   14
                       16
                             20
                                 25
                                 27
   28
        11
              12
                   18
                       23
                             26
   32
        11
              12
                   20
                       27
                             28
                                 29
                             22
                                  29
   36
        11
              12
                   20
                        21
   40
        11
              12
                   20
                        21
                             22
                                  29
```

bash-4.2\$ ./simulator Test1.dat

```
Total instructions executed: 27
Stalls because the reservation stations are occupied: 2
Stalls because the Reorder Buffer is full: 5
Average CDB utilization: 72.972977
Total cycles taken: 37
Tomasulo Table:
Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT
          2
              2
                  3
   4
       1
          2
              2
                 4
                     5
                         6
   8
          2
              3
                     7
                         8
       1
                  4
   12
      1
          2
             3 5
                     7
                        8
             4 5
      2
           3
                     6
                         9
   16
                     13 14
   20
      2
           3
             4 8
      2
           3
             7 8
   24
                      10
                          14
      2
             7
   28
          3
                 14
                     18
                         19
   32
      3
           4 8 19 20
                          21
       3
           4
              8
                  9
   36
                      10
                          21
              9
   40
       3
           4
                  10
                     11
                           22
           11 11
                   12
                        15
                            22
   12
       10
   16
       10
           11
              11
                   12
                        13 23
   20
       10
           11
               12
                   16
                        23
                            24
   24
       10
           11
               12
                   13
                        15 24
   28
       11
           12
               13
                   24
                        27
                           28
   32
           12
                        29 30
       11
               13
                   28
           12
               14
                        16 30
   36
       11
                   15
   40
       11
           12
              14
                   15
                        16 31
   12
       12
           13
               15
                   16
                        18 31
   16
       12
           13
              15
                   16
                        17 32
   20
                   22
                        26 32
       12
           13
              16
   24
       12
                        19 33
           13
               16
                   17
   28
       13
           14
              20
                   28
                        33 34
   32
       13
           14 22
                        35 36
                   34
   36
       13
           14 22
                   23
                        25 36
   40
       13
           14
               23
                   24
                        25
                           37
bash-4.2$
```

Enter NF: 2 Enter NW: 4 Enter NR: 16 Enter NB: 4

```
Total instructions executed: 27
Stalls because the reservation stations are occupied: 2
Stalls because the Reorder Buffer is full: 4
Average CDB utilization: 45.000000
Total cycles taken: 30
Tomasulo Table:
Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT
             2
                 2
   0
        1
                      3
                           4
                                5
             2
                 2
                           5
                               6
    4
        1
                      4
    8
        2
             3
                 3
                      4
                           6
                               7
         2
             3
                           7
                                8
    12
                  3
                       5
                       5
    16
         3
             4
                  4
                           6
                                8
                       8
    20
         3
             4
                  4
                           12
                                13
    24
         4
             5
                  7
                       8
                           10
                                 13
         4
             5
                  7
                       13
    28
                            16
                                 17
                            18
             6
                  7
    32
         5
                       17
                                 19
    36
         5
             6
                  7
                       8
                           9
                                19
   40
             7
                  8
                       9
                           10
                                 19
         6
         9
    12
              10
                  10
                        11
                             13
                                   19
    16
         9
              10
                   10
                             12
                                   20
                        11
    20
         10
              11
                   11
                        14
                              18
                                  20
    24
         10
              11
                   11
                         12
                              14
                                   20
                                   23
    28
         11
              12
                   12
                         19
                              22
    32
         11
              12
                   12
                         23
                              24
                                   25
    36
         12
              13
                   13
                         14
                              15
                                   25
   40
         12
              13
                   13
                         14
                              15
                                   25
              14
                              17
                                   25
    12
         13
                   14
                         15
    16
         13
              14
                   14
                         15
                              16
                                   26
    20
         14
              15
                   15
                         19
                              23
                                   26
                                   26
    24
         14
                              20
              15
                   15
                         16
    28
         15
                              27
                                   28
              16
                   18
                         24
    32
         15
                              29
                                   30
              16
                   20
                         28
    36
         16
              17
                   20
                         21
                              22
                                   30
    40
         16
              17
                   20
                         21
                              22
                                   30
bash-4.2$
```

Enter NF: 4 Enter NW: 4 Enter NR: 4 Enter NB: 4

```
************************************
Total instructions executed: 27
Stalls because the reservation stations are occupied: 0
Stalls because the Reorder Buffer is full: 37
Average CDB utilization: 28.723404
Total cycles taken: 47
Tomasulo Table:
Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT
            2
                2
                    3
   4
       1
            2
               2
                    4
                        5
                            6
            2
   8
       1
                2
                    3
                        5
                            6
                   5
   12
       1
            2
               2
                        7
                           8
                   7
   16
       2
            3
                6
                        8
                             9
   20
        2
            3
               7
                   8
                        12 13
        2
            3
                7
   24
                    8
                        10
                             13
        2
            3
   28
                9
                    13
                        16
                             17
   32
        3
            4
                10
                    17 18
                              19
        3
            4
   36
                14
                     15 16
                              19
   40
        3
            4
                14
                     15
                         16
                              19
                           21 22
   12
        15
            16
                18
                     19
   16
        15
             16
                20
                      21
                           22 23
   20
        15
             16
                 20
                      22
                           26
                               27
   24
        15
             16
                 20
                      21
                           23
                              27
   28
        16
             17
                 23
                      27
                           30
                               31
   32
        16
             17
                 24
                           32
                               33
                      31
   36
        16
             17
                 28
                      29
                           30
                               33
   40
        16
             17
                 28
                      29
                           30
                              33
   12
        17
             18
                 32
                      33
                           35
                               36
   16
        17
             18
                 34
                      35
                           36
                               37
   20
                34
                           40
                               41
        17
             18
                      36
   24
        17
                           37
             18
                 34
                      35
                               41
   28
                               45
        18
             19
                 37
                      41
                           44
   32
        18
             19
                 38
                      45
                               47
                           46
   36
        18
             19
                 42
                      43
                           44
                               47
                               47
   40
        18
             19
                 42
                      43
                           44
```

Enter NF: 4 Enter NW: 4 Enter NR: 8 Enter NB: 4

```
Total instructions executed: 27
Stalls because the reservation stations are occupied: 3
Stalls because the Reorder Buffer is full: 19
Average CDB utilization: 38.571430
Total cycles taken: 35
Tomasulo Table:
Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT
           2
              2
                  3
                      4
                          5
   4
       1
           2
              2
                  4
                      5
                          6
                      5
   8
       1
           2
              2
                  3
                          6
   12 1
           2
              2
                  5
                      7
                          8
              3
                  5
      2
           3
                      6
                          8
   16
       2
           3
              3
   20
                   8
                       12 13
      2
           3
               6
                  7
                       9
                          13
   24
           3
   28
       2
               6
                      16 17
                   13
   32
       3
           4
               6
                   17
                      18 19
   36
       3
           4
               7
                   8
                       9
                           19
   40
      3
           4
               7
                   8
                       9
                          19
       8
           9
               9
                   10
                      12
                            19
   12
           9
               9
   16
       8
                   10
                       11
                            20
           9
               14
   20
       8
                   15
                       20
                            21
   24
           9
       8
               14
                   15
                       17
                            21
   28
       9
           10
                   21
                       24
                            25
               18
       9
               20
                    25
                            27
   32
           10
                        26
       9
                            27
   36
           10
               20
                    21
                        22
   40
       9
           10
               20
                    21
                        22
                            27
                        23 27
   12
       10
           11
               20
                    21
   16
       10
            11
               21
                    22
                         23
                            28
   20
       10
            11
                22
                    24
                         28 29
   24
       10
                22
                    23
                         25
                             29
            11
   28
                    29
                         32 33
       11
            12
                26
   32
       11
            12
               28
                    33
                         34 35
   36
       11
            12
                28
                    29
                         30 35
   40
       11
            12
                28
                    29
                         30
                             35
```

Enter NF: 4 Enter NW: 4 Enter NR: 32 Enter NB: 4

```
************************************
Total instructions executed: 27
Stalls because the reservation stations are occupied: 8
Stalls because the Reorder Buffer is full: 0
Average CDB utilization: 46.551723
Total cycles taken: 29
Tomasulo Table:
Instr address | FETCH | DECODE | ISSUE | EX | WB | COMMIT
            2
                2
                    3
                             5
   4
        1
            2
                2
                    4
                         5
                             6
   8
            2
                2
        1
                    3
                         5
                             6
            2
                2
                    5
                        7
   12
       1
                             8
                     5
   16
       2
            3
                3
                         6
                             8
   20
        2
            3
                3
                     8
                         12 13
        2
            3
                 6
                     7
                         9
   24
                              13
        2
            3
   28
                 6
                     13
                         16 17
   32
        3
            4
                 6
                     17
                         18
                             19
        3
            4
                     7
                             19
   36
                 6
                         8
                 7
                         9
                              19
   40
        3
            4
                     8
                 9
   12
        8
            9
                         12
                              19
                     10
            9
        8
                 9
   16
                     10
                         11
                               20
            9
                 9
   20
        8
                     13
                          17
                               20
        8
            9
                 10
                               20
   24
                     11
                          13
   28
        9
            10
                10
                      18
                          21
                               22
   32
        9
            10
                 10
                      22
                           23
                                24
   36
        9
            10
                 10
                      11
                           12
                                24
   40
        9
            10
                 11
                      12
                           13
                                24
   12
        10
             11
                 14
                      15
                           17
                               24
   16
        10
             11
                  14
                      15
                            16
                                25
   20
                            22
                                25
        10
             11
                  14
                      18
   24
                            20
                                25
        10
             11
                  14
                      16
```