# COMPUTER ORGANIZATION AND ARCHITECTURE COA LAB ASSIGNMENT – 4

Name: MATTA ROHIT Roll No: 21CS01022

#### Plot of IPC vs Configurations:

I have calculated the values of IPC for the following configurations:

- SM2\_GTX480
- SM3\_KEPLER\_TITAN
- SM6\_TitanX
- SM7\_QV100
- SM7\_TITANV
- SM75\_RTX2060

I have calculated the IPC for the following Warp Schedulers:

- GTO Greedy-then-oldest
- LRR Loose Round Robin
- TL Two Level

Q1. Example: Plot showing the IPC on Y-axis and application name on X-axis, legend: different warp schedulers.

#### SM2-GTX480:



|            |                  |         | SM2_GTX480 |             |          |        |        |
|------------|------------------|---------|------------|-------------|----------|--------|--------|
|            |                  | IPS     | CPS        | IPC         | RUN_TIME | L1D MR | L2 MR  |
|            | GTO              | 140474  | 2397       | 58.60408844 | 219      | 0.8225 | 0.1651 |
| BFS        | LRR              | 212164  | 3712       | 57.15625    | 145      | 0.823  | 0.2493 |
| DFS        | TWO_LEVEL_ACTIVE | 146494  | 2549       | 57.47116516 | 210      | 0.8209 | 0.2537 |
|            | GTO              | 1031370 | 2205       | 467.7414966 | 61       | 0.5671 | 0.7951 |
|            | LRR              | 1398080 | 2584       | 541.0526316 | 45       | 0.567  | 0.7946 |
| PathFinder | TWO_LEVEL_ACTIVE | 911791  | 1870       | 487.5887701 | 69       | 0.567  | 0.7956 |
|            |                  |         |            |             |          |        |        |
|            | GTO              | 600526  | 1811       | 331.5991165 | 2        | 0.6009 | 0.3304 |
| NN         | LRR              | 600526  | 2059       | 291.6590578 | 2        | 0.6    | 0.3309 |
| ININ       | TWO_LEVEL_ACTIVE | 600526  | 1911       | 314.2469911 | 2        |        | 0.3255 |
|            | GTO              | 58284   | 9428       | 6.182011031 | 107      | 0.7509 | 0.3192 |
| NW         | LRR              | 62364   | 10083      | 6.185063969 | 100      | 0.7509 | 0.3207 |
|            | TWO LEVEL ACTIVE | 43009   | 6966       | 6.174131496 | 145      | 0.7509 | 0.3205 |

# SM7-QV100:



|        | SM7  | _QV100      |          |        |        |
|--------|------|-------------|----------|--------|--------|
| IPS    | CPS  | IPC         | RUN_TIME | L1 MR  | L2 MR  |
| 136123 | 803  | 169.5180573 | 226      | 0.4894 | 0      |
| 175793 | 1039 | 169.1944177 | 175      | 0.4881 | 0      |
| 112688 | 666  | 169.2012012 | 273      | 0.4895 | 0      |
|        |      |             |          |        |        |
| 547074 | 699  | 782.6523605 | 115      | 1      | 0.0123 |
| 796374 | 1024 | 777.7089844 | 79       | 1      | 0.0123 |
| 452615 | 585  | 773.7008547 | 139      | 1      | 0.0123 |
|        |      |             |          |        |        |
| 300263 | 1513 | 198.4553866 | 4        | 0.6    | 0.3334 |
| 300263 | 1524 | 197.0229659 | 4        | 0.6    | 0.3334 |
| 200175 | 1017 | 196.8289086 | 6        | 0.6    | 0.3334 |
|        |      |             |          |        |        |
| 18024  | 2502 | 7.20383693  | 346      | 0.8661 | 0      |
| 19013  | 2639 | 7.204622963 | 328      | 0.8661 | 0      |
| 13074  | 1817 | 7.195376995 | 477      | 0.8661 | 0      |
|        |      |             |          |        |        |

# **SM7-TITANV:**



|        | SM7  | _TITANV     |          |        |        |
|--------|------|-------------|----------|--------|--------|
| IPS    | CPS  | IPC         | RUN_TIME | L1 MR  | L2 MR  |
| 133755 | 444  | 301.25      | 230      | 0.4931 | 0.0317 |
| 221322 | 735  | 301.1183673 | 139      | 0.4936 | 0.0301 |
| 116972 | 388  | 301.4742268 | 263      | 0.493  | 0.031  |
|        |      |             |          |        |        |
|        |      |             |          |        |        |
| 367915 | 368  | 999.7690217 | 171      | 1      | 0.282  |
| 911791 | 924  | 986.7867965 | 69       | 1      | 0.282  |
| 487702 | 493  | 989.2535497 | 129      | 1      | 0.282  |
|        |      |             |          |        |        |
|        |      |             |          |        |        |
| 400350 | 529  | 756.805293  | 3        | 0.6    | 0.3334 |
| 400350 | 518  | 772.8764479 | 3        | 0.6    | 0.3334 |
| 300263 | 403  | 745.0694789 | 4        | 0.6    | 0.3334 |
|        |      |             |          |        |        |
|        |      |             |          |        |        |
| 30721  | 2720 | 11.29448529 | 203      | 0.8661 | 0      |
| 32146  | 2846 | 11.29515109 | 194      | 0.8661 | 0      |
| 21140  | 1876 | 11.26865672 | 295      | 0.8661 | 0      |

## SM75-RTX2060:



|         | SM7  | 5_RTX2060   |          |        |        |
|---------|------|-------------|----------|--------|--------|
| IPS     | CPS  | IPC         | RUN_TIME | L1 MR  | L2 MR  |
| 201070  | 1650 | 121.8606061 | 153      | 0.5869 | 0      |
| 287512  | 2388 | 120.39866   | 107      | 0.5865 | 0      |
| 184214  | 1506 | 122.3200531 | 167      | 0.5875 | 0      |
| 700400  | 4400 | 057.544000  | 00       |        | 0.004  |
| 786420  | 1196 | 657.541806  | 80       | 1      | 0.824  |
| 1143883 | 1791 | 638.6839754 | 55       | 1      | 0.824  |
| 706894  | 1100 | 642.6309091 | 89       | 1      | 0.824  |
| 400350  | 3003 | 133.3166833 | 3        | 0.6    | 0.3334 |
| 400350  | 3074 | 130.2374756 | 3        | 0.6    | 0.3334 |
| 300263  | 2252 | 133.3317052 | 4        | 0.6    | 0.3334 |
| 43308   | 6089 | 7.112497947 | 144      | 0.8661 | 0      |
| 44866   | 6308 | 7.112555485 | 139      | 0.8661 | 0      |
|         |      |             |          |        |        |
| 31338   | 4393 | 7.133621671 | 199      | 0.8661 | 0      |

Q2. Example: Plot showing the following on Y-axis and application name on X-axis, legend: different warp schedulers.

## (i) Plot of L1D Miss Rates:

## SM2-GTX480:



## SM7-QV100:



## **SM7-TITANV:**



## SM75-RTX2060:



# Q2 – b: L2 Miss Rates PLOT:

#### SM2-GTX480:



#### SM7-QV100:



# **SM7-TITANV:**



# SM75-RTX2060:



3. Categorize the applications w.r.t the L1D and L2 Cache hit rates. What changes do you observe w.r.t L1D and L2 cache hit rates when the L1D cache size is increased from 32KB to 8MB? (use warp scheduler GTO)

#### For 32 KB of L1D cache:

|     |     | SM2_GTX480   |        |  |  |
|-----|-----|--------------|--------|--|--|
|     |     | L1D MR L2 MR |        |  |  |
| BFS | GTO | 0.8225       | 0.1651 |  |  |
| NN  | GTO | 0.6009       | 0.3304 |  |  |
| NW  | GTO | 0.7509       | 0.3192 |  |  |

#### For 8 MB of L1D cache:

|     |     | SM2_GTX480   |        |  |  |
|-----|-----|--------------|--------|--|--|
|     |     | L1D MR L2 MR |        |  |  |
| BFS | GTO | 0.3257       | 0.4795 |  |  |
| NN  | GTO | 0.6          | 0.3309 |  |  |
| NW  | GTO | 0.7481       | 0.3214 |  |  |

On increasing the size of L1D cache by 256 times, we can incur the following observarions in L1D Miss Rate and L2 Miss Rate:

Miss Rate = No. of Misses/ No. of Accesses

- 1. **L1D Miss Rate has decreased.** This is because with increase in cache size, more amount of memory can be stored in L1D. Hence less number of misses and more number of hits.
- 2. **L2 Miss Rate is impacted**. We can say with surity that L2 accesses decreases due to reduced L1D Miss Rate, but L2 Miss Rate has many factors to be taken into consideration. Here we can observe that L2 Miss Rate has increased. This might be because that L2 accesses decrease is more prominent than L2 misses decrease.

4. What percentage of power is consumed by Execution units, DRAM, Register Files in each application run? Do you notice any correlation between the L1D cache hit rates observed in Question 3 and the Power consumption between different applications?

For 32 KB:

|     |       | Power     | Total Average Power |
|-----|-------|-----------|---------------------|
|     | RFP   | 7.15      |                     |
| BFS | DRAMP | 0.0948    | 62.6126             |
| БГЭ | L1P   | 3.8962    | 02.0120             |
|     | L2P   | 0.923     |                     |
|     |       |           |                     |
|     | RFP   | 9.312     |                     |
| NIN | DRAMP | 0.0579366 | 79.2563             |
| NN  | L1P   | 5.1959    | 79.2503             |
|     | L2P   | 10.0307   |                     |

#### For 8 MB:

|      |       | Power     | Total Average Power |
|------|-------|-----------|---------------------|
|      | RFP   | 7.15057   |                     |
| BFS  | DRAMP | 0.094     | 94.2833             |
| БГЭ  | L1P   | 3.895     | 94.2033             |
|      | L2P   | 0.923     |                     |
|      |       |           |                     |
|      | RFP   | 9.31267   |                     |
| NN   | DRAMP | 0.0579366 | 79.2728             |
| ININ | L1P   | 5.1959    | 19.2128             |
|      | L2P   | 10.0307   |                     |

Here we can observe that by increasing the L1D cache size by 256 times, the percentage of power consumption by each component in the total power decreases.

Increase in size of L1D cache helps in storing more data in L1D, which in turn decreases latency thereby decreasing the total percentage of power consumption.