## CS2610 A5

Aswin Ramesh cs19b007 March 2021

| System configuration  | intel i5 6th gen, 2.3Ghz processor<br>frequency, 3 level cache hierarchy<br>(64K, 256K, 3072K) |
|-----------------------|------------------------------------------------------------------------------------------------|
| Scenario 1            |                                                                                                |
| instructions          | 7,53,97,58,816                                                                                 |
| branch-instructions   | 1,07,75,66,741                                                                                 |
| cache-references      | 21,83,31,833                                                                                   |
| cache-misses          | 7,53,94,522                                                                                    |
| L1-dcache-loads       | 3,23,12,34,843                                                                                 |
| L1-dcache-load-misses | 1,28,60,86,505                                                                                 |
| dTLB-loads            | 3,22,49,26,547                                                                                 |
| dTLB-load-misses      | 26,20,043                                                                                      |
| LLC-loads             | 8,22,30,131                                                                                    |
| LLC-load-misses       | 2,55,02,392                                                                                    |
| Average runtime       | 1,856                                                                                          |
| CPU-Cycles            | 4,95,57,46,859                                                                                 |
| Scenario 2            |                                                                                                |
| instructions          | 4,14,37,54,833                                                                                 |
| branch-instructions   | 27,47,67,543                                                                                   |
| cache-references      | 15,07,73,701                                                                                   |
| cache-misses          | 9,18,02,058                                                                                    |
| L1-dcache-loads       | 80,41,76,889                                                                                   |
| L1-dcache-load-misses | 7,15,75,321                                                                                    |
| dTLB-loads            | 81,29,80,612                                                                                   |
| dTLB-load-misses      | 21,22,057                                                                                      |
| LLC-loads             | 51,95,374                                                                                      |
| LLC-load-misses       | 22,16,729                                                                                      |
| Average runtime       | 595                                                                                            |
| CPU-Cycles            | 1,59,25,95,035                                                                                 |
|                       |                                                                                                |

Column major takes less number of cycles and instructions, since it doesn't have to switch between different blocks of data, while taking the values of B[j][k], since the elements in a column of B are consecutive in the main memory.