# COMPUTER ORGANIZATION CACHE SIMULATION

Andrew Kee Ryan Montoya ECEN 4593

#### INTRODUCTION

A cache has been simulated that operates with 48 bit addresses, an L1 instruction, L1 data, L2 cache and Main Memory.

#### RESULTS

An initial way to view the performance of the Traces with specific Configurations is to look at the execution times for those Configurations. Fig. 1 shows each Trace and its execution time for each Configuration in relation to all the other traces.

#### **Execution Time for All Traces and Configurations**



Figure 1: Execution times of all Traces with correspondence to all Configurations

The execution time for some traces varies very little with changes in cache configuration while other traces are largely effected by the cache configuration. As seen in Fig. 1, libquantum has a very steady execution time which does not seem to depend on cache configuration. A trace like omnetpp however, is very dependent on the cache configuration. The execution times range from 40 billion cycles to 120 billion cycles.

Another way to compare the performance of each trace is to look at the Cycles per Instruction (CPI). This is plotted in Fig. 2.

## 

Figure 2: CPI of all Traces with correspondence to all Configurations

Similar to the execution times in Fig. 1, Fig. 2 shows that the configuration of the cache can dramatically effect the CPI.

Figures. 1 & 2 show that Fully Associative configurations give the fastest results, but at what cost?



Figure 3: Cost of all the different Configurations

Fig. 3 shows the total cost of having each Configuration. The fully associative configurations are much more expensive than the other configurations. Viewing both Fig. 2 & 3 show that cost transfers quite proportionally to performance.

The next step is to find the performance of the configurations with respect the the cheapest/slowest configuration. It was found that L1 small was the cheapest and slowest (on average) configuration. Fig. 4 shows the relative performance per dollar with respect to the L1 Small configuration.



Figure 4: Relative performance with respect to L1 Small

Fig. 4 uses Instructions per Cycle to calculate relative performance with respect to L1 Small.



### DISCUSSION