Computer Assignment 2
Fall 2022

Deadline: Friday, Dec. 2, 11:55 PM

(Upload it to Gradescope.)

Here are some general guidelines:

- You should do all your work in the C++ programming language. Your code should be written using only the standard libraries. You may or may not decide to use classes and/or structs to write your code. Regardless of the design, your code should be modular with well-defined functions and clear comments.
- You are free to use as many helper functions/class/definitions as needed in your project.
- There is no restriction on what data type (e.g., int, string, array, vector, etc.) you want to use for each parameter.
- Unfortunately, experience has shown that there is a very high chance that there are errors in this project description. The online version will be updated as errors are discovered, or if something needs to be described better. It is your responsibility to check the website often and read new versions of this project description as they become available.
- Sharing of code between students is viewed as cheating and will receive appropriate action in accordance with University policy. You are allowed to compare your results with others or discuss how to design your system. You are also allowed to ask questions and have discussions on Campuswire as long as no code is shared.
- You have to follow the directions specified in this document and turn in the files and reports that are asked for.
- This project has a bonus part. You need to follow the instructions shown in red to add and submit your bonus part.

# **Project Description**

## Overview

In computer assignment 2, we want to design a cache/memory controller. You are responsible for designing the memory controller.

The input to your memory is a 32-bit address. Your code is responsible for loading the data from the memory (if it is an LW) or storing the value at the correct address if it is an SW. A skeleton code (memory\_driver.cpp) is provided to you for initializing the variables, creating the cache object and main memory array, reading a trace, and

passing the relevant information to your memory hierarchy. You are free to change this skeleton as you see fit!

Your memory hierarchy <u>has two levels of cache and the main memory</u> (i.e., the same array/vector you used in Project 1).

Your L1 cache should be a <u>direct-mapped cache</u> with 1 block per line with a total of 16 lines/sets. Each block in our cache should be 4 bytes (i.e., an int). It is your responsibility to correctly calculate the index, block offset, and tag in your code.

Your L2 cache should be an <u>8-way set-associative</u> cache. Each block should be still 4 bytes and there should be 1 block per line (same as L1). There should be 16 sets for L2. Same as above, your code should correctly compute tag, index, and block offset. Your main memory has 4096 lines each one 4 bytes.

Since L2 is using the SA cache, you should use the *LRU replacement policy* (L1 is DM, so no replacement policy is needed).

Your cache design should be exclusive/write-no-allocate/write-through. You should assume that your cache is initially randomly initialized with all valid bits equal to zero.

### Trace

Each line in a trace has four values (MemR, MemW, adr, data). The first two are either zero or one indicating whether this is a LOAD instruction or STORE (always only one of them will be equal to one). The address is a number between 0 and 4095 (you can safely assume this), and the data is a signed integer value. The code already read the trace line by line and stores the values in an array called myTrace.

In the main loop of the driver code, each entry is read one-by-one and should be handled by your memory hierarchy.

As mentioned before, you can assume that your cache and memory are initially empty (with random values) and we first store things before reading anything.

We have provided one simple trace file, but we will add more soon. Same as CA1, your code will be tested with multiple different traces on Gradescope. We strongly recommend creating your own traces and testing your code using those.

# Design

The entry for your code is your memory controller (controller) which is called in a loop in the main function of memory\_driver.cpp. For each item in the trace, this controller function should be called, and appropriate actions should be taken. Apart from these actions, you should also update stats in this controller. We strongly recommend adding more functions to your cache object for each action (e.g., search, update, evict, etc.)

The controller should first check whether this request is load or store and then follow an algorithm for each. For load, the controller should first search L1, then L2. If data is not found in either, data should be brought from memory (remember that there is no *miss* in main memory and the address provided in the trace is the index for the main memory array).

Once data is found, you should make sure to (1) Update the stats for each cache, and (2) Update data, tag, and LRU positions if needed.

If data is in L1, you just need to update the stats. If it is in L2, you should bring data to L1 (and remove it from L2) and update LRU, data, and tag. If it is in neither, data should be brought from memory and installed in L1 (update tag and data), the evicted data from L1 should be placed in L2 (should be put as most recently used), and evicted data from L2 should be removed.

If it is a store, you should do similar things, but remember that we use a write-no-allocate write-through strategy (i.e., if data is in both cache and memory, both should be updated. If it is not in the cache, only memory should be updated, but update the stats for each level.) Remember that you need to search both caches for the store the same as the load, and you have to update the stats in both cases.

### Cache Driver Code and Benchmarks:

The entry point of your project is "memory\_driver.cpp". The program should run like this:

```
"./memory_driver <inputfile.txt>",
```

Your program should print the miss-rate for L1 and L2, as well as the average access time (AAT) of the system in the terminal. For AAT, you can assume that the hit time for L1 is 1 cycle, for L2 is 8 cycles, and for main memory is 100 cycles. You are free to add more stats in your code if needed.

We will provide an additional trace file (or maybe two) with the ground truth.

Same as the previous project, it is your choice how you want to structure your code (e.g., whether you want to have separate objects for each class, or you want to instantiate an object within another class, or even not use any class at all and utilize functions and structs, etc.).

#### Bonus

For the bonus part, you need to add a victim cache to your L1 cache! Note that, to be able to grade the bonus and regular parts separately, you have to submit two separate projects. Our suggestion is to first finish the regular part and submit it. Once it works and passes the test, start working on the bonus part (using separate .cpp and .h files copied from your regular project).

The victim cache has to have four entries and has to be fully-associative with 1 block per line and 4B per block (same as DM). The victim cache has to use LRU as the replacement policy.

Data evicted from the L1 cache should be installed in the victim cache (remember there is only one victim cache from the entire L1 cache so any set from L1 can install their line into the victim cache. The oldest line in the victim cache should be then replaced by this new line and the evicted line from the victim cache should be installed in L2 (and the evicted line from L2 should be removed).

On an L1 miss, you have to first search your Victim cache. If it is there, you should update the victim cache hit stat (a new stat), and bring the data back to L1 (swap it with the corresponding line in L1). You should update the LRU states for the victim (with the new line as the most recent).

If data is not in the Victim cache, you should search L2 similar to before. Make sure to update the Victim Access stat. If data is in L2 or memory, it should be installed in L1 (and not the victim cache).

For printing the output, your code should still print L1 and L2 miss rates. Note the L2 miss rate will change now since you have a victim cache. For AAT, consider the hit time of the victim cache the same as L1 (i.e., the hit in L1 or victim cache takes an equal amount of time). You have to adjust your AAT calculation accordingly.

### What to submit.

You need to submit the following files on Gradescope. If you are submitting the bonus part, there is a separate submission link for it.

- 1. Your well-commented code (all the files DO NOT put it under a folder). Note that we will use different traces to test your code's correctness. Your code should be compiled with the following command: "g++ \*.cpp -o memory\_driver". If the code fails to compile, you will lose points. Your code should produce the results exactly as hard-coded in your skeleton code. Failing to create that format will result in losing points.
- 2. For the bonus part, submit a separate set of codes (again all codes and comments) on Gradescope (different submission link). It should be compiled similarly, and it should follow the same format for printing. If you didn't do the bonus part, you don't need to submit anything for this link.