

#### **Introduction to Computer Architectures (Fall 2019)**

# **PA3: Cache Simulator**

Prof. Jinkyu Jeong (jinkyu@skku.edu)

TA – Minwoo Ahn (<u>minwoo.ahn@csi.skku.edu</u>)

TA – Sunghwan Kim (<u>sunghwan.kim@csi.skku.edu</u>)

Computers Systems and Intelligence Laboratory (<a href="http://csi.skku.edu">http://csi.skku.edu</a>)
2019. 12. 03.

#### Overview: What to do?

- Goal
  - To help you understand the internal of cache
- Implement cache simulator in C language
  - Skelaton code is provided
- Environment
  - Recommend to implement your code on Linux



#### **Overview: Given Files**

- Configuration (cache.config)
  - Configuration of cache is listed
  - level, size, way, entry, replacement, write policy, access cycle
- Input (cache.input)
  - Random (or sequential) read and write request stream
- Skelaton (main.c, mem.h)
  - You have to implement your cache in these files
  - Do not modify original code

#### **Input Files**

- Input files are set of [read | write] operation with 32bit physical address (file must end with 'H', halt)
  - Read example (R 22f7c744)
  - Write example (W 22f7c54c)
  - Every read and write operation requests 4 bytes word data



#### **Configuration File**

- 1st line: # of cache level, memory access time (cycle)
- 2<sup>nd</sup> line (level 1 cache): cache size (byte), # of entries, # of ways, replacement policy, write policy, cache access time (cycle)
- 3<sup>rd</sup> line (level 2 cache): same configuration as the second line
- Replacement policy

- LRU: Least Recently Used

FIFO: First-in-First-out

#### Write policy

- WB: Write Back

WT: Write Through

```
1 400
16 256 2 FIF0 WB 2
```

(a) 1 level cache config

```
2 400
16 256 2 FIF0 WB 2
64 512 4 LRU WT 10
```

(b) 2 level cache config



#### **Skelaton Code**

- mem.h
  - struct cache\_entry
    - Valid bit, dirty bit, tag
    - You can add new variables for your own purpose
  - struct cache
    - Variable: size, num\_entry, num\_ways, replace\_policy, write\_policy, access\_cycle
    - Usage: way\_cache[way #][entry #]

```
cache entry {
       char valid;
       char dirty;
      unsigned int tag;
      /* You can add variables for your purpose */
truct cache {
       struct cache entry **way cache;
       int size;
       int num entry;
       int num ways;
       int block per entry;
                                       // # of blocks / entry
       int bytes per block;
                                       // # of bytes / block
       enum replace policy replace;
      enum write policy write;
      int access cycle;
```

#### main.c - main()

- Read configuration file
- Read input file
- Cache operation
- Print result

```
Start Simulation */
while(1) {}
        fscanf(fd input, "%c", &op);
        switch(op) {
                                //Read Operation
                case
                        fscanf(fd input, "%x", &phy addr);
                        read op(phy addr);
                        break;
                                 //Write Operation
                case
                        fscanf(fd input, "%x", &phy addr);
                        write_op(phy_addr);
                        break;
                default:
                                 //HALT
                        goto end;
       };
```

### main.c - initialize()

```
initialize() {
  int n, i, j;
  printf("Initialize Your Cache...\n");
  /* cache.way cache[way #][entry #]*/
  for(n = 0; n < cache level; n++) {
          cache[n].way cache = (struct cache entry **)malloc(sizeof(struct cache entry *) * cache[n].num ways);
          for (i = 0; i < cache[n].num ways; i++)
                 cache[n].way cache[i] = (struct cache entry *)malloc(sizeof(struct cache entry) * cache[n].num entry);
          for (i = 0; i < cache[n].num ways; i++) {
                 for (j = 0; j < cache[n].num entry; j++) {
                         /* Initialize other variables in your cache entry structure! */
                        /* e.g. valid, dirty, tag, etc.
                                                                             If you added new variables in
                        cache[n].way cache[i][j].valid = 0;
                        cache[n].way cache[i][j].tag = 0;
                                                                             the struct cache, you must
                        cache[n].way cache[i][j].dirty = 0;
                        _____initialize the value here!
  printf("Finished to Initialize Your Cache!\n\n");
```

#### main.c - read\_op(), write\_op()

- You must implement these three functions
  - read\_op(), write\_op()
    - You should count the hit and miss each level of cache
    - At the miss, you should implement cache load and eviction
    - If evicted entry is dirty, write back or write through the entry
    - You should count the total access time (cycle)

Direct Map (baseline)



Cache Associativity (2-way)



#### **PA3 – Cache Simulator**

- Implement cache simulator in C language
  - read\_op(), write\_op()
- L1 cache & L2 cache must satisfy the inclusion property
  - No addition cycle
- L1 cache & L2 cache have same number of data blocks
- Write policy with write-allocation
- No write buffer



#### **2 Level Cache**

L1 Cache Access time: 2

L2 Cache Access time: 6

Memory Access time: 400

• L1 hit: 2 cycles

• L1 miss, L2 hit: 2+6 cycles

L1 miss, L2 miss: 2+6+400 cycles

 Write through or dirty bit entry eviction need cycle addition

#### **Submission**

- Compress your main.c, and mem.h files only
  - Do not change file name
  - Without subdirectories
  - Submitted file name should be YourStudentID.zip
    - ex) 2018000000.zip
  - Please follow this format
- Upload your zip file to I-Campus Assignments bulletin
- DO NOT COPY
  - If you have any questions please use your TAs
- Due date: TBD
  - -20% per day for delayed submission



# Tips

- Execution
  - To run the code, type \$make command under skeleton directory
  - "simulator" executable file will be made
  - Run the executable without argument
    - \$./simulator
- Lecture notes will help your assignment

cache.config

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Tag    | Index | Block<br>offset | Byte<br>offset |
|--------|-------|-----------------|----------------|
| 28 bit | 1 bit | 1 bit           | 2 bit          |

| Index | Valid | Dirty | Tag | Cach  | ne line |   |
|-------|-------|-------|-----|-------|---------|---|
|       |       |       |     | Data1 | Data2   |   |
| 0     | 0     | 0     | 0   | 0     | 0       | V |
| 1     | 0     | 0     | 0   | 0     | 0       |   |

WT L1 cache

| Tag    | Index | Block<br>offset | Byte<br>offset |
|--------|-------|-----------------|----------------|
| 28 bit | 1 bit | 1 bit           | 2 bit          |

| ı | V | D | Т | Cache | Cache line |   | V | D | Т | Cache | line  |
|---|---|---|---|-------|------------|---|---|---|---|-------|-------|
|   |   |   |   | Data1 | Data2      |   |   |   |   | Data1 | Data2 |
| 0 | 0 | 0 | 0 | 0     | 0          | 0 | 0 | 0 | 0 | 0     | 0     |
| 1 | 0 | 0 | 0 | 0     | 0          | 1 | 0 | 0 | 0 | 0     | 0     |

cache.config

2 400

16 2 1 LRU WT 2 32 2 2 LRU WB 10 1. W 22F7CF44

| Index | Valid | Dirty | Tag     | Cach  | ne line |
|-------|-------|-------|---------|-------|---------|
|       |       |       |         | Data1 | Data2   |
| 0     | 1     | 0     | 22F7CF4 | XX    | aa      |
| 1     | 0     | 0     | 0       | 0     | 0       |

WT L1 cache

L1 miss + L2 miss = 2 + 10 + 400 cycles

| 1 | V | D | Т       | Cache | Cache line |   | V | D | Т | Cache | line  |
|---|---|---|---------|-------|------------|---|---|---|---|-------|-------|
|   |   |   |         | Data1 | Data2      |   |   |   |   | Data1 | Data2 |
| 0 | 1 | 0 | 22F7CF4 | XX    | XX         | 0 | 0 | 0 | 0 | 0     | 0     |
| 1 | 0 | 0 | 0       | 0     | 0          | 1 | 0 | 0 | 0 | 0     | 0     |



cache.config

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10 1. W 22F7CF44

| Index | Valid | Dirty | Tag     | Cache line |       |  |  |
|-------|-------|-------|---------|------------|-------|--|--|
|       |       |       |         | Data1      | Data2 |  |  |
| 0     | 1     | 0     | 22F7CF4 | XX         | aa    |  |  |
| 1     | 0     | 0     | 0       | 0          | 0     |  |  |

WT L1 cache

Write Through = 10 cycles

| ı | V | D | Т       | Cache | line  |   | V | D | Т | Cache | line  |
|---|---|---|---------|-------|-------|---|---|---|---|-------|-------|
|   |   |   |         | Data1 | Data2 |   |   |   |   | Data1 | Data2 |
| 0 | 1 | 0 | 22F7CF4 | XX    | aa    | 0 | 0 | 0 | 0 | 0     | 0     |
| 1 | 0 | 0 | 0       | 0     | 0     | 1 | 0 | 0 | 0 | 0     | 0     |



cache.config

32 2 2 LRU WB 10

2 400 16 2 1 LRU WT 2 2. R 22F7CF40

| Index | Valid | Dirty | Tag     | Cach  | e line |
|-------|-------|-------|---------|-------|--------|
|       |       |       |         | Data1 | Data2  |
| 0     | 1     | 0     | 22F7CF4 | XX    | aa     |
| 1     | 0     | 0     | 0       | 0     | 0      |

WT L1 cache

L1 hit = 2 cycles

| ı | V | D | Т       | Cache | line  |   | V | D | Т | Cache | line  |
|---|---|---|---------|-------|-------|---|---|---|---|-------|-------|
|   |   |   |         | Data1 | Data2 |   |   |   |   | Data1 | Data2 |
| 0 | 1 | 1 | 22F7CF4 | XX    | aa    | 0 | 0 | 0 | 0 | 0     | 0     |
| 1 | 0 | 0 | 0       | 0     | 0     | 1 | 0 | 0 | 0 | 0     | 0     |



3. R 22F7CA3C

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cach  | he line |  |  |
|-------|-------|-------|---------|-------|---------|--|--|
|       |       |       |         | Data1 | Data2   |  |  |
| 0     | 1     | 0     | 22F7CF4 | XX    | aa      |  |  |
| 1     | 1     | 0     | 22F7CA3 | xx    | XX      |  |  |

WT L1 cache

L1 miss + L2 miss = 2 + 10 + 400 cycles

| 1 | V | D | Т       | Cache | Cache line |   | V | D | Т | Cache | line  |
|---|---|---|---------|-------|------------|---|---|---|---|-------|-------|
|   |   |   |         | Data1 | Data2      |   |   |   |   | Data1 | Data2 |
| 0 | 1 | 1 | 22F7CF4 | XX    | aa         | 0 | 0 | 0 | 0 | 0     | 0     |
| 1 | 1 | 0 | 22F7CA3 | XX    | XX         | 1 | 0 | 0 | 0 | 0     | 0     |

4. W 22F7CA30

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cacl  |       |         |
|-------|-------|-------|---------|-------|-------|---------|
|       |       |       |         | Data1 | Data2 |         |
| 0     | 1     | 0     | 22F7CA3 | bb    | XX    | WT L1 c |
| 1     | 1     | 0     | 22F7CA3 | xx    | xx    |         |

L1 miss + L2 miss = 2 + 10 + 400 cycles

|   | V | D | Т       | Cache line |       |   | V | D | Т       | Cache | line  |
|---|---|---|---------|------------|-------|---|---|---|---------|-------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1 | Data2 |
| 0 | 1 | 1 | 22F7CF4 | XX         | aa    | 0 | 1 | 0 | 22F7CA3 | XX    | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0     | 0     |



4. W 22F7CA30

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

|          | ne line | Cach  | Tag     | Dirty | Valid | Index |
|----------|---------|-------|---------|-------|-------|-------|
|          | Data2   | Data1 |         |       |       |       |
| WT L1 ca | XX      | bb    | 22F7CA3 | 0     | 1     | 0     |
|          | xx      | XX    | 22F7CA3 | 0     | 1     | 1     |

ache

Write Through = 10 cycles

| ı | V | D | Т       | Cache line |       |   | V | D | Т       | Cache | line  |
|---|---|---|---------|------------|-------|---|---|---|---------|-------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1 | Data2 |
| 0 | 1 | 1 | 22F7CF4 | xx         | aa    | 0 | 1 | 0 | 22F7CA3 | bb    | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0     | 0     |



5. R 22F7CF40

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cache line |       |   |  |
|-------|-------|-------|---------|------------|-------|---|--|
|       |       |       |         | Data1      | Data2 |   |  |
| 0     | 1     | 0     | 22F7CF4 | XX         | aa    | V |  |
| 1     | 1     | 0     | 22F7CA3 | xx         | xx    |   |  |

WT L1 cache

L1 miss + L2 hit = 2 + 10 cycles

| I | V | D | Т       | Cache line |       |   | V | D | Т       | Cache line |       |
|---|---|---|---------|------------|-------|---|---|---|---------|------------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1      | Data2 |
| 0 | 1 | 1 | 22F7CF4 | XX         | aa    | 0 | 1 | 1 | 22F7CA3 | bb         | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0          | 0     |

6. W 22F7C420

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cacl  |       |             |
|-------|-------|-------|---------|-------|-------|-------------|
|       |       |       |         | Data1 | Data2 |             |
| 0     | 1     | 0     | 22F7CF4 | XX    | aa    | WT L1 cache |
| 1     | 1     | 0     | 22F7CA3 | XX    | xx    |             |

L1 miss + L2 miss + L2 replacement = 2 + 10 + 400 + 400 cycles

|   | V | D | Т       | Cache line |       |   | V | D | Т       | Cache | line       |      |
|---|---|---|---------|------------|-------|---|---|---|---------|-------|------------|------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1 | Pata2ite k | oack |
| 0 | 1 | 1 | 22F7CF4 | XX         | aa    | 0 | 1 | 1 | 22F7CA3 | bb    | XX         | Juck |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0     | 0          |      |



6. W 22F7C420

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cacl  | ne line |
|-------|-------|-------|---------|-------|---------|
|       |       |       |         | Data1 | Data2   |
| 0     | 1     | 0     | 22F7C42 | aa    | xx      |
| 1     | 1     | 0     | 22F7CA3 | xx    | xx      |

WT L1 cache

L1 miss + L2 miss + L2 replacement = 2 + 10 + 400 + 400 cycles

| ı | V | D | Т       | Cache line |       |   | V | D | Т       | Cache line |       |
|---|---|---|---------|------------|-------|---|---|---|---------|------------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1      | Data2 |
| 0 | 1 | 1 | 22F7CF4 | xx         | aa    | 0 | 1 | 0 | 22F7C42 | XX         | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0          | 0     |



6. W 22F7C420

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | ndex Valid Dirty |   | Tag     | Cach  | e line |    |
|-------|------------------|---|---------|-------|--------|----|
|       |                  |   |         | Data1 | Data2  |    |
| 0     | 1                | 0 | 22F7C42 | aa    | XX     | WT |
| 1     | 1                | 0 | 22F7CA3 | xx    | xx     |    |

cache

Write through = 10 cycles

| ı | V | D | Т       | Cache line |       |   | V | D | Т       | Cache line |       |
|---|---|---|---------|------------|-------|---|---|---|---------|------------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1      | Data2 |
| 0 | 1 | 1 | 22F7CF4 | xx         | aa    | 0 | 1 | 0 | 22F7C42 | aa         | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | xx    | 1 | 0 | 0 | 0       | 0          | 0     |



7. W 22F7C420

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cacl  | ne line |             |
|-------|-------|-------|---------|-------|---------|-------------|
|       |       |       |         | Data1 | Data2   |             |
| 0     | 1     | 0     | 22F7C42 | bb    | XX      | WT L1 cache |
| 1     | 1     | 0     | 22F7CA3 | xx    | xx      |             |

L1 hit + Write through= 2 + 10 cycles

| ı | V | D | Т       | Cache line |       |   | V | D | Т       | Cache line |       |
|---|---|---|---------|------------|-------|---|---|---|---------|------------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1      | Data2 |
| 0 | 1 | 1 | 22F7CF4 | xx         | aa    | 0 | 1 | 1 | 22F7C42 | bb         | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0          | 0     |



8. R 22F7C424

2 400 16 2 1 LRU WT 2 32 2 2 LRU WB 10

| Index | Valid | Dirty | Tag     | Cache line |       |   |
|-------|-------|-------|---------|------------|-------|---|
|       |       |       |         | Data1      | Data2 | _ |
| 0     | 1     | 0     | 22F7C42 | bb         | XX    | W |
| 1     | 1     | 0     | 22F7CA3 | xx         | xx    |   |

WT L1 cache

L1 hit = 2 cycles

|   | V | D | Т       | Cache line |       |   | V | D | Т       | Cache line |       |
|---|---|---|---------|------------|-------|---|---|---|---------|------------|-------|
|   |   |   |         | Data1      | Data2 |   |   |   |         | Data1      | Data2 |
| 0 | 1 | 1 | 22F7CF4 | xx         | aa    | 0 | 1 | 1 | 22F7C42 | bb         | XX    |
| 1 | 1 | 0 | 22F7CA3 | XX         | XX    | 1 | 0 | 0 | 0       | 0          | 0     |



#### **Example - result**

```
W 22f7cf44
R 22f7cf40
R 22f7ca3c
W 22f7ca30
R 22f7cf40
W 22f7c420
W 22f7c420
R 22f7c424
H
```

cache.input

```
Level 1 Cache
Hit Count : 3
Miss Count : 5
Hit Ratio : 0.375

Level 2 Cache
Hit Count : 1
Miss Count : 4
Hit Ratio : 0.200

Total Hit Ratio : 0.500
Total cycle: 2106
```

cache.output

#### Questions

- You are free to ask questions to TAs
  - Please send email before visit the office
  - Email: minwoo.ahn@csi.skku.edu / sunghwan.kim@csi.skku.edu
  - Office: Semiconductor Building #400509