CS224 - SPRING 2023 -

Preliminary Design Report

Lab-06

Section 5

Tolga Han Arslan

22003061

22.05.2023

**1. (5 points: With 3 or more errors you get 0 points. Otherwise full point.)** Fill in the empty cells of the following table. Assume that main memory size is 4GB. **Index Size**: No. of bits needed to express the set number in an address, **Block Offset**: No. of bits needed to indicate the word offset in a block, **Byte Offset**: No. of bits needed to indicate the byte offset in a word. **Block Replacement Policy Needed**: Indicate if a block replacement policy such as FIFO, Random etc. is needed (yes) or not (no). If some combinations are not possible mark them.

|  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| No. | Cache  Size  KB | N  way  cache | Word  Size | Block  Size  (no. of  words) | No.  of  Sets | Tag  Size  in  bits | Index  Size  (Set No.) in  bits | Word  Block  Offset  Size in bits | Byte  Offset  Size in  bits | Block  Replacement  Policy Needed  (Yes/No) |
| 1 | 64 | 1 | 32 bits | 4 | 212 | 16 | 12 | 2 | 2 | NO |
| 2 | 64 | 2 | 32 bits | 4 | 211 | 17 | 11 | 2 | 2 | YES |
| 3 | 64 | 4 | 32 bits | 8 | 29 | 18 | 9 | 3 | 2 | YES |
| 4 | 64 | Full | 32 bits | 8 | 1 | 27 | 0 | 3 | 2 | YES |
| 9 | 128 | 1 | 16 bits | 4 | 214 | 15 | 14 | 2 | 1 | NO |
| 10 | 128 | 2 | 16 bits | 4 | 213 | 16 | 13 | 2 | 1 | YES |
| 11 | 128 | 4 | 16 bits | 16 | 210 | 17 | 10 | 4 | 1 | YES |
| 12 | 128 | Full | 16 bits | 16 | 1 | 27 | 0 | 4 | 1 | YES |

**2**. **(5 points: With 3 or more errors you get 0 points. Otherwise full point.)** Consider the following MIPS code segment. Cache capacity is 8 words, Block size: 2 words, N= 1.

addi $t0, $0, 5

loop: beq $t0, $0, done

lw $t1, 0x4($0)

lw $t2, 0xC($0)

lw $t3, 0x8($0)

addi $t0, $t0, -1

j loop

done:

**a.** In the following table indicate the type of miss, if any: Compulsory, Conflict, Capacity.

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| Instruction | Iteration No. | | | | |
| **1** | **2** | **3** | **4** | **5** |
| lw $t1, 0x4($0) | Compulsory | Hit | Hit | Hit | Hit |
| lw $t2, 0xC($0) | Compulsory | Hit | Hit | Hit | Hit |
| lw $t3, 0x8($0) | Hit | Hit | Hit | Hit | Hit |

**b.** What is the total cache memory size in number of bits? Include the V bit your calculations. Show the details of your calculation.

Cache memory with capacity 8 words, block size 2 words, N = 1 can be represented as:

|  |  |  |  |
| --- | --- | --- | --- |
| V | Tag | Data | Data |
| 1 bit | 27 bits | 32 bits | 32 bits | Set 0 |
| 1 bit | 27 bits | 32 bits | 32 bits | Set 1 |
| 1 bit | 27 bits | 32 bits | 32 bits | Set 2 |
| 1 bit | 27 bits | 32 bits | 32 bits | Set 3 |

Byte offset = 2 bits (32 bit data)

Set = log24 = 2bits

Block offset = log22 = 1 bit

Tag = 32 – 2 – 2 – 1 = 27 bits

Hence, total cache memory size in number of bits = 4 (1 + 27 + 32 + 32) = 368 bits

**c.** State the number of AND and OR gates, EQUALITY COMPARATORs and MULTIPLEXERs needed to implement the cache memory.

One equality comparator needed for comparing the tags (since N = 1). One 2-1 multiplexer needed for selecting the word. 1 AND gate is needed for determining the hit (v bit and result of equality comparator).

**(5 points: With 3 or more errors you get 0 points. Otherwise full point.)** Consider the above MIPS code segment. The cache capacity is 2 words, block size is 1 word. There is only 1 set. The block replacement policy is LRU.

**a.** In the following table indicate the type of miss, if any: Compulsory, Conflict, Capacity.

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| Instruction | Iteration No. | | | | |
| **1** | **2** | **3** | **4** | **5** |
| lw $t1, 0x4($0) | Compulsory | Capacity | Capacity | Capacity | Capacity |
| lw $t2, 0xC($0) | Compulsory | Capacity | Capacity | Capacity | Capacity |
| lw $t3, 0x8($0) | Capacity | Capacity | Capacity | Capacity | Capacity |

**b.** How many bits are needed for the implementation of LRU policy? What is the total cache memory size in number of bits? Include the V bit and the bit(s) used for LRU in your calculations. Show the details of your calculation.

Cache memory with capacity 2 words, block size 1 word, only 1 set and LRU replacement can be represented as:

|  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- |
| V | U | Tag | Data | V | Tag | Data |
| 1 bit | 1 bit | 30 bits | 32 bits | 1 bit | 30 bits | 32 bits | Set 0 |

Byte offset = 2 bits (32 bit data)

Set = log21 = 0 bit (only 1 set)

Block ofset = log21 = 0 bit (block size 1)

Tag = 32 – 2 = 30 bits

1 use bit is needed for LRU replacement since there are 2 words in the set.

Hence, total cache memory size in number of bits = 1+ 1 + 30 + 32 + 1 + 30 + 32 = 127 bits

**c.** State the number of AND and OR gates, EQUALITY COMPARATORs and MULTIPLEXERs needed to implement the cache memory.

Two equality comparator needed for comparing tags (since N = 2), one 2-1 multiplexer needed for selecting the word, 2 AND and 1 OR gate needed for determining the hit (V bits and equality comparators result to AND gates and their results to OR gate).

**4**. **(5 points)** Consider a three-level memory: L1 and L2 are for cache memory and the third level is for the main memory. Access time for L1 is 1 clock cycle, the access time for L2 is 4 times more than L1 and main memory access time is 10 times more than L2. The miss rate for L1 is 20% and the miss rate for L2 is 5%. What is the effective clock cycle for memory access (AMAT in number of clock cycles)?

With 4 GHz clock rate how much time is needed for a program with 1012 instructions to execute? Show your work briefly.

* (I assumed that L2 is 4 times more than L1 means that L2 = 5 L1 etc.)

Access times TL1 = 1 cycle, TL2 = 5 cycles, TMM = 55 cycles

AMAT = TL1 + MissL1 (TL2 + MissL2 (TMM)) = 1 cycle + 0.2 (5 cycles + 0.05 (55 cycles)) = 2.55 cycles

4 GHz clock rate means that clock period is 1/(4 x 109) = 25 x 10-11 seconds

Hence, execution time of 1012 instructions = 1012 x 25 x 10-11 x 2.55 = 637.5 seconds