Malcom Chiaji

SCT212-0063/2021

### Lab4 E1: Cache Tag and Speedup

a. Tag Size:

* Address size: 32 bits
* Block size: 64 bytes → 6 bits for offset
* Cache size: 256 KB = 2¹⁸ bytes
* 4-way set associative: 2¹⁸ / (4 × 64) = 1024 sets → 10 bits for index
* Tag size: 32 - 10 - 6 = 16 bits

b. Speedup if all accesses hit:

CP I = CPI execution + StallCyclesPerInstruction

For computer that always hits, CP I = 1

For computer with non-zero miss rate, let us compute StallCyclesPerInstruction

StallCyclesPerInstruction = (Memory accesses per instruction) \* miss rate \* miss penalty

Memory accesses per instruction = 1 + 0.5 (1 instruction access + 0.5 data access)

StallCyclesPerInstruction = 1.5 \* 0.02 \* 25 = 0.75

Therefore, CP I = 1.75

The computer with no cache misses is 1.75 times faster.

E2: Memory Bandwidth Usage

Assumptions:

* Cache block = 2 words
* Access rate = 10⁹ words/sec → 0.5×10⁹ blocks/sec
* Miss rate = 5%

a. Write-through:

* Read misses: 0.75×10⁹ × 5% = 0.0375×10⁹ → read 2 words = 0.075×10⁹ words
* Writes (25%):
  + Write hits: 0.2375×10⁹ → write 1 word each
  + Write misses: 0.0125×10⁹ → read 2 words + write 1 = 0.0375×10⁹ words
* Total: 0.075 + 0.2375 + 0.0375 = 0.35×10⁹ words/sec = 35% bandwidth

b. Write-back:

Then:

On a read hit there is no memory access

On a read miss:

1. If replaced line is modified then cache must send two words to memory, and then

memory must send two words to the cache

2. If replaced line is clean then memory must send two words to the cache

On a write hit there is no memory access

On a write miss:

1. If replaced line is modified then cache must send two words to memory, and then

memory must send two words to the cache

2. If replaced line is clean then memory must send two words to the cache

Thus:

Average words transferred = 0.7125 ∗ 0 + 0.0375 ∗ (0.7 ∗ 2 + 0.3 ∗ 4) + 0.2375 ∗ 0 + 0.0125 ∗

(0.7 ∗ 2 + 0.3 ∗ 4) = 0.13 Average bandwidth used = 0.13 ∗ 109

F ractionof bandwidthused = 0.13 ∗ 10^9/10^9

= 0.13 (2)

Write through cache uses more than twice the

cache-memory bandwidth of the write back cache.

E3: Write-Through vs Write-Back Performance

Assumptions:

* Miss penalties = 50 cycles
* Loads = 26%, Stores = 9%, total = 35% of instructions
* I-cache miss rate = 0.5%, D-cache miss rate = 1%
* Write buffer absorbs all write-through stalls

CPU performance equation: CP UT ime = IC ∗ CP I ∗ ClockTime

CP I = CPIexecution + StallCyclesPerInstruction

We know:

Instruction miss penalty is 50 cycles

Data read hit takes 1 cycle

Data write hit takes 2 cycles

Data miss penalty is 50 cycles for write through cache

Data miss penalty is 50 cycles or 100 cycles for write back cache

Miss rate is 1% for data cache and 0.5% for instruction cache

50% of cache blocks are dirty in the write back cache

26% of all instructions are loads

9% of all instructions are stores

Then:

CPIexecution = 0.26 ∗ 1 + 0.09 ∗ 2 + 0.65 ∗ 1 = 1.09

Write through

StallCyclesPerInstruction = MRI ∗ 50 + MRD ∗ (0.26 ∗ 50 + 0.09 ∗ 50) = 0.425

so:

CP I = 1.09 + 0.425 = 1.515 (3)

Write back

StallCyclesPerInstruction = MRI ∗ 50 + MRD ∗ (0.26 ∗ (0.5 ∗ 50 + 0.5 ∗ 100) + 0.09 ∗

(0.5 ∗ 50 + 0.5 ∗ 100)) = 0.5125

so:

CP I = 1.09 + 0.5125 = 1.6025

Comparing we notice that the system with the write back cache is 6% slower.