Malcom Chiaji

SCT212-0063/2021

Lab5

a. Cache Misses in Original Code

* Cache size: 16 KB
* Line size: 64 B → 16 KB / 64 B = 256 cache lines
* Each array element: 4 B
* X and Y arrays: 4096 elements each = 16 KB each
* Direct-mapped cache

Misses Breakdown:

coldmisses : 2 ∗ (4096/16) = 512

conflictmisses : 4096 + (15/16) ∗ 4096 = 4096 + 3840 = 7936

as every store to X will miss because it was displaced by the previous load to Y, and as every

load to the remaining 15 elements of Y in a line will miss because they were displaced by

the previous store to X

total : 512 + 7936 = 8448 misses

So:

missrate : 8448/12288 = 0.6875 (68.75%)

b. Software Solution: Blocking

:merge arrays X and Y, interleaving their elements, now 8 elements of

each array fit in the same line

coldmisses : 4096/8 = 512

as every 8th iteration will miss on the load to X

conflictmisses : 0

total : 512 misses

missrate : 512/12288 = 0.0417

c. Hardware Solution:

double the cache size to 32KB

coldmisses : 2 ∗ (4096/16) = 512

as every 16th iteration will miss on the load to X and on the load to Y

conflictmisses : 0

total : 512 misses

So:

missrate : 512/12288 = 0.0417