You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I don't quite understand your solutions to this exercise for sumB and sumC.
The array_t is more than a 2 multiple of the cache size (4C) and in sumB we read column by column. After the first 16 iterations of the inner loop the first quarter of the matrix will be in the cache, which means by the time we wrap around to the first row all blocks in the first quarter of the matrix will have been evicted.
sumB has a 100% miss rate, not 25%.
sumC is similar to sumB (reading columns wise) with the only diff being that we read two items out of a block once it's loaded into the cache, so we get two hits per block (per 4 ints).
sumC has a 50% miss rate, not 25%.
There is no difference between N=64 and N=60 because both will lead to a multiple of 16 and an array that is more than a 2 multiple of the cache size.
Since the stride is larger than the block size, sumB and sumC could have smaller miss rates only if the whole array_t was less than a 2 multiple of the cache size. For example, if the entire array could fit into the cache, then they would have a 25% miss rate.
The text was updated successfully, but these errors were encountered:
agreed. In sumB, reading columns always evicts the previous mapping, so the miss rate should be 100%. In sumC, the first two always miss, the latter always hits, so sumC's miss rate should be 50%.
I don't quite understand your solutions to this exercise for sumB and sumC.
The array_t is more than a 2 multiple of the cache size (4C) and in sumB we read column by column. After the first 16 iterations of the inner loop the first quarter of the matrix will be in the cache, which means by the time we wrap around to the first row all blocks in the first quarter of the matrix will have been evicted.
sumB has a 100% miss rate, not 25%.
sumC is similar to sumB (reading columns wise) with the only diff being that we read two items out of a block once it's loaded into the cache, so we get two hits per block (per 4 ints).
sumC has a 50% miss rate, not 25%.
There is no difference between N=64 and N=60 because both will lead to a multiple of 16 and an array that is more than a 2 multiple of the cache size.
Since the stride is larger than the block size, sumB and sumC could have smaller miss rates only if the whole array_t was less than a 2 multiple of the cache size. For example, if the entire array could fit into the cache, then they would have a 25% miss rate.
The text was updated successfully, but these errors were encountered: