Holden Sandlar

Advanced Computer Architecture

Assignment 5

**Problem 5.1**

c.

P3.B0: (M, 120, 0080)

d.

P0.B2: (S, 110, 0030)

P1.B2: (S, 110, 0030) reads 0030

P3.B2: (S,110,0030)

e.

P0.B1: (M, 108, 0048)

P3.B1: (I, 108, 0008)

**Problem 5.9 -- assumption was P0,1 would mean Processor 0 on Chip 1**

c.

L1 miss and L2 miss will replace B1 in L1 and B1 in L2 which has address 108

L1 will have 128 in B1 (modified value = 78), L2 will have it (DM, P0,0)

Memory directory entry for 108 will become <DS,C1>

Memory directory entry for 128 will become <DM,C0>

d.

L1 miss and L2 miss will replace B0 in L1 and B0 in L2

L1 will have 120 in B1, L2 will have it (DS, P0,0;E)

Memory directory entry for 120 will become <DS, C0,C1>

Memory directory entry for 100 will become <DI, - >

f.

For P0,0 read 120 part d will occur.

Now, P1,0 write 120 <-- 80 will cause the following:

We don't know anything about the specific state of the cache for P1 on chip 0 so we assume L1 miss

L1 misses but L2 hit for address 120

L1 will have 120 in B0 (modified value = 80)

L2 will have 120 in B0 (caused by read) with state (DM, P1,0)

L1 of P0,0 will invalidate its copy of 120 and upon another read would need to retrieve the updated value from L2

Also, Chip 1 would have the following changes:

P3,1 would invalidate B0

L2$,1 B0 would invalidate

Directory Entry for 120 would become <DM, C0, 80>

g.

P0,0 write 120 <-- 80

L1 and L2 cache miss for P0,0 on Chip 0

P0,0 B0 would become (M, 120, 80)

L2$,0 block 0 would become (DM, P0,0)

P3,1 B0 would become invalid

L2$,1 B0 would become (DI, - )

Directory Entry for 120 would become <DM, C0, 80>

P1,0 read 120

L1 cache miss for P1,0

L2 cache hit

P1,0 B0 would become (S, 120, 80)

P0,0 B0 would become (S, 120, 80)

L2$,0 block 0 would become (DS, P0,0, P1,0)

Directory Entry for 120 would become <DS, C0, 80>

**Problem 5.10**

c.

P0,0: write 118 <-- 90

Write miss seen by P0,0

Invalidate received by P0,1

d.

P1,0: write 128 <-- 98

Write miss seen by P1,0

**Problem 5.20**

Clock rate = 3.3 GHz

Base CPI of app with all references hitting in the cache is 0.5

Assume 0.2% instructions involve remote communication reference

Cost of remote comm reference is (100+10h) ns where h is number of comm hops to get to the remote processor memory and back.

a.

Ring:

Each side of the ring (square) is 16 procs. Worst case we will go to extreme diagonals.

Worst case comm hops = 16\*2 (excludes originating proc, and double count on corner) = 32

Communication cost = (100 + 10 \* 64) = 740 ns

|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| **o** | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** |  |  |  |  |  |  |  |  |  |  |  |  |  |  | 1 |
| **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **1** | **e** |

8x8 Grid:

|  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| |  |  |  |  |  |  |  |  | | --- | --- | --- | --- | --- | --- | --- | --- | | **o** | 1 | 1 | 1 | 1 | 1 | 1 | 1 | | 1 | **1** | 1 | 1 | 1 | 1 | 1 | 1 | | 1 | 1 | **1** | 1 | 1 | 1 | 1 | 1 | | 1 | 1 | 1 | **1** | 1 | 1 | 1 | 1 | | 1 | 1 | 1 | 1 | **1** | 1 | 1 | 1 | | 1 | 1 | 1 | 1 | 1 | **1** | 1 | 1 | | 1 | 1 | 1 | 1 | 1 | 1 | **1** | 1 | | 1 | 1 | 1 | 1 | 1 | 1 | 1 | **e** | |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |
|  |  |  |  |  |  |  |  |

7 hops worst case

Remote comms cost = (100 + 10\*14) = 240 ns

b.

Ring CPI =

Grid CPI =