### LE14.3.1 Set-associative cache

1/1 point (ungraded)

Consider a **2-way set-associative** cache that has **16 cache lines** in each of the two direct-mapped subcaches and a **block size of 2 words** (i.e., a miss brings in an even-odd word pair from main memory).

1. The Beta produces 32-bit byte addresses, A[31:0]. To ensure the best cache performance for the 2-way set associative cache described above, which address bits should be used for selecting the cache line? For matching with the tag field?





### Address bits used for matching with tag field: A[



#### Explanation

Looking at the address bits from right to left, the bottom 2 bits of the address are 00 for word alignment. After that we want to have the block offset bits so that you can benefit from locality by bringing more than one word in per block. Next comes the cache line selector, and finally, the tag.

Having a block size of 2 words means that 1 bit is used for the block offset, so bit 2 is used for the block offset. To address 16 lines of cache, we needs 4 bits to select the line, so the next 4 bits A[6:3] are used to select the cache line. The remaining bits are all used for the tag, A[31:7].

2. Using the 2-way set-associative cache with a block size of 2 described above, estimate the approximate long-term hit ratio for the following program. Assume that the cache is empty before execution (all the valid bits are 0) and that an LRU replacement strategy is used. Remember the cache is used for both instruction and data (LD) accesses.

. = 0x2000
 CMOVE(0x1000,R0)
loop: LD(R0,0,R1)
 SUBC(R0,4,R0)
 BNE(R0,loop)
 HALT()

### **Approximate long-term hit ratio:**



### Explanation

In steady state, the loop has 4 memory accesses: 3 instruction fetches and 1 data fetch. The instruction fetches will all hit. If the block size was 1 word, then the data fetches would all be misses. However, because the block is of size 2. That means that every other data fetch will be a hit. So you have 1 miss per 8 memory accesses, so your hit ratio is 7/8.

Submit

Answers are displayed within the problem

### LE14.3.2 Fully-associative cache

0.0/1.0 point (ungraded)

A Beta processor (generating 32-bit byte addresses) is connected to a fully associative cache having four cache lines each containing one 32-bit data word as well as dirty and valid bits. The cache uses Least Recently Used (LRU) replacement. Access times are 5 ns on a hit and a total of 50ns on a miss (including the 5ns cache access time).

| 1. | What hit ratio is nece             | ssary to | yield a | n average | access t | ime of | 14ns? |
|----|------------------------------------|----------|---------|-----------|----------|--------|-------|
|    | Hit ratio for 14ns $t_{av_{ m s}}$ | g:       |         |           |          |        |       |
|    |                                    |          |         |           |          |        |       |

2. The Beta produces 32-bit byte addresses, **A[31:0]**. Which of these bits are compared with tags stored in the cache?

Bits compared with stored tags: A[

| _ |  |  |
|---|--|--|
|   |  |  |
|   |  |  |
|   |  |  |

1

| 3. How many | such | comparisor | is are | performed | simulta | neously | for eac | ch mer | nory |
|-------------|------|------------|--------|-----------|---------|---------|---------|--------|------|
| read?       |      |            |        |           |         |         |         |        |      |

Submit

## LE14.3.3 Set-associative cache parameters

0.0/1.0 point (ungraded)

Consider the universe of caches constructed from one or more static RAM lookup tables organized as shown on the right, i.e. as table allowing access to a single entry (line) at a time, each line having a tag portion as well as 2<sup>B</sup> data entries.

Combining multiple of these devices along with necessary logic, comparators, and replacement strategy, we can build a parameterized family of set-associative caches. Each cache in this family implements  $2^S$  sets, where each set comprises  $2^N$  lines and each line has a tag as well as  $2^B$  consecutive words of data from main memory as depicted in the "big picture" view below:





For each of the following questions, assume a machine that uses **A** bit addresses (to keep it simple, we avoid the byte addressing of the Beta; thus consecutive addresses differ by one). You may answer each question by a number, or a formula involving the parameters A, B, N, and S.

1. What is the total number of data words that can be held in the cache? **Formula for total cache size, in words:** 

| AM | 6.004.2x Courseware   edX                                                               |
|----|-----------------------------------------------------------------------------------------|
|    |                                                                                         |
|    |                                                                                         |
| 2. | What constraint on the above parameters characterizes a direct-mapped cache?            |
|    | $\bigcirc A = 0$                                                                        |
|    | $\bigcirc B=0$                                                                          |
|    | N=0                                                                                     |
|    | $\bigcirc$ $S=0$                                                                        |
|    | What constraint on the above parameters characterizes a <b>fully-associative</b> cache? |
|    | $\bigcirc A = 0$                                                                        |
|    | $\bigcirc B=0$                                                                          |
|    |                                                                                         |
|    | N=0                                                                                     |
|    | $\bigcirc$ $S=0$                                                                        |
|    |                                                                                         |

4. What is the minimum number of bits required in the tag portion of each cache line?

Formula for size of each tag:

Submit

# LE14.3.4 Cache comparisons

0.0/1.0 point (ungraded)

Three otherwise identical Beta systems have slightly different cache configurations. Recall that the Beta supplies 32-bit byte addresses when accessing memory. Each cache has a total of 8 lines caching a single 32-bit data word; a single cache is used when responding to both instruction and data fetches. However, the caches differ in their associativity as follows:

- Cache C1: 8-line direct mapped.
- Cache C2: 2-way set associative (4 sets of 2 lines), LRU replacement.
- Cache C3: 8-line fully associative, LRU replacement.
  - 1. How many bits are there in the TAG field of each line in cache C1? Which address bits from the Beta are used by the cache's comparator to determine if there is a cache hit?

| Beta addr                  | as shown ir                 | e <b>d in hit l</b> o | ogic: A | <b>[</b> Is for a while the tag and data fields of the |
|----------------------------|-----------------------------|-----------------------|---------|--------------------------------------------------------|
| ;<br>]                     | Beta with th<br>as shown ir | e C2 cac              |         |                                                        |
| ;<br>]                     | Beta with th<br>as shown ir | e C2 cac              |         |                                                        |
| ;<br>]                     | Beta with th<br>as shown ir | e C2 cac              |         |                                                        |
| ;<br>]                     | Beta with th<br>as shown ir | e C2 cac              |         |                                                        |
| :<br>]<br>2. After the I   | as shown ir                 |                       | he runs | e for a while the tag and data fields of the           |
| . ] 2. After the I         | as shown ir                 |                       | he runs | e for a while the tag and data fields of the           |
| ]<br>2. After the l        | as shown ir                 |                       | he runs | e for a while the tag and data fields of the           |
| <b>]</b><br>2. After the l | as shown ir                 |                       | he runs | s for a while the tag and data fields of the           |
| z. After the i             | as shown ir                 |                       | ne runs | TOT 3 WHILD THE TAN AND DATA HOIDE OF THE              |
|                            |                             | i the tabl            |         | v. Assume that any unspecified bits are 0              |
|                            | ıll cache ent               |                       |         | or each of the given Beta read requests to             |
|                            |                             | -                     |         | hit or miss in the cache and the data                  |
|                            | on a hit (ente              | er CAN'T              | TELL f  | or the data returned if there is a cache               |
| miss).<br>Line # Ta        | ıq Data                     |                       | Tag     | Data                                                   |
|                            | .g 2414<br><40 0×7391       | F0083                 | •       | 0×73FF0121                                             |
|                            | <00 0×73Fi                  |                       |         | 0×619F0044                                             |
|                            | <00 0×73FI                  |                       |         | 0×515F003C                                             |
|                            | <41 0×627I                  |                       |         | 0×73FF02C3                                             |
|                            | dress 0×00                  |                       | 000     | 0470110200                                             |
| HIT                        |                             |                       |         |                                                        |
|                            |                             |                       |         |                                                        |
| MIS                        | SS                          |                       |         |                                                        |
|                            |                             |                       |         |                                                        |
| Data at lo                 | cation 0x0                  | 0C or "C/             | זד דימע | ELL" for "MISS": 0x                                    |
|                            |                             | J                     |         | 131 IVII-00 · 0/A                                      |

| Hit for a                       |                                                                                                                                                                                                                                                                                                                                                                        | s: 0×400 <sup>4</sup>                                        | ?                       |          |        |           |   |   |  |  |  |  |
|---------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------|-------------------------|----------|--------|-----------|---|---|--|--|--|--|
| M                               | MISS                                                                                                                                                                                                                                                                                                                                                                   |                                                              |                         |          |        |           |   |   |  |  |  |  |
| Data at                         | locatio                                                                                                                                                                                                                                                                                                                                                                | on 0×400                                                     | or "C                   | AN'T TEL | L" for | "MISS": 0 | x |   |  |  |  |  |
| address<br>an arrow<br>delivere | Consider the short benchmark program given below. The reference string of word addresses generated as the benchmark runs is shown below the benchmark with an arrow indicating how the access pattern repeats. Please indicate the hit ratio delivered by the each of the caches after the benchmark has run for a while, e.g., during the 10th iteration of the loop. |                                                              |                         |          |        |           |   |   |  |  |  |  |
| L00P:                           | LD(R3<br>LD(R3<br>LD(R3<br>SUBC(                                                                                                                                                                                                                                                                                                                                       | (100, R1)<br>1, 1024,<br>1, 1024+4<br>1, 1024+8<br>R1, 1, R1 | R0)<br>I, R0)<br>B, R0) |          |        |           |   |   |  |  |  |  |
| 0                               | 1<br><b>^</b>                                                                                                                                                                                                                                                                                                                                                          | 256                                                          | 2                       | 257      | 3      | 258       | 4 | 5 |  |  |  |  |
| Hit Ratio                       | o? for                                                                                                                                                                                                                                                                                                                                                                 | C1:                                                          |                         |          |        |           |   |   |  |  |  |  |
| Hit Ratio                       | ? for                                                                                                                                                                                                                                                                                                                                                                  | C2:                                                          |                         |          |        |           |   |   |  |  |  |  |

| Hit | Rat | io? | for | C3: |  |
|-----|-----|-----|-----|-----|--|
|     |     |     |     |     |  |
|     |     |     |     |     |  |
|     |     |     |     |     |  |

Submit

# Discussion

**Hide Discussion** 

Topic: 14. Caches and the Memory Hierarchy / LE14.3

#### Add a Post

| Show all posts 🗸 by recent acti                                                                                                                                             | vity 🗸   |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------|
|                                                                                                                                                                             | 3        |
| Question about execise B of LE14.3.1 SET-ASSOCIATIVE CACHE The cache description is as the following Consider a 2-way set-associative cache tha                             | . 3      |
| About the hit rate<br>"If the block size was 1 word, then the data fetches would all be misses. However, because the blo                                                    | 2        |
| <u>✓ LE14.3.1</u> What is "word alignment" at the bottom 2 bits? and why it is 00?                                                                                          | 2        |
| Memory access time solution? In the LE14.3.2 Problem A, the solution states that **average access time = (hit rate * hit time) + (                                          | 4        |
| CMOVE(0×1000,R0) I have no idea how I got through the rest of the course without knowing this, but: ADDC(R31, c , Rc                                                        | . 2      |
| <u>LE14.3.1.b Does cache load memory addresses in ascending order?</u> As I understand over the long term, the cache will retain the instruction that are in Memory locatio | 3        |
| CACHE COMPARISONS In the last problem, shouldn't the description of cache c2 be "2 sets of 4 lines"?                                                                        | 3        |
| 2-way SA<br>During a miss, which way does the program overwrite? Is this a relevant question?                                                                               | 2        |
| Implementing "waits" on CPU The Beta design seems predicated on an assumption that we will always have the next instruction                                                 | <b>2</b> |