# Computer Architecture (ECS 154A) Study Guide

# Contents

| 1 | Boo  | blean Algebra                          | 3 |
|---|------|----------------------------------------|---|
|   | 1.1  | Operations                             | 3 |
|   | 1.2  | Equivalence Laws                       | 3 |
|   | 1.3  | Truth Table                            | 3 |
|   |      | 1.3.1 NOT                              | 3 |
|   |      | 1.3.2 AND                              | 4 |
|   |      | 1.3.3 OR                               | 4 |
|   |      | 1.3.4 XOR                              | 4 |
|   | 1.4  | Canonical Normal Form                  | 5 |
|   | 1.5  | Karnaugh Maps                          | 5 |
|   | 1.6  |                                        | 5 |
| 2 | Con  | nbinational Logic Circuits             | 5 |
|   | 2.1  | Gates                                  | 5 |
|   | 2.2  | Timing Diagrams                        | 5 |
|   | 2.3  | Multiplexers, Decoders, Shifters       | 5 |
|   | 2.4  | Adders and Subtracters                 | 5 |
|   | 2.5  | Designing Combinational Logic Circuits | 5 |
| 3 | Fini | ite State Automata                     | 5 |
| • | 3.1  |                                        | 5 |
|   | 3.2  |                                        | 5 |
| 4 | Soa  | uential Logic Circuit                  | 5 |
| • | 4.1  |                                        | 5 |
|   | 4.2  |                                        | 5 |
|   | 4.3  |                                        | 5 |
|   | 4.4  |                                        | 5 |
|   | 4.4  | Designing Sequential Logic Circuits    | J |
| 5 | Sing | gle Cycle CPU Design                   | 5 |
| 6 | Cac  | che .                                  | 5 |
|   | 6.1  | Memory Hierarchy                       | 5 |
|   | 6.2  | Direct Mapped Cache                    | 6 |
|   |      | 6.2.1 Format                           | 6 |
|   |      | 6.2.2 Example                          | 6 |
|   | 6.3  | Fully Associative                      | 7 |
|   |      | 6.3.1 Format                           | 7 |
|   |      |                                        | 7 |
|   | 6.4  |                                        | 8 |
|   |      |                                        | 8 |
|   |      |                                        | 8 |
|   |      | 1                                      | 9 |
|   |      |                                        |   |

| 8 | Multi-Cycle CPU Design | ( |
|---|------------------------|---|
| 9 | Pipeline CPU Design    | ( |

# 1 Boolean Algebra

## 1.1 Operations

| Operation | Symbol   | Example      |
|-----------|----------|--------------|
| NOT       | _        | $\bar{A}$    |
|           | !        | !A           |
|           | _        | $\neg A$     |
|           | $\sim$   | $\sim A$     |
| AND       | $\wedge$ | $A \wedge B$ |
|           | *        | A*B          |
|           |          | AB           |
| OR        | V        | $A \vee B$   |
|           | +        | A + B        |
| XOR       | $\oplus$ | $A \oplus B$ |

## 1.2 Equivalence Laws

| Name         | OR Version                             | AND Version                            |
|--------------|----------------------------------------|----------------------------------------|
| Commutative  | A + B = B + A                          | A * B = B * A                          |
| Associative  | (A+B) + C = A + (B+C)                  | (A*B)*C = A*(B*C)                      |
| Distributive | A + (B * C) = (A + B) * (A + C)        | A * (B + C) = (A * B) + (A * C)        |
| Identity     | A + 0 = A                              | A*1=A                                  |
| Annulment    | A+1=1                                  | A * 0 = 0                              |
| Idempotent   | A + A = A                              | A * A = A                              |
| Complement   | $A + \bar{A} = 1$                      | $A * \bar{A} = 0$                      |
| De Morgan's  | $\overline{(A+B)} = \bar{A} * \bar{B}$ | $\overline{(A*B)} = \bar{A} + \bar{B}$ |

## 1.3 Truth Table

Truth tables are mathematical tables composing of every combination of inputs and the resulting function. The number of combinations or rows in the truth table is  $2^N$ , where N is the number of inputs.

### 1.3.1 NOT

$$\begin{array}{c|c}
A & f(A) = !A \\
\hline
0 & 1 \\
1 & 0
\end{array}$$

## 1.3.2 AND

| A | В | f(A,B) = A * B |
|---|---|----------------|
| 0 | 0 | 0              |
| 0 | 1 | 0              |
| 1 | 0 | 0              |
| 1 | 1 | 1              |

## 1.3.3 OR

| A | В | f(A,B) = A + B |
|---|---|----------------|
| 0 | 0 | 0              |
| 0 | 1 | 1              |
| 1 | 0 | 1              |
| 1 | 1 | 1              |

## 1.3.4 XOR

$$\begin{array}{c|c|c} A & B & f(A,B) = A \oplus B \\ \hline 0 & 0 & 0 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ 1 & 1 & 0 \\ \end{array}$$

- 1.4 Canonical Normal Form
- 1.5 Karnaugh Maps
- 1.6 Quine-McClucksey Algorithm
- 2 Combinational Logic Circuits
- 2.1 Gates
- 2.2 Timing Diagrams
- 2.3 Multiplexers, Decoders, Shifters
- 2.4 Adders and Subtracters
- 2.5 Designing Combinational Logic Circuits
- 3 Finite State Automata
- 3.1 Moore Model
- 3.2 Mealy Model
- 4 Sequential Logic Circuit
- 4.1 Latches
- 4.2 Flip Flops
- 4.3 Registers and Counters
- 4.4 Designing Sequential Logic Circuits
- 5 Single Cycle CPU Design
- 6 Cache

Effectiveness is based on the concept of information reuse: temporal locality and spatial locality.

### 6.1 Memory Hierarchy

Goal: Make memory perform as if it was made of the most expensive and fastest type, but cost as if made of the cheapest type.

- 1. Fast, Hot, Expensive
- 2. Static RAM
- 3. Dynamic RAM
- 4. Disk

A cache is smaller than main memory and is composed of numerous cache lines. Cache lines consist of a dirty bit, a tag, and block(s) of data. If the dirty bit is on, then it signals the CPU to write the data from this cache line into main memory when this cache line is freed. Caches also contain a valid bit which signifies whether there is loaded data into the cache — imagine starting up the computer for the first time, the cache is going to be empty. The issue is that even when a cache is empty (bits are all 0), it still holds some signal. As we continue, when we mention the size of a cache line, we refer to the size of the data blocks in the cache line only (excluding flags and tag).

| Flags   Tag   Block(s) |
|------------------------|
|------------------------|

## 6.2 Direct Mapped Cache

A given address is partitioned into three components: Tag, line number, and offset. The line number directly accesses a specific cache line. It then compares the tag partitioned from the address to the tag stored in the cache line. If the tags match, then it is a cache hit. Otherwise, it becomes a cache miss. Assuming that it was a cache hit, it then proceeds to use the offset to select which block to read or write. The block of data in the cache line can be thought of as an array and the offset as an index into this "array".

#### **6.2.1** Format

| Tag   Line Number | Offset |
|-------------------|--------|
|-------------------|--------|

#### 6.2.2 Example

A CPU is using 24-bit addresses and is byte-addressable. Each line in cache holds 16 bytes of data and each block is 1 byte. Assuming that the tag is 12 bits wide, find the following: number of lines in cache, size of cache, size of tag, and sizes of each partition in the format.

```
Size(Cache\ Line) = 16\ bytes
Bit\_Width(Offset) = Number\ of\ bits\ needed\ to\ address\ 16\ blocks
= \log_2 16
= 4\ bits
Bit\_Width(Line\ Number) = Size(Address) - Bit\_Width(Tag) - Bit\_Width(Offset)
= 24\ bits - 12\ bits - 4\ bits
= 8\ bits
Number\ of\ Cache\ Lines = 2^{Bit\_Width(Line\ Number)}
= 2^8
= 256
```

Size of Cache = Number of Cache Lines 
$$*$$
 Size of Cache Lines =  $2^8 * 2^4$  = 2048 bytes = 2 KB

Format Partition Sizes and Bit Range

|           | Tag     | Line Number | Offset |
|-----------|---------|-------------|--------|
| Size      | 12 Bits | 8 Bits      | 4 Bits |
| Bit Range | [12,23] | [4,11]      | [0,3]  |

The bit range represents what bits that partition occupies. Ex: Offset occupies bits 0, 1, 2, and 3. The four least-significant bits.

## 6.3 Fully Associative

A given address is partitioned into two components: Tag and offset. The reason why a line number is unused is because the CPU will perform a linear search through all cache lines looking for either a cache hit, an unused cache line, or a cache line to replace. Because of this, using full association has the lowest miss rates. The tag and offset performs the same as the tag and offset in direct mapping.

#### **6.3.1** Format

#### 6.3.2 Example

A CPU is using 11-bit addresses and is byte-addressable. The cache size is 128 bytes and the size of each cache line is 8 bytes. Find the following: Bit width of offset, bit width of tag, number of cache lines, and sizes and bit ranges of the partitions.

$$\begin{array}{rcl} {\rm Bit\_Width(Offset)} &=& {\rm Number\ of\ bits\ needed\ to\ address\ 8\ blocks}\\ &=& {\rm log_2\,8}\\ &=& 3\ {\rm bits}\\ \\ &=& {\rm Size(Address)} - {\rm Bit\_Width(Offset)}\\ &=& 11\ {\rm bits} - 3\ {\rm bits}\\ \\ &=& 8\ {\rm bits}\\ \\ \\ &=& 128\ {\rm bytes/8\ bytes}\\ \\ &=& 16\ {\rm Cache\ Lines} \end{array}$$

Format Partition Sizes and Bit Range

|           | Tag    | Offset |
|-----------|--------|--------|
| Size      | 8 Bits | 3 Bits |
| Bit Range | [3,10] | [0,2]  |

#### 6.4 Set Associative

Set associative caching is a hybrid between direct-mapped and fully associative. A given address is partitioned into three components: Tag, set number, and offset. What makes set associative different from direct-mapped caching is that a given address will access a specific set of cache lines instead of a single cache line. Within that set of cache lines, it will perform a linear search like the fully associative cache. The tag and offset perform the same as the direct-mapped cache and fully associative cache. Important terminology: N-way associative cache means that there are N cache lines per set. This is also known as the associativity.

#### **6.4.1** Format

| Tag | Set Number | Offset |
|-----|------------|--------|
|-----|------------|--------|

#### **6.4.2** Example

A CPU is using 64-bit addresses and is byte-addressable. It also uses a 3-way set associative cache with a cache size of 98,304 bytes and 32 sets. Find the following: Number of lines per set, size of each set, size of cache line, bit width of offset, bit width of set number, bit width of tag, and partition sizes and bit ranges.

Number of Lines per Set = 3 (Given)

Size(Cache Line) = Size(Set)/ Number of lines per set  
= 
$$3072 \text{ bytes/3}$$
  
=  $1024 \text{ bytes}$ 

$$\begin{array}{lll} Bit\_Width(Offset) & = & Number \ of \ bits \ needed \ to \ represent \ 1024 \ bytes \\ & = & \log_2 Size(Cache \ Line) \\ & = & \log_2 1024 \\ & = & 10 \ bits \end{array}$$

```
\begin{array}{lll} {\rm Bit\_Width(Set\ Number)} &=& {\rm Number\ of\ bits\ needed\ to\ represent\ 32\ sets} \\ &=& {\log_2{\rm\ Number\ of\ sets}} \\ &=& {\log_2{32}} \\ &=& 5\ {\rm bits} \end{array}
```

$$Bit_Width(Tag) = Size(Address) - Bit_Width(Set Number) - Bit_Width(Offset)$$
  
= 64 bits - 5 bits - 10 bits  
= 49 bits

Format Partition Sizes and Bit Range

|           | Tag     | Set Number | Offset  |
|-----------|---------|------------|---------|
| Size      | 12 Bits | 5 Bits     | 10 Bits |
| Bit Range | [15,63] | [10,14]    | [0,9]   |

#### 6.4.3 Associativity Remarks

It can actually be noted that direct-mapped caching and fully associative caching are a form of set associative cache. Suppose that our cache has N lines and we use an 1-way set associative cache. In this scenario, it would imply that each set consists of one cache line. This would mean that the set number is the exact same as a line number. Therefore, 1-way set associative caches are also direct mapped caches.

Suppose that our cache has N lines and we use an N-way set associative cache. This implies that there is a single set with N lines. Recall the format of the set associative — tag, set number, and offset. If there is only one set, how many bits are required to represent one set.  $\log_2 1 = 0$ . Therefore, there is no need for any bits to represent that one set. In this case, the format becomes – tag and offset. Also, recall that in set associative, it first uses the set number to access a specific set and then performs a linear search among the cache lines within that set. Since there is only one set, it performs the linear search over that entire set by default. Therefore, an N-way set associative cache is also a fully associative cache.

- 7 Virtual Memory
- 8 Multi-Cycle CPU Design
- 9 Pipeline CPU Design