## Cache
cache capacity: # of sets * # blocks per set * size of block

### Replacement Policy:
* Direct mapped: no choice
* Set associative
  * Prefer non-valid entry, if there is one
  * Otherwise, choose among entries in the set
* Optimal policy: Replace the block that is accessed furthest in the future
  * Requires knowing the future
* Predict the future from looking at the past
  * If a block has not been used recently, it’s often less likely to be accessed in the near future (a locality argument)
* Least-recently used (LRU)
  * Choose the one unused for the longest time


### Write Policy:
* Write-through:
  * CPU writes are cached, but also written to main memory immediately
  * Stalling the CPU until write is completed
  * Simple, slow
* Write-back:
  * CPU writes are cached, but not written to main memory until we replace the block
  * Commonly implemented in current systems
  * Fast, more complex


### Write-Back with 'Dirty' Bits
* Add 1 bit per block to record whether block has been written to.
* Only write back dirty blocks.


# Virtual Memory
* Software Caches:
* Same object: fake large, fast, and cheap memory
* Conceptually similar
* Different implementations

<brk>

* Use main memory as a "cache" for secondary (disk) storage
  * Managed jointly by CPU hardware and the operating system (OS)
* Programs share main memory
  * Each gets a private virtual address space holding its frequently used code and data
  * Protected from other programs

<brk>

* CPU and OS translate virtual addresses to physical addresses
  * VM "block" is called a page
  * VM translation "miss" is called a page fault

<br>

* Two kinds of addresses:
  1. CPU (also programs) uses virtual addresses
  1. Main memory uses physical addresses
* Hardware translates virtual addresses to physical addresses via an operating system (OS) - Managed table
  * the page map or page table
* The price of VM is address translation

<brk>

* Assume we have 1 GB main memory: How many bits are required to represent a physical address?
  * 30 bits: $1GB = 2^{30}$
* In virtual memory design, you do not need the index field
  * instead it is page number and page offset
* Virtual address: Virtual page number + offset bits

### Virtual address: Virtual page number + offset bits
### Physical address: Physical page number + offset bits

Virtual address: (v+p) bits, v: virtual page # bits, p: page offset bits  
Physical address: (m+p) bits, m: physical page # bits, p: page offset bits

1. How many virtual pages: 2^V
2. How many physical pages: 2^M
3. How many bytes per page: 2^P
4. How many bytes in physical memory: 2^(M+P)
5. How many bits in the page table: (2^V) * (M+1(Valid bit)+1(dirty bit)+1(reference bit))
   1. for every virtual page number, we need to provide a translation

### Page Table
* Stores placement information
  * An array of page table entries (PTE), indexed by a virtual page number
  * Page table register in CPU points to page table
  * 
* We need to tell if it is a valid piece of information first

### Page fault
* Replacement: To reduce page fault rate, prefer least-recently used (LRU) replacement
  * Reference bit (aka use bit) in {TE set to 1 on access to page
  * Periodically clear to 0 by OS
  * A page with reference bit = 0 has not been use recently
  * 

* v + p: bits in virtual address
* m + p: bits in physical address
* 2^V: # of virtual pages
* 2^m: # of physical pages
* 2^p: Bytes per physical page
* 2^(m+p): bytes in physical table
* (m+3) * 2^v: bits in the page table

Suppose:
* $32$-Bit virtual address
* $2^{12}$ Page size (4KB)
* $2^{30}$ RAM max (1GB)

Then:
* p = 12
* v + p=32
* v = 20
* m + p = 30
* m = 18
1. \# Physical pages = 2^18
2. \# Virtual pages = 2^20
3. \# PTE (page table entries) = 2^20
4. \# bits in page table =(18+3)* 2^20

### Where to store the page table?
* Sma; page tables can choose dedicated SRAM
* but expensive for big ones
* Solution: move page table to main memory
* {ropblem: each emmory reference now takes 2 accesses

* Exam on the 7th  
* hw due wednesday



* Suppose:
    * 32-bit virtual address
    * Page size: 4KB 
      * $2^{12}$
    * RAM max: 1GB 
      * $2^{30}$
* Then:
    * /# Physical pages: 218 256K
    * /# Virtual pages: 220
    * /# PTE (page table entries): 220  # bits in page table: (18+3)*220
* Use SRAM for page table???  21Mbit3MB
 

### Where to store the page table?
* Small page tables can choose dedicated SRAM
* But expensive for big ones
* Solution: move page table to main memory
* Problem: each memory reference now takes 2 accesses

### Speed up translation with a TLB
* Problem: 2 accesses for each memory reference
* Solution: Cache the page table entries
* Translation Look-aside Buffer (TLB)
  * Small full/set-associative hardware cache in MMU
  * Lookup by VPN (Virtual Page Number

### Multi-Level Page Tables
* Problem: $2^{20}$ virtual pages not necessary all valid or used, but $2^{20}$ PTEs always in main memory
* Common solution: multi-level page table
* Example: 2-level page table

Example: 
* 2-Level Page Table (10 pointers on outer table, each inner table is of length 1024)
* So like the outer table contains pointers, and each ptr points to a new table
* This can contain 10x1024 values
  * for each outer table pointer, there is a new table of length 1024