19 1. Last time 122. Page faults: intro + mechanics 0/3. Page faults: uses (of pagy) or 4. Page faults' costs 15. Page replacement policies 0 6. Thrashing 2. Page faults intro + mechanics Concept: illegal virtual memory reference:

hardware thinks it's illegal (though it might be valid for the process) OS has to get involved Mechanics: -processor constructs trap trame and transfer execution to an interrupt or trap handler



3. Uses of page taults
- Classic example: overcommitting physical memory

prog: 64 6B 4/w: 16613 RAM 863 - Copy on write Accounting



- demand paging

- growing the stack

- BSS page allocation



- Shared text (code)

- Shared libraries

- Shared memory

Rondy Pausch time management

4. Page faults : costs

look at AMAT (avg, memory access time)

AMAT = (1-p)+ (mem access time) + p+ (page fault time)

to

p is probability (or frequency) of a page fault.

mem access time ~ 100 ns to disk access time ~ 10 ms (Ons) to

QUESTION: what is p s.t. Paging hurts performance by less than 10%? Yout paging: Im W paging: |. | 4 tm = (1-p) + tm + p \* tD  $|t_{M}| = (p)(-t_{N}) + p + t_{D}$  $\beta = \frac{1 + t_{M}}{t_{D} - t_{M}} = \frac{10 \text{ ns}}{10^{6}} = \frac{1}{10^{6}}$ P = 10-6 ~ (000,000)

5. Page replacement policies (h, vintre

RAM

PPN=46"



· FIFO: eject oldest

MIN (OPT): eject entry that won't be referenced for the longest time

input:
reference string
cache size

output: number of evictions

FIFO

CABD phys\_slot 51 52 7 evictions, 48 hits A 3 phys\_slot 5 euctins, 6 thits LRU phys\_slot 51 52 53 Sevictions, 6 hits

ABCDABCD

Mys-slot

\$1

\$2

\$3

Herefras, Ohits

FIFO
3extries ABCDABEABCDE
phys-slot
\$1
\$2

53

yesties ABCDABEABCDE

phys.slot

51

Solodys anomaly S3 S4

OPT ~ LRU We are notivated to implement LRU CLOCK

## **Core i7 Page Table Translation**



# **Review of Symbols**

#### Basic Parameters

- N = 2<sup>n</sup>: Number of addresses in virtual address space
- M = 2<sup>m</sup>: Number of addresses in physical address space
- **P = 2**<sup>p</sup> : Page size (bytes)

#### Components of the virtual address (VA)

- TLBI: TLB index
- TLBT: TLB tag
- VPO: Virtual page offset
- VPN: Virtual page number

#### Components of the physical address (PA)

- PPO: Physical page offset (same as VPO)
- PPN: Physical page number
- **CO**: Byte offset within cache line
- CI: Cache index
- CT: Cache tag

# **Core i7 Level 1-3 Page Table Entries**



#### Each entry references a 4K child page table. Significant fields:

**P:** Child page table present in physical memory (1) or not (0).

**R/W:** Read-only or read-write access access permission for all reachable pages.

**U/S:** user or supervisor (kernel) mode access permission for all reachable pages.

**WT:** Write-through or write-back cache policy for the child page table.

**A:** Reference bit (set by MMU on reads and writes, cleared by software).

**PS:** Page size: if bit set, we have 2 MB or 1 GB pages (bit can be set in Level 2 and 3 PTEs only).

**Page table physical base address:** 40 most significant bits of physical page table address (forces page tables to be 4KB aligned)

**XD:** Disable or enable instruction fetches from all pages reachable from this PTE.

# **Core i7 Level 4 Page Table Entries**



Available for OS (for example, if page location on disk)

P=0

#### Each entry references a 4K child page. Significant fields:

P: Child page is present in memory (1) or not (0)

R/W: Read-only or read-write access permission for this page

**U/S:** User or supervisor mode access

WT: Write-through or write-back cache policy for this page

A: Reference bit (set by MMU on reads and writes, cleared by software)

**D:** Dirty bit (set by MMU on writes, cleared by software)

Page physical base address: 40 most significant bits of physical page address (forces pages to be 4KB aligned)

**XD:** Disable or enable instruction fetches from this page.

### **End-to-end Core i7 Address Translation**



## **Cute Trick for Speeding Up L1 Access**



#### Observation

- Bits that determine CI identical in virtual and physical address
- Can index into cache while address translation taking place
- Cache carefully sized to make this possible: 64 sets, 64-byte cache blocks
- Means 6 bits for cache index, 6 for cache offset
- That's 12 bits; matches VPO,  $PPO \rightarrow$  One reason pages are  $2^{12}$  bits = 4 KB

## Virtual Address Space of a Linux Process





Figure 4-12. Page-Fault Error Code