### A3.1.2. Translation Lookaside Buffer

$$
\text{Physical Address} = \text{PageTable}[\text{VPN}].\text{PFN} \;\|\; \text{Offset}
$$

where VPN is the virtual page number, PFN the physical frame number, and $\|$ denotes concatenation.

**Explanation:**

The **Translation Lookaside Buffer (TLB)** is a hardware cache that stores recent virtual-to-physical address translations, avoiding the cost of walking the page table on every memory access.

**Virtual Memory Translation:**

1. CPU issues a virtual address.
2. The MMU splits it into **virtual page number (VPN)** and **page offset**.
3. VPN is looked up in the TLB.
4. **TLB hit** ‚Üí physical frame number returned in ~1 cycle.
5. **TLB miss** ‚Üí hardware page table walker traverses the multi-level page table (~10‚Äì100 cycles), then fills the TLB.

**Typical TLB Sizes:**

| TLB | Entries | Page Size | Coverage |
|-----|---------|-----------|----------|
| L1 dTLB | 64 | 4 KB | 256 KB |
| L2 sTLB | 1536 | 4 KB | 6 MB |
| L1 dTLB (huge) | 32 | 2 MB | 64 MB |

**Performance Implications:**

- **Large working sets** that span many pages exhaust TLB entries, causing frequent misses.
- **Huge pages** (2 MB or 1 GB) cover more memory per TLB entry, reducing miss rate.
- **TLB shootdown** ‚Äî when a page mapping changes, all cores must invalidate their TLB entries (expensive IPI).

**Example:**

A program touching 1 GB of data with 4 KB pages needs 262,144 page translations. With a 1536-entry L2 TLB, only ~6 MB is covered ‚Äî the rest causes TLB misses. Using 2 MB huge pages, the same 1 GB needs only 512 translations, fitting comfortably in the TLB.

In [None]:
PAGE_SIZE_4K = 4 * 1024
PAGE_SIZE_2M = 2 * 1024 * 1024
PAGE_SIZE_1G = 1 * 1024 * 1024 * 1024

DTLB_L1_ENTRIES = 64
STLB_L2_ENTRIES = 1536
HUGE_TLB_ENTRIES = 32

coverage_l1_4k = DTLB_L1_ENTRIES * PAGE_SIZE_4K
coverage_l2_4k = STLB_L2_ENTRIES * PAGE_SIZE_4K
coverage_huge_2m = HUGE_TLB_ENTRIES * PAGE_SIZE_2M

print("TLB Coverage:")
print(f"  L1 dTLB (4 KB pages): {coverage_l1_4k / 1024:.0f} KB")
print(f"  L2 sTLB (4 KB pages): {coverage_l2_4k / (1024**2):.1f} MB")
print(f"  Huge TLB (2 MB pages): {coverage_huge_2m / (1024**2):.0f} MB")

working_set_bytes = 1 * 1024 * 1024 * 1024
pages_needed_4k = working_set_bytes // PAGE_SIZE_4K
pages_needed_2m = working_set_bytes // PAGE_SIZE_2M

print(f"\nWorking set: {working_set_bytes / (1024**3):.0f} GB")
print(f"  Pages needed (4 KB): {pages_needed_4k:,}")
print(f"  Pages needed (2 MB): {pages_needed_2m:,}")
print(f"  L2 TLB covers {STLB_L2_ENTRIES}/{pages_needed_4k:,} pages at 4 KB")
print(f"  Huge TLB covers {HUGE_TLB_ENTRIES}/{pages_needed_2m} pages at 2 MB")

virtual_address = 0x00007FFF_ABCD1234
offset_bits = 12
offset_mask = (1 << offset_bits) - 1

vpn = virtual_address >> offset_bits
page_offset = virtual_address & offset_mask

print(f"\nAddress translation (4 KB pages):")
print(f"  Virtual address: 0x{virtual_address:016X}")
print(f"  VPN: 0x{vpn:013X}")
print(f"  Page offset: 0x{page_offset:03X} ({page_offset} bytes)")

**References:**

[üìò Hennessy, J. & Patterson, D. (2019). *Computer Architecture: A Quantitative Approach (6th ed.).* Morgan Kaufmann.](https://www.elsevier.com/books/computer-architecture/hennessy/978-0-12-811905-1)

[üìò Drepper, U. (2007). *What Every Programmer Should Know About Memory.* Red Hat.](https://people.freebsd.org/~lstewart/articles/cpumemory.pdf)

---

[‚¨ÖÔ∏è Previous: Cache Hierarchy](./01_cache_hierarchy.ipynb) | [Next: Branch Prediction ‚û°Ô∏è](./03_branch_prediction.ipynb)