[← Home](Home.md) # The Global Descriptor Table (GDT) *Three eight-byte entries that tell the CPU how to interpret every memory access in protected mode.* To enter [protected mode](protected-mode.md), the CPU needs a table that defines the segments it will use: where each one starts, how large it is, and what may be done with it. That table is the **Global Descriptor Table (GDT)**. MyOS-Simple installs a tiny, hand-written GDT in [`boot.asm`](stage-2-c-protected-mode.md) with just three entries — the absolute minimum for a flat 32-bit kernel. This page explains the *concept* of the GDT and walks through the project's table. For the exhaustive, bit-by-bit field reference (every flag in the access and granularity bytes), see [`reference/gdt-descriptor-format.md`](gdt-descriptor-format.md). ## What a descriptor is In real mode, a segment register directly holds a value that is multiplied by 16 to form a base address. In protected mode, a segment register instead holds a **selector** — essentially an *index* into the GDT. The GDT entry it points to, called a **segment descriptor**, is an 8-byte structure that encodes: - **Base** — the linear start address of the segment (split awkwardly across the descriptor for backward-compatibility reasons). - **Limit** — the size of the segment, scaled by the granularity flag. - **Access byte** — present bit, privilege level, executable/data type, and read/write permissions. - **Flags** — granularity, default operand size (16- vs 32-bit), and the 64-bit long-mode flag. The CPU consults the descriptor on every memory access to validate and translate the address. In the flat model, this validation is essentially a no-op, but the descriptors still have to exist and be correct. ## The project's three-entry table Here is the complete GDT from [`boot.asm:107-134`](stage-2-c-protected-mode.md): ```asm ; GDT gdt_start: dd 0x0 dd 0x0 gdt_code: dw 0xffff ; Limit dw 0x0 ; Base db 0x0 ; Base db 10011010b ; Access db 11001111b ; Flags db 0x0 ; Base gdt_data: dw 0xffff dw 0x0 db 0x0 db 10010010b db 11001111b db 0x0 gdt_end: gdt_descriptor: dw gdt_end - gdt_start - 1 dd gdt_start CODE_SEG equ gdt_code - gdt_start DATA_SEG equ gdt_data - gdt_start ``` ### Entry 0: the null descriptor ```asm gdt_start: dd 0x0 dd 0x0 ``` The first GDT entry **must** be all zeros. The x86 architecture reserves selector `0x00` (the "null selector") as a deliberate trap: loading it into a data segment register and then dereferencing it raises a fault. This catches the classic bug of using an uninitialized segment register. The null descriptor occupies the slot but is never used to address anything. ### Entry 1: the code segment ```asm gdt_code: dw 0xffff ; Limit (bits 0..15) dw 0x0 ; Base (bits 0..15) db 0x0 ; Base (bits 16..23) db 10011010b ; Access = 0x9A db 11001111b ; Flags+limit-high = 0xCF db 0x0 ; Base (bits 24..31) ``` This describes the segment from which the CPU **fetches instructions**. Base is `0`, the limit field is `0xFFFFF`, and with 4 KiB granularity (see below) it spans the full 4 GiB. The access byte `10011010b` (`0x9A`) marks it **present, ring 0, a code segment, executable, non-conforming, and readable**. ### Entry 2: the data segment ```asm gdt_data: dw 0xffff dw 0x0 db 0x0 db 10010010b ; Access = 0x92 db 11001111b ; Flags+limit-high = 0xCF db 0x0 ``` Byte-for-byte identical to the code segment **except the access byte**, which is `10010010b` (`0x92`): **present, ring 0, a data segment, grows-up, writable**. This is the segment used for `DS`, `SS`, `ES`, `FS`, and `GS`. It overlaps the code segment completely — both cover all 4 GiB — which is exactly what "flat model" means. > 💡 **Tidbit:** The code and data segments differ by a single bit (the > *executable* bit, E). `0x9A` vs `0x92` is the difference between "the CPU may > jump here and run it" and "the CPU may read and write here." Everything else > about the two segments is identical, which is why they can occupy the same > address range. ## Decoding the two interesting bytes You do not need to memorize these — [`reference/gdt-descriptor-format.md`](gdt-descriptor-format.md) has the full tables — but seeing them once builds intuition. ### The access byte Bits, from 7 down to 0, are `P | DPL[1] DPL[0] | S | E | DC | RW | A`: | Value | Bits | Meaning | |-------|------|---------| | `0x9A` (code) | `1 00 1 1 0 1 0` | Present, ring 0, code/data type, executable, non-conforming, readable, not accessed | | `0x92` (data) | `1 00 1 0 0 1 0` | Present, ring 0, code/data type, **not** executable (data), grows-up, writable, not accessed | `DPL = 00` is what makes this a **ring 0** kernel — the most privileged level. ### The flags / limit-high byte: `0xCF` (`11001111b`) The high nibble is the **flags**, the low nibble is **limit bits 19:16**: | Nibble | Value | Meaning | |--------|-------|---------| | Flags (high) | `1100` = `0xC` | G=1 (4 KiB granularity), D/B=1 (32-bit), L=0 (not 64-bit), AVL=0 | | Limit[19:16] (low) | `1111` = `0xF` | top 4 bits of the 20-bit limit | > 💡 **Tidbit:** How do you get a 4 GiB segment from a 20-bit limit field? The > **granularity bit (G)** scales the limit by 4 KiB. With the full limit field > `0xFFFFF` and G=1, the segment size is `(0xFFFFF + 1) * 4 KiB = 0x100000 * 0x1000 > = 4 GiB`. Without granularity, the same field would top out at just 1 MiB. ## How the descriptor is registered: `gdt_descriptor` ```asm gdt_descriptor: dw gdt_end - gdt_start - 1 ; limit = (size of table) - 1 dd gdt_start ; base = linear address of the table ``` The `lgdt` instruction loads the **GDT register (GDTR)** from this 6-byte structure — a 16-bit limit followed by a 32-bit base. The limit is `gdt_end - gdt_start - 1`: for three 8-byte entries that is `24 - 1 = 23`. > 💡 **Tidbit:** The limit is **size minus one** by design. A limit value of `0` > has to mean "1 byte is valid," not "0 bytes," so the field always encodes the > *largest valid offset* rather than the count. This same minus-one convention > appears in the segment descriptors' own limit fields. ## Selectors: turning a label into a segment register value ```asm CODE_SEG equ gdt_code - gdt_start DATA_SEG equ gdt_data - gdt_start ``` A **selector** is the value you actually load into a segment register, and it is simply the **byte offset** of the descriptor within the GDT: - `CODE_SEG = gdt_code - gdt_start = 8` → `0x08` - `DATA_SEG = gdt_data - gdt_start = 16` → `0x10` This is why the [protected-mode switch](protected-mode.md) far-jumps to `CODE_SEG:init_pm` (loading `CS` with `0x08`) and then loads `0x10` into every data segment register. The low 3 bits of a real selector also encode the requested privilege level and a table-indicator bit; because this project only uses ring 0 and only the GDT, those bits are all zero, so the offsets `0x08`/`0x10` are the selectors verbatim. > ⚠️ **Caveat:** This GDT is *static* — it is assembled into the boot sector and > never modified at runtime. A fuller OS would later add a **Task State Segment > (TSS)** descriptor (needed for ring transitions and hardware task switching) and > separate user-mode (ring 3) code/data descriptors. MyOS-Simple needs none of > that because it runs entirely in ring 0 with a single flat address space. ## See also - [`reference/gdt-descriptor-format.md`](gdt-descriptor-format.md) — the complete bit-field reference for every descriptor field - [Protected mode](protected-mode.md) — why and how the GDT is loaded - [Real mode](real-mode.md) — the `segment * 16` model the GDT replaces - [Memory map](memory-map.md) — where the GDT lives in memory - [Stage 2: C in protected mode](stage-2-c-protected-mode.md) — the stage that introduces the GDT - [Home](Home.md)