Skip to content

06. Global Descriptor Table

Jose edited this page Aug 18, 2021 · 12 revisions

This chapter introduces the concept of segmentation, x86 protected mode, and the global descriptor table (GDT). An x86 processor boots in real mode, but only in protected mode does it reveal its full power, providing features such as physical memory access control and native support for paging.

We implement our GDT data structures and learn how to enter the protected mode after OS booting, and get our Hux kernel prepared for starting memory management using paging. We will introduce paging in later chapters dedicated to memory management.

Main References of This Chapter

Scan through them before going forth:

Segmentation on Ancient 8086

The concept of segmentation dates back to the ancient 8086 architecture where a 16-bit register could only grant addressing over 64 KiB of memory, which was way too small to be useful. Intel designed 8086 to support 1 MiB of memory (20-bit in address-line width), and the way they enabled this is by combining two values: a segment register & an offset, to represent one single memory address. A typical syntax is like:

segment:[offset]

where segment contains the upper 16 bits of the segment's base address (the last 4 bits must be aligned to 0) and offset can thus address up to 64 KiB off that base.

Segment registers include CS (code segment), SS (stack segment), DS (data segment), and ES (extra segment, which serves as a more flexible data segment for programmers to use). Instructions reside in the code segment, stack grows from top to bottom in the stack segment with SP (stack pointer) as the offset, and static data reside in the data segment.

Below is a demonstration cropped from here:

And the address calculation from figure 4.6 here:

80x86 Protected Mode & Global Descriptor Table (GDT)

When it evolves to 32-bit 80386 architecture, registers extend to 32 bits (the lower 16 bits are still compatible with their old names), there are two more extra segment registers FS & GS, and because of 32-bit, we can now address up to 4 GiB of memory.

IA32 x86 registers look like:

The processor can work under one of the four different modes:

  1. Virutal 8086 mode: to be compatible with old 8086 programs
  2. System management mode (SMM): can only enter through BIOS; not visible to programmers
  3. Real mode: memory access follows exactly the same rules described in the section above (segment base + offset)
  4. Protected mode: uses global descriptors and enables the full power of x86 CPU

Global Descriptor Table (GDT)

The problem of real mode execution is that it does not provide access control to different physical memory segments. In order to enable access control, Intel invented the protected mode which uses segment descriptors. Each segment descriptor is 8 bytes with the following definition:

where the Access Byte and Flags fields are:

The above is the definition of a GDT table entry. It defines the base address and length limit of a segment, as well as proper access control info about this segment.

These descriptor entries need to be stored in a block of memory as well, called the global descriptor table (GDT). The start address and boundary (= length - 1) of GDT is recorded in a special 48-bit register GDTR:

GDTR := |47   GDT Base Addr   16|15   Boundary   0|

We can infer that the length of GDT is at most 2^16 bytes / 8 bytes per descriptor = 8,192 descriptors. The 0-th entry of GDT is unused, so we have to insert a null entry there.

Protected Mode

In protected mode, a segment register does not contain base address any more. Instead, it contains the index of corresponding segment's descriptor in the GDT. Address translation now follows the procedure ✭:

Only in protected mode does an x86 processor reveal its full power, bringing in modern features such as paging support. We will talk about memory management & paging in later chapters.

We are targeting a 80386 machine. By default, the processor enters real mode after powered on, and will continue running under real mode unless we instruct it to switch to protected mode. Modern operating systems, however, all operate under protected mode since they need memory protection and virtualization.

On 386 or higher architectures, there exist several control registers named CR<x>. CR0 contains various essential control flags that modify the basic operation of the processor. The 0th bit of CR0 is the "Protected Mode Enable" flag, and a value of 1 controls the processor to operate under protected mode (default is 0 after reset, which means real mode). Switch to protected mode is simply setting this bit to 1.

Here is the standard procedure of switching to protected mode:

  1. Disable interrupts
  2. Enable the A20 line (this is a historical issue; you may skim through the linked wikipedia page if you'd like to know more)
  3. Setup the GDTR correctly
  4. Set the lowest bit of CR0 register

GRUB has alreay taken care of step 1 & 2 for us before transferring control to the _start function of our kernel. We will do step 3 in this chapter. Since we directly switch to use paging when entering protected mode, we will do step 4 in later chapters when we set up the virtual memory system with paging.

GDT Implementation

First, following the GDT entry format in the above sections, we pack them into structs @ src/memory/gdt.h:

/**
 * GDT entry format.
 * Check out https://wiki.osdev.org/Global_Descriptor_Table
 * for detailed anatomy of fields.
 */
struct gdt_entry {
    uint16_t limit_lo;          /** Limit 0:15. */
    uint16_t base_lo;           /** Base 0:15. */
    uint8_t  base_mi;           /** Base 16:23. */
    uint8_t  access;            /** Access Byte. */
    uint8_t  limit_hi_flags;    /** Limit 16:19 | FLags. */
    uint8_t  base_hi;           /** Base 24:31. */
} __attribute__((packed));
typedef struct gdt_entry gdt_entry_t;


/**
 * 48-bit GDTR address register format.
 * Used for loading the GDT table with `lgdt` instruction.
 */
struct gdt_register {
    uint16_t boundary;  /** Boundary = length in bytes - 1. */
    uint32_t base;      /** GDT base address. */
} __attribute__((packed));
typedef struct gdt_register gdt_register_t;


/** List of segments registered in GDT. */
#define SEGMENT_UNUSED 0x0
#define SEGMENT_KCODE  0x1
#define SEGMENT_KDATA  0x2
#define SEGMENT_UCODE  0x3
#define SEGMENT_UDATA  0x4
#define SEGMENT_TSS    0x5      // Will be explained in user mode execution.

#define NUM_SEGMENTS 6


void gdt_init();

For a more detailed explanation of what each bit in Access Byte and Flags fields stands for, check out the "Global Descriptor Table" page.

GDT Loading Routines

Next step is to implement some initialization routines for setting up and loading the GDT. Add these @ src/memory/gdt.c:

/**
 * The GDT table. We maintain 5 entries for now:
 *   0. a null entry;
 *   1. kernel mode code segment;
 *   2. kernel mode data segment;
 *   3. user mode code segment;
 *   4. user mode data segment;
 */
static gdt_entry_t gdt[NUM_SEGMENTS];

/** GDTR address register. */
static gdt_register_t gdtr;


/**
 * Setup one GDT entry.
 * Here, BASE and LIMIT are in their complete version, ACCESS represents
 * the access byte, and FLAGS has the 4-bit granularity flags field in its
 * higer 4 bits.
 */
static void
gdt_set_entry(int idx, uint32_t base, uint32_t limit, uint8_t access,
              uint8_t flags)
{
    gdt[idx].limit_lo       =  (uint16_t) (limit & 0xFFFF);
    gdt[idx].base_lo        =  (uint16_t) (base & 0xFFFF);
    gdt[idx].base_mi        =  (uint8_t) ((base >> 16) & 0xFF);
    gdt[idx].access         =  (uint8_t) access;
    gdt[idx].limit_hi_flags =  (uint8_t) ((limit >> 16) & 0x0F);
    gdt[idx].limit_hi_flags |= (uint8_t) (flags & 0xF0);
    gdt[idx].base_hi        =  (uint8_t) ((base >> 24) & 0xFF);
}


/** Extern our load routine written in ASM `gdt-load.s`. */
extern void gdt_load(uint32_t gdtr_ptr, uint32_t data_selector_offset,
                                        uint32_t code_selector_offset);

The gdt_load function should be done in a separate assembly piece of code, calling the lgst instruction, @ src/memory/gdt-load.s:

/**
 * Load the GDT.
 * 
 * The `gdt_load` function will receive a pointer to the `gdtr` value. We
 * will then load the `gdtr` value into GDTR register by the `lgdt`
 * instruction.
 *
 * Assumes flat memory model. The first argument is pointer to `gdtr`, the
 * second argument is the offset of data segment selector and the third
 * argument is the offset of code segment selector.
 */


.global gdt_load
.type gdt_load, @function
gdt_load:
    
    /** Get first two arguments. */
    movl 4(%esp), %edx
    movl 8(%esp), %eax

    /** Load the GDT. */
    lgdt (%edx)

    /** Load all data segment selectors. */
    movw %ax, %ds
    movw %ax, %es
    movw %ax, %fs
    movw %ax, %gs

    /** Load stack segment as well. */
    movw %ax, %ss

    
    /** Load code segment by doing a far jump. */
    pushl 12(%esp)  /** Push FAR pointer on stack. */
    pushl $.setcs   /** Push offset of FAR JMP. */
    ljmp *(%esp)    /** Perform FAR JMP. */

.setcs:

    addl $8, %esp   /** Restore stack. */
    
    ret

Flat Model Segmentation

We will later be using paging as our main memory management scheme. In paging mode, all valid segments shall have base address 0x00000000 and a limit of 0xFFFFF to span across the whole 4 GiB address space ✭. Thus, we initialize all the following four segments:

  • Kernel mode code segment
  • Kernel mode data segment
  • User mode code segment
  • User mode data segment

all with base address of 0x00000000 and limit of 0xFFFFF.

Implement our initialization function exposed to main function @ src/memory/gdt.c:

/**
 * Initialize the global descriptor table (GDT) by setting up the 5 entries
 * of GDT, setting the GDTR register to point to our GDT address, and then
 * (through assembly `lgdt` instruction) load our GDT.
 */
void
gdt_init()
{
    /**
     * First, see https://wiki.osdev.org/Global_Descriptor_Table for a
     * detailed anatomy of Access Byte and Flags fields.
     * 
     * Access Byte -
     *   - Pr    = 1: present, must be 1 for valid selectors
     *   - Privl = ?: ring level, 0 for kernel and 3 for user mode
     *   - S     = 1: should be 1 for all non-system segments
     *   - Ex    = ?: executable, 1 for code and 0 for data segment
     *   - DC    =
     *     - Direction bit for data selectors,  = 0: segment spans up
     *     - Conforming bit for code selectors, = 0: can only be executed
     *                                               from ring level set
     *                                               in `Privl` field
     *   - RW    =
     *     - Readable bit for code selectors, = 1: allow reading
     *     - Writable bit for data selectors, = 1: allow writing
     *   - Ac    = 0: access bit, CPU sets it to 1 when accessing it
     *   Hence, the four values used below.
     * 
     * Flags -
     *   - Gr = 1: using page granularity
     *   - Sz = 1: in 32-bit protected mode
     *   Hence, 0b1100 -> 0xC for all these four segments. 
     */
    gdt_set_entry(SEGMENT_UNUSED, 0u, 0u, 0u, 0u);  /** 0-th entry unused. */
    gdt_set_entry(SEGMENT_KCODE, 0u, 0xFFFFF, 0x9A, 0xC0);
    gdt_set_entry(SEGMENT_KDATA, 0u, 0xFFFFF, 0x92, 0xC0);
    gdt_set_entry(SEGMENT_UCODE, 0u, 0xFFFFF, 0xFA, 0xC0);
    gdt_set_entry(SEGMENT_UDATA, 0u, 0xFFFFF, 0xF2, 0xC0);

    /** Setup the GDTR register value. */
    gdtr.boundary = (sizeof(gdt_entry_t) * NUM_SEGMENTS) - 1;
    gdtr.base     = (uint32_t) &gdt;

    /**
     * Load the GDT.
     * Passing pointer to `gdtr` as unsigned integer. Each GDT entry takes 8
     * bytes, therefore kernel data selector is at 0x10 and kernel code
     * selector is at 0x08.
     */
    gdt_load((uint32_t) &gdtr, SEGMENT_KDATA << 3, SEGMENT_KCODE << 3);
}

Read the comments for explanation of the access byte and flags values we are using.

I mentioned paging a lot in this chapter, since we will be using paging as our main memory management scheme and GDT serves as the start point of memeory management in protected mode. We will come back to virtual memory & paging in later chapters.

Progress So Far

In summary, we learned a bit of shallow theory on x86 protected mode, x86 segmentation, and global descriptor table (GDT). We implemented initialization routines to setup and load the GDT.

Current repo structure:

hux-kernel
├── Makefile
├── scripts
│   ├── gdb_init
│   ├── grub.cfg
│   └── kernel.ld
├── src
│   ├── boot
│   │   ├── boot.s
│   │   ├── elf.h
│   │   └── multiboot.h
│   ├── common
│   │   ├── debug.c
│   │   ├── debug.h
│   │   ├── port.c
│   │   ├── port.h
│   │   ├── printf.c
│   │   ├── printf.h
│   │   ├── string.c
│   │   ├── string.h
│   │   ├── types.c
│   │   └── types.h
│   ├── display
│   │   ├── terminal.c
│   │   ├── terminal.h
│   │   └── vga.h
│   ├── memory
│   │   ├── gdt.c
│   │   ├── gdt.h
│   │   └── gdt-load.s
│   └── kernel.c