You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
People say there are things that are complex and there are things that are just complicated. Complexity is considered interesting, complicatedness is considered harmful. The process of setting up an x86_64 CPU is mostly complicated.
I’ll describe one way to go from a boot sector loaded by the BIOS with the CPU in 16-bit real mode to the CPU set up in 64-bit long mode. The setup is pretty bare-bones and there’s tons more to do.
I was surprised by how readable some of the Intel manual is. The initial chapters in volume 1 do a really good job at providing an overview of the system and explaining the terms used throughout the other volumes. But volume 3: System Programming Guide is most relevant to this discussion. There is an overview of all the operating modes in volume 3, section 2.2 Modes of Operation. The path we’re taking is highlighted in red.
After a reset, the x86 CPU is in “real mode”. That mode has a default operand size of 16 bits. You get a 20-bit address space and thus the ability to address 1MB of memory by using segmentation. Real mode is pretty much a backward compatibility mode for the Intel 8086 chip from 1978.
After the BIOS the first code that runs is that in the boot sector. The BIOS searches the system for a disk where the first sector ends in the magic number 0xaa55 (i.e., the byte 0x55 followed by the byte 0xaa). It loads that “boot sector” to memory at address 0x7c00.
So the BIOS gives us 512 bytes to work with. We need to use these bytes in order to bootstrap the rest of the bootloader. One can fit a surprising amount of stuff in 512 bytes, but it’s easiest to just load some more data from disk first. Fortunately, routines defined by the BIOS remain available to us as long as we’re in real mode.
Boot Sector Setup
Let’s set up a simple boot sector. It will just print a message to the screen using BIOS routines and then hang. This way, we know that the tooling works.
;; Uses the BIOS to print a null-termianted string. The address of the ;; string is found in the bx register. print_string: pusha movah,0x0e; BIOS "display character" function
The linker script linker.ld is important because it makes sure that the code in our boot sector is relocated to the right address in the final image. Specifically, the bootloader loads the boot sector to address 0x7c00 in memory. So that’s the base address to relocate the boot sector to. In addition, the linker will add the magic number at the end of the boot sector. Other guides I’ve seen do both the offset and the magic number inside the boot sector assembly source file by using features of the assembler, but that’s somewhat hackish.
Running make boot should result in a QEMU window and the “Hello, World!” message should be displayed.
Stage 1 – Loading Stage 2 From Disk
We can split the bootloader into two stages. Stage 1 is the code in the boot sector. It is everything that the BIOS loads for us. The sole purpose of stage 1 is to load stage 2 into memory. Stage 1 does this by using BIOS-provided routines to load stage 2 into memory.
In stage 2, we’ll switch from 16-bit real mode to 32-bit protected mode. In protected mode, we can’t use BIOS routines anymore. Without BIOS routines, loading sectors from a disk would become much more involved. So we’ll load a number of sectors from disk into memory and hope for the best. Of course, this is an unsafe technique, but it works for now.
This is how one can access the disk using BIOS. There’s an osdev.org page on this.
disk_address_packet: db0x10; Size of packet db0; Reserved, always 0 dap_sectors_num: dwREAD_SECTORS_NUM; Number of sectors read dd(BOOT_LOAD_ADDR+SECTOR_SIZE); Destination address dq1; Sector to start at (0 is the boot sector)
I just copied the print_string function so we can test if the jump works. Because this specific function only works with BIOS in real mode, it won’t be of any use to stage 2 once we have switched to protected mode.
Next, we’ll switch the CPU from real mode (16-bit) to protected mode (32-bit). In protected mode, segmentation is used by default to implement memory protection. Before switching to protected mode, you need to define a Global Descriptor Table (GDT) that contains segment descriptors for all the segments you want to define. Usually, paging is used in favor of segmentation. In fact, in 64-bit long mode, you need to use paging. But for the initial switch to protected mode, segmentation is required.
The Intel manual describes the “flat model” as a very simple segmentation model that can be implemented in the GDT. The “flat model” comprises a code segment and a data segment. Both of these segments are mapped to the entire linear address space (their base addresses and limits are identical). Using the simplest of all models is fine, since we just want to get to long mode and abandon segmentation in favor of paging.
The GDT is defined as a contiguous structure in memory. You fill a chunk of memory with the right data and give the CPU the address and the length of the memory chunk. The format of the GDT structure is described in the Intel manual.
From section “3.4.5 Segment Descriptors”:
The GDT is just an array of segment descriptors with a “null descriptor” at the start that’s used to catch invalid translations. The fields in the segment descriptor are described in detail in section “3.4.5 Segment Descriptors” of volume 3 of the Intel manual.
We define the GDT like this:
;; include/gdt32.s
<span>;; Base address of GDT should be aligned on an eight-byte boundary</span>
<span>align</span> <span>8</span>
gdt32_start: ;; 8-byte null descriptor (index 0). ;; Used to catch translations with a null selector. dd0x0 dd0x0
gdt32_code_segment: ;; 8-byte code segment descriptor (index 1). ;; First 16 bits of segment limit dw0xffff ;; First 24 bits of segment base address dw0x0000 db0x00 ;; 0-3: segment type that specifies an execute/read code segment ;; 4: descriptor type flag indicating that this is a code/data segment ;; 5-6: Descriptor privilege level 0 (most privileged) ;; 7: Segment present flag set indicating that the segment is present db10011010b ;; 0-3: last 4 bits of segment limit ;; 4: unused (available for use by system software) ;; 5: 64-bit code segment flag indicates that the segment doesn't contain 64-bit code ;; 6: default operation size of 32 bits ;; 7: granularity of 4 kilobyte units db11001111b ;; Last 8 bits of segment base address db0x00
gdt32_data_segment: ;; Only differences are explained ... dw0xffff dw0x0000 db0x00 ;; 0-3: segment type that specifies a read/write data segment db10010010b db11001111b dw0x00
gdt32_end:
;; Value for GDTR register that describes the above GDT gdt32_pseudo_descriptor: ;; A limit value of 0 results in one valid byte. So, the limit value of our ;; GDT is its length in bytes minus 1. dwgdt32_end-gdt32_start-1 ;; Start address of the GDT ddgdt32_start
Switching to protected mode is very easy now. We load the GDT pseudo-descriptor into the GDTR register so that the base address and length of our GDT are known to the system. Lastly, we do a far jump to flush the instruction pipeline.
<span>;; Load GDT and switch to protected mode</span>
<span>cli</span> <span>; Can't have interrupts during the switch</span>
<span>lgdt</span> <span>[</span><span>gdt32_pseudo_descriptor</span><span>]</span>
<span>;; Setting cr0.PE (bit 0) enables protected mode</span>
<span>mov</span> <span>eax</span><span>,</span> <span>cr0</span>
<span>or</span> <span>eax</span><span>,</span> <span>1</span>
<span>mov</span> <span>cr0</span><span>,</span> <span>eax</span>
<span>;; The far jump into the code segment from the new GDT flushes</span>
<span>;; the CPU pipeline removing any 16-bit decoded instructions</span>
<span>;; and updates the cs register with the new code segment.</span>
<span>jmp</span> <span>CODE_SEG32</span><span>:</span><span>start_prot_mode</span>
<span>[</span><span>bits</span> <span>32</span><span>]</span>
start_prot_mode: ;; Old segments are now meaningless movax,DATA_SEG32 movds,ax movss,ax moves,ax movfs,ax movgs,ax
;; ...
%include "include/gdt32.s"
Interrupts are disabled during the switch. After the entire setup is complete, interrupts can be enabled again. This would require extra setup work.
Now that we’re in protected mode, we can’t use the BIOS routines anymore. To print text, we can write straight to the VGA buffer instead.
;; src/stage2.s
;; ...
;; Writes a null-terminated string straight to the VGA buffer. ;; The address of the string is found in the bx register. print_string32: pusha
Best print something so that we know the switch worked. Note the message in the top left corner of the screenshot.
64-bit Long Mode
For this part, refer to “10.8.5 Initializing IA-32e Mode”. Note that Intel calls the 64-bit mode “IA-32e” while AMD refers to it as “long mode” in the AMD64 manual.
Before switching to long mode, the CPU must be in protected mode and paging must be enabled. We have protected mode now, but we are missing paging.
I love paging. It’s just very cool. But I’d do a poor job at explaining the concept itself. Philipp Oppermann’s Introduction to Paging from the “Writing an OS in Rust” blog was really useful for me personally. OSTEP also talks about paging starting chapter 18, although it doesn’t go into the specifics of paging on x86 like Philipp Oppermann’s post does.
In long mode with Physical Address Extension enabled (PAE, we’ll do that below ), a four level page table is used. The below code generates such a page table at a given address.
;; src/stage2.s
;; Builds a 4 level page table starting at the address that's passed in ebx. build_page_table: pusha
<span>PAGE64_PAGE_SIZE</span><span> equ</span> <span>0x1000</span>
<span>PAGE64_TAB_SIZE</span><span> equ</span> <span>0x1000</span>
<span>PAGE64_TAB_ENT_NUM</span><span> equ</span> <span>512</span>
<span>;; Initialize all four tables to 0. If the present flag is cleared, all other bits in any</span>
<span>;; entry are ignored. So by filling all entries with zeros, they are all "not present".</span>
<span>;; Each repetition zeros four bytes at once. That's why a number of repetitions equal to</span>
<span>;; the size of a single page table is enough to zero all four tables.</span>
<span>mov</span> <span>ecx</span><span>,</span> <span>PAGE64_TAB_SIZE</span> <span>; ecx stores the number of repetitions</span>
<span>mov</span> <span>edi</span><span>,</span> <span>ebx</span> <span>; edi stores the base address</span>
<span>xor</span> <span>eax</span><span>,</span> <span>eax</span> <span>; eax stores the value</span>
<span>rep</span> <span>stosd</span>
<span>;; Link first entry in PML4 table to the PDP table</span>
<span>mov</span> <span>edi</span><span>,</span> <span>ebx</span>
<span>lea</span> <span>eax</span><span>,</span> <span>[</span><span>edi</span> <span>+</span> <span>(</span><span>PAGE64_TAB_SIZE</span> <span>|</span> <span>11b</span><span>)]</span> <span>; Set read/write and present flags</span>
<span>mov</span> <span>dword</span> <span>[</span><span>edi</span><span>],</span> <span>eax</span>
<span>;; Link first entry in PDP table to the PD table</span>
<span>add</span> <span>edi</span><span>,</span> <span>PAGE64_TAB_SIZE</span>
<span>add</span> <span>eax</span><span>,</span> <span>PAGE64_TAB_SIZE</span>
<span>mov</span> <span>dword</span> <span>[</span><span>edi</span><span>],</span> <span>eax</span>
<span>;; Link the first entry in the PD table to the page table</span>
<span>add</span> <span>edi</span><span>,</span> <span>PAGE64_TAB_SIZE</span>
<span>add</span> <span>eax</span><span>,</span> <span>PAGE64_TAB_SIZE</span>
<span>mov</span> <span>dword</span> <span>[</span><span>edi</span><span>],</span> <span>eax</span>
<span>;; Initialize only a single page on the lowest (page table) layer in</span>
<span>;; the four level page table.</span>
<span>add</span> <span>edi</span><span>,</span> <span>PAGE64_TAB_SIZE</span>
<span>mov</span> <span>ebx</span><span>,</span> <span>11b</span>
<span>mov</span> <span>ecx</span><span>,</span> <span>PAGE64_TAB_ENT_NUM</span>
Paging supersedes segmentation for managing virtual address spaces, permissions, etc. A Global Descriptor Table with segment descriptors is still needed though, and the segment descriptors must be modified slightly to enable long mode-specific features.
This is another GDT that also implements the flat model. It’s almost identical to the GDT for protected mode. Just two bits were changed.
gdt64_code_segment: dw0xffff dw0x0000 db0x00 db10011010b ;; 5: 64-bit code segment flag indicates that this segment contains 64-bit code ;; 6: must be zero if L bit (bit 5) is set db10101111b db0x00
gdt64_data_segment: dw0xffff dw0x0000 db0x00 ;; 0-3: segment type that specifies a read/write data segment db10010010b db10101111b dw0x00
Setting up an x86 CPU in 64-bit mode
https://thasso.xyz/2024/07/13/setting-up-an-x86-cpu.html
Thassilo Schulze
13 Jul 2024
People say there are things that are complex and there are things that are just complicated. Complexity is considered interesting, complicatedness is considered harmful. The process of setting up an x86_64 CPU is mostly complicated.
I’ll describe one way to go from a boot sector loaded by the BIOS with the CPU in 16-bit real mode to the CPU set up in 64-bit long mode. The setup is pretty bare-bones and there’s tons more to do.
To follow along, you need the Intel 64 and IA-32 Architectures Software Developer’s Manual, an assembler (I used nasm), and QEMU. If you don’t have an x86_64 CPU, you should still be able to run everything I describe by emulating an x86 CPU in QEMU. I assume you know x86 assembly and the syntax that nasm uses. I like the nasm tutorial by Ray Toal for getting started.
I was surprised by how readable some of the Intel manual is. The initial chapters in volume 1 do a really good job at providing an overview of the system and explaining the terms used throughout the other volumes. But volume 3: System Programming Guide is most relevant to this discussion. There is an overview of all the operating modes in volume 3, section 2.2 Modes of Operation. The path we’re taking is highlighted in red.
For everything up to 32-bit mode, take a look at “Writing a Simple Operating System – from Scratch”. It’s unfinished but still very good.
Starting Point: BIOS
After a reset, the x86 CPU is in “real mode”. That mode has a default operand size of 16 bits. You get a 20-bit address space and thus the ability to address 1MB of memory by using segmentation. Real mode is pretty much a backward compatibility mode for the Intel 8086 chip from 1978.
After the BIOS the first code that runs is that in the boot sector. The BIOS searches the system for a disk where the first sector ends in the magic number
0xaa55
(i.e., the byte0x55
followed by the byte0xaa
). It loads that “boot sector” to memory at address0x7c00
.So the BIOS gives us 512 bytes to work with. We need to use these bytes in order to bootstrap the rest of the bootloader. One can fit a surprising amount of stuff in 512 bytes, but it’s easiest to just load some more data from disk first. Fortunately, routines defined by the BIOS remain available to us as long as we’re in real mode.
Boot Sector Setup
Let’s set up a simple boot sector. It will just print a message to the screen using BIOS routines and then hang. This way, we know that the tooling works.
This is the assembly we need:
Plus this
Makefile
:The linker script
linker.ld
is important because it makes sure that the code in our boot sector is relocated to the right address in the final image. Specifically, the bootloader loads the boot sector to address0x7c00
in memory. So that’s the base address to relocate the boot sector to. In addition, the linker will add the magic number at the end of the boot sector. Other guides I’ve seen do both the offset and the magic number inside the boot sector assembly source file by using features of the assembler, but that’s somewhat hackish.Running
make boot
should result in a QEMU window and the “Hello, World!” message should be displayed.Stage 1 – Loading Stage 2 From Disk
We can split the bootloader into two stages. Stage 1 is the code in the boot sector. It is everything that the BIOS loads for us. The sole purpose of stage 1 is to load stage 2 into memory. Stage 1 does this by using BIOS-provided routines to load stage 2 into memory.
In stage 2, we’ll switch from 16-bit real mode to 32-bit protected mode. In protected mode, we can’t use BIOS routines anymore. Without BIOS routines, loading sectors from a disk would become much more involved. So we’ll load a number of sectors from disk into memory and hope for the best. Of course, this is an unsafe technique, but it works for now.
This is how one can access the disk using BIOS. There’s an osdev.org page on this.
And at the end of
boot_sector.s
put this data:Lastly we need a stage 2 to jump to and we need to update the linker script. The
Makefile
remains unchanged.I just copied the
print_string
function so we can test if the jump works. Because this specific function only works with BIOS in real mode, it won’t be of any use to stage 2 once we have switched to protected mode.Finally the linker script:
32-bit Protected Mode
Next, we’ll switch the CPU from real mode (16-bit) to protected mode (32-bit). In protected mode, segmentation is used by default to implement memory protection. Before switching to protected mode, you need to define a Global Descriptor Table (GDT) that contains segment descriptors for all the segments you want to define. Usually, paging is used in favor of segmentation. In fact, in 64-bit long mode, you need to use paging. But for the initial switch to protected mode, segmentation is required.
The Intel manual describes the “flat model” as a very simple segmentation model that can be implemented in the GDT. The “flat model” comprises a code segment and a data segment. Both of these segments are mapped to the entire linear address space (their base addresses and limits are identical). Using the simplest of all models is fine, since we just want to get to long mode and abandon segmentation in favor of paging.
The GDT is defined as a contiguous structure in memory. You fill a chunk of memory with the right data and give the CPU the address and the length of the memory chunk. The format of the GDT structure is described in the Intel manual.
From section “3.4.5 Segment Descriptors”:
The GDT is just an array of segment descriptors with a “null descriptor” at the start that’s used to catch invalid translations. The fields in the segment descriptor are described in detail in section “3.4.5 Segment Descriptors” of volume 3 of the Intel manual.
We define the GDT like this:
Switching to protected mode is very easy now. We load the GDT pseudo-descriptor into the GDTR register so that the base address and length of our GDT are known to the system. Lastly, we do a far jump to flush the instruction pipeline.
Interrupts are disabled during the switch. After the entire setup is complete, interrupts can be enabled again. This would require extra setup work.
Now that we’re in protected mode, we can’t use the BIOS routines anymore. To print text, we can write straight to the VGA buffer instead.
Best print something so that we know the switch worked. Note the message in the top left corner of the screenshot.
64-bit Long Mode
For this part, refer to “10.8.5 Initializing IA-32e Mode”. Note that Intel calls the 64-bit mode “IA-32e” while AMD refers to it as “long mode” in the AMD64 manual.
Before switching to long mode, the CPU must be in protected mode and paging must be enabled. We have protected mode now, but we are missing paging.
I love paging. It’s just very cool. But I’d do a poor job at explaining the concept itself. Philipp Oppermann’s Introduction to Paging from the “Writing an OS in Rust” blog was really useful for me personally. OSTEP also talks about paging starting chapter 18, although it doesn’t go into the specifics of paging on x86 like Philipp Oppermann’s post does.
In long mode with Physical Address Extension enabled (PAE, we’ll do that below ), a four level page table is used. The below code generates such a page table at a given address.
Paging supersedes segmentation for managing virtual address spaces, permissions, etc. A Global Descriptor Table with segment descriptors is still needed though, and the segment descriptors must be modified slightly to enable long mode-specific features.
This is another GDT that also implements the flat model. It’s almost identical to the GDT for protected mode. Just two bits were changed.
With the page table and the GDT in place, the switch from protected mode to long mode can be performed.
Again, the “success message” should show up in the top left corner. Write a small VGA driver if this annoys you.
Using C
C code can easily be intergrated into this setup. E.g, this might become an OS kernel.
Update
src/stage2.s
:The linker script:
Lastly, the
Makefile
needs to change. Here, I only included the lines that have changed.Cool if you actually came along this far. The code is on GitHub.
via thasso.xyz
July 15, 2024 at 03:12PM
The text was updated successfully, but these errors were encountered: