https://notebooklm.google.com/notebook/508f5265-b7ac-4d56-9345-03ae517eaa1c
This document explains how to write, assemble, and run a "Hello, World!" program in x86-64 assembly on a Linux system.
section .data
hello db 'Hello, World!', 0ah
section .text
global _start
_start:
; write(1, hello, 13)
mov rax, 1
mov rdi, 1
mov rsi, hello
mov rdx, 13
syscall
; exit(0)
mov rax, 60
xor rdi, rdi
syscallsection .data: This section is for declaring initialized data.hello db 'Hello, World!', 0ah: This defines a byte string namedhellocontaining "Hello, World!" and a newline character (0ah).
section .text: This section contains the program's code.global _start: This makes the_startlabel visible to the linker, marking it as the program's entry point._start:: This is where the program begins execution.
- Printing "Hello, World!": This part of the code uses a system call to write the string to the console.
mov rax, 1: Setsraxto1, the syscall number forwrite.mov rdi, 1: Setsrdito1, the file descriptor forstdout.mov rsi, hello: Setsrsito the memory address of ourhellostring.mov rdx, 13: Setsrdxto13, the length of the string.syscall: Executes thewritesystem call.
- Exiting the program: This part of the code uses a system call to terminate the program.
mov rax, 60: Setsraxto60, the syscall number forexit.xor rdi, rdi: Setsrdito0(the exit code) by XORing it with itself.syscall: Executes theexitsystem call.
To run the assembly program, you need to assemble it into an object file and then link it to create an executable.
- Assemble with
nasm:nasm -f elf64 hello.asm -o hello.o
- Link with
ld:ld hello.o -o hello
- Run the executable:
./hello
nasm -f elf64 hello.asm -o hello.o: This command assembles thehello.asmfile.nasm: The Netwide Assembler.-f elf64: Specifies the output format as ELF64, the standard for 64-bit Linux executables.hello.asm: The input assembly file.-o hello.o: Specifies the output object file name.
ld hello.o -o hello: This command links thehello.oobject file.ld: The GNU linker.hello.o: The input object file.-o hello: Specifies the output executable file name.
./hello: This command executes thehelloprogram.
Hey! Here is a simple explanation for how the order of lines in the .data section of an assembly file can change the program's output. It all comes down to how the assembler reads the code.
Think of the assembler as a program that reads your .asm file one line at a time, from top to bottom. As it reads, it calculates memory addresses and defines values.
The most important symbol for this explanation is the dollar sign: $
$: This special symbol represents the current memory address where the assembler is working. Its value changes as the assembler moves down the file and allocates space for data.
Here is the simple program we are looking at. It's designed to print a message to the screen.
section .text
global _start
_start:
mov edx, len ; The length of the message
mov ecx, msg ; The message to write
mov ebx, 1 ; File descriptor (1 for stdout)
mov eax, 4 ; System call number (sys_write)
int 0x80 ; Call the kernel
mov eax, 1 ; System call number (sys_exit)
int 0x80 ; Call the kernel
section .data
msg db "12345", 0xa
; The next lines are what we will changeThe key instruction is len equ $ - msg. This tells the assembler: "Create a constant called len and set its value to the current address ($) minus the starting address of msg."
In this case, we calculate the length immediately after the first message.
section .data
msg db "12345", 0xa
; --- The important part ---
len equ $ - msg
msg2 times 5 db "6", 0xa- The assembler reads
msg db "12345", 0xaand allocates 6 bytes of memory. - On the very next line, it sees
len equ $ - msg. At this exact moment, the$symbol points to the memory address right after themsgstring. - The assembler calculates
lenas(address after msg) - (address of msg), which equals 6. - Even though
msg2is defined on the next line, it doesn't matter because the value oflenhas already been set in stone. - The program runs and prints 6 bytes starting from
msg.
12345
In this case, we move the length calculation to the very end.
section .data
msg db "12345", 0xa
msg2 times 5 db "6", 0xa
; --- The important part ---
len equ $ - msg- The assembler reads
msg db "12345", 0xa(6 bytes). - It then continues and reads
msg2 times 5 db "6", 0xa(10 bytes). - Only after processing both
msgandmsg2does it seelen equ $ - msg. Now, the$symbol points to the memory address after all themsg2data. - The assembler calculates
lenas(address after msg2) - (address of msg), which is the total combined length: 16 bytes. - The program runs and prints 16 bytes starting from
msg. Sincemsg2is located right aftermsgin memory, it prints all ofmsgand then continues, printing all ofmsg2as well.
12345
6
6
6
6
6
You understood this perfectly. The key takeaway is that the assembler is not magic; it's a sequential program. The value of $ is dynamic during assembly time, and the position of the len equ $ - msg line is what determines the final, constant value of len that gets baked into your executable. Your analysis of both cases was spot on.
That's a great question. You've touched on the other half of data definition.
Here’s the key difference:
DB,DW,DD, etc. Define Data. They allocate space and put a specific value into it.RESB,RESW,RESD, etc. Reserve Space. They only allocate empty space. They are for creating uninitialized variables that your program will fill in later.
Think of it like this:
DB:my_message: db 'Hello!'➡ Puts the bytes for 'H', 'e', 'l', 'l', 'o', '!' into the file.RESB:user_input: resb 100➡ Skips 100 bytes of space, leaving an empty buffer. Your code can later read keyboard input and store it at theuser_inputaddress.
The correct way to use these RESB directives is in a special section called .bss.
Your binary file has different sections:
.text: For your code (likemov,jmp)..data: For initialized data (likedb 'Hello'). This data is saved inside your executable file..bss: For uninitialized data (likeresb 100). This data is not saved in the file. It's just a note that says, "When you load this program, please set aside 100 bytes of empty memory for it." This keeps your executable file small.
Here is how you would use it in a program.
section .data
; --- Initialized data ---
prompt_msg: db 'What is your name? ', 0
section .bss
; --- Reserved (uninitialized) space ---
; Reserve 100 bytes of space to store the user's name
user_name: resb 100
; Reserve 1 word (2 bytes) to store the user's age
user_age: resw 1
; Reserve 1 doubleword (4 bytes) for a counter
login_attempts: resd 1
section .text
global _start
_start:
; ... your code would go here ...
; For example, you might:
; 1. Call a function to print 'prompt_msg'
; 2. Call a function to read keyboard input
; 3. Store the result at the 'user_name' address
; ... more code ...In this example:
prompt_msgis created usingdbbecause we know its value ("What is your name?") from the start.user_nameis created usingresb 100because we don't know the user's name yet. We are just saving an empty 100-byte buffer for it.user_ageis created usingresw 1to hold a 2-byte number that we'll get from the user later.login_attemptsis created usingresd 1to hold a 4-byte number that our code will increment or reset.
| Directive | Purpose | Analogy |
|---|---|---|
DB, DW, DD |
Define Data | Puts a pre-filled box on the shelf. |
RESB, RESW, RESD |
Reserve Space | Puts an empty box on the shelf (labeled). |