# 1. journey

Awesome. Let’s go step-by-step and see how a simple C program like:

```c
int main() {
    open("file.txt", 0);
    return 0;
}
```

gets compiled down to assembly, and eventually includes a **trap instruction** like `syscall`.

We’ll explore:

1. ✅ Writing and compiling the program
2. 🔍 Disassembling it with `objdump`
3. 🧠 Seeing the actual system call and trap
4. 🛠 Bonus: Using `strace` to see runtime syscalls

---

## ✅ 1. Write a C Program

Create a file called `demo.c`:

```c
#include <fcntl.h>  // for open()

int main() {
    open("file.txt", 0);  // O_RDONLY = 0
    return 0;
}
```

---

## 🛠️ 2. Compile the Program (No Optimization)

To make it easy to see raw instructions, disable optimization:

```bash
gcc -no-pie -O0 -o demo demo.c
```

* `-O0`: No optimization
* `-no-pie`: Makes it easier to analyze addresses (optional for now)

---

## 🔍 3. Disassemble the Binary with `objdump`

```bash
objdump -d demo | less
```

Now look for the `main` function. You might see something like this (simplified):

```asm
0000000000001135 <main>:
    ...
    mov    rdi, offset of "file.txt"
    mov    esi, 0                  ; O_RDONLY
    call   1140 <open@plt>         ; Call glibc's open()
    ...
```

This shows:

* Your `open()` call is actually a call to a **dynamic library function** called `open@plt`.
* The real `syscall` happens **inside glibc**, not in your binary (unless you make a static binary).

---

## ✅ 4. Find the Actual `syscall` Inside glibc (Optional Static Build)

To see the actual `syscall` instruction, build **statically**:

```bash
gcc -static -O0 -o demo_static demo.c
```

Now disassemble that one:

```bash
objdump -d demo_static | grep -A10 '<open>'
```

You may see something like:

```asm
mov    eax, 2       ; syscall number for open
mov    edi, ...     ; arg1
mov    esi, 0       ; arg2 (O_RDONLY)
syscall             ; ← actual trap instruction
```

> This is what the CPU really runs: a `syscall` instruction that traps into the kernel.

---

## 🔎 5. Bonus: Use `strace` to Observe at Runtime

Instead of disassembling, you can just watch your program as it runs using `strace`:

```bash
strace ./demo
```

Output:

```text
open("file.txt", O_RDONLY) = -1 ENOENT (No such file or directory)
```

This confirms: your C code → glibc → `syscall` → kernel → system call handler → back

---

## ✅ Summary of the Journey

| Step                  | What's Happening                          |
| --------------------- | ----------------------------------------- |
| `open("file.txt", 0)` | C library function call                   |
| `glibc`               | Prepares registers + calls `syscall`      |
| `syscall`             | CPU instruction → trap into kernel        |
| Kernel                | Executes file open logic, returns result  |
| Back to user mode     | Control returns after `syscall` completes |

---

Would you like to try a minimal version **without glibc**, where you write your own `syscall` in inline assembly? It’s fun and shows everything explicitly.


# 2. assembly vs c

Fantastic follow-up — this shows you're thinking deeply about **what really gets executed by the CPU**, and **how high-level code turns into actual machine behavior**. Let's break it down step by step.

---

## ✅ Question:

> “My C code is converted into a trap instruction — is that assembly? Or some format the CPU understands? Does the CPU understand the concept of a ‘trap instruction’?”

---

## 🔹 1. Your C Code Is Compiled to Machine Code

When you write this C code:

```c
int fd = open("file.txt", O_RDONLY);
```

That call to `open()` is a **function defined in the C standard library**, like **glibc** on Linux.

So what really happens?

---

## 🔸 Step-by-Step Translation

### ✅ a. Your C code calls the glibc wrapper:

```c
open(...)   →   glibc_open(...)
```

### ✅ b. `glibc_open()` contains **assembly instructions**

Under the hood, glibc has something like this in x86-64 Linux:

```asm
mov rax, 2        ; syscall number for sys_open
mov rdi, pathname ; 1st arg to syscall
mov rsi, flags    ; 2nd arg
mov rdx, mode     ; 3rd arg (if needed)
syscall           ; special CPU instruction
```

* `rax` holds the **system call number** (e.g., `2` for `open`)
* `rdi`, `rsi`, `rdx` hold **arguments**
* `syscall` is a **CPU instruction** defined in x86-64 architecture

---

## 🔸 c. The syscall instruction is **native machine code**

The instruction `syscall` is not a C-level construct — it is **part of the CPU’s instruction set**.

* In x86, `int 0x80` and `syscall` are both recognized by the CPU as:

  > “Switch to kernel mode, save the state, and go to the trap handler.”

✅ So yes — the **CPU understands the concept of a "trap instruction"** — just like it understands `add`, `jmp`, or `mov`.

It’s just one of many **hardware-level instructions**.

---

## 🔹 2. So where does the conversion happen?

Here’s the full path:

```text
Your C code
   ↓
glibc wrapper function (C + inline assembly)
   ↓
Assembly instructions (e.g., mov + syscall)
   ↓
Compiler + linker turn this into:
   ↓
Raw machine code (binary instructions the CPU understands)
```

So **yes**, the trap instruction is ultimately **compiled into machine code**, and the **CPU executes it directly**.

---

## 🔹 3. Example of actual machine code

The `syscall` instruction corresponds to the machine code bytes:

```asm
0x0f 0x05
```

So in a compiled program, the binary might contain:

```
b8 02 00 00 00    ; mov eax, 2 (syscall number)
0f 05             ; syscall
```

This is what the CPU sees and executes — not C, not glibc — just **raw instructions**.

---

## ✅ Final Summary

| Concept                                         | Explanation                                                                                                                                      |
| ----------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------ |
| Does the CPU understand "trap"?                 | Yes — `syscall` and `int 0x80` are actual instructions in the CPU’s instruction set.                                                             |
| How does your C code become a trap instruction? | The C standard library (glibc) contains assembly that executes `syscall`, which becomes machine code during compilation.                         |
| Is it assembly?                                 | The trap starts as assembly and becomes raw binary instructions that the CPU executes.                                                           |
| Who defines what `syscall` does?                | The CPU hardware defines what `syscall` does — it saves context, switches to kernel mode, and jumps to a known handler address set up by the OS. |

---

Would you like to see a real compiled example of `open()` turned into raw assembly and machine code with `objdump` or `strace`?
