<span style="color:#ADD8E6; font-size:60px; font-weight:bold">C Program Compilation Process</span>

---
---

## 🛠️<span style="color:#ADD8E6; font-size:40px; font-weight:bold">Compilation Stages Overview</span>

> We'll consider a Linux system here, where the C source code (`.c` file) is compiled into an executable—commonly with the `.elf` or `.out` extension.

>The .c file contains human-readable code written in the C programming language. This must be converted into machine code, which consists of binary instructions (0s and 1s) that a CPU can understand and execute.

>The final output file's extension depends on the operating system:

+ Linux: `.elf`, `.out`

+ Windows: `.exe`

+ macOS: `.out`(typically just an executable with no extension)

#### 💻**What Happens During Compilation?**
>Compilation is the process of translating human-readable C code into binary instructions. This transformation is necessary because CPUs don’t understand text—they only process binary.

#### 💻 **How the CPU Executes the Binary**
>`Loading into Memory`: The operating system loads the compiled executable binary into the system’s memory (RAM).

>`Execution`: The CPU directly reads and executes binary instructions, carrying out tasks such as arithmetic operations, data transfers, and memory access.



---
---

## 💡<span style="color:#ADD8E6; font-size:40px; font-weight:bold">Detailed Compilation Steps</span>

>During the compilation of a C program from source code (main.c) to an executable binary(output.elf), a series of steps occur: `preprocessing`, `compilation`, `assembly`, and `linking` —each generating intermediate files (.i, .s, .o). These files are automatically created and deleted unless explicitly retained (e.g., using -save-temps). By default, only the final executable (e.g., a.out) remains. The process can be streamlined with one-step compilation (gcc source.c -o output), which handles all stages implicitly.

Below is a flowchart illustrating the different stages of the compilation process, along with the associated intermediate files created at each step:

<div style="display: flex; flex-direction: column; align-items: center; justify-content: center; text-align: center; font-family: Arial, sans-serif; margin: 20px auto;">
    <div style="font-size: 25px; color: black; font-weight: bold; border: 5px solid black; padding: 3px; width: 150px; margin-bottom: 10px; background-color: white;">main.c</div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div>Preprocessing: <span style="font-size: 18px; color: black; font-family: Courier, monospace; background-color: rgba(244, 244, 244, 0.69); padding: 1px 5px; border-radius: 4px; margin: 5px 0;">gcc -E main.c -o main.i</span></div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div style="font-size: 25px; color: black; font-weight: bold; border: 5px solid black; padding: 3px; width: 150px; margin-bottom: 10px; background-color: white;">main.i</div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div>Compilation: <span style="font-size: 18px; color: black; font-family: Courier, monospace; background-color: rgba(244, 244, 244, 0.69); padding: 1px 5px; border-radius: 4px; margin: 5px 0;">gcc -S main.i -o main.s</span></div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div style="font-size: 25px; color: black; font-weight: bold; border: 5px solid black; padding: 3px; width: 150px; margin-bottom: 10px; background-color: white;">main.s</div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div>Assembling: <span style="font-size: 18px; color: black; font-family: Courier, monospace; background-color: rgba(244, 244, 244, 0.69); padding: 1px 5px; border-radius: 4px; margin: 5px 0;">gcc -c main.s -o main.o</span></div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div style="font-size: 25px; color: black; font-weight: bold; border: 5px solid black; padding: 3px; width: 150px; margin-bottom: 10px; background-color: white;">main.o</div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div>Linking: <span style="font-size: 18px; color: black; font-family: Courier, monospace; background-color: rgba(244, 244, 244, 0.69); padding: 1px 5px; border-radius: 4px; margin: 5px 0;">gcc main.o -o output.elf "-Wl,-Map=output.map"</span></div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div style="font-size: 25px; color: black; font-weight: bold; border: 5px solid black; padding: 3px; width: 150px; margin-bottom: 10px; background-color: white;">output.elf</div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div>Generating map file (Optional)</div>
    <div style="margin: 5px 0; font-size: 20px;">▼</div>
    <div style="font-size: 25px; color: black; font-weight: bold; border: 5px solid black; padding: 3px; width: 150px; margin-bottom: 10px; background-color: white;">output.map</div>
</div>

--- 
---


Here is a basic C program saved as main.c, which prints 'Hello, world!' to the terminal:

In [15]:
%%file main.c
#include <stdio.h>

int main() {
    printf("Hello, world!\n");
    return 0;
}

Writing main.c


---



### 🔹 1. **Preprocessing (`*.c` → `*.i`)**
- **What happens:**
  - Removes all comments (`//`, `/* */`)
  - Handles all the `#include`, `#define`, `#ifdef` and other preprocessor directives
  - Replaces all macros and header files with their full definitions
  - Generates **pure C code**


In [16]:
%%bash
gcc -E main.c -o main.i
# type main.i  

Couldn't find program: 'bash'



---

### 🔹 2. **Compilation (`*.i` → `*.s`)**
- **What happens:**
  - Converts the preprocessed C code to assembly (CPU-specific).
  - main.s will be written in human-readable assembly code (specific to the machine’s architecture).
  - Performs syntax/semantic checks.
  - Applies optimizations (if flags like `-O1`, `-O2`, or `-O3` are enabled).


In [17]:
%%bash
gcc -S main.i -o main.s

Couldn't find program: 'bash'



---

### 🔹 3. **Assembly (`*.s` → `*.o`)**
- **What happens:**
  - The assembler takes the assembly code and turns it into machine code (`object code`), stored in .o, which is a binary file containing the instructions the CPU can understand.
  - Mnemonics (e.g., mov, add, call) are translated into binary instructions.
  - Includes a symbol table that keeps track of labels, variables, and external references.
  - The resulting .o file is not executable —  it may have references to things (like external functions or variables) that are unresolved.


In [None]:
%%bash
gcc -c main.s -o main.o
# objdump -d main.o

Couldn't find program: 'bash'



---

### 🔹 4. **Linking (`*.o` → `*.elf`)**
- **What happens:**
  - Links the object file with other object files and libraries (like libc) onto one executable.
  - Resolves symbols (like printf) from the C standard library.
  - Resolves external symbols (libc, etc.).
  - Assigns final memory addresses.
  - Generates a fully executable binary executable(ELF format on Linux).


In [None]:
%%bash
gcc main.o -o output.elf

Couldn't find program: 'bash'



---

### 🔹 5. **Map File Generation (Optional)**
- **What happens:**
  - Creates `memory layout report`.
  - Shows symbol addresses, section sizes, Memory regions (.text, .data, .bss, etc.), Object file contributions to each section.
  - Useful for embedded systems.


In [None]:
%%bash
gcc main.o -o output.elf -Wl,-Map=output.map

Couldn't find program: 'bash'


After successful compilation, the generated executable (e.g., output.elf) can be ran to see the output in the terminal:

In [None]:
%%bash
./output.elf

Couldn't find program: 'bash'


---
## 📘<span style="color:#ADD8E6; font-size:40px; font-weight:bold">Extras</span>

1. **View Expanded Headers:**
   - Use `cat main.i` to see expanded headers after preprocessing.

2. **Assembly Differences:**
   - Different CPUs generate different assembly code. C is portable, but machine code isn’t.

3. **Disassembly Insight:**
   - Use `objdump -d main.o` to see the disassembly after assembly.

4. **Key Linking Tools:**
   - Important tools in linking include `ld` (linker) and `ar` (static libraries).

5. **Generate Intermediate Files:**
   - To see all intermediate files, use:
     ```bash
     gcc -save-temps main.c -o output
     ```
     This generates `.i`, `.s`, and `.o` files automatically.

6. **Compiler Optimizations:**
   **🔍 Optimization Levels (`-O0` vs `-O3`):**
   - `-O0` disables optimizations, resulting in **verbose, straightforward assembly** that closely matches the original C code — useful for debugging and learning.
   - `-O3` enables aggressive optimizations, producing **compact and efficient assembly** by inlining functions, unrolling loops, and removing redundant code.
   - Comparing the two helps visualize how compilers transform source code for performance.


7. **Cross-Compiling:**
   - Use the `-target` flag for ARM/x86 differences when cross-compiling.




---