|
| 1 | +### 1. High-Level View: A Pluggable `rustc` Backend |
| 2 | + |
| 3 | +The Rust compiler (`rustc`) is broadly split into two main parts: |
| 4 | + |
| 5 | +1. **The Frontend:** This part is responsible for parsing your Rust code, checking for errors (like type and borrow checking), and ultimately producing an intermediate representation called **MIR** (Mid-level IR). This part is common for all compilation targets. |
| 6 | +2. **The Backend:** This part takes the MIR from the frontend and is responsible for turning it into the final machine code for the target CPU. |
| 7 | + |
| 8 | +`rustc` is designed to have a "pluggable" backend. While the default backend uses **LLVM**, it can be instructed to load a different backend from a shared library (`.so`, `.dylib`). |
| 9 | + |
| 10 | +`rustc_codegen_gcc` is exactly that: an alternative backend. It is loaded by `rustc`, receives the MIR, and uses **GCC** (via a library called `libgccjit`) to perform the final code generation and optimization, instead of LLVM. |
| 11 | + |
| 12 | +``` |
| 13 | ++----------------+ +--------------------------+ +-----------------+ |
| 14 | +| Your Rust Code | ---> | rustc Frontend | ---> | MIR | |
| 15 | ++----------------+ +--------------------------+ +-----------------+ |
| 16 | + | |
| 17 | + | |
| 18 | ++------------------------------------------------------------------+ |
| 19 | +| |
| 20 | +| +--------------------------------------------+ |
| 21 | +| | rustc_codegen_gcc | |
| 22 | +| | (Loaded by the Frontend as a Plugin) | |
| 23 | +| | | |
| 24 | +| | +--------------------------------------+ | +----------------+ |
| 25 | +| | | Translates MIR into libgccjit | ---> | GCC | |
| 26 | +| | +--------------------------------------+ | +----------------+ |
| 27 | +| | | | |
| 28 | +| +--------------------------------------------+ | |
| 29 | +| V |
| 30 | +| +---------------+ |
| 31 | +| | Machine Code | |
| 32 | +| | (.o, .rlib) | |
| 33 | +| +---------------+ |
| 34 | +``` |
| 35 | + |
| 36 | +### 2. Internal Architecture |
| 37 | + |
| 38 | +Internally, the crate follows a structured process to translate MIR into a format that GCC understands. The compilation of a crate is broken down into smaller "Codegen Units" (CGUs), which can be processed in parallel. |
| 39 | + |
| 40 | +Here is the flow for a single Codegen Unit: |
| 41 | + |
| 42 | +1. **Entry Point (`src/lib.rs`)** |
| 43 | + * `rustc` loads this library and finds the `GccCodegenBackend`, which fulfills the `CodegenBackend` trait required by the compiler. This is the main entry point that kicks off the compilation for a crate. The process is orchestrated by a framework called `rustc_codegen_ssa`, which this crate is built upon. |
| 44 | + |
| 45 | +2. **Context Setup (`src/context.rs` & `src/base.rs`)** |
| 46 | + * For each CGU, a `CodegenCx` (Codegen Context) is created. This is a central data structure that holds everything needed for the translation, most importantly: |
| 47 | + * The `rustc` type context (`TyCtxt`). |
| 48 | + * A `gccjit::Context`, which is the object we will use to describe our program to GCC. |
| 49 | + |
| 50 | +3. **MIR-to-GCC Translation (A Two-Pass Process)** |
| 51 | + The core of the work is converting MIR into GCC's own internal representation (called GIMPLE) via the `libgccjit` API. This happens in two passes over all the functions and statics in the CGU. |
| 52 | + |
| 53 | + * **Pass 1: Pre-definition (`src/mono_item.rs`)** |
| 54 | + The first pass quickly iterates through every function and static variable. It *declares* them within the `gccjit::Context` but does not define their content. This creates an empty shell (e.g., the function signature) so that other code can reference it before it's fully translated. This is essential for handling forward references and recursion. |
| 55 | + |
| 56 | + * **Pass 2: Definition (`src/builder.rs`)** |
| 57 | + This is where the main translation happens. The `Builder` object iterates through the MIR of each function, block by block, instruction by instruction. For each MIR instruction (e.g., `_1 = _2 + _3;`), it calls the corresponding method in the `libgccjit` API to build an equivalent representation in GCC's IR. This is the heart of the codegen process. |
| 58 | + |
| 59 | +4. **Type Mapping (`src/type_of.rs`)** |
| 60 | + * Throughout the translation, whenever the `Builder` encounters a Rust type (like a `struct`, `u64`, or `&str`), it uses the `LayoutGccExt::gcc_type` method. This crucial step converts Rust's internal type and layout information into a `gccjit::Type` that GCC can understand, ensuring correct memory layout, alignment, and function ABI. |
| 61 | + |
| 62 | +5. **Final Object File Generation (`src/back/write.rs`)** |
| 63 | + * Once the `gccjit::Context` has been completely filled with all the translated functions and statics for the CGU, this module takes over. |
| 64 | + * It makes a single call: `Context::compile_to_file()`. This hands control over to the GCC compiler, which: |
| 65 | + 1. Runs its powerful suite of optimization passes on the IR. |
| 66 | + 2. Generates the final, native machine code. |
| 67 | + 3. Writes the result to an object file (`.o`). |
| 68 | + |
| 69 | +In summary, `rustc_codegen_gcc` acts as a sophisticated **translator**, converting Rust's high-level MIR into a series of C-like API calls that describe the program to GCC, which then handles the heavy lifting of optimization and machine code generation. |
0 commit comments