A from-scratch C compiler implemented in C++, demonstrating the full compilation pipeline from source code to target assembly.
The project emphasizes modular design, educational clarity, and correctness, making it suitable for academic use and compiler fundamentals study.
This compiler processes C source code through all standard compilation phases:
- Lexical Analysis – Token generation
- Syntax Analysis – Parsing and AST construction
- Semantic Analysis – Type checking and scope validation
- Intermediate Code Generation – Three Address Code (TAC)
- Target Code Generation – MIPS-like assembly output
The implementation mirrors the structure of industrial compilers while remaining readable and extensible.
- Complete multi-phase compiler pipeline
- Recursive descent parser with operator precedence
- Hierarchical symbol table with scope management
- Semantic analysis with detailed error reporting
- Three Address Code (TAC) intermediate representation
- MIPS-style assembly code generation
- Clear diagnostics with source line and column tracking
- Modular and extensible architecture
intfloatcharvoid
- Variable declarations and assignments
- Arithmetic, relational, and logical expressions
- Control flow:
if / elsewhilefordo-while
- Functions:
- Definition
- Parameters
- Return values
- Recursion
- Basic I/O:
printfscanf
- Arrays (basic support)
- Pointers (limited)
- Structs, unions, typedef
- Preprocessor directives
C Source Code
↓
Lexical Analyzer
↓
Syntax Analyzer (Parser)
↓
Semantic Analyzer
↓
Intermediate Code Generator (TAC)
↓
Target Code Generator
↓
MIPS-like Assembly Output
git clone https://github.com/Yam1nX/compiler.git
cd compiler
g++ compiler.cpp -o compiler./compiler input.c- Token stream
- Three Address Code (TAC)
- Generated assembly code
int main() {
int a = 10;
int b = 20;
int sum = a + b;
printf("Sum = %d\n", sum);
return 0;
}
- Tokens generated
- TAC instructions
- MIPS-style assembly code
-
30+ test programs covering:
- Expressions
- Control flow
- Functions
- Semantic errors
-
Lexical, syntax, and semantic error validation
-
Comparison against reference compiler outputs
-
Full support for arrays, pointers, structs, and unions
-
Advanced optimizations:
- Constant folding
- Dead code elimination
- Better register allocation
-
Support for additional targets (x86-64, ARM, WebAssembly)
-
Basic linker and preprocessor support
-
Enhanced diagnostics and warnings
Name: Ashabul Yamin Tuhin Course: Compiler Lab Department: Computer Science and Engineering