This is a C compiler project that I developed to deepen my understanding of compiler design and low-level programming. My compiler supports basic arithmetic operations and demonstrates the core phases of compilation from lexical analysis to code generation.
The project implements key compiler components:
- Lexer: Tokenizes input C source code
- Parser: Builds Abstract Syntax Trees (AST) from tokens
- AST: Data structures representing program structure
- Code Generator: Converts AST to machine-level instructions
Designed for educational purposes, the modular source files enable study and extension of each compiler phase.
lexer.c
/lexer.h
: Lexer implementation for tokenizing inputparser.c
/parser.h
: Parser building the ASTast.c
/ast.h
: AST node definitions and utilitiescodegen.c
/codegen.h
: Code generator producing executable codemain.c
: Entry point controlling compilation processtest.c
: Sample file to test the compiler
- Supports basic integer arithmetic operations:
+
,-
,*
,/
- Tokenizes simple C syntax
- Builds and traverses AST
- Modular design for compiler components
Use MSYS2 MinGW64 Compiler
- Clone the repo:
git clone https://github.com/razancodes/C-Compiler.git
cd C-Compiler
- Configure test.c for the expression you want:
- Compile our compiler's C source files
gcc -o razancompiler lexer.c parser.c ast.c codegen.c main.c
./razancompiler test.c
- Assemble the generated output.s into an object file
as -o output.o output.s
- Link the object file into an executable
gcc -o rznprogram output.o
- Run the compiled program
Execute Finally as:
./rznprogram
- Use
echo $?
To get the return value of the program.
The compiler currently handles simple arithmetic in C like:
test.c: int main() {return 2+3;}
You can compile similar code with this compiler while studying how basic expressions are processed.
This project provided me with practical insights into:
- Compiler internals such as tokenization, parsing, and code generation.
- Abstract Syntax Tree (AST) design and traversal.
- Building a minimal but functional compilation pipeline.
- Low-level programming concepts underlying high-level languages.
- Only basic arithmetic operations are supported.
- No advanced C features like pointers, control flow, or functions.
- Minimal error handling and diagnostic messages.
- Code optimizations are not implemented.
Potential extensions to improve this compiler:
- Support for control flow constructs (
if
,while
,for
). - Function definitions and calls support.
- Enhanced type checking and semantic analysis.
- More extensive error reporting.
- Target assembly or machine code generation improvements.
Contributions to enhance the compiler or extend supported features are welcome:)) Feel free to open issues or submit pull requests if something doesnt work...
Open source for educational and experimental use.
razancodes