A statically-typed compiler implementing a subset of C, built with Flex and Bison. This project demonstrates fundamental compiler construction techniques including lexical analysis, parsing, semantic analysis, symbol table management, and type checking.
This compiler implements a C-like language with static type checking, supporting fundamental operations, control flow structures, and comprehensive error reporting. The implementation showcases core compiler design principles while maintaining a clean, educational codebase.
- Basic data types (int, bool)
- Variable declarations and assignments
- Arithmetic operations (+, -, *, /)
- Logical operations (and, or, not)
- Comparison operations (>, <, >=, <=, ==, !=)
- Control structures (if-else, while)
- Print and return statements
- Static type checking
- Type verification for operations
- Error reporting with line numbers
program -> statement_list | ε
statement_list -> statement
| statement_list statement
statement -> declaration ;
| assignment ;
| if_statement
| while_statement
| print_statement ;
| return_statement ;
declaration -> type_specifier IDENTIFIER
| type_specifier IDENTIFIER = expression
type_specifier -> int | bool
assignment -> IDENTIFIER = expression
if_statement -> IF ( expression ) { statement_list }
| IF ( expression ) { statement_list } ELSE { statement_list }
while_statement -> WHILE ( expression ) { statement_list }
```bnf
## Example Programs
### Valid Programs
#### 1. Control Structures (inputs/valid/control_flow.txt)
```c
int x = 5;
int y = 10;
bool flag = true;
if (x < y) {
print x; // Will print 5
} else {
print y;
}
while (flag) {
x = x + 1;
if (x >= 10) {
flag = false;
}
}
print x; // Will print 10
```c
This example demonstrates:
- Variable initialization
- If-else statements
- While loops
- Boolean conditions
- Arithmetic operations
#### 2. Operations Test (inputs/valid/operations_test.txt)
```c
// Arithmetic Operations
int a = 15;
int b = 3;
int result;
result = a + b;
print result; // Prints 18
result = a - b;
print result; // Prints 12
result = a * b;
print result; // Prints 45
result = a / b;
print result; // Prints 5
// Boolean Operations
bool test = true;
bool other = false;
bool final;
final = test && other;
print final; // Prints false
final = test || other;
print final; // Prints true
```c
This example shows:
- All arithmetic operations
- Logical operations
- Mixed declarations and assignments
- Print statements
### Invalid Programs
#### 1. Type Mismatch (inputs/invalid/type_mismatch.txt)
```c
int x = 5;
bool y = x; // Error: Cannot initialize bool with int
Error: Type mismatch in initialization - Cannot assign integer value to boolean variable
int a = 5;
b = 10; // Error: b not declaredError: Variable 'b' has not been defined
bool flag = true;
bool other = false;
bool result = flag + other; // Error: Cannot add booleansError: Cannot perform addition between boolean operands
int x = 5;
int y = 0;
int result = x / y; // Error: Division by zero
```c
Error: Division by zero
#### 5. Variable Redeclaration (inputs/invalid/var_redeclaration.txt)
```c
int x = 5;
int x = 10; // Error: x already declared
```c
Error: Variable 'x' already declared
#### 6. Invalid Token (inputs/invalid/invalid_token.txt)
```c
gibberish // Error: Unrecognized token
```c
Error: Lexical error - Unrecognized character
### Directory Structure for input filesinputs/ ├── valid/ │ ├── control_flow.txt │ └── operations_test.txt └── invalid/ ├── type_mismatch.txt ├── undefined_var.txt ├── invalid_operation.txt ├── div_by_zero.txt ├── var_redeclaration.txt └── invalid_token.txt
## Valid Input Examples
```c
// Variable declarations and initialization
int x = 5;
bool flag = true;
// Control structures
if (x > 0) {
print x;
} else {
x = 0;
}
while (flag) {
x = x + 1;
if (x > 10) {
flag = false;
}
}
```c
## Type Rules and Restrictions
### Type Rules
- Arithmetic operations (+, -, *, /) only work with integers
- Logical operations (and, or, not) only work with booleans
- Comparison operations return boolean values
- Variables must be declared before use
### Invalid Operations
```c
x = 5 + true; // Cannot mix int and bool
bool result = 5 or 5; // Logical operations need booleans
int res = true + true; // Cannot add booleans
print (12 and true); // Invalid types for AND
```c
## Building and Running
### Prerequisites
#### For Unix/Linux/MacOS
1. Install the required tools using your package manager:
```bash
# For Ubuntu:
sudo apt-get update
sudo apt-get install flex bison gcc make
# For MacOS with Homebrew:
brew install flex bison gcc make
-
Install GCC (MinGW):
- Download MinGW installer from https://sourceforge.net/projects/mingw/
- Add
C:\MinGW\binto your PATH
-
Install Flex and Bison:
- Download from https://gnuwin32.sourceforge.net/packages/flex.htm
- Download from https://gnuwin32.sourceforge.net/packages/bison.htm
- Install both packages
- Add
C:\GnuWin32\binto your PATH
# Using make
make clean
make
# Manual build
flex lexer.l
bison -d parser.y
gcc lex.yy.c parser.tab.c symbol_table.c operations.c -o compiler# Using make
mingw32-make clean
mingw32-make
# Manual build
flex lexer.l
bison -d parser.y
gcc lex.yy.c parser.tab.c symbol_table.c operations.c -o compiler.exe# Interactive mode
./compiler
# With input file
./compiler < input_file.txt# Interactive mode
compiler.exe
# With input file
compiler.exe < input_file.txt- "Variable already declared" - Attempting to redeclare a variable
- "Undefined variable" - Using a variable before declaration
- "Type mismatch" - Invalid type in operation or assignment
- "Division by zero" - Attempting to divide by zero
-
Lexical Analyzer (lexer.l)
- Tokenizes input
- Handles comments and whitespace
-
Parser (parser.y)
- Implements grammar rules
- Performs type checking
- Handles error reporting
-
Symbol Table
- Dynamic linked list implementation
- Stores variable information
- Supports lookup and insertion
-
Operations
- Type-checked arithmetic operations
- Logical operations
- Comparison operations
- Implementation: Linked list with O(n) lookup
- Scope: Single global scope
- Features: Duplicate detection, type storage, value management
- Static typing: All type errors caught at compile-time
- Strict enforcement: No implicit conversions
- Comprehensive error messages: Line numbers and context for all errors
- Bottom-up parsing: LALR(1) parser generated by Bison
- Operator precedence: Properly handles expression evaluation order
- Semantic actions: Embedded type checking and evaluation
- Memory management: Proper cleanup with symbol table deallocation
- Error recovery: Clear, actionable error messages with line numbers
- Extensible design: Modular operations system for easy feature additions
- Cross-platform: Works on Windows, Linux, and MacOS
simple-c-compiler/
├── lexer.l # Flex lexical analyzer specification
├── parser.y # Bison parser specification with grammar rules
├── symbol_table.h/c # Symbol table implementation
├── operations.h/c # Type-checked operation implementations
├── makefile # Build automation
├── grammar.md # Formal grammar specification
├── inputs/ # Test cases
│ ├── valid/ # Valid programs for testing
│ └── invalid/ # Error case demonstrations
└── README.md # This file
Potential areas for extension:
- Function declarations and calls
- Array support and pointer arithmetic
- Struct/union types
- Multiple scopes with scope stack
- Code generation (assembly/bytecode output)
- Optimization passes
- Additional data types (float, char, string)
This is an educational project demonstrating compiler construction fundamentals. Feel free to fork and extend with additional features.