A MiniJava semantic analyser built on top of JavaCC (lexer + parser, generated from minijava.jj) and JTB (Java Tree Builder — produces an AST and a visitor scaffold). Two custom visitors handle the actual semantic work.
Second homework of the Compilers course at the University of Athens (Department of Informatics & Telecommunications). Solo project.
Given a MiniJava source file, the analyser:
- Builds the symbol table for all classes, methods, and fields.
- Type-checks every expression, method call, return type, and assignment.
- Validates inheritance (parent class must be defined; method overrides must match signatures; types narrow correctly).
- Reports semantic errors with location info, or computes field/method offsets when the program is well-typed.
java Main path/to/program.javainput_files/minijava-examples-new/ ships with the standard MiniJava test programs (BinaryTree, BubbleSort, Factorial, LinearSearch, LinkedList, MoreThan4, QuickSort, TreeVisitor) plus their *-error.java counterparts to verify the analyser rejects bad code. offset-examples/ contains the reference offset outputs that well-typed programs should produce.
I split the work across two AST traversals so each pass has a clean responsibility.
Walks the tree once, building the symbol-table hierarchy:
- A global scope holding
ClassSymbols. - Each
ClassSymbolhas its own scope withVariableSymbols (fields) andMethodSymbols. - If a class inherits another, its scope's parent is the parent class's scope; otherwise it points to the global scope.
- Each method gets its own scope. Method arguments are stored in an ordered list (so we can match positional arguments at call sites later) and in the local-variable scope (matches Java's semantics — slight duplication, but it makes collision checks trivial).
Some semantic errors are caught here already:
- Name collisions (two classes with the same name; two fields with the same name within a class).
- An inheritance reference to a class that hasn't been defined yet.
A variable referencing a class defined later is fine — that gets resolved by Pass 2.
The full type checker. Every node returns its computed type as a String, which keeps the visitor cleanly decoupled from JTB's internals — no need to subclass or rewrite GJDepthFirst. Responsibilities:
- Every type referenced actually exists (checked against the symbol table).
- Method calls receive arguments matching the declared parameter types.
- Variables are assigned compatible types.
- Methods return what their signature claims.
- A subclass
BextendingAis usable whereverAis expected.
| File | Role |
|---|---|
symbol_table/Symbol.java |
Common base |
symbol_table/ClassSymbol.java |
Class with its own scope, parent pointer for inheritance |
symbol_table/MethodSymbol.java |
Method with parameter list + local scope |
symbol_table/VariableSymbol.java |
Field/local with declared type |
symbol_table/Scope.java |
Lookup with parent-chain resolution |
- Fields are implicitly
protected; methods are implicitlypublic. - Even within a method of the same class, fields cannot be accessed via
this.fieldorobj.fieldsyntax — the language spec doesn't permit it. The only legal dot access is.lengthonint[]andboolean[]. - Method scopes are flat. No nested blocks / inner scopes — keeps Pass 1 simple.
The Identifier node ended up handling a lot of context-dependent logic — distinguishing whether an identifier is a field reference, a local variable, a parameter, a class name, or a method name. That branching turned messier than I'd like, and I leaned on a few global "where am I in the tree" flags (e.g., the current argument index when parsing a method call) instead of threading proper state through the visitor.
If the assignment had asked for richer features — nested scopes inside methods, generics, real protected access semantics — I'd refactor away from the global-flag pattern and toward an explicit traversal-context object passed through the visitor. As-is for the assignment scope, "if it works, don't touch it" applied.
Part of a two-piece Compilers arc:
- compilers-hw1 — LL(1) parser construction + JFlex/JavaCUP translator
- compilers-hw2 (you are here) — MiniJava semantic analysis (JavaCC/JTB, two-visitor design)
MIT — applies to my own code in this repo. Generated parser code (MiniJavaParser*.java, Token*.java, JavaCharStream.java) comes from JavaCC and is governed by its license. Vendored jars (jar_files/javacc5.jar, jar_files/jtb132di.jar) retain their original licenses.