Skip to content

ulissestecca/smallPython

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Custom Python-like Language Interpreter

A Java-based, tree-walk interpreter for a custom programming language. The language features a Python-like, indentation-based syntax, dynamic typing, and standard control flow structures.

This project implements a full evaluator, including lexical analysis (with indentation tracking), recursive descent parsing, Abstract Syntax Tree (AST) generation, and a visitor-based interpreter.

Language Features

  • Indentation-based blocks: Uses significant whitespace (indentation/dedentation) to define scope, similar to Python.
  • Dynamic Typing: Supports numbers (doubles), strings, and booleans with dynamic evaluation.
  • Control Flow: if/else conditionals and while loops.
  • Logical Operators: and, or, and not.
  • Relational & Arithmetic Operations: Standard mathematical and comparison operators.
  • Built-in Print: print() statement supporting multiple arguments.

Grammar Specifications

The language is built on the following Context-Free Grammar:

CompilationUnit := { Statement }
Statement       := (IfStatement | WhileStatement | Expression | PrintStatement) NEWLINE
IfStatement     := 'if' Expression ':' BlockStatement [ 'else' ':' BlockStatement ]
WhileStatement  := 'while' Expression ':' BlockStatement

PrintStatement  := 'print' '(' [ Expression { ',' Expression } ] ')'
BlockStatement  := NEWLINE INDENT { Statement } DEDENT
Expression      := E1 { 'or' E1 }
E1              := E2 { 'and' E2 }
E2              := [ 'not' ] E3
E3              := E4 { ('<' | '<=' | '>' | '>=' | '==' | '!=') E4 }
E4              := E5 { ('+' | '-') E5 }
E5              := E6 { ('*' | '/' | '%') E6 }
E6              := String | Number | Identifier { '=' Expression } | '(' Expression ')'

Identifier      := (Letter | '_') { Letter | Digit | '_' }
Number          := Digit { Digit } [ '.' { Digit } ]
String          := '"' { AnyCharExceptDoubleQuoteOrNewline } '"'

A Note on Assignments as Expressions

You may notice that the grammar allows assignments to be evaluated as expressions (e.g., Identifier { '=' Expression } inside the E6 production). This deviates from the original Python language, which strictly treats assignments as statements.

This design choice ensures the grammar remains strictly LL(1). If Assignment were defined as a separate Statement production, the recursive descent parser would encounter a structural ambiguity when reading an Identifier. It would not be able to decide whether it was evaluating a standard mathematical/logical expression starting with a variable or parsing an assignment without utilizing a two-token lookahead, which would make the grammar LL(2).

Project Architecture

  • Lexer.java & Token.java: Scans the raw source code and converts it into a sequence of tokens. It maintains a stack to track indentation levels, generating synthetic INDENT and DEDENT tokens to pass structural block information to the parser.
  • Parser.java: A recursive descent parser that strictly follows the LL(1) grammar above. It reads tokens and constructs the Abstract Syntax Tree.
  • AST.java: Defines the node structures for the Abstract Syntax Tree (Expressions and Statements) and sets up the interfaces for the Visitor design pattern.
  • Interpreter.java: Walks the AST using the Visitor pattern to evaluate expressions and execute statements. It manages a runtime environment (a HashMap) to keep track of variable states and handles truthiness, equality, and arithmetic logic.
  • Main.java: The entry point that orchestrates the pipeline: File Reading -> Lexer -> Parser -> Interpreter.

About

A Java-based, tree-walk interpreter for a custom Python-like programming language. The project implements a complete pipeline including a stateful Lexer, a recursive descent LL(1) Parser, and a Visitor-pattern AST Interpreter.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages