nuPython Interpreter & Debugger

A complete interpreter and source-level debugger for nuPython, a Python-like programming language. This project implements the full execution pipeline from parsing to runtime, along with an interactive debugging environment similar to GDB.

Overview

This project consists of four major components that work together to parse, execute, and debug nuPython programs:

Program Execution Engine - Interprets and executes nuPython code
Memory Management System - Simulates RAM for variable storage
Interactive Debugger - Provides step-by-step execution and inspection
Program Graph Representation - Internal AST-like structure for code

What It Does

The interpreter can execute nuPython programs that include:

Variable assignments (integers, floats, strings, booleans)
Arithmetic operations (+, -, *, /, %, **)
Comparison operations (==, !=, <, <=, >, >=)
String concatenation
Control flow (if/elif/else statements, while loops)
Built-in functions (print(), input(), int(), float())
Pointer operations (address-of &, dereference *)

The debugger provides GDB-like functionality:

Step-by-step execution
Variable inspection and modification
Memory state visualization
Execution history tracking
Breakpoint-style program navigation

Core Components I Built

1. Execution Engine (`execute.c` / `execute.h`)

The heart of the interpreter - executes the parsed program graph statement by statement.

Key Features:

Expression Evaluation: Handles unary and binary expressions with proper type coercion
- Automatic int/float conversion for mixed-type arithmetic
- String concatenation support
- Boolean comparison operations for all types
Statement Execution:
- Assignment statements (including pointer dereference assignments)
- Function calls (print, input, int, float)
- Control flow (if/elif/else, while loops)
Type System: Manages multiple data types with proper error handling
- Integers, floats, strings, booleans, pointers, None
- Runtime type checking and conversion
Error Handling: Comprehensive semantic error detection
- Undefined variables
- Invalid operand types
- Division/modulus by zero
- Type conversion failures

Notable Implementation Details:

execute_expr(): Unified expression evaluator that handles both simple values and complex binary operations
execute_binary_expression(): Performs arithmetic/comparison operations with automatic type promotion
Helper functions for type-specific operations (int, real, string)
Proper memory management for dynamically allocated strings in concatenation

2. Memory Management System (`ram.c` / `ram.h`)

A simulated RAM implementation that stores Python variables and their values.

Architecture:

Dual Data Structure Design:
- cells[]: Dynamic array storing actual variable values
- map[]: Alphabetically sorted array mapping variable names to cell addresses
Dynamic Growth: Automatically doubles capacity when full (starts at 4, grows to 8, 16, 32...)

Core Operations:

ram_write_cell_by_name(): Inserts variables in sorted order (O(log n) search + O(n) insertion)
ram_read_cell_by_name(): Binary search lookup (O(log n))
ram_read_cell_by_addr(): Direct address access (O(1))
ram_get_addr(): Retrieves memory address for a variable

Features:

Binary search for efficient variable lookup
Deep copying for safe value retrieval
String duplication for proper memory ownership
Comprehensive cleanup in destructor

Design Decisions:

Kept map sorted alphabetically for fast binary search
Separated name-to-address mapping from value storage
Addresses never change once assigned (variables have stable memory locations)

3. Interactive Debugger (`debugger.cpp` / `debugger.h`)

A GDB-style debugging interface for nuPython programs.

Debugging Commands:

s - Step: Execute current statement and advance
w - Where: Show next statement to execute
p varname - Print variable value and type
c varname newvalue - Change variable value
sm - Show memory: Display all variables
e - Execution history: List executed line numbers
pg - Print program graph: Show entire program structure
q - Quit debugger

Implementation Highlights:

Program Graph Manipulation:

Maintains "current" and "next" statement pointers
Unlinks program graph to execute one statement at a time
Properly handles control flow (if statements, while loops with multiple next pointers)
Relinks graph after execution to maintain program structure

State Management:

Tracks execution history (line numbers executed)
Detects execution completion vs. execution failure
Prevents stepping beyond program end or after errors

Variable Inspection/Modification:

Type-aware variable printing (shows both type and value)
Dynamic variable modification during execution
Proper type conversion for new values (handles strings, ints, floats, booleans)

Advanced Features:

Handles while loops by evaluating conditions and choosing between loop body or next statement
Supports if/elif/else by evaluating conditions and following true/false paths
Preserves program graph integrity for re-execution or printing

4. Main Program (`main.c`)

Orchestrates the complete execution pipeline.

Pipeline:

Input handling (file or keyboard)
Parsing (validates syntax using provided parser)
Program graph construction
RAM initialization
Program execution
Memory state display
Cleanup

Project Structure

├── main.c                 # Main execution pipeline
├── execute.c/h            # Program execution engine 
├── ram.c/h                # Memory management system 
├── debugger.cpp/h         # Interactive debugger 
├── debugger_run.cpp       # Debugger main program
├── programgraph.h         # Program representation 
├── parser.h               # Syntax parser 
├── scanner.h              # Lexical scanner 
├── token.h                # Token definitions 
├── tokenqueue.h           # Token queue structure 
├── nupython.*.o           # Pre-compiled parser/scanner for program execution
├── nupython_e.*.o         # Pre-compiled parser/scanner for debugger
└── test-*.py              # Example nuPython programs

Technical Highlights

Expression Evaluation Strategy

The interpreter uses a recursive approach for expression evaluation:

Determine if expression is unary or binary
For binary: evaluate left-hand side, right-hand side, then operator
Perform automatic type coercion (int→float when needed)
Return result as dynamically allocated RAM_VALUE

Memory Organization

Variables are stored with stable addresses:

First assigned variable gets address 0
Second gets address 1, etc.
Addresses never change (important for pointer operations)
Map keeps sorted by name for fast lookup, but addresses remain sequential

Control Flow Handling

The debugger implements control flow by:

Unlinking the program graph at the current statement
Evaluating conditions when needed (for if/while)
Choosing appropriate next statement based on condition
Relinking after execution
Advancing to the chosen next statement

Error Handling Philosophy

Runtime errors are detected and reported immediately:

Undefined variable access
Type mismatches in operations
Division/modulus by zero
Invalid pointer operations
Execution stops at first error with descriptive message

Example Usage

Running a Program (Interpreter Mode)

./a.out test-while.py

Output:

**parsing successful, valid syntax
**building program graph...
**executing...
Enter an integer> 5
project:a
project:aa
project:aaa
project:aaaa
project:aaaaa
**done
**MEMORY PRINT**
Size: 4
Capacity: 4
Contents:
 N: int, 5
 i: int, 6
 result: str, 'project:aaaaa'
 s: str, '5'
**END PRINT**

Interactive Input (Interpreter Mode)

./a.out
nuPython input (enter $ when you're done)>
x = 10
print(x ** 2)
$

Output:

**parsing successful, valid syntax
**building program graph...
**executing...
100
**done
**MEMORY PRINT**
Size: 1
Capacity: 4
Contents:
 x: int, 10
**END PRINT**

Debugging a Program (Debugger Mode)

./debugger test-while.py

Interactive Session:

** Simple gdb for nuPython **

Enter a command, h for help > 
s
Enter an integer> 5

Enter a command, h for help > 
w
1: result = "project:"

Enter a command, h for help > 
p N
N (int): 5

Enter a command, h for help > 
sm
**MEMORY PRINT**
Size: 2
Capacity: 4
Contents:
 N: int, 5
 s: str, '5'
**END PRINT**

Enter a command, h for help > 
s

Enter a command, h for help > 
p result
result (str): project:

Enter a command, h for help > 
q

** Done **

Building the Project

Setup

IMPORTANT: Before building, you must copy the correct pre-compiled object files for your system architecture.

For Intel-based systems (x86/x64/AMD):

cp nupython.x86.o nupython.o
cp nupython_e.x86.o nupython_e.o

For Apple Silicon Macs (M1/M2/M3):

cp nupython.arm.o nupython.o
cp nupython_e.arm.o nupython_e.o

Compilation & Running

Interpreter Mode

The interpreter (main.c) runs nuPython programs directly and shows the final memory state.

# Build the interpreter
make build

# Run with keyboard input (type your program, end with $)
make run

# Run with a file
./a.out test-expr.py

# Check for memory leaks with valgrind
make valgrind file=test-while.py

Debugger Mode

The debugger (debugger_run.cpp) provides step-by-step execution with GDB-like commands.

# Build the debugger
make build-debugger

# Run the debugger with a file
./debugger test-funct.py

# Or use the make command
make run-debugger

# Check for memory leaks with valgrind
make valgrind-debugger file=test-if.py

Key Differences:

Interpreter runs the entire program and shows final state
Debugger lets you step through line-by-line, inspect/modify variables, and view execution history

Test Programs

The project includes several test programs demonstrating different features:

test-expr.py: Expressions with integers, floats, strings, and comparisons
test-funct.py: Built-in functions (input, int, float)
test-if.py: If/elif/else statements
test-while.py: While loops and string concatenation

Learning Outcomes

This project demonstrated:

Low-level systems programming in C/C++
Memory management with malloc/free and dynamic arrays
Data structure design (sorted arrays, binary search)
Interpreter architecture (scanning → parsing → execution)
Debugger implementation (program graph manipulation, state tracking)
Type systems and runtime type checking
Error handling and semantic analysis

The most challenging aspects were:

Managing the program graph unlinking/relinking for single-step execution
Implementing proper type coercion for mixed-type expressions
Handling string memory allocation during concatenation
Coordinating control flow between if/while conditions and statement advancement

Key Implementation Insights

Why Binary Search for Variables?

With variables stored alphabetically, binary search provides O(log n) lookup time instead of O(n) linear search. While insertion is still O(n) due to shifting elements, lookups are far more frequent in typical programs.

Why Unlink the Program Graph?

The debugger needs to execute exactly one statement at a time. By unlinking the current statement from the next, we can call the existing execute() function (which normally runs entire programs) to execute just that single statement. After execution, we relink the graph and manually advance to the next statement.

Why Dynamic String Allocation?

When concatenating strings like "hello" + " world", we can't know the final length at compile time. The interpreter allocates exactly enough memory for the result, copies both strings into it, and stores the pointer in RAM. This memory is properly freed when the variable is reassigned or when RAM is destroyed.

Why Separate Execution from Parsing?

The program graph serves as an intermediate representation between raw source code and execution. This separation allows:

The debugger to traverse the program structure
Error messages with accurate line numbers
Potential for optimization passes (not implemented here)
Clean separation of concerns (parsing vs. execution)

Credits

Built by Jay Rao for CS 211 at Northwestern University.

My Work: execute.c, ram.c, debugger.cpp, main.c
Provided: Parser, scanner, token system, program graph structure
Course: CS 211 - Fundamentals of Computer Programming II
Instructor: Prof. Joe Hummel

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nuPython Interpreter & Debugger

Overview

What It Does

Core Components I Built

1. Execution Engine (`execute.c` / `execute.h`)

2. Memory Management System (`ram.c` / `ram.h`)

3. Interactive Debugger (`debugger.cpp` / `debugger.h`)

4. Main Program (`main.c`)

Project Structure

Technical Highlights

Expression Evaluation Strategy

Memory Organization

Control Flow Handling

Error Handling Philosophy

Example Usage

Running a Program (Interpreter Mode)

Interactive Input (Interpreter Mode)

Debugging a Program (Debugger Mode)

Building the Project

Setup

Compilation & Running

Interpreter Mode

Debugger Mode

Test Programs

Learning Outcomes

Key Implementation Insights

Why Binary Search for Variables?

Why Unlink the Program Graph?

Why Dynamic String Allocation?

Why Separate Execution from Parsing?

Credits

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitattributes		.gitattributes
README.md		README.md
a.out		a.out
debugger		debugger
debugger.cpp		debugger.cpp
debugger.h		debugger.h
debugger_execute.h		debugger_execute.h
debugger_run.cpp		debugger_run.cpp
execute.c		execute.c
execute.h		execute.h
main.c		main.c
makefile		makefile
nupython.arm.o		nupython.arm.o
nupython.x86.o		nupython.x86.o
nupython_e.arm.o		nupython_e.arm.o
nupython_e.x86.o		nupython_e.x86.o
parser.h		parser.h
programgraph.h		programgraph.h
ram.c		ram.c
ram.h		ram.h
scanner.h		scanner.h
test-expr.py		test-expr.py
test-funct.py		test-funct.py
test-if.py		test-if.py
test-while.py		test-while.py
token.h		token.h
tokenqueue.h		tokenqueue.h

jayrao213/nuPython

Folders and files

Latest commit

History

Repository files navigation

nuPython Interpreter & Debugger

Overview

What It Does

Core Components I Built

1. Execution Engine (execute.c / execute.h)

2. Memory Management System (ram.c / ram.h)

3. Interactive Debugger (debugger.cpp / debugger.h)

4. Main Program (main.c)

Project Structure

Technical Highlights

Expression Evaluation Strategy

Memory Organization

Control Flow Handling

Error Handling Philosophy

Example Usage

Running a Program (Interpreter Mode)

Interactive Input (Interpreter Mode)

Debugging a Program (Debugger Mode)

Building the Project

Setup

Compilation & Running

Interpreter Mode

Debugger Mode

Test Programs

Learning Outcomes

Key Implementation Insights

Why Binary Search for Variables?

Why Unlink the Program Graph?

Why Dynamic String Allocation?

Why Separate Execution from Parsing?

Credits

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Execution Engine (`execute.c` / `execute.h`)

2. Memory Management System (`ram.c` / `ram.h`)

3. Interactive Debugger (`debugger.cpp` / `debugger.h`)

4. Main Program (`main.c`)

Packages