Maze

Grammar-constrained code generation using vLLM + llguidance.

What It Does

Uses Lark grammars during LLM code generation to enforce syntactic validity.

Status

Working:

Python, TypeScript, Rust completion grammars
Modal deployment (vLLM 0.11.0 + llguidance + Qwen2.5-Coder-32B)
Smart grammar selection (auto-detects completion vs full generation)
Validated constraint enforcement (13 passing tests)

Not yet implemented:

Go, Zig grammars need more work
Type-aware generation (type system exists but not integrated)
Full benchmark validation (HumanEval, MBPP)

Quick Start

git clone https://github.com/rand/maze.git
cd maze
uv pip install -e ".[dev]"

# Run fast unit tests
uv run pytest tests/unit/test_core/test_types.py -v

# Run validation tests (requires Modal endpoint)
export MODAL_ENDPOINT_URL=https://rand--maze-inference-mazeinferenceserver-fastapi-app.modal.run
uv run pytest tests/validation/test_constraint_enforcement.py::TestComplexScenarios -v

How It Works

Grammar selection: Detects if prompt is completion (def foo():) or full generation ("write a function")
Constraint enforcement: llguidance masks invalid tokens during generation
Validation: Parse with language compiler (ast.parse, tsc, rustc)

Example:

from maze.orchestrator.providers.modal import ModalProviderAdapter
from maze.orchestrator.providers import GenerationRequest

adapter = ModalProviderAdapter()

request = GenerationRequest(
    prompt="def get_answer():\n    ",
    grammar="start: simple\nsimple: \"return \" NUMBER\nNUMBER: /[0-9]+/",
    max_tokens=16,
    temperature=0.0,
)

response = adapter.generate(request)
# Output: "return 42069420694206"
# Parses successfully: ast.parse("def get_answer():\n    return 42...")

Performance

Measured on Modal (vLLM 0.11.0, Qwen2.5-Coder-32B, A100-80GB):

Metric	Value
Latency (with grammar)	1.2s avg
Latency (unconstrained)	0.4s
Syntax validity (constrained)	100% (3/3)
Syntax validity (unconstrained)	0% (0/3)
Overhead	3x slower, worth it for correctness

Critical Requirements

Grammar syntax - llguidance doesn't support Lark inline rules:

# ❌ WRONG
?start: function_body

# ✅ CORRECT
start: function_body

Completion vs full generation - use the right grammar:

# Completion (prompt has partial code)
prompt = "def foo():"
grammar = PYTHON_FUNCTION_BODY  # Starts with body only

# Full generation (prompt is description)
prompt = "write a sum function"  
grammar = PYTHON_FUNCTION  # Starts with "def"

See docs/GRAMMAR_CONSTRAINTS.md for details.

Repository Structure

src/maze/
├── core/           # Types, constraints, pipeline
├── synthesis/      # Grammar templates (python.py, typescript.py, rust.py)
├── orchestrator/   # Provider adapters (modal.py)
├── indexer/        # Language indexers (extract symbols, types)
├── validation/     # Syntax/type checking
└── repair/         # Error correction (not fully integrated)

tests/
├── unit/           # Fast unit tests (core types, config)
├── validation/     # Constraint enforcement tests (require Modal)
└── integration/    # Integration tests (require external services)

deployment/modal/   # vLLM + llguidance on Modal.com
docs/               # Documentation

Documentation

GRAMMAR_CONSTRAINTS.md: Complete technical guide
QUICK_REFERENCE.md: One-page critical rules
TEST_RESULTS_SUMMARY.md: Test validation evidence
CONTRIBUTING.md: Development guidelines

Modal Deployment

# Deploy
modal deploy deployment/modal/modal_app.py

# Endpoint
curl https://rand--maze-inference-mazeinferenceserver-fastapi-app.modal.run/health

# Generate with grammar
curl -X POST https://rand--maze-inference-mazeinferenceserver-fastapi-app.modal.run/generate \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "def test():\n    ",
    "grammar": "start: simple\nsimple: \"return \" NUMBER\nNUMBER: /[0-9]+/",
    "max_tokens": 16,
    "temperature": 0.0
  }'

Known Issues

Complex grammars with INDENT/DEDENT can fail (use simple grammars)
Left-recursive grammars cause incomplete generation
Examples require Modal endpoint (skip in CI)
Type system exists but not integrated into generation yet

Testing

# Core unit tests (fast, no external deps)
uv run pytest tests/unit/test_core/test_types.py -v
uv run pytest tests/unit/test_config.py -v

# Constraint enforcement (requires Modal)
export MODAL_ENDPOINT_URL=https://...
uv run pytest tests/validation/test_constraint_enforcement.py::TestComplexScenarios -v

License

MIT - see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 179 Commits
.archive		.archive
.beads		.beads
.github		.github
.maze		.maze
benchmarks		benchmarks
deployment		deployment
docs		docs
examples		examples
integrations		integrations
specs		specs
src/maze		src/maze
tests		tests
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
AGENT_GUIDE.md		AGENT_GUIDE.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
COMPLETION_GRAMMAR_IMPLEMENTATION.md		COMPLETION_GRAMMAR_IMPLEMENTATION.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
REPOSITORY_STATUS.md		REPOSITORY_STATUS.md
TEST_RESULTS_SUMMARY.md		TEST_RESULTS_SUMMARY.md
coverage.json		coverage.json
pyproject.toml		pyproject.toml
temp.o		temp.o
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Maze

What It Does

Status

Quick Start

How It Works

Performance

Critical Requirements

Repository Structure

Documentation

Modal Deployment

Known Issues

Testing

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

rand/maze

Folders and files

Latest commit

History

Repository files navigation

Maze

What It Does

Status

Quick Start

How It Works

Performance

Critical Requirements

Repository Structure

Documentation

Modal Deployment

Known Issues

Testing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages