A self-evolving autonomous coding agent built on the pi.dev harness. MaxCode combines multi-model orchestration, MemGPT-style 3-tier memory, hyperbolic embeddings, and a Karpathy-style autoresearch ratchet into a single system that gets better the more you use it.
- Autoresearch Ratchet: Every code change is automatically verified (typecheck, tests). If verification fails, the change is reverted. If it passes, the agent looks for further improvements. Quality only goes UP — never down.
- 3-Tier Memory (MemGPT-style): Working memory (hot, in-context), session memory (warm, LanceDB vectors), and archival memory (cold, persistent store). Entities are auto-extracted and relationships tracked.
- Multi-Brain Architecture: Different LLMs for different roles — an architect for planning, a builder for implementation, a sentinel for code review, and local embeddings for search.
- Skill Accumulation: Successful task patterns are automatically extracted and reused. The agent literally learns from its own work.
- Self-Evolution: A genetic algorithm mutates agent parameters (temperature, context window, prompt strategies) and keeps only mutations that improve benchmark scores.
┌─────────────────────────────────────────────┐
│ Agent Loop │
│ observe → think → act → validate │
├─────────────────────────────────────────────┤
│ Context Builder │ LLM Router │ Tools │
├───────────────────┼──────────────┼──────────┤
│ Working Memory │ Session Mem │ Archival │
│ (Map, instant) │ (LanceDB) │ (disk) │
├───────────────────┴──────────────┴──────────┤
│ Evolution Engine │
│ benchmark → mutate → evaluate → keep │
└─────────────────────────────────────────────┘
core/ # Agent loop, LLM router, tool executor, context builder
agent-loop.ts # Main ReAct loop with THE RATCHET (auto-verify + auto-improve)
llm-router.ts # Routes requests to the right model by role
tool-executor.ts # Executes tools (read, write, edit, bash, search, embed)
context-builder.ts# Assembles context window with token budgeting
minimax-client.ts # MiniMax API client (builder model)
ollama-client.ts # Ollama client (sentinel + embeddings)
config.ts # Environment config loader
types.ts # Shared type definitions
memory/ # 3-tier memory system
working-memory.ts # Hot: in-context Map with temporal decay
archival-store.ts # Cold: LanceDB vector store for long-term facts
entity-extractor.ts # Auto-extract entities and relationships from conversations
reflection-engine.ts # Generate insights from accumulated memories
skill-accumulator.ts # Extract reusable skills from successful tasks
conflict-resolver.ts # Handle contradictory memories
index.ts # Unified memory interface
types.ts # Memory type definitions
evolution/ # Self-improvement ratchet
evolution.ts # Genetic algorithm over agent configs
benchmark.ts # Task benchmarks for measuring improvement
agent-config.ts # Mutable agent parameters (the genome)
embeddings/ # Vector search
lance-store.ts # LanceDB vector store wrapper
hybrid-search.ts # BM25 + vector hybrid retrieval
anti-slop/ # Output quality filters
scripts/ # Utility scripts
tests/ # Test suite
docs/ # Documentation
- Node.js ≥ 20
- pi.dev — coding agent harness (provides the architect LLM)
- Ollama — local LLM server for sentinel + embeddings
- GPU — ≥12GB VRAM recommended (RTX 3060+ or equivalent)
ollama pull qwen2.5-coder:14b # Sentinel (code review, verification)
ollama pull nomic-embed-text # Embeddings (vector search)- MiniMax API key — for the builder model (or swap for any OpenAI-compatible API)
# 1. Clone this repo
git clone https://github.com/initiatesofzeus-design/maxcode.git
cd maxcode
# 2. Install dependencies
npm install
# 3. Configure environment
cp .env.example .env
# Edit .env with your API keys
# 4. Start Ollama (if not running)
ollama serve
# 5. Verify the build
node verify-build.mjs
# 6. Run tests
npm test
# 7. Run with pi.dev
pi # From the maxcode directoryEvery code change goes through automatic verification:
- Agent proposes a change (edit, write)
- TypeScript type-checker runs automatically
- Tests run automatically
- If either fails → change is reverted, error fed back to the agent
- If both pass → agent looks for further improvements (simplify, optimize)
- Improvements only kept if they STILL pass verification
Code quality only goes UP. That's the ratchet.
Inspired by MemGPT's tiered memory architecture:
| Tier | Storage | Access Speed | Contents |
|---|---|---|---|
| Working | In-context Map | Instant | Current task, recent entities |
| Session | LanceDB vectors | ~10ms | Conversation history, extracted facts |
| Archival | Disk + LanceDB | ~50ms | Long-term knowledge, learned skills |
Entities and relationships are automatically extracted from conversations. The reflection engine periodically generates insights from accumulated memories.
| Role | Default Model | Purpose |
|---|---|---|
| Architect | Via pi.dev | High-level planning, architecture decisions |
| Builder | minimax-m2.7 | Implementation, code generation |
| Sentinel | qwen2.5-coder:14b | Code review, verification, security checks |
| Embeddings | nomic-embed-text | Semantic search, memory retrieval |
You can swap any model — just update .env.
The evolution engine treats agent configuration as a genome:
- Temperature, max tokens, system prompt fragments, tool preferences
- A genetic algorithm mutates parameters
- Each mutation is benchmarked against coding tasks
- Only improvements survive
Run it with: npm run evolve
See the detailed planning docs:
CODING-TODO.md— Implementation plan with task checklistEVOLUTION-TODO.md— Frontier ideas (hyperbolic embeddings, dream consolidation, curiosity-driven exploration)RESEARCH-TODO.md— What we know and what needs investigation
This is an experimental project. If you want to contribute:
- Fork the repo
- Make changes on a branch
- Ensure
npm testandnpm run typecheckpass - Open a PR
MIT