Autonomous project builder for Claude Code that doesn't forget what it's doing.
Built 61 features for a Rust API server without human intervention. No degradation. No loops. No "I've lost track of what we're building."
AI coding agents degrade over long sessions. Context windows fill with stale information, the model loses track of decisions, and you end up babysitting what should be autonomous work.
This happens because most agent setups treat context like a pile - just keep adding until it breaks.
Context Engine uses a four-layer memory architecture based on research from Google, Stanford, and Anthropic:
| Layer | Purpose | Lifecycle |
|---|---|---|
| Working Context | Current task only | Rebuilt each session |
| Episodic Memory | Recent decisions, patterns | Rolling window |
| Semantic Memory | Project knowledge, architecture | Persistent |
| Procedural Memory | What worked, what failed | Append-only |
Each session starts fresh with computed context instead of accumulated garbage.
PatchForge - An agentless Linux patch management system in Rust:
- 61 features
- Fully autonomous (no human intervention after init)
- 158+ tests passing
- Subagents verifying each feature
- Zero context degradation
The loop runs overnight. Each session: compile context → implement feature → run tests → subagent review → commit → exit. Repeat until done.
# Clone
git clone https://github.com/zeddy89/context-engine.git
cd context-engine
# Install (just copies scripts)
./install.sh
# Create a new project
python3 ~/tools/context-engine/orchestrator.py --new ~/projects/my-app --model opus
# Or run fully autonomous after init
~/tools/context-engine/loop-runner.py ~/projects/my-app --model opus- Claude Code CLI installed
- Python 3.10+
- Git
python3 orchestrator.py --new ~/projects/my-app --model opusThis will:
- Create the project directory
- Initialize git
- Set up the context-engineered harness
- Open a shell for you to add MCPs (type
exitwhen done) - Run Session 1 to generate
feature_list.json - Start the autonomous implementation loop
During setup (or anytime), add MCPs via Claude Code CLI:
# Documentation lookup (recommended - get API key from ref.tools)
claude mcp add --transport http Ref https://api.ref.tools/mcp --header "x-ref-api-key: YOUR_KEY"
# Alternative docs (no key needed)
claude mcp add context7
# Browser automation for UI testing
claude mcp add playwright npx @playwright/mcp@latest
# Verify what's configured
claude mcp listMCPs are registered with Claude Code directly - no config files needed.
After initialization:
./loop-runner.py ~/projects/my-app --model opusThis will:
- Pick the next incomplete feature
- Compile fresh context
- Run Claude Code session
- Use MCP tools for documentation lookup
- Run tests
- Invoke subagents for review
- Commit and repeat
Stop it anytime with Ctrl+C. Resume later - it picks up where it left off.
python3 orchestrator.py --project ~/projects/my-app --model opuspython3 orchestrator.py --status --project ~/projects/my-appAfter initialization, your project gets:
my-app/
├── .agent/
│ ├── AGENT_RULES.md # Memory model instructions
│ ├── working-context/ # Rebuilt each session
│ ├── memory/ # Persistent knowledge
│ │ ├── strategies/ # What worked, what failed
│ │ ├── entities/ # Discovered models, services
│ │ └── constraints/ # Project constraints
│ ├── artifacts/ # Large outputs by reference
│ ├── hooks/ # Context compilation scripts
│ └── workflows/ # Init, implement, debug workflows
├── .claude/
│ └── agents/ # Subagents (code-reviewer, test-runner)
├── feature_list.json # Atomic features with status
└── CLAUDE.md # Instructions for Claude Code
┌─────────────────────────────────────────────────────────────┐
│ Session Start │
├─────────────────────────────────────────────────────────────┤
│ 1. Compile fresh working context │
│ └─ Pull relevant memory, not everything │
│ 2. Check failure log │
│ └─ Don't repeat past mistakes │
│ 3. Look up docs via MCP (Ref, Context7) │
│ 4. Implement single feature │
│ 5. Run tests (mandatory) │
│ 6. Subagent review (@code-reviewer, @test-runner) │
│ 7. Update feature_list.json │
│ 8. Commit with "session: completed {feature_id}" │
│ 9. Exit cleanly │
└─────────────────────────────────────────────────────────────┘
│
▼
Loop runner starts next session
Features are defined in feature_list.json:
{
"features": [
{
"id": "F001",
"description": "Project scaffold with Cargo.toml",
"priority": 1,
"passes": true
},
{
"id": "F002",
"description": "Database connection pool",
"priority": 2,
"dependencies": ["F001"],
"passes": false
}
]
}The loop runner:
- Respects dependencies
- Skips completed features
- Marks blocked features after repeated failures
- Syncs with git history (recovers from missed updates)
--model sonnet # Faster, good for most features (default)
--model opus # Better for complex architecture decisions| Flag | Description |
|---|---|
--new PATH |
Create new project at path |
--project PATH |
Continue existing project |
--model MODEL |
sonnet or opus |
--mcp-preset NAME |
Suggest MCPs for preset (rust/python/node/web) |
--debug |
Show debug output |
--status |
Show project status |
The harness includes specialized subagents that Claude Code invokes during implementation:
| Agent | Purpose |
|---|---|
@code-reviewer |
Reviews changes for issues |
@test-runner |
Runs tests and analyzes failures |
@feature-verifier |
End-to-end feature verification |
@debugger |
Analyzes errors and suggests fixes |
The harness now automatically detects feature complexity and adjusts subagent usage:
HIGH complexity (security, crypto, auth, ssh, credentials, permissions): Uses all 3 subagents (code-reviewer, test-runner, feature-verifier)
MEDIUM complexity (api, endpoint, database, repository, patch, service, handler): Uses just test-runner subagent
LOW complexity (refactor, rename, cleanup, docs, simple changes): Just runs tests directly, no subagents
This speeds up simple features from ~10 min to ~4-5 min while keeping thorough review for security-sensitive code. The complexity is shown in the output like: 🔧 Implementing: patch-002 [MEDIUM] - description...
MCPs give Claude access to documentation, databases, and tools during sessions.
# Documentation (pick one or both)
claude mcp add --transport http Ref https://api.ref.tools/mcp --header "x-ref-api-key: YOUR_KEY"
claude mcp add context7
# UI testing
claude mcp add playwright npx @playwright/mcp@latest
# Database access
claude mcp add postgres # if using PostgreSQLWithout MCP, Claude guesses at APIs based on training data (potentially outdated).
With Ref MCP, Claude looks up current documentation before writing code. Fewer hallucinated APIs, fewer test failures.
The init session didn't complete properly. Run:
cd ~/projects/my-app
claude --model opus -p "Read .agent/workflows/init.md and create feature_list.json"The harness auto-syncs with git history. If a commit says "session: completed F003" but feature_list.json shows passes: false, the next loop iteration fixes it.
MCPs must be added via claude mcp add, not a config file:
cd ~/projects/my-app
claude mcp add context7
claude mcp list # verify it's thereThe harness enforces tests, but Claude might skip them. Check:
- Test framework is set up (
cargo test/pytest/npm testworks manually) - CLAUDE.md mentions test requirements
- Run with
--debugto see what's happening
See docs/ARCHITECTURE.md for the full explanation of:
- Four-layer memory model
- Context compilation algorithm
- Artifact reference system
- Feedback capture loop
Issues and PRs welcome. This is actively developed.
Key areas:
- More MCP integrations
- Better test enforcement
- Support for other AI coding tools (Cursor, Aider, etc.)
- Parallel feature implementation
MIT - 2025 Sheldon Lewis
- Research foundation from Google's MemGPT, Stanford's Generative Agents, and Anthropic's context management papers
- Nate's Substack for the "computed context" mental model
- Built for use with Claude Code
