Describe it. Ralph builds it.
From a single task description to tested, committed, production-ready code
with human approval at every step.
Ralph Loop takes a plain-English task description and autonomously builds the entire project — specification, task breakdown, code, tests, QA review, and git commits. You approve the spec and task list before any coding begins. Every change is reviewed by a separate QA agent. If something fails, a healer agent fixes it automatically.
The result: tested, committed code with clean git history, delivered in minutes.
|
What you provide
|
What Ralph delivers
|
These are actual runs with real API calls — not benchmarks, not mocks.
| Project | Tasks | Tests Generated | Coverage | Cost | Time |
|---|---|---|---|---|---|
| Todo API FastAPI + SQLite + CRUD + validation |
10/10 | 47 pass | — | $2.48 | 20 min |
| URL Shortener Cache + rate limiting + click tracking |
6/6 | 35 pass | — | $2.81 | 20 min |
| Unit Converter CLI + 3 unit types + registry pattern |
12/12 | 66 pass | 98% | $5.73 | 30 min |
| Existing Codebase Add search to Todo API (zero regressions) |
2/2 | 58 pass | — | $0.89 | 9 min |
35 out of 35 real API tasks completed. 158 framework tests passing.
┌─────────────────────────────────────────────────────────────────┐
│ │
│ You: "Build a REST API with FastAPI for managing todo items" │
│ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ SPEC GEN │ LLM writes spec.md │
│ └──────┬──────┘ (architecture, models, API, tests) │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ YOU REVIEW │ Full-screen markdown viewer │
│ │ & APPROVE │ Edit, download, or reject │
│ └──────┬──────┘ │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ TASK SPLIT │ spec.md → atomic tasks (prd.json) │
│ └──────┬──────┘ Each with acceptance criteria │
│ │ │
│ ▼ │
│ ┌─────────────┐ For each task: │
│ │ CODE LOOP │ Code → Test → QA Review → Heal → Commit │
│ │ │ Fresh context per iteration │
│ └──────┬──────┘ Separate QA sentinel per task │
│ │ │
│ ▼ │
│ ┌─────────────┐ │
│ │ DELIVERED │ All tests pass. Clean git. Analytics. │
│ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────┘
| Requirement | Why |
|---|---|
| Python 3.12+ | Runtime |
| Claude Code CLI | npm install -g @anthropic-ai/claude-code |
| Anthropic API key | Or Azure Foundry endpoint, or OpenAI key |
| Node.js 18+ | Only if modifying the web dashboard |
git clone https://github.com/fnusatvik07/autonomous-coding-ralph-loop.git
cd autonomous-coding-ralph-loop
# With uv (recommended)
uv pip install -e ".[web]"
# Or with pip
pip install -e ".[web]"Drop
[web]if you only want the CLI without the dashboard.
cp .env.example .envThen set your API key in .env:
Option A — Anthropic API (simplest)
ANTHROPIC_API_KEY=sk-ant-your-key-hereOption B — Azure Foundry
CLAUDE_CODE_USE_FOUNDRY=1
ANTHROPIC_FOUNDRY_API_KEY=your-foundry-key
ANTHROPIC_FOUNDRY_BASE_URL=https://your-endpoint.azure.com/anthropic/
ANTHROPIC_DEFAULT_SONNET_MODEL=claude-opus-4-6Option C — OpenAI (via Deep Agents)
OPENAI_API_KEY=sk-proj-your-key-here
RALPH_PROVIDER=deep-agents
RALPH_MODEL=openai:gpt-4oralph --version
ralph --helpralph run "Build a REST API with FastAPI for a todo app"
ralph run "Build a CLI tool" -m claude-opus-4-20250514 # specific model
ralph run "Build something" --budget 10.00 # budget cap
ralph run "Add auth" -w ./my-project # existing project
ralph resume -w ./my-project # continue previous run
ralph status -w ./my-project # check progress
ralph analytics -w ./my-project # cost breakdownralph web # opens http://localhost:8420
ralph web -w ./my-project # point at specific workspace
ralph web -p 9000 # custom portThe dashboard walks you through: task input → spec review → task approval → live coding terminal → results browser
|
2-Step Spec Flow |
QA Sentinel |
Healer Loop |
|
Multi-Model Routing |
Reflexion |
Git Checkpoints |
|
Budget Control |
Full Observability |
Safety |
ralph/
├── cli.py # CLI commands
├── config.py # Configuration
├── loop.py # Main orchestrator
├── models.py # Data models
├── providers/
│ ├── claude_sdk.py # Claude Agent SDK
│ └── deep_agents.py # Deep Agents SDK (any LLM)
├── prompts/
│ └── templates.py # All prompt templates
├── spec/
│ └── generator.py # spec.md → prd.json
├── qa/
│ ├── sentinel.py # Quality gate
│ └── healer.py # Fix loop
├── routing.py # Model routing by complexity
├── reflexion.py # Failure analysis
├── checkpoint.py # Git checkpoints
├── observability.py # Logging + analytics
├── web/
│ ├── server.py # FastAPI + WebSocket
│ ├── runner.py # WebRalphLoop
│ └── events.py # Event bus
├── memory/
│ ├── progress.py # Iteration log
│ └── guardrails.py # Failure memory
frontend/ # React + TypeScript + Tailwind
tests/ # 158 tests, 20 files
.claude/skills/ # /spec, /code, /qa, /status
When Ralph runs, it creates .ralph/ in the project directory:
| File | Purpose |
|---|---|
spec.md |
Application specification (human-readable) |
prd.json |
Task queue with status tracking |
progress.md |
Iteration log with learnings |
guardrails.md |
Failure signs for future iterations |
reflections.md |
LLM failure analysis |
sessions.jsonl |
Per-session cost, duration, tools |
ralph.log |
Structured debug log |
| Command | Description |
|---|---|
ralph run "task" |
Start the coding loop |
ralph run -f task.md |
Task from a file |
ralph resume |
Continue from existing PRD |
ralph status |
Show task progress |
ralph analytics |
Cost and session analytics |
ralph web |
Launch web dashboard |
ralph progress |
Iteration log |
ralph guardrails |
Failure memory |
ralph index |
Codebase index |
python -m pytest tests/ -v # 158 tests across 20 filesMIT