Ralph Loop

Autonomous Coding Agent

Describe it. Ralph builds it.

From a single task description to tested, committed, production-ready code
with human approval at every step.

Overview

Ralph Loop takes a plain-English task description and autonomously builds the entire project — specification, task breakdown, code, tests, QA review, and git commits. You approve the spec and task list before any coding begins. Every change is reviewed by a separate QA agent. If something fails, a healer agent fixes it automatically.

The result: tested, committed code with clean git history, delivered in minutes.

At a Glance

What you provide

A task description in plain English
Your API key
Budget limit (optional)

What Ralph delivers

Application specification (spec.md)
Atomic task breakdown (prd.json)
Working code with tests
Clean git history (1 commit per task)
Cost and analytics dashboard

Proven Results

These are actual runs with real API calls — not benchmarks, not mocks.

Project	Tasks	Tests Generated	Coverage	Cost	Time
Todo API _{FastAPI + SQLite + CRUD + validation}	10/10	47 pass	—	$2.48	20 min
URL Shortener _{Cache + rate limiting + click tracking}	6/6	35 pass	—	$2.81	20 min
Unit Converter _{CLI + 3 unit types + registry pattern}	12/12	66 pass	98%	$5.73	30 min
Existing Codebase _{Add search to Todo API (zero regressions)}	2/2	58 pass	—	$0.89	9 min

35 out of 35 real API tasks completed. 158 framework tests passing.

How It Works

┌─────────────────────────────────────────────────────────────────┐
│                                                                   │
│   You: "Build a REST API with FastAPI for managing todo items"    │
│                                                                   │
│         │                                                         │
│         ▼                                                         │
│   ┌─────────────┐                                                │
│   │ SPEC GEN    │  LLM writes spec.md                            │
│   └──────┬──────┘  (architecture, models, API, tests)            │
│          │                                                        │
│          ▼                                                        │
│   ┌─────────────┐                                                │
│   │ YOU REVIEW  │  Full-screen markdown viewer                   │
│   │ & APPROVE   │  Edit, download, or reject                     │
│   └──────┬──────┘                                                │
│          │                                                        │
│          ▼                                                        │
│   ┌─────────────┐                                                │
│   │ TASK SPLIT  │  spec.md → atomic tasks (prd.json)             │
│   └──────┬──────┘  Each with acceptance criteria                 │
│          │                                                        │
│          ▼                                                        │
│   ┌─────────────┐  For each task:                                │
│   │ CODE LOOP   │  Code → Test → QA Review → Heal → Commit      │
│   │             │  Fresh context per iteration                    │
│   └──────┬──────┘  Separate QA sentinel per task                 │
│          │                                                        │
│          ▼                                                        │
│   ┌─────────────┐                                                │
│   │ DELIVERED   │  All tests pass. Clean git. Analytics.         │
│   └─────────────┘                                                │
│                                                                   │
└─────────────────────────────────────────────────────────────────┘

Setup

Prerequisites

Requirement	Why
Python 3.12+	Runtime
Claude Code CLI	`npm install -g @anthropic-ai/claude-code`
Anthropic API key	Or Azure Foundry endpoint, or OpenAI key
Node.js 18+	Only if modifying the web dashboard

Install

git clone https://github.com/fnusatvik07/autonomous-coding-ralph-loop.git
cd autonomous-coding-ralph-loop

# With uv (recommended)
uv pip install -e ".[web]"

# Or with pip
pip install -e ".[web]"

Drop [web] if you only want the CLI without the dashboard.

Configure

cp .env.example .env

Then set your API key in .env:

Option A — Anthropic API (simplest)

ANTHROPIC_API_KEY=sk-ant-your-key-here

Option B — Azure Foundry

CLAUDE_CODE_USE_FOUNDRY=1
ANTHROPIC_FOUNDRY_API_KEY=your-foundry-key
ANTHROPIC_FOUNDRY_BASE_URL=https://your-endpoint.azure.com/anthropic/
ANTHROPIC_DEFAULT_SONNET_MODEL=claude-opus-4-6

Option C — OpenAI (via Deep Agents)

OPENAI_API_KEY=sk-proj-your-key-here
RALPH_PROVIDER=deep-agents
RALPH_MODEL=openai:gpt-4o

Verify

ralph --version
ralph --help

Usage

CLI

ralph run "Build a REST API with FastAPI for a todo app"

ralph run "Build a CLI tool" -m claude-opus-4-20250514     # specific model
ralph run "Build something" --budget 10.00                  # budget cap
ralph run "Add auth" -w ./my-project                        # existing project

ralph resume -w ./my-project                                # continue previous run
ralph status -w ./my-project                                # check progress
ralph analytics -w ./my-project                             # cost breakdown

Web Dashboard

ralph web                    # opens http://localhost:8420
ralph web -w ./my-project    # point at specific workspace
ralph web -p 9000            # custom port

The dashboard walks you through: task input → spec review → task approval → live coding terminal → results browser

Key Features

2-Step Spec Flow _{Task → spec.md → human review → prd.json → human review → code. Nothing runs without approval.}	QA Sentinel _{A separate LLM session reviews every code change. Blocks on failing tests, security issues, or missing coverage.}	Healer Loop _{When QA fails, a debugging specialist iterates up to 5 times. Auto-rollback on final failure.}
Multi-Model Routing _{Haiku for scaffolding. Sonnet for features. Opus for architecture. 60% cost reduction.}	Reflexion _{LLM analyzes why it failed and stores the lesson. Future iterations read these before starting.}	Git Checkpoints _{Tags before each task. Rollback to last known-good state on failure. Clean squash on success.}
Budget Control _{Set a max spend with --budget. Warning at 80%. Hard stop when exceeded.}	Full Observability _{sessions.jsonl with cost/duration per session. Structured logging. Web analytics dashboard.}	Safety _{15 regex patterns blocking dangerous shell commands. acceptEdits permission model. Env filtering.}

Project Structure

ralph/
├── cli.py                # CLI commands
├── config.py             # Configuration
├── loop.py               # Main orchestrator
├── models.py             # Data models
├── providers/
│   ├── claude_sdk.py     # Claude Agent SDK
│   └── deep_agents.py    # Deep Agents SDK (any LLM)
├── prompts/
│   └── templates.py      # All prompt templates
├── spec/
│   └── generator.py      # spec.md → prd.json
├── qa/
│   ├── sentinel.py       # Quality gate
│   └── healer.py         # Fix loop
├── routing.py            # Model routing by complexity
├── reflexion.py          # Failure analysis
├── checkpoint.py         # Git checkpoints
├── observability.py      # Logging + analytics
├── web/
│   ├── server.py         # FastAPI + WebSocket
│   ├── runner.py         # WebRalphLoop
│   └── events.py         # Event bus
├── memory/
│   ├── progress.py       # Iteration log
│   └── guardrails.py     # Failure memory
frontend/                 # React + TypeScript + Tailwind
tests/                    # 158 tests, 20 files
.claude/skills/           # /spec, /code, /qa, /status

Workspace Output

When Ralph runs, it creates .ralph/ in the project directory:

File	Purpose
`spec.md`	Application specification (human-readable)
`prd.json`	Task queue with status tracking
`progress.md`	Iteration log with learnings
`guardrails.md`	Failure signs for future iterations
`reflections.md`	LLM failure analysis
`sessions.jsonl`	Per-session cost, duration, tools
`ralph.log`	Structured debug log

CLI Reference

Command	Description
`ralph run "task"`	Start the coding loop
`ralph run -f task.md`	Task from a file
`ralph resume`	Continue from existing PRD
`ralph status`	Show task progress
`ralph analytics`	Cost and session analytics
`ralph web`	Launch web dashboard
`ralph progress`	Iteration log
`ralph guardrails`	Failure memory
`ralph index`	Codebase index

Tests

python -m pytest tests/ -v       # 158 tests across 20 files

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
.claude/skills		.claude/skills
.github/workflows		.github/workflows
docs		docs
example-output		example-output
frontend		frontend
ralph		ralph
runs/ralph_35581b24		runs/ralph_35581b24
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
ARCHITECTURE.md		ARCHITECTURE.md
PLAN.md		PLAN.md
README.md		README.md
Screenshot 2026-04-01 at 7.27.43 AM.png		Screenshot 2026-04-01 at 7.27.43 AM.png
main.py		main.py
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ralph Loop

Autonomous Coding Agent

Overview

At a Glance

Proven Results

How It Works

Setup

Prerequisites

Install

Configure

Verify

Usage

CLI

Web Dashboard

Key Features

Project Structure

Workspace Output

CLI Reference

Tests

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ralph Loop

Autonomous Coding Agent

Overview

At a Glance

Proven Results

How It Works

Setup

Prerequisites

Install

Configure

Verify

Usage

CLI

Web Dashboard

Key Features

Project Structure

Workspace Output

CLI Reference

Tests

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages