Autonomous agent loop harness for AI coding agents.
Ralph manages features through a structured lifecycle: specify, plan, decompose
into a DAG of tasks, then iteratively invoke an agent to complete them one at a
time. Task state lives in a SQLite database, enabling dependency tracking,
automatic unblocking, and hierarchical task relationships. Ralph works with any
ACP-compliant agent binary -- the default is Claude Code,
but you can swap in any agent via the --agent flag, the RALPH_AGENT
environment variable, or the [agent] section in .ralph.toml.
Warning
Ralph can (and possibly WILL) destroy anything you have access to, according
to the whims of the LLM. Use ralph run --limit=1 to test before unleashing
unattended loops.
Install the latest release with the shell installer (macOS/Linux):
curl --proto '=https' --tlsv1.2 -LsSf https://github.com/Studio-Sasquatch/ralph/releases/latest/download/ralph-installer.sh | shPre-built binaries for all platforms are also available on the GitHub releases page.
cargo install --path .Or build manually:
cargo build --release
./target/release/ralph --helpRalph organizes work around features that progress through a defined
lifecycle: draft -> planned -> ready -> running -> done/failed.
# 1. Initialize a Ralph project
ralph init
# 2. Create a feature (interactive spec → plan → task DAG)
ralph feature create auth
# 3. Run the agent loop
ralph run authThe feature create command runs the entire lifecycle in one shot: a spec
interview (you collaborate with the agent to refine a specification), an
automated review pass, a plan interview (you refine the implementation plan),
another review pass, and finally DAG decomposition into concrete tasks. Each
phase skips if its output file already exists on disk, so you can resume an
interrupted feature create without losing progress.
For quick one-off work, create standalone tasks instead:
ralph task add "Fix the login bug" # Non-interactive, scriptable
ralph task create # Interactive, Claude-assisted
ralph task list # See what you have
ralph run t-abc123 # Run a specific task by IDRalph now uses a ratatui interface by default when running in a TTY.
ralph run <target>opens a live run cockpit (iteration/model/task state, tool activity, stream output)- Interactive authoring flows (
ralph feature create,ralph task create) use in-app multiline modals - Non-JSON browse commands (
feature list,task list/show/tree,task deps list) open explorer views - Destructive task actions (
task delete/done/fail/reset) request confirmation in UI mode; pass--yesto bypass
Use --no-ui to force plain text output, or set RALPH_UI=0.
In non-interactive contexts (CI, pipes, redirected stdout/stderr), Ralph auto-falls back to plain output.
ralph auth always delegates to claude auth login and runs in plain terminal mode.
flowchart TD
subgraph "Feature Lifecycle"
create[ralph feature create] --> dag[(SQLite DAG)]
end
subgraph "Run Loop"
run[ralph run] --> ready{Ready tasks?}
ready -- yes --> claim[Claim task]
claim --> model[Model strategy selects model]
model --> agent[Spawn agent]
agent --> sigils{Parse sigils}
sigils -- task-done --> verify{Verify?}
sigils -- task-failed --> retry{Retries left?}
verify -- pass --> done[Complete task]
verify -- fail --> retry
retry -- yes --> claim
retry -- no --> fail[Fail task]
done --> auto[Auto-transitions]
fail --> auto
auto -- unblock dependents --> resolved{All resolved?}
auto -- auto-complete parent --> resolved
auto -- auto-fail parent --> resolved
resolved -- no --> ready
resolved -- yes --> exit0[Exit 0]
ready -- "no, but incomplete" --> exit2[Exit 2 Blocked]
end
dag --> run
ralph feature create <name>runs an interactive session to craft a specification, then creates an implementation plan, then decomposes the plan into a DAG of tasks stored in.ralph/progress.db-- all in one commandralph run <target>picks the next ready task (by priority, then creation time), claims it, and invokes the agent- The agent works on the assigned task and signals the result:
<task-done>{task_id}</task-done>-- task completed, triggers auto-unblocking of dependents<task-failed>{task_id}</task-failed>-- task failed, retried up to--max-retriestimes
- After each task completion, a verification agent (a read-only agent session) checks the work. On failure, the task is retried.
- Ralph checks if all tasks are resolved; if not and the limit has not been reached, it picks the next ready task and loops
.ralph.toml # Project configuration ([execution], [agent])
.ralph/
progress.db # SQLite DAG database (gitignored)
features/ # Feature specs and plans
<name>/
spec.md # Feature specification
plan.md # Implementation plan
knowledge/ # Project knowledge entries
<entry-name>.md # Tagged markdown knowledge file
.claude/
skills/ # Reusable agent skills
<name>/
SKILL.md # Skill definition with YAML frontmatter
The .ralph.toml file controls project-level defaults:
[execution]
# max_retries = 3
# verify = true
[agent]
# command = "claude"Tasks are stored in a SQLite database with:
- Hierarchical relationships -- parent/child tasks with derived parent status
- Dependencies -- blocker/blocked relationships with cycle detection
- Status transitions --
pending->in_progress->done/failed, with auto-transitions (completing a task unblocks its dependents; completing all children auto-completes the parent) - Claim system -- each running Ralph agent gets a unique ID
(
agent-{8 hex}) and claims tasks atomically - Feature scoping -- tasks belong to features and are queried by feature context during execution
- Feature-driven -- Work is organized into features with specifications, plans, and task DAGs. The lifecycle provides structure and traceability.
- DAG-first -- All work is tracked as tasks in a SQLite DAG. No work happens outside the DAG.
- One task per iteration -- Each agent invocation works on exactly one claimed task, keeping context focused.
- Signal-driven -- The agent communicates results via sigils (
<task-done>,<task-failed>,<promise>,<next-model>). Ralph never interprets the agent's prose. - Auto-transitions -- The DAG manages cascading state changes: completing a task unblocks dependents; completing all children auto-completes the parent; failing a child auto-fails the parent.
- Verify then trust -- A read-only verification agent checks each completed task before accepting it.
- Agent-agnostic -- Ralph works with any ACP-compliant agent binary,
configurable via
--agentflag,RALPH_AGENTenv var, or[agent].commandin.ralph.toml. The default agent isclaude.
After each task completion, Ralph spawns a read-only agent session that:
- Reads the relevant source files
- Runs applicable tests
- Checks acceptance criteria from the task description
- Emits
<verify-pass/>or<verify-fail>reason</verify-fail>
Failed verifications trigger a retry (up to --max-retries). Disable
verification with --no-verify.
Ralph maintains two complementary memory systems that feed context into each iteration's system prompt:
- Run Journal -- Each iteration writes a journal entry to SQLite (outcome,
model, duration, cost, files modified, notes from
<journal>sigils). Smart selection combines recent entries from the current run with FTS5 full-text search matches from prior runs, within a 3000-token budget. - Project Knowledge -- Reusable knowledge entries stored as tagged markdown
files in
.ralph/knowledge/. The agent emits<knowledge>sigils to create entries. Discovery scans the directory and scores entries by tag relevance to the current task, feature, and recently modified files. Entries support[[Title]]references for zettelkasten-style cross-linking; link expansion pulls in related entries not directly matched by tags. Rendered within a 2000-token budget.
Both systems are always active -- there is no toggle to disable them.
Ralph swaps between Claude models (opus, sonnet, haiku) across loop
iterations to optimize cost and capability. Use --model-strategy to select a
strategy, or --model to pin a specific model.
ralph run auth --model=opus # Always use opus (fixed)
ralph run auth --model-strategy=cost-optimized # Default: pick model by progress signals
ralph run auth --model-strategy=escalate # Start at haiku, escalate on errors
ralph run auth --model-strategy=plan-then-execute # Opus for iteration 1, sonnet aftercost-optimized(default) -- Picks the cheapest model likely to succeed. Defaults tosonnet; escalates toopuson error/failure signals; drops tohaikuwhen tasks are completing cleanly.fixed-- Always uses the model from--model. No swapping.escalate-- Starts athaiku. On failure signals (errors, stuck, panics), escalates tosonnetthenopus. Never auto-de-escalates; only a Claude hint can step back down.plan-then-execute-- Usesopusfor the first iteration (planning), thensonnetfor all subsequent iterations (execution).
Claude can override the strategy for the next iteration by emitting a
<next-model> sigil in its output:
<next-model>opus</next-model>
<next-model>sonnet</next-model>
<next-model>haiku</next-model>
Hints always override the strategy's choice, apply to the next iteration only, and are optional.
ralph [--no-ui] init Initialize a new Ralph project
ralph [--no-ui] feature create <name> Create feature: spec → plan → task DAG
ralph [--no-ui] feature list List all features and their status
ralph [--no-ui] feature delete <name> [-y] Delete a feature (UI confirm unless -y)
ralph [--no-ui] task add <TITLE> Add a standalone task (scriptable)
ralph [--no-ui] task create Interactively create a task (Claude-assisted)
ralph [--no-ui] task list List tasks
ralph [--no-ui] task delete <id> [-y] Delete task (UI confirm unless -y)
ralph [--no-ui] task done <id> [-y] Mark task done (UI confirm unless -y)
ralph [--no-ui] task fail <id> [-y] Mark task failed (UI confirm unless -y)
ralph [--no-ui] task reset <id> [-y] Reset task to pending (UI confirm unless -y)
ralph [--no-ui] run <target> Run the agent loop on a feature or task
ralph [--no-ui] auth Authenticate with the agent
Global option:
--no-ui— disableratatuiand force plain terminal output
ralph run [OPTIONS] <TARGET>
Arguments:
<TARGET> Feature name or task ID (t-...)
Options:
--limit <N> Maximum iterations (0 = unlimited)
--model <MODEL> Model: opus, sonnet, haiku (implies --model-strategy=fixed)
--model-strategy <STRATEGY>
Strategy: fixed, cost-optimized, escalate, plan-then-execute
[default: cost-optimized]
--max-retries <N> Maximum retries for failed tasks
--no-verify Disable autonomous verification
--agent <CMD> Agent command (env: RALPH_AGENT, default: claude)
-h, --help Print help
The create subcommand accepts --model <MODEL> and --agent <CMD> flags.
| Variable | Description |
|---|---|
RALPH_LIMIT |
Default iteration limit |
RALPH_MODEL |
Default model (opus/sonnet/haiku) |
RALPH_MODEL_STRATEGY |
Default model strategy |
RALPH_AGENT |
Agent command (default: claude) |
RALPH_UI |
UI mode: auto (default), 1/on, 0/off |
RALPH_ITERATION |
Current iteration (for resume) |
RALPH_TOTAL |
Total iterations (for display) |
| Exit Code | Outcome | Meaning |
|---|---|---|
| 0 | Complete | All tasks resolved |
| 0 | LimitReached | Iteration limit hit (not an error) |
| 1 | Failure | Critical failure (FAILURE sigil or error) |
| 2 | Blocked | No ready tasks but incomplete tasks remain |
| 3 | NoPlan | DAG is empty -- run ralph feature create |
Requires Rust toolchain. With Nix:
nix develop
cargo build
cargo testPull requests and pushes to main run the ci-smoke workflow
(.github/workflows/ci-smoke.yml), which:
- Builds mock agent binaries (
--features test-mock-agents --examples) - Runs
cargo test(unit + integration) - Runs TTY smoke tests via
expect(tests/smoke/tty_smoke.sh) - Runs non-TTY fallback assertions (
tests/smoke/non_tty_smoke.sh)
To run smoke tests locally (requires expect):
cargo build --features test-mock-agents --examples
bash tests/smoke/tty_smoke.sh
bash tests/smoke/non_tty_smoke.shReleases are built by cargo-dist and published via GitHub Actions
when a v* tag is pushed. To cut a release: bump the version in Cargo.toml,
commit, tag vX.Y.Z, and push the tag. The CI produces platform tarballs,
an installer script, and checksums.
Heavily inspired by Chris Barrett's shell-based ralph harness.
MIT