Rust CLI that orchestrates autonomous coding agents in a managed development loop. Spawns agents, assigns features, verifies results, accumulates context, repeats. Single binary, zero external dependencies.
Forge runs a loop:
┌──────────────────────────────────────────┐
│ 1. Pick next unblocked feature │
│ 2. Spawn agent (Claude / Codex) │
│ 3. Agent implements + commits │
│ 4. CLI runs verify script (exit code) │
│ 5. Pass → done. Fail → reopen. │
│ 6. Orchestrating agent reviews session │
│ 7. Git sync, repeat │
└──────────────────────────────────────────┘
Key ideas:
- Verify scripts are the trust layer — agents claim "done", scripts confirm with exit codes
- Context accumulates in git — decisions, gotchas, patterns, and references persist across sessions
- Skills are markdown — four skills (planning, protocol, orchestrating, adjusting) installed as
.claude/skills/ - CLI never calls an LLM — all intelligence is in the skills, CLI is pure orchestration
- Multi-agent via git worktrees — parallel agents work in isolated branches, merged back after each round
cargo install --path .Requires Claude Code or Codex CLI installed and authenticated.
# 1. Initialize a forge project
forge init "My REST API"
# 2. Plan features (interactive, inside Claude Code)
# Run /forge-planning in a Claude Code session
# 3. Run the autonomous loop
forge runAfter forge init, your project has:
forge.toml # project config
features.json # task list (fill via /forge-planning)
CLAUDE.md # agent instructions (~40 lines)
AGENTS.md # same, for non-Claude agents
context/ # decisions/, gotchas/, patterns/, references/
feedback/ # verify reports, session reviews
scripts/verify/ # one script per feature (exit 0 = pass)
.claude/skills/ # 4 skills installed
forge init <description> # scaffold project
forge run # start development loop (1 agent)
forge run --agents 3 # parallel agents with git worktrees
forge run --max-sessions 10 # cap iterations
forge verify # run all verify scripts
forge status # show feature progress + context counts
forge stop # graceful stop after current session
forge logs agent-1 # tail agent log
forge logs agent-1 -t 100 # last 100 linesforge.toml:
[project]
name = "my-app"
stack = "Rust, axum, sqlx"
# Each role picks its own backend + model
[forge.roles.protocol] # executor: implements features
backend = "claude"
model = "sonnet"
[forge.roles.orchestrating] # reviewer: post-session feedback
backend = "claude"
model = "haiku"
[forge.roles.planning] # architect: feature decomposition
backend = "codex"
model = "o3"
[principles]
readability = "Code understood in one read after an all nighter"
proof = "Tests prove code works, not test that it works"
style = "Follow a style even in private projects"
boundaries = "Divide at abstraction boundaries. APIs guide communication."
[scopes.data-model]
owns = ["src/models/", "src/schema/"]
[scopes.auth]
owns = ["src/auth/"]
upstream = ["data-model"]Supported backends: claude (Claude Code), codex (OpenAI Codex CLI), or any binary name for custom backends.
features.json — the task list agents work from:
{
"features": [
{
"id": "f001",
"type": "implement",
"scope": "data-model",
"description": "Create User struct with validation",
"verify": "./scripts/verify/f001.sh",
"depends_on": [],
"priority": 1,
"status": "pending",
"claimed_by": null,
"blocked_reason": null
}
]
}Statuses: pending → claimed → done (or blocked).
Single agent (forge run):
- Load
features.json, find highest-priority unblocked pending feature - Spawn agent subprocess (
claude --printorcodex exec) - Agent reads CLAUDE.md, claims feature, implements, runs verify, commits
- CLI runs all verify scripts, writes
feedback/last-verify.json - Failed features get reopened automatically
- Git pull to sync
- Orchestrating agent reviews the session, writes
feedback/session-review.mdand context entries - Next iteration
Multi-agent (forge run --agents N):
- Pick up to N claimable features
- Create git worktrees (one per agent, isolated branches)
- Spawn N agents in parallel
- Wait for all to finish
- Merge branches back into main (conflicts → abort + retry next round)
- Verify, orchestrate, repeat
Four markdown skills installed in .claude/skills/:
| Skill | Mode | Purpose |
|---|---|---|
forge-planning |
Interactive | Design doc → feature decomposition |
forge-protocol |
Automated | Agent executor: claim → implement → verify → commit |
forge-orchestrating |
Automated | Post-session review: feedback + context writing |
forge-adjusting |
Interactive | Replan based on new context |
cargo test # 66 tests
cargo build # debug build
cargo clippy # lint