Skip to content

Luke-Bradford/autonomy-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

autonomy-engine

Repo-agnostic engine for running Claude Code (and, in future, other CLI agents) autonomy loops against any target repo, from one operator's account.

Quickstart

One guided entry point chains the whole onboarding sequence:

bin/quickstart.sh /path/to/target-repo

It scaffolds the .autonomy/ pack, walks you through the minimum config (board owner/title, model, merge-gate strategy — Enter keeps the shown value; writes are comment-preserving), runs the doctor report, offers to create the dedicated worktree + launchd plist, offers to register the repo for the dashboard, and prints the exact go-live commands. It never runs launchctl itself — loading the supervisor stays a deliberate operator step.

Idempotent: re-run it any time; pressing Enter everywhere changes nothing. Every prompt has a flag twin (--board-owner, --board-title, --model, --merge-gate, --worktree yes|no, --register yes|no) for non-interactive use.

Under the hood it runs the same tools you can drive by hand:

# 1. Scaffold a new target repo's pack:
bin/onboard.sh /path/to/target-repo
# edit /path/to/target-repo/.autonomy/config.yaml

# 2. Check it's ready:
bin/doctor.sh /path/to/target-repo

# 3. Create its dedicated worktree + launchd plist:
bin/setup_worktree.sh /path/to/target-repo

# 4. Load it (see setup_worktree.sh's own printed next-steps for the exact commands)

The .autonomy/ pack contract

Every target repo needs a .autonomy/ directory with exactly these three files:

  • loop_prompt.md — the standing task, passed as the primary prompt (claude -p).
  • hard_rules.md — non-negotiable safety rules, appended to the session's system prompt.
  • config.yaml — project policy (see schema below). bin/onboard.sh scaffolds all three from templates/autonomy-pack/.

.autonomy/config.yaml existing and parsing is the engine's hard requirement for treating a directory as a valid target repo — doctor.sh/supervisor.sh both refuse to proceed without it.

config.yaml schema

board:
  owner: <github-user-or-org>
  project_title: "<Projects v2 board title>"

engine:
  label: <slug>                # optional; disambiguates two repos sharing a basename
  requires_claude_md: <bool>    # hard-fail (not just warn) if .claude/CLAUDE.md is missing
  account_key: <key>            # optional; which account's shared usage-limit marker this
                                # repo participates in (default: "default" -- all repos on
                                # this machine share one account's rate-limit state)

agent:
  type: claude                  # claude | codex (only claude has an adapter implemented)
  model:
    primary: <model-id>
    fallback: <model-id>
  config: {}                    # opaque, adapter-owned pass-through (unused by the claude adapter)

merge_gate:
  strategy: manual | ci_only | bot_comment | gh_review
  author_login: <string>        # bot_comment only
  marker: <string>               # bot_comment only
  doc_only_extensions: [<ext>]   # bot_comment only, e.g. [".md"]
  reviewer_login: <string>       # gh_review only

worktree:
  default_path: "../.{repo-slug}-autonomy"  # worktree path: positional arg overrides, {repo-slug} substituted

Every value is either optional-with-an-engine-default or required-only-for-the-strategy-that-uses- it. Nothing in the engine hardcodes any one repo's actual values.

{repo-slug} = engine.label if set, else the target repo's directory basename, lowercased, non-alphanumeric runs collapsed to -.

Merge-gate strategies

CI-green is checked first (any failing/pending check refuses; a gh API failure itself refuses, never treated as green; ci_only additionally refuses on zero configured checks). Then:

Strategy What it checks
manual (default) Nothing further — never auto-merges. PRs stay open for a human.
ci_only Nothing further — CI green is the whole gate.
bot_comment Latest matching issue comment (by author_login + marker) postdates the head commit and reads APPROVE, no BLOCKING/REQUEST CHANGES. Includes a doc-only fast path for PRs where every changed file matches doc_only_extensions.
gh_review Latest GitHub Review object from reviewer_login postdates the head commit and its state == APPROVED.

bin/safe_merge.sh <pr-number> is the only sanctioned merge path — the loop must never call gh pr merge directly.

bin/ reference

Script Purpose
supervisor.sh --repo <path> [--agent-type] [--model] [--fallback-model] [--label] The main loop launchd runs
quickstart.sh <target-repo> [flags] Guided single-entry onboarding: onboard → minimum config → doctor → optional worktree → optional dashboard registration → printed go-live commands (never runs launchctl)
control.sh list | register | unregister | start | stop | pause | resume Multi-repo control unit over ~/.config/autonomy/repos: loop states from the supervisor's own lock/sentinel, start/stop via launchctl against the installed plists, graceful pause/resume via the sentinel. --all fans out. Never provisions
onboard.sh <target-repo> Scaffold .autonomy/ (idempotent)
doctor.sh <target-repo> Full readiness report (network calls; diagnostic-only, never provisions)
setup_worktree.sh <target-repo> [worktree-path] Create/reuse the dedicated worktree + install the launchd plist
worktree_gc.sh --repo <path> Prune stale worktrees + merged branches
safe_merge.sh <pr-number> The only sanctioned merge path
board.sh status <issue#> "<status>" | add <issue#> Best-effort GitHub Projects v2 board updates
unblock_dependents.sh <merged-pr-number> Post-merge "blocked by #X" notifier
dashboard.py --repo <path> [--repo …] [--port 8787] Control-room page. Stdlib HTTP+SSE, binds 127.0.0.1 only. Renders the engine's emitted artifacts (session logs, supervisor.log, git/gh, config, quota); lifecycle controls (start / graceful-stop / hard-stop / resume) via a token-guarded POST
agents/claude.sh The Claude Code agent adapter (only one implemented)

Lifecycle: start / graceful-stop / hard-stop

Three levers, distinct on purpose:

  • Start (hard) / stop (hard): launchctl bootstrap / launchctl bootout the plist setup_worktree.sh installed. Hard-stop kills the supervisor process; a mid-session hard-stop interrupts the running agent.
  • Graceful-stop (pause): create the sentinel file <target-repo>/var/autonomy-logs/autonomy-PAUSE. The supervisor checks it at the top of the loop, so the current session always finishes — never a mid-session kill — then idles (polling every PAUSE_POLL=30s). It logs the pause once to supervisor.log.
  • Resume / start-if-stopped: remove the sentinel; the supervisor logs "resuming" and continues. (Under launchd KeepAlive=true, exiting on pause would just be relaunched — idling is the only stop that holds, which is why graceful-stop pauses rather than exits.)

The dashboard's graceful-stop / resume controls (issue #10) drive this sentinel.

All three levers are also available across every registered repo at once through bin/control.sh (start/stop wrap launchctl against the installed plist; pause/resume wrap the sentinel; --all fans out over ~/.config/autonomy/repos).

Usage-limit backoff (account-shared)

When a session hits the account rate limit, the agent adapter extracts the API-reported reset epoch and the supervisor persists it (that split is an invariant) — to the repo-local var/autonomy-logs/.last_usage_reset and to an account-shared marker ~/.config/autonomy/usage-reset.<account_key>, so parallel supervisors on the same account back off together instead of each rediscovering the wall. Waits use the latest valid epoch across both files; garbage/stale/torn markers are ignored per-file (fail-safe), and a clean session clears both. Repos on different accounts set engine.account_key to keep their markers separate.

Control room (P1 dashboard)

bin/dashboard.py --repo /path/to/worktree [--repo /another] [--port 8787]
# open http://127.0.0.1:8787/

--repo can be omitted: discovery falls back to the AUTONOMY_DASHBOARD_REPOS environment variable (newline-separated paths), then to ~/.config/autonomy/repos (one path per line — quickstart.sh's register step appends here). CLI --repo always wins.

A single self-contained local page — stdlib HTTP + SSE, localhost-bind only, no build step. It exposes what the engine already emits, nothing invented:

  • Now — each repo's current worker: working / idle / paused / stopped (working-vs-idle is the freshness of the session log, not the lock pid), current step, in-flight ticket, elapsed.
  • Repos & roles — the multi-role roster (Coder live; PM/QA/Researcher shown per docs/agent-org-design.md, rendered even before they're built).
  • Activity — tree / timeline / tally over the session's stream-json tool calls (subagents nest by parent_tool_use_id).
  • Account quota — real 5-hour and weekly utilization from rate_limit_events.
  • Throughput — server-sampled output tok/min over wall-clock (flatlines when idle).
  • Supervisor voice — the loop's own decisions, tailing supervisor.log.
  • Git in flight — open PRs (CI / review / mergeable) + recently-merged tickets.

Lifecycle controls (per running/stopped repo, in the Now cards): start / graceful-stop / hard-stop / resume. Behind POST /api/control, which requires a per-process token embedded in the served page (defeats cross-origin / DNS-rebinding drive-by) and only ever acts on a managed repo — lifecycle only, never a target repo's trade/order path. Hard-stop and start prompt for confirmation; start never auto-fires. graceful-stop/resume drive the autonomy-PAUSE sentinel; hard-stop/start drive launchctl bootout/bootstrap on the repo's installed plist.

Design + the research behind it: docs/dashboard-design.md, docs/control-room-research.md. Dark/light toggle.

Agent adapters

bin/agents/<type>.sh, dispatched by agent.type. Each implements two functions:

  • agent_invoke(prompt_file, safety_file, model, fallback_model, log_file) -> exit code
  • agent_classify_outcome(log_file, exit_code) -> "success" | "usage_limit [epoch]" | "error"

Only claude.sh exists today. A codex.sh is a real future possibility (Codex's CLI differs structurally — no system-prompt-append flag, its own JSONL schema, no native fallback-model support) but is not built or tested here.

Testing

bash tests/run_all.sh

About

Repo-agnostic engine for running Claude Code autonomy loops against any target repo

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors