Skip to content

yusugomori/agent-loop

Repository files navigation

Agent Loop icon Agent Loop

Run Codex as a self-correcting local agent loop.

Agent Loop turns one Codex task into a local, inspectable workflow: requirements, research, planning, implementation, verification, review, and summary. Each phase writes durable artifacts into your repository, so a run can be followed, resumed, reviewed, or committed from the same local state.

The CLI is the source of truth. The Electron desktop companion watches those same runs, opens diffs, resumes work, and can switch into PET mode when you want a visual status monitor.

Agent Loop CLI demo

Start in the terminal. Agent Loop streams the phase loop and writes durable run artifacts.

Agent Loop desktop companion demo

Watch runs, inspect diffs, scroll through details, and switch PET mode on in the desktop companion.

The public website is deployed with Cloudflare Pages at https://agent-loop.dev. The site is a static bundle under docs/site: leave the Pages build command empty, set the build output directory to docs/site, and keep website assets under docs/site/assets/.

License

Agent Loop is licensed under the Apache License, Version 2.0.

Copyright 2026 Yusuke Sugomori.

Agent Loop Model

One command starts a local workflow of specialized agents:

  1. The Requirements agent turns the task into acceptance criteria and constraints.
  2. The Researcher reads the repository and records concrete facts, patterns, risks, and likely tests.
  3. The Planner writes an executable plan, and the Plan Reviewer gates it before code changes begin.
  4. The Implementer edits the worktree, then the Verifier runs deterministic checks and risk gates.
  5. The Reviewer inspects the implementation and decides the next loop action: finish, retry implementation, revise the plan, redo research, revise requirements, or stop.
  6. The Summarizer writes the final human-facing run summary and patch context.

In workstream mode, a Lane Planner decomposes implementation into isolated writer worktrees, then the Integrator combines passed lane patches into one final diff before final verification and review.

Autonomous does not mean opaque or unbounded. The run stays local, final diffs remain under your control, and .agent-loop/runs/<run-id>/ is the recovery contract for status, resume, review, commit, and cleanup.

Quickstart

The verified path in this checkout is source-first while npm publication remains gated.

corepack enable
pnpm install
pnpm build
pnpm agent-loop run --repo . --task "Fix the failing tests and review the diff."

Use --requirements path/to/task.md instead of --task when the task lives in a Markdown file. Exactly one of --task or --requirements is required.

Inspect and continue runs:

pnpm agent-loop status --repo .
pnpm agent-loop status .agent-loop/runs/<run-id> --events 20
pnpm agent-loop diff .agent-loop/runs/<run-id> --summary
pnpm agent-loop resume .agent-loop/runs/<run-id>
pnpm agent-loop commit .agent-loop/runs/<run-id>
pnpm agent-loop clean .agent-loop/runs/<run-id> --dry-run

Use The CLI From Anywhere

The pnpm agent-loop commands above run the source checkout's development script. To make agent-loop available from any directory on your machine, build and link this checkout into your global pnpm bin directory:

pnpm build
pnpm link --global

Then run Agent Loop from the repository you want it to operate on:

agent-loop run --repo . --task "Fix the failing tests and review the diff."

The examples below keep using pnpm agent-loop so they work from this checkout; after linking, use the same arguments with agent-loop.

If pnpm reports that the global bin directory is not configured, run pnpm setup, restart your shell, and retry pnpm link --global. After changing the TypeScript source, rerun pnpm build; the global link will keep pointing at this checkout's built CLI.

The package also exposes an agent-loop binary in built and packed installs. Use the local smoke script before publishing or documenting package install commands:

npm run smoke:npm-install

Desktop Companion

Run the Electron companion from source:

pnpm app:dev

Build or package it locally:

pnpm app:build
pnpm app:package:mac

The desktop app can choose a repository, list existing runs, open run detail, start a new run, resume or force resume, show diff views, preview cleanup, clean runs, and commit completed work. It observes CLI-started runs through .agent-loop/runs/<run-id>/ and uses the same workflow APIs for GUI-started runs.

Electron is optional. The CLI does not require a desktop process, daemon, local web server, or long-lived background service.

Workflow

flowchart LR
  Start["CLI run\nor desktop New Run"] --> Req["Requirements agent\nshape task"]
  Req --> Research["Researcher agent\nread code, docs, history"]
  Research --> Plan["Planner agent\nwrite executable plan"]
  Plan --> Gate["Plan reviewer\ngate scope and fit"]
  Gate -- "approved" --> Implement["Implementer agent\nedit scoped worktrees"]
  Gate -- "revise plan" --> Plan
  Gate -- "needs research" --> Research

  Implement --> Verify["Verifier\nchecks + risk gates"]
  Verify -- "blocking risk" --> Stop["Stopped\npolicy or limit"]
  Verify -- "report" --> Review["Reviewer agent\ninspect diff + evidence"]
  Review -- "approved" --> Ship["Ready diff\nsummary + commit"]
  Review -- "changes requested" --> Implement
  Review -- "needs plan" --> Plan
  Review -- "needs research" --> Research
  Review -- "needs requirements" --> Req
  Review -- "stop" --> Stop

  Start -. "creates" .-> Runs[".agent-loop/runs/<run-id>\nstate, events, artifacts, patches"]
  Req -. "updates" .-> Runs
  Research -. "updates" .-> Runs
  Plan -. "updates" .-> Runs
  Gate -. "updates" .-> Runs
  Implement -. "updates" .-> Runs
  Verify -. "updates" .-> Runs
  Review -. "updates" .-> Runs
  Stop -. "updates" .-> Runs
Loading
  1. Start a run from the terminal or the desktop app.
  2. Agent Loop turns the task into a chain of phase-agent artifacts.
  3. Verification and review decide whether the run is done or which earlier phase should run again.
  4. The CLI streams phase logs while the desktop companion follows the same run directory.
  5. Inspect status, artifacts, summaries, and diffs, then resume, clean, or commit when the run is ready.

PET Mode

PET mode turns run status dots into animated sprites. Agent Loop can load custom pets from $CODEX_HOME/pets and, on macOS, import compatible pets from Codex.app when its app archive is available.

The launch media uses Miso as a local custom pet. Miso is demo-only here; Agent Loop does not require or bundle Miso.

Architecture

flowchart TB
  subgraph Execute["Execution"]
    CLI["CLI\nagent-loop run / resume"] --> Workflow["Agent loop orchestrator\nrunAgentLoop / resumeAgentLoop"]
    GUIStart["Desktop\nNew Run / Resume"] --> Workflow
    Workflow --> Codex["Codex app-server\nphase-agent turns"]
    Workflow --> Telemetry["RunTelemetry"]
    Workflow --> Worktrees[".agent-loop/worktrees/<run-id>\noptional isolated changes"]
    Telemetry --> Disk[".agent-loop/runs/<run-id>\nstate.json\nevents.jsonl\nartifacts\nfinal.patch\nsummary.md"]
  end

  subgraph Read["Read and observe"]
    Disk --> RunStore["RunStore"]
    Disk --> Observer["FileRunObserver\nwatch + polling fallback"]
    Disk --> CLIRead["CLI\nstatus / diff / commit / clean"]
    RunStore --> Services["App services\nview models"]
    Observer --> Services
    Services --> Desktop["Electron renderer\nruns, detail, diff, PET"]
  end

  Workflow -. "live GUI-started events" .-> LiveHub["InProcessRunEventHub"]
  LiveHub --> Desktop
Loading

Workflow execution is owned by runAgentLoop and resumeAgentLoop. During a run, RunTelemetry updates the durable run directory with state.json, events.jsonl, artifacts, patches, and summaries under .agent-loop/runs/<run-id>/. RunStore and the observation layer are the supported read side for status screens, tools, and UI clients.

codex app-server remains an execution detail behind the workflow APIs. GUI status comes from Agent Loop workflow state, events, and artifacts, not from direct inspection of app-server protocol traffic.

The current observer backend is file-based: it reads snapshots with RunStore, tails events.jsonl, watches run directories, and uses polling fallback when filesystem events are unavailable. This lets Electron discover CLI-started runs even if they began before Electron opened.

An optional daemon can be added as another observer backend for multi-client coordination, but the default CLI will not require it. The stable recovery contract remains the durable artifacts under .agent-loop/runs/<run-id>/: state.json, events.jsonl, and artifacts/.

Reference

The remaining sections are operator and developer reference material.

Development

corepack enable
pnpm install
pnpm build
pnpm format
pnpm lint
pnpm typecheck
pnpm test
pnpm app:dev

Git Hooks

This repository uses Lefthook for local quality gates.

pnpm install
pnpm lefthook run pre-commit
pnpm lefthook run pre-push

pnpm install installs the local Git hooks through the prepare script. Run pnpm hooks:install to reinstall them manually. To bypass hooks for an intentional exceptional case, use Git's --no-verify flag. These local hooks do not replace CI.

pnpm format applies Biome formatting and safe fixes. pnpm lint checks formatting, lint diagnostics, and import organization without writing. pre-commit runs pnpm lint and pnpm typecheck; pre-push runs pnpm test.

This repository uses pnpm for deterministic development installs and commits only pnpm-lock.yaml. If you prefer ni, the equivalent commands are ni, nr build, nr format, nr lint, nr typecheck, and nr test.

Live Progress Logs

run and resume write phase-centric progress logs to stderr by default. Stdout stays reserved for final command output such as Status and Run directory.

pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --dry-run --no-worktree

The default pretty output highlights the current phase first, then indents meaningful activity under it. In an interactive terminal, status lines are colorized so OK, warning, running, and error states are easier to scan. Piped output, redirected output, and CI logs stay plain.

[run] 20260507-233320-update-greeting
      repo=/workspaces/agent-loop

[phase] Requirements
        phase=requirements agent=agent-loop-requirements thread=019e04c9...
  - wrote requirements.json
[ok] Requirements completed in 10.7s

[attempt 1] Verification
        phase=verifier agent=agent-loop-verifier attempt=1
  - running: npm test
  - passed: npm test in 18.2s

Long-running phases show a live-updating progress row in an interactive terminal. When output is not attached to a TTY, they emit periodic heartbeat lines instead:

[still] Implementation running for 2m30s
        latest=editing src/workflow/runWorkflow.ts

Use these options to tune output:

pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --log-level debug
pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --log-format json
pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --quiet

--log-level debug shows lower-level app-server turn and activity details that are hidden in the default pretty output. --log-format json keeps stderr as structured NDJSON with event type fields; persisted .agent-loop/runs/<run-id>/events.jsonl and OTel span events remain event-centric.

Set NO_COLOR=1 to disable color in interactive pretty output. Color and live progress are terminal presentation only; run artifacts such as events.jsonl, JSON reports, patches, and Markdown/text artifacts do not contain ANSI color or live-progress control characters.

Use status --events to inspect the same high-level events after the run:

pnpm agent-loop status .agent-loop/runs/<run-id> --events 50

Config File

By default, the CLI reads .agent-loop/config.yml from the target repository. Pass --config path/to/config.yml to use a different file.

You can start from the checked-in example at .agent-loop/config.yml.example.

Minimal config:

mode: default
max_iterations: 3
max_plan_revisions: 3
sandbox:
  network: false
tests:
  default_commands:
    - npm test
parallelism:
  global_max_active_agents: 6
  fail_on_cap_exceeded: true
research:
  subagent_count: 3
  max_subagent_count: 4
review:
  subagent_count: 2
  max_subagent_count: 3
  target: uncommittedChanges
  delivery: detached
implementation:
  lane_count: 2
  max_lane_count: 3
  max_lane_attempts: 2
  max_allowed_touch_points_per_lane: 5
integration:
  max_fix_attempts: 1
limits:
  max_diff_lines: 800
  stop_on_new_dependency: true
  stop_on_lockfile_change: true
  stop_on_migration: true
  stop_on_auth_or_payment_change: true
  stop_on_large_diff: true
  stop_on_forbidden_path: true
  forbidden_paths:
    - .env
    - .env.
    - .npmrc
    - .pypirc
    - .agent-loop/
    - .ssh/
logging:
  level: info
  format: pretty
  debug_protocol: false
  redact: true
otel:
  enabled: false
  service_name: agent-loop
  endpoint: null
  protocol: http/protobuf
  logs: false
  console: false

The default per-phase sandbox is externalSandbox, with network access restricted unless --network or sandbox.network: true is set. This is the practical default when the orchestrator itself is already running inside a devcontainer or another sandbox.

implementation.lane_count: 2 requests the workstream path by default. Agent Loop may reduce the effective lane count for very small plans, and implementation.lane_count: 1 forces the sequential implementation loop. Non-dry-run workstream mode requires git worktrees; --no-worktree is compatible only with lane_count: 1.

To opt into Codex-managed sandboxing on hosts that support it, set phase sandboxes explicitly:

sandbox:
  requirements: readOnly
  researcher: readOnly
  planner: readOnly
  plan_reviewer: readOnly
  lane_planner: readOnly
  implementer: workspaceWrite
  verifier: readOnly
  integrator: workspaceWrite
  reviewer: readOnly
  summarizer: readOnly
  network: false

YOLO Mode

mode: default keeps the normal per-phase sandbox behavior described above. mode: yolo is an explicit dangerous mode for environments where the entire agent-loop process is already isolated by a trusted external sandbox.

When YOLO mode is enabled, every Codex app-server turn uses:

mode: yolo

The corresponding turn policy sent to Codex is sandboxPolicy: { type: "dangerFullAccess" } with approvalPolicy: "never". This is equivalent in intent to Codex CLI's --dangerously-bypass-approvals-and-sandbox: Codex turns will not ask for approval prompts and will not be constrained by the normal per-phase sandbox settings.

You can also pass --yolo or --mode yolo to run and resume:

pnpm agent-loop run --repo . --task "Update the README greeting." --yolo
pnpm agent-loop resume .agent-loop/runs/<run-id> --yolo

YOLO mode does not bypass deterministic verifier risk gates, review decisions, max iteration limits, worktree behavior, or commit safeguards. It only changes Codex app-server agent turn policy. Pretty progress logs emit a warning when YOLO mode is active.

Risk Gate

After each implementation attempt, the verifier inspects the local git diff and records risk evidence in verification-<attempt>.json.

Detected risk flags are:

  • new_dependency
  • lockfile_changed
  • migration_changed
  • auth_changed
  • payment_changed
  • large_diff
  • forbidden_path_changed
  • secrets_detected

The report separates blocking_risk_flags from nonblocking_risk_flags and includes risk_findings with the file path, reason, and blocking status. secrets_detected is always blocking and cannot be downgraded through config.

Use limits to intentionally allow selected classes of changes while still recording them:

limits:
  max_diff_lines: 800
  stop_on_new_dependency: false
  stop_on_lockfile_change: false
  stop_on_migration: false
  stop_on_auth_or_payment_change: true
  stop_on_large_diff: true
  stop_on_forbidden_path: true
  forbidden_paths:
    - .env
    - .env.*
    - .npmrc
    - .pypirc
    - .agent-loop/
    - .ssh/
    - secrets/**

Blocking risk flags stop the workflow deterministically before Reviewer execution. Nonblocking risk flags remain visible in artifacts and summaries.

Verifier Command Selection

The verifier always runs git diff --check, then selects the first non-empty command source in this order:

  1. CLI --test
  2. Workstream lane executable verification hints
  3. Plan phases[].automated_verification
  4. Requirements test_commands
  5. Research test_commands
  6. Config tests.default_commands
  7. Inferred project commands

Inference only runs when no explicit or configured commands exist. It reads root project files and can infer package scripts (test, typecheck, lint) using pnpm-lock.yaml, yarn.lock, or package-lock.json to choose the package manager. It can also infer cargo test from Cargo.toml and go test ./... from go.mod.

verification-<attempt>.json records command_sources next to executed command results, so each command can be traced to builtin, cli, lane, plan, requirements, research, config, or inferred.

Command handling is source-aware.

Trusted sources run exactly as written:

  • CLI --test
  • Config tests.default_commands
  • Built-in git diff --check
  • Inferred project commands

Agent-generated sources are checked by verifier command policy before execution:

  • Workstream lane executable verification hints
  • Plan phases[].automated_verification
  • Requirements test_commands
  • Research test_commands

Agent-generated commands may be accepted, normalized, or rejected. Supported fixed-string file checks such as rg -q --fixed-strings -- <text> <path>, grep -qF -- <text> <path>, grep -Fq -- <text> <path>, and git grep -q --fixed-strings -- <text> -- <path> are normalized to Node-based file checks. Supported repo-native verification commands include package test, typecheck, and lint scripts, plus cargo test and go test ./... when their manifests exist.

Unsupported Agent-generated shell forms, inspection-only commands such as jq, sed, cat, ls, find, git status, and git diff --name-only, and unsupported package-manager actions are rejected before shell execution. Rejections are recorded as synthetic verifier failures with exit code 126, and details appear in command_policy_results.

Use CLI --test or config tests.default_commands for arbitrary project-specific commands that the operator intentionally wants to run as-is. Put file/text assertions in manual_verification when they are evidence checks rather than repo-native automated verification.

Manual Verification Evidence

Plan phases may include manual_verification items. These do not pause or stop autonomous development by default. The verifier may substitute a small set of safe deterministic checks and records the result in verification-<attempt>.json as manual_verification_results.

Supported substitutions are intentionally narrow:

  • Confirm <path> contains "<text>".
  • Verify <path> includes "<text>".
  • Check <path> contains \`.`
  • Open <path> and confirm ... reads exactly "<text>".
  • Confirm <path> does not contain "<text>".
  • Verify <path> excludes "<text>".
  • Confirm the diff shows only <path> changed.
  • Confirm agent-loop <subcommand> --help includes "<text>".
  • Verify agent-loop <subcommand> --help contains "<text>".

Only run, status, diff, commit, clean, and resume are accepted for the agent-loop <subcommand> --help substitution. Unsafe paths, visual checks, browser checks, external services, and unrecognized wording remain unresolved.

Manual verification substitute results are separate from automated commands, command_sources, and failed_commands. A failed_substitute or unresolved manual item does not change verification.ok by itself. Substituted, failed, and unresolved manual verification items are listed in summary.md for follow-up.

Worktree Behavior

By default, each run creates a git worktree under:

.agent-loop/worktrees/<run-id>/

Unless --base-branch is set, the worktree is created from the target repository's current HEAD commit. Uncommitted changes, including staged changes, are not copied into the run worktree.

Run with --no-worktree to operate directly in the target repository working tree. The Implementer is the only writer phase; requirements, research, planning, plan gate, review, and summary phases are read-only.

Because workstream mode needs isolated lane and integration worktrees, combine --no-worktree with implementation.lane_count: 1.

The tool does not automatically push or merge. The final diff stays in the run worktree until you inspect, commit, or clean it.

In workstream mode, implementation lanes are isolated writer worktrees and the final result is produced in a separate integration worktree:

.agent-loop/worktrees/<run-id>/workstream-1/
.agent-loop/worktrees/<run-id>/workstream-2/
.agent-loop/worktrees/<run-id>/integration/

Researcher and Reviewer may use Codex subagents for read-only decomposition. Implementation lanes are orchestrator-managed worktrees, not competing candidate implementations. The Integrator combines passed lane patches into one final implementation.

Artifacts

Each run writes artifacts under:

.agent-loop/runs/<run-id>/
  input.md
  state.json
  events.jsonl
  protocol-events.jsonl   # only when logging.debug_protocol or --debug-protocol is enabled
  final.patch
  summary.md
  artifacts/
    requirements.json
    research.json
    plan.json
    plan.md
    plan-revision-1.json
    plan-revision-1.md
    plan-review-initial.json
    plan-review-revision-1.json
    execution-policy-initial.json
    execution-policy-revision-1.json
    implementation-1.json
    verification-1.json
    review-1.json
    lane-verification-policy.json
    lane-verification-policy-revision-1.json
    lane-plan.json
    lane-plan-revision-1.json
    lane-evaluation.json
    integration.json
    verification-final.json
    review-final.json
    lanes/
      workstream-1/
        implementation.json
        implementation-1.json
        verification.json
        verification-1.json
        review.json
        review-1.json
        scope.json
        scope-1.json
        state.json
        patch.diff

Some revision and retry artifacts are written only when a phase needs repair or another attempt. summary.md is the human-facing overview. final.patch is the final uncommitted diff captured from the sequential worktree or, for workstream runs, from the integration worktree only.

events.jsonl is the canonical high-level event stream and is redacted by default. Raw app-server JSON-RPC traffic is written to protocol-events.jsonl only when --debug-protocol or logging.debug_protocol: true is enabled; that file can include prompts and should be treated as sensitive.

Inspecting A Run

Show the latest run for a repository:

pnpm agent-loop status --repo .

Show a specific run:

pnpm agent-loop status .agent-loop/runs/<run-id>

Machine-readable output:

pnpm agent-loop status .agent-loop/runs/<run-id> --json

Include recent event summaries:

pnpm agent-loop status .agent-loop/runs/<run-id> --events 20

The status output also includes Current phase, Current agent, Current activity, and Last event from state.json for quick progress inspection.

For workstream runs, status also shows lane states, integration state, final verification/review availability, and lane artifact availability.

OpenTelemetry

OTel trace export is opt-in. When enabled, agent-loop creates a root agent_loop.run span, phase child spans, Codex turn spans, verifier command spans, and structured span events using attributes such as agent_loop.run_id, agent_loop.phase, agent_loop.agent, agent_loop.attempt, thread id, and turn id.

OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4318 \
pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --otel

pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" \
  --otel --otel-endpoint http://localhost:4318 --otel-service-name agent-loop-dev

Supported protocols are http/protobuf and grpc:

pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" \
  --otel --otel-endpoint http://localhost:4317 --otel-protocol grpc

OTEL_SERVICE_NAME, OTEL_EXPORTER_OTLP_ENDPOINT, OTEL_EXPORTER_OTLP_TRACES_ENDPOINT, OTEL_EXPORTER_OTLP_PROTOCOL, and OTEL_EXPORTER_OTLP_TRACES_PROTOCOL are used when OTel is enabled. Use AGENT_LOOP_OTEL=true to enable OTel from the environment.

OpenTelemetry log records are not emitted by default; the stable OTel path is traces with span events. --otel-logs is accepted as an experimental opt-in marker while local events.jsonl remains the authoritative log stream.

For local instrumentation debugging without a collector, write OTel-style span records to stderr:

pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --otel-console

Viewing Diffs

Show the saved final patch if it exists, otherwise show the current worktree diff:

pnpm agent-loop diff .agent-loop/runs/<run-id>

Show a risk-oriented summary:

pnpm agent-loop diff .agent-loop/runs/<run-id> --summary

Show changed file names from the current recorded worktree:

pnpm agent-loop diff .agent-loop/runs/<run-id> --current --name-only

Show a stored lane patch or the lane's current diff:

pnpm agent-loop diff .agent-loop/runs/<run-id> --lane 1

Show the integration worktree diff:

pnpm agent-loop diff .agent-loop/runs/<run-id> --integration --current

Read only final.patch:

pnpm agent-loop diff .agent-loop/runs/<run-id> --final

Resuming A Run

A dry run stops after requirements, research, planning, and plan gate:

pnpm agent-loop run --repo . --task "Update test/fixtures/minimal-repo/README.md greeting to 'Hello from agent-loop.'" --dry-run

Continue that same run later:

pnpm agent-loop resume .agent-loop/runs/<run-id>

Resume creates fresh app-server threads and passes the persisted artifacts back into the workflow. It does not depend on old app-server thread ids.

Stopped workstream runs may still contain useful lane patches and integration diagnostics. Resume infers whether to continue failed lanes, integration, final verification, or final review from state.json and artifacts.

Committing A Run

After reviewing a finished run, create a local commit from the recorded worktree:

pnpm agent-loop commit .agent-loop/runs/<run-id>

Use an explicit message:

pnpm agent-loop commit .agent-loop/runs/<run-id> --message "Update greeting"

commit only allows done runs by default, refuses runs with no changes, and never pushes. Use --auto-commit on run or resume to commit automatically only when the workflow finishes with done.

For workstream runs, commit requires status: done and a completed integration state, and commits only from final_worktree_path. --allow-stopped does not allow committing incomplete workstream integrations.

Cleaning Runs And Worktrees

Preview cleanup:

pnpm agent-loop clean .agent-loop/runs/<run-id> --dry-run

Remove a run directory and any agent-loop-created worktree:

pnpm agent-loop clean .agent-loop/runs/<run-id>

Clean older runs:

pnpm agent-loop clean --repo . --older-than-days 14 --dry-run

For --no-worktree runs, clean removes run artifacts but does not remove the repository working tree.

For workstream runs, clean removes every recorded lane worktree and the integration worktree, subject to the same safety checks.

Fixture Smoke Test

The repository includes a fake app-server fixture so the CLI can be smoke-tested without calling a model:

TMP_REPO="$(mktemp -d)"
cp -R test/fixtures/minimal-repo/. "$TMP_REPO"/
cat > "$TMP_REPO/.agent-loop/config.yml" <<EOF
mode: default
max_iterations: 3
max_plan_revisions: 3
sandbox:
  network: false
tests:
  default_commands:
    - npm test
limits:
  max_diff_lines: 800
  stop_on_new_dependency: true
  stop_on_lockfile_change: true
  stop_on_migration: true
  stop_on_auth_or_payment_change: true
  stop_on_large_diff: true
  stop_on_forbidden_path: true
  forbidden_paths:
    - .env
    - .env.
    - .npmrc
    - .pypirc
    - .agent-loop/
    - .ssh/
app_server:
  fake_script: $(pwd)/test/fixtures/fake-app-server/dry-run.json
EOF

git -C "$TMP_REPO" init
git -C "$TMP_REPO" config user.email test@example.com
git -C "$TMP_REPO" config user.name Test
git -C "$TMP_REPO" add .
git -C "$TMP_REPO" commit -m initial

pnpm agent-loop run --repo "$TMP_REPO" --task "Update README.md greeting to 'Hello from agent-loop.'" --dry-run --no-worktree

To smoke-test YOLO protocol wiring with the same fake app-server, enable raw protocol logging:

pnpm agent-loop run --repo "$TMP_REPO" \
  --task "Update README.md greeting to 'Hello from agent-loop.'" \
  --dry-run --no-worktree --yolo --debug-protocol

Use the printed Run directory: and inspect protocol-events.jsonl. Each turn/start request should include sandboxPolicy.type set to dangerFullAccess and approvalPolicy set to never:

RUN_DIR=/path/printed/run-directory
rg '"turn/start"|"dangerFullAccess"|"approvalPolicy":"never"' "$RUN_DIR/protocol-events.jsonl"

Current Limits

  • No mandatory daemon, hosted service, web server, database, or local task queue.
  • No hosted dashboard; the Electron app is a local companion that reads the same run artifacts as the CLI.
  • No GitHub PR creation or GitHub integration.
  • No automatic push or merge.
  • No adaptive large-plan splitting beyond the configured workstream lanes; oversized tasks still need human scoping.
  • No local multiple-task queue or batch runner.
  • No candidate mode where several agents implement the same scope and one result is selected.
  • No network access for Implementer unless --network is passed.
  • No bundled PET packages; PET mode uses custom pets or compatible imports.
  • npm publication is still gated while the package remains private.

About

Local Codex agent-loop orchestrator

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors