Loop Watchdog is a runaway-agent kill switch for looped AI coding sessions.
It sits in front of an OpenAI-compatible model endpoint, watches the session for repeated fix-break behavior, pauses the next model call before more credits burn, and pushes a structured alert to Slack, email, or your own control plane.
The first version in this repository ships two real pieces:
- A local Python watchdog proxy that coding agents can target as their
base_url - A Cloudflare Worker control plane that ingests incidents, stores them in D1, and fans them out to Slack or email
The failure mode is simple:
- An agent edits a file
- Tests fail in a similar way
- The agent retries with nearly the same patch
- Tokens keep burning while the session makes no meaningful progress
Loop Watchdog turns that failure mode into an explicit product boundary. It tracks repeated request patterns, repeated file churn, repeating error families, and edit-error oscillation. When the score crosses a threshold, the session is paused and an incident is raised.
IDE agent / CLI wrapper
|
v
Loop Watchdog proxy (FastAPI)
|
+--> in-memory session graph
| |
| +--> loop detector
| +--> pause / resume / kill state
|
+--> incident dispatcher
| |
| +--> Cloudflare Worker
| |
| +--> D1 incident store
| +--> Slack webhook
| +--> Resend email
|
+--> upstream provider (Gemini / OpenAI-compatible endpoint)
- High-overlap retries across adjacent agent requests
- The same files being churned repeatedly with no success signal
- Repeating failure signatures after normalization
- Edit -> error -> edit -> error oscillation on the same files
- Growing request volume without a passing test or explicit recovery event
The detector is intentionally rule-based for v1. It is deterministic, cheap to run locally, and easy to tune from real operator feedback.
For a Codex user, the easiest path is:
loop-watchdog start codexRun that from the root of the repo you want to work in. Loop Watchdog will start the local server if needed, launch Codex through the proxy, and keep watching while you build.
python -m venv .venv
. .venv/Scripts/activate
pip install -e .For Codex with the default OpenAI upstream, you can often skip this entirely. The proxy already defaults to https://api.openai.com and forwards the incoming Codex auth header upstream.
You only need extra configuration when you want custom alerting, a non-default upstream, or different detector tuning.
Create .env from the example values below:
LOOP_WATCHDOG_UPSTREAM_BASE_URL=https://generativelanguage.googleapis.com
LOOP_WATCHDOG_UPSTREAM_API_KEY=replace-me
LOOP_WATCHDOG_UPSTREAM_AUTH_MODE=x-api-key
LOOP_WATCHDOG_ALERT_WEBHOOK_URL=https://your-worker.example.workers.dev/api/incidents
LOOP_WATCHDOG_ALERT_HMAC_SECRET=replace-me
LOOP_WATCHDOG_PERSISTENCE_PATH=.loop_watchdog/state.json
LOOP_WATCHDOG_PAUSE_SCORE_THRESHOLD=4.8cd path/to/your-project
loop-watchdog start codexThat single command:
- starts the local watchdog server if it is not already running
- points Codex at
http://127.0.0.1:8787/v1 - preserves Codex's OpenAI authentication flow
- generates a stable session id like
repo:user:branch - injects that id as
X-Loop-Session - emits lightweight
file_editevents while Codex is active
You can still pass normal Codex arguments through the wrapper:
loop-watchdog start codex "Fix the failing parser test"
loop-watchdog start codex exec "Investigate why totals are still rounding wrong"
loop-watchdog start codex -m gpt-5.5If you want to inspect the exact Codex command before launch:
loop-watchdog start codex --dry-runIf Codex is installed but Windows path lookup is flaky, pass the executable explicitly:
loop-watchdog start codex --codex-executable "C:\Users\you\.vscode\extensions\openai.chatgpt-...\bin\windows-x86_64\codex.exe"http://127.0.0.1:8787/dashboard
The dashboard shows live sessions, paused incidents, recent event timelines, and operator controls for acknowledge, resume, reset, archive, and kill actions.
If you want to start fresh, use the dashboard's Clear Local History button before your next run.
If you prefer the older two-terminal setup, you can still run the server yourself:
loop-watchdog serve --host 127.0.0.1 --port 8787The landing page is also still available at:
http://127.0.0.1:8787/
Any other tool that supports an OpenAI-compatible base_url can target http://127.0.0.1:8787/v1.
For best detection, send a stable session identifier on each request:
X-Loop-Session: repo-name:user-or-agent:branch
If a resumed session requires a changed plan, send it on the next request as either:
X-Loop-Plan: rewritten plan text
or:
{
"metadata": {
"loop_watchdog_plan": "rewritten plan text"
}
}The loop-watchdog start codex launcher already emits basic file_edit events automatically. The proxy still gets much better when an IDE extension or wrapper also reports structured test failures and richer edit context:
POST /v1/watchdog/events
Content-Type: application/json
{
"session_id": "repo:user:branch",
"kind": "file_edit",
"summary": "Retrying parser fix after test failure",
"files": ["src/parser.py", "tests/test_parser.py"],
"metadata": {
"diff_excerpt": "@@ parse_user @@",
"attempt": 4
}
}If you want to test the actual "stop the retry before more credits burn" path without wiring a full client first:
- Start the watchdog server.
- Open the dashboard and clear local history.
- Run:
powershell -ExecutionPolicy Bypass -File .\scripts\test-loop-watchdog.ps1The script posts repeated file_edit and test_failure events for one session, then sends a model request through /v1/chat/completions. The expected result is an HTTP 409 pause response, which means the watchdog blocked the next model call before it could be forwarded upstream.
For the closest v1 product test with a human and Codex in the loop:
- Install
loop-watchdogonce. - Open the repo you actually want to work in.
- Run
loop-watchdog start codex. - Open
http://127.0.0.1:8787/dashboard. - Work normally in Codex.
- Keep the dashboard open to watch requests, file churn, pauses, and incident reasons.
This is the easiest real-user path in the repo today. It removes the manual base_url, server startup, and session-header setup, but automatic test-pass and test-failure telemetry is still a next-step integration.
GET /GET /dashboardGET /v1/watchdog/dashboardPOST /v1/watchdog/demo/guided-trialPOST /v1/watchdog/history/clearPOST /v1/chat/completionsPOST /v1/responsesPOST /v1/watchdog/eventsGET /v1/watchdog/sessionsGET /v1/watchdog/sessions/{session_id}/eventsGET /v1/watchdog/status/{session_id}POST /v1/watchdog/sessions/{session_id}/acknowledgePOST /v1/watchdog/sessions/{session_id}/resumePOST /v1/watchdog/sessions/{session_id}/archivePOST /v1/watchdog/sessions/{session_id}/killGET /healthz
The Worker receives incidents from the proxy, stores them in D1, and can fan them out to Slack and email.
cd apps/control-plane
npm install
npm run db:migrate
npm run deployEnvironment variables are documented in apps/control-plane/.dev.vars.example.
python -m pytest
python -m compileall src tests- The proxy is intentionally local-first. A developer can run it without creating a cloud account.
- Incidents are serialized with enough context to power a future dashboard without reworking the schema.
- The detector is built around explainable signals so the pause decision can be surfaced to humans without hand-waving.
- Session state persists locally by default, so incidents and operator notes survive a restart.
- The dashboard can enforce a changed-plan token before a resumed session is allowed to spend again.
- The landing page keeps the live dashboard clean by default, while guided trial can create a realistic paused session on demand.
The next product layer after this repo is a native editor wrapper that emits richer diff and test telemetry automatically. This codebase is designed so that layer can plug into the existing event API without replacing the core.

