Problem
After the 16-PR self-bootstrap sprint, forge-loop ships ~9,500 LOC of source
but only ~30% of it has been exercised by a real operator. The rest is
scaffolding: Redis queue + leader election, async runner, multi-repo,
Slack/Discord/webhook adapters, the dashboard, Prometheus + OTel exporters,
time-travel replay, pipeline DAG. Each one looks like a feature but has
zero downstream consumer, costs maintenance, and pulls third-party deps
into the default install.
The product is one operator on one box. We should reflect that in the
default surface and document the rest as opt-in experiments.
Acceptance criteria
- Stability matrix in README: clearly marks each module:
- STABLE: SDK worker (Opus 4.7), typed critic, retry+cooldown,
externalized briefs, attempts ledger, watchdog, runner sync,
maintenance pass, PO spec-expander.
- EXPERIMENTAL: multirepo, runner_async, dashboard, integrations
(slack/discord/webhook), observability (prometheus/otel), replay,
pipeline DAG, queue backends other than the default.
- pyproject.toml extras:
- default install pulls ONLY what STABLE needs (claude_agent_sdk,
pyyaml, duckdb is borderline — keep for the attempts ledger if
used, else drop).
- `pip install forge-loop[experimental]` pulls fastapi, uvicorn,
redis, prometheus_client, opentelemetry-*, jinja2.
- the experimental modules use lazy imports gated on the extras being
installed; a clear ImportError at CLI invocation time if a user
hits e.g. `forge-loop dashboard` without the extra.
- Replace Redis backend with SQLite (WAL) as the durable embedded queue:
- keep `Queue` interface + `InMemoryQueue` (the test default).
- new `SQLiteQueue` is the production default (durable across crashes,
zero infra, ACID).
- DELETE `redis_backend.py` + `cluster/election.py` (premature
distribution — re-add when a real 2+ host operator shows up).
- Critic prompt patch — add the `product` category + the
no-scaffold-theatre rule: a PR adding a configurable backend or an
integration adapter with NO downstream consumer in the same repo
gets a sev2 product finding. (Pin in critic.md.tmpl.)
- README: add a "running on a Claude subscription" section saying
per-token budget tracking is not supported; the operator runs in
flat-fee mode.
Test matrix
- unit: `pip install forge-loop` in a clean venv → `forge-loop run`
works without any `[experimental]` dep being importable.
- unit: `forge-loop dashboard` without [experimental] → clear
ImportError naming the extra to install.
- unit: `SQLiteQueue` round-trip (push N, pop N, assert order + ack).
- unit: `SQLiteQueue` durability — write, close, reopen, items still
there.
- integration: a full `forge-loop run` tick using SQLiteQueue on disk;
events.jsonl appended; no Redis import attempted anywhere in the call
graph.
- adversarial: the experimental modules SHOULD fail to import in the
default install; the test asserts this is the case (regression guard
against a future PR re-adding them to the core requirements).
Out of scope
- Re-architecting the runner's tick loop (separate ticket if needed).
- A new web dashboard (current dashboard stays as experimental until
someone has a use case for it).
- Replacing duckdb if it's the analytics dep — leave for a follow-up.
File pointers
- README.md (Stability matrix + subscription-mode note)
- pyproject.toml (split default vs [experimental] extras)
- src/forge_loop/queue/init.py (interface stays)
- src/forge_loop/queue/in_memory.py (stays)
- src/forge_loop/queue/sqlite.py (NEW — replaces redis)
- src/forge_loop/queue/redis_backend.py (DELETE)
- src/forge_loop/cluster/election.py (DELETE)
- src/forge_loop/multirepo/ (keep, gate imports on extras)
- src/forge_loop/runner_async.py (keep, gate)
- src/forge_loop/dashboard/ (keep, gate)
- src/forge_loop/integrations/ (keep, gate)
- src/forge_loop/observability/ (keep, gate)
- src/forge_loop/replay.py (keep, gate)
- src/forge_loop/briefs/critic.md.tmpl (add product category rule)
- tests/test_install_surface.py (NEW — assert default-install surface)
- tests/test_queue_sqlite.py (NEW)
Why `loop:blocked`
Big deletion + reorganization. Operator should sanity-check the
boundaries before a worker shotguns the cleanup. Flip to `loop:ready`
when you're ready to dispatch.
Problem
After the 16-PR self-bootstrap sprint, forge-loop ships ~9,500 LOC of source
but only ~30% of it has been exercised by a real operator. The rest is
scaffolding: Redis queue + leader election, async runner, multi-repo,
Slack/Discord/webhook adapters, the dashboard, Prometheus + OTel exporters,
time-travel replay, pipeline DAG. Each one looks like a feature but has
zero downstream consumer, costs maintenance, and pulls third-party deps
into the default install.
The product is one operator on one box. We should reflect that in the
default surface and document the rest as opt-in experiments.
Acceptance criteria
externalized briefs, attempts ledger, watchdog, runner sync,
maintenance pass, PO spec-expander.
(slack/discord/webhook), observability (prometheus/otel), replay,
pipeline DAG, queue backends other than the default.
pyyaml, duckdb is borderline — keep for the attempts ledger if
used, else drop).
redis, prometheus_client, opentelemetry-*, jinja2.
installed; a clear ImportError at CLI invocation time if a user
hits e.g. `forge-loop dashboard` without the extra.
zero infra, ACID).
distribution — re-add when a real 2+ host operator shows up).
no-scaffold-theatre rule: a PR adding a configurable backend or an
integration adapter with NO downstream consumer in the same repo
gets a sev2 product finding. (Pin in critic.md.tmpl.)
per-token budget tracking is not supported; the operator runs in
flat-fee mode.
Test matrix
works without any `[experimental]` dep being importable.
ImportError naming the extra to install.
there.
events.jsonl appended; no Redis import attempted anywhere in the call
graph.
default install; the test asserts this is the case (regression guard
against a future PR re-adding them to the core requirements).
Out of scope
someone has a use case for it).
File pointers
Why `loop:blocked`
Big deletion + reorganization. Operator should sanity-check the
boundaries before a worker shotguns the cleanup. Flip to `loop:ready`
when you're ready to dispatch.