Skip to content

EPIC: ship a 'stable surface' — quarantine experimental features into [extras], swap Redis→SQLite, document the matrix #39

@hadamrd

Description

@hadamrd

Problem

After the 16-PR self-bootstrap sprint, forge-loop ships ~9,500 LOC of source
but only ~30% of it has been exercised by a real operator. The rest is
scaffolding: Redis queue + leader election, async runner, multi-repo,
Slack/Discord/webhook adapters, the dashboard, Prometheus + OTel exporters,
time-travel replay, pipeline DAG. Each one looks like a feature but has
zero downstream consumer, costs maintenance, and pulls third-party deps
into the default install.

The product is one operator on one box. We should reflect that in the
default surface and document the rest as opt-in experiments.

Acceptance criteria

  1. Stability matrix in README: clearly marks each module:
    • STABLE: SDK worker (Opus 4.7), typed critic, retry+cooldown,
      externalized briefs, attempts ledger, watchdog, runner sync,
      maintenance pass, PO spec-expander.
    • EXPERIMENTAL: multirepo, runner_async, dashboard, integrations
      (slack/discord/webhook), observability (prometheus/otel), replay,
      pipeline DAG, queue backends other than the default.
  2. pyproject.toml extras:
    • default install pulls ONLY what STABLE needs (claude_agent_sdk,
      pyyaml, duckdb is borderline — keep for the attempts ledger if
      used, else drop).
    • `pip install forge-loop[experimental]` pulls fastapi, uvicorn,
      redis, prometheus_client, opentelemetry-*, jinja2.
    • the experimental modules use lazy imports gated on the extras being
      installed; a clear ImportError at CLI invocation time if a user
      hits e.g. `forge-loop dashboard` without the extra.
  3. Replace Redis backend with SQLite (WAL) as the durable embedded queue:
    • keep `Queue` interface + `InMemoryQueue` (the test default).
    • new `SQLiteQueue` is the production default (durable across crashes,
      zero infra, ACID).
    • DELETE `redis_backend.py` + `cluster/election.py` (premature
      distribution — re-add when a real 2+ host operator shows up).
  4. Critic prompt patch — add the `product` category + the
    no-scaffold-theatre rule: a PR adding a configurable backend or an
    integration adapter with NO downstream consumer in the same repo
    gets a sev2 product finding. (Pin in critic.md.tmpl.)
  5. README: add a "running on a Claude subscription" section saying
    per-token budget tracking is not supported; the operator runs in
    flat-fee mode.

Test matrix

  • unit: `pip install forge-loop` in a clean venv → `forge-loop run`
    works without any `[experimental]` dep being importable.
  • unit: `forge-loop dashboard` without [experimental] → clear
    ImportError naming the extra to install.
  • unit: `SQLiteQueue` round-trip (push N, pop N, assert order + ack).
  • unit: `SQLiteQueue` durability — write, close, reopen, items still
    there.
  • integration: a full `forge-loop run` tick using SQLiteQueue on disk;
    events.jsonl appended; no Redis import attempted anywhere in the call
    graph.
  • adversarial: the experimental modules SHOULD fail to import in the
    default install; the test asserts this is the case (regression guard
    against a future PR re-adding them to the core requirements).

Out of scope

  • Re-architecting the runner's tick loop (separate ticket if needed).
  • A new web dashboard (current dashboard stays as experimental until
    someone has a use case for it).
  • Replacing duckdb if it's the analytics dep — leave for a follow-up.

File pointers

  • README.md (Stability matrix + subscription-mode note)
  • pyproject.toml (split default vs [experimental] extras)
  • src/forge_loop/queue/init.py (interface stays)
  • src/forge_loop/queue/in_memory.py (stays)
  • src/forge_loop/queue/sqlite.py (NEW — replaces redis)
  • src/forge_loop/queue/redis_backend.py (DELETE)
  • src/forge_loop/cluster/election.py (DELETE)
  • src/forge_loop/multirepo/ (keep, gate imports on extras)
  • src/forge_loop/runner_async.py (keep, gate)
  • src/forge_loop/dashboard/ (keep, gate)
  • src/forge_loop/integrations/ (keep, gate)
  • src/forge_loop/observability/ (keep, gate)
  • src/forge_loop/replay.py (keep, gate)
  • src/forge_loop/briefs/critic.md.tmpl (add product category rule)
  • tests/test_install_surface.py (NEW — assert default-install surface)
  • tests/test_queue_sqlite.py (NEW)

Why `loop:blocked`

Big deletion + reorganization. Operator should sanity-check the
boundaries before a worker shotguns the cleanup. Flip to `loop:ready`
when you're ready to dispatch.

Metadata

Metadata

Assignees

No one assigned

    Labels

    epicMulti-PR umbrella tracking a major themeloop:readyLoop runner will autonomously attempt this issuepriority:p1Important, near-termtype:refactorCode reorg, no behaviour change

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions