Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,71 @@ merges, and redeploys.
Every Nth tick is a maintenance pass: a PM agent triages/retitles/dedupes
the backlog. Risk-gated issues skip auto-merge.

## Stability matrix

forge-loop's surface is split into a **stable** core and a quarantined
**experimental** ring (issue #39). The default `pip install forge-loop`
ships ONLY the stable surface; experiments require an extra and a
matching feature flag.

### STABLE (default install — supported)

| Module | What it does |
|---|---|
| `forge_loop.worker` | Claude Agent SDK worker (Opus 4.7) — the main dispatch path |
| `forge_loop.critic` | Typed CriticReport + per-finding gating |
| `forge_loop.attempts` | Per-issue attempts ledger w/ retry + cooldown |
| `forge_loop.briefs/*` | Externalised worker / PO / critic brief templates |
| `forge_loop.po` | PO spec-expander for thin tickets |
| `forge_loop.maintenance` | Periodic maintenance pass (triage / dedupe) |
| `forge_loop.watchdog` | Liveness watchdog + idle-kill |
| `forge_loop.runner` | Synchronous tick loop (PO → workers → critics) |
| `forge_loop.queue.in_memory` | Default queue (test / in-process) |
| `forge_loop.queue.sqlite` | Durable embedded queue (WAL) — production default |

### EXPERIMENTAL (gated, requires `pip install 'forge-loop[experimental]'`)

| Module | Why it's experimental |
|---|---|
| `forge_loop.multirepo` | One loop serves N repos — zero downstream operator yet |
| `forge_loop.runner_async` | Three-stage asyncio orchestrator — sync path is the supported one |
| `forge_loop.dashboard` | Stdlib HTTP `/metrics` + `/healthz` — no scrape consumer yet |
| `forge_loop.integrations.*` | Slack / Discord / generic-webhook adapters |
| `forge_loop.observability.*` | Prometheus + OpenTelemetry exporters |
| `forge_loop.replay` | Time-travel re-run of a past tick with a new brief |
| `forge_loop.pipeline` | Declarative role-chain pipeline (`.forge/pipeline.yaml`) |

Each experimental module's top-level import calls
`forge_loop._extras.require_experimental()`, which raises a clear
`ImportError` naming the extra to install if the gate is not satisfied.
Set `FORGE_LOOP_EXPERIMENTAL=1` to bypass the gate during local development.

### REMOVED in #39

| Module | Reason |
|---|---|
| `forge_loop.queue.redis_backend` | Premature distribution — no real 2+ host operator. Replaced by `SQLiteQueue`. |
| `forge_loop.cluster` (election + coordinator) | Same reason — single-host is the supported surface. |

## Running on a Claude subscription (flat-fee mode)

forge-loop is designed for an operator running on a Claude Pro / Max /
Team subscription, not a metered API key. **Per-token budget tracking is
not supported** in this mode — the loop's spend gauges will read $0 and
the `LOOP_DAILY_BUDGET_USD` / `LOOP_TICK_BUDGET_USD` knobs are no-ops.

What you get instead:

* The watchdog wall-clock budget (`LOOP_WORKER_TIMEOUT_S`) still applies
per worker — it's the actual safety net under a flat fee.
* `forge-loop status` reports tick counts, PRs merged, failures.
* Cooldowns + attempts ledger still gate retries, so a stuck issue
can't burn unbounded wall time.

If you DO have a metered key and want token accounting, set up a
separate billing scrape — the current default is "operator pays a flat
fee, we count work done not tokens spent".

## Quickstart

```sh
Expand Down
40 changes: 31 additions & 9 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,9 @@ version = "0.1.0"
description = "Autonomous multi-worker dispatcher for Claude Code — picks up GitHub issues by label, dispatches parallel workers in git worktrees, watches PRs, merges, and redeploys."
readme = "README.md"
requires-python = ">=3.11"
# STABLE surface only (issue #39). Anything heavier — fastapi, uvicorn,
# prometheus_client, opentelemetry, jinja2, redis — lives in the
# [experimental] extra and is gated by ``forge_loop._extras``.
dependencies = [
"mcp >=1.0",
"pyyaml >=6.0",
Expand All @@ -14,26 +17,32 @@ dependencies = [
]

[project.optional-dependencies]
observability = [
# All quarantined / experimental features (multirepo, dashboard, async
# orchestrator, slack/discord/webhook adapters, prometheus + otel
# exporters, replay, pipeline DAG) live behind this extra. Install with:
# pip install 'forge-loop[experimental]'
experimental = [
"fastapi >=0.110",
"uvicorn >=0.30",
"prometheus_client >=0.20",
"opentelemetry-api >=1.25",
"opentelemetry-sdk >=1.25",
"opentelemetry-exporter-otlp >=1.25",
]
# Optional Redis-backed queue + leader election for multi-host runners
# (issue #19). Default stays in-memory; install with `pip install
# 'forge-loop[redis]'` to enable `forge-loop run --queue redis://...`.
redis = [
"redis >=5.0",
"jinja2 >=3.1",
]
dev = [
"pytest >=8.0",
"pytest-cov >=5.0",
"ruff >=0.6",
"mypy >=1.10",
"types-PyYAML >=6.0",
"fakeredis >=2.20",
"redis >=5.0",
"fastapi >=0.110",
"uvicorn >=0.30",
"prometheus_client >=0.20",
"opentelemetry-api >=1.25",
"opentelemetry-sdk >=1.25",
"opentelemetry-exporter-otlp >=1.25",
"jinja2 >=3.1",
]

[project.scripts]
Expand Down Expand Up @@ -62,6 +71,19 @@ target-version = "py311"
select = ["E", "F", "I", "B", "UP", "SIM"]
ignore = ["E501"] # line-too-long handled by format

[tool.ruff.lint.per-file-ignores]
# Experimental modules (issue #39) call the extras gate before any
# other import so an unprepared environment fails fast with a clear
# error rather than crashing partway through a chain of imports. That
# forces E402 / I001 violations that are by-design.
"src/forge_loop/multirepo/__init__.py" = ["E402", "I001"]
"src/forge_loop/dashboard/__init__.py" = ["E402", "I001"]
"src/forge_loop/integrations/__init__.py" = ["E402", "I001"]
"src/forge_loop/observability/__init__.py" = ["E402", "I001"]
"src/forge_loop/pipeline/__init__.py" = ["E402", "I001"]
"src/forge_loop/replay.py" = ["E402", "I001"]
"src/forge_loop/runner_async.py" = ["E402", "I001"]

[tool.mypy]
python_version = "3.11"
strict = true
Expand Down
70 changes: 70 additions & 0 deletions src/forge_loop/_extras.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
"""Extras gate — central place to declare 'this module is experimental'.

The stable surface of forge-loop is documented in the README's stability
matrix. Anything outside that surface lives behind an extras gate: the
default install of forge-loop deliberately does NOT pull the third-party
dependencies the experimental modules need (fastapi, prometheus_client,
opentelemetry, jinja2, …), and the modules themselves refuse to import
unless those deps are present.

This file owns the detection logic so each experimental module can be
one-line-gated:

from forge_loop._extras import require_experimental
require_experimental("dashboard")

If a sentinel dep from the ``[experimental]`` extra is missing, we raise
``ImportError`` with a single clear remediation step: install the extra.
The override env var ``FORGE_LOOP_EXPERIMENTAL=1`` lets us run tests of
experimental modules without forcing the extras on every dev box; the
production gate stays loud.
"""

from __future__ import annotations

import importlib
import os

# Sentinel dependencies that ship inside the [experimental] extra. If
# ANY one of them is importable, we treat the extra as installed (we do
# not require all of them — operators sometimes carve out a subset, and
# the message remains actionable either way).
_EXPERIMENTAL_SENTINELS = (
"prometheus_client",
"fastapi",
"uvicorn",
"opentelemetry",
"jinja2",
"redis",
)


def experimental_installed() -> bool:
"""Return True if any sentinel dep from [experimental] is importable."""

if os.environ.get("FORGE_LOOP_EXPERIMENTAL") == "1":
return True
for name in _EXPERIMENTAL_SENTINELS:
try:
importlib.import_module(name)
return True
except ImportError:
continue
return False


def require_experimental(feature: str) -> None:
"""Refuse to import an experimental module unless the extra is present.

Raises ``ImportError`` (the canonical signal for "this module is
unavailable in your environment") with a single remediation line.
"""

if experimental_installed():
return
raise ImportError(
f"forge-loop feature {feature!r} is experimental and is not "
f"available in the default install. Install with: "
f"pip install 'forge-loop[experimental]' "
f"(or export FORGE_LOOP_EXPERIMENTAL=1 to bypass for local dev)."
)
16 changes: 14 additions & 2 deletions src/forge_loop/briefs/critic.md.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,19 @@ DO:
- sev2 = meaningful concern (untested error path, weak assertion,
scope creep affecting reviewers). Worth fixing before merge.
- sev3 = nit / suggestion. Non-blocking.
Categories: correctness | security | style | tests | docs.
Categories: correctness | security | style | tests | docs | product.

NO-SCAFFOLD-THEATRE RULE (issue #39):
A PR that adds a configurable backend, an integration adapter, a
dashboard / metrics exporter, or any other "plug-in" surface WITHOUT
a downstream consumer wired up in the same repo on the same PR is
scaffold theatre. Emit a sev2 finding with category="product" naming
the orphan code; the operator must either wire it up or quarantine
it behind the [experimental] extra before merge. Examples:
* adding a redis/postgres queue backend with no operator using it
* adding a Slack adapter with no event in the runner that sends to it
* adding a Prometheus exporter no scrape job consumes
Default surface stays minimal; experiments live in extras.

DO NOT:
- push code or edit files.
Expand All @@ -26,7 +38,7 @@ FINAL OUTPUT (one JSON line, no prose after it, no markdown fence):
{{"overall": "approve|request_changes|block",
"findings": [
{{"severity": "sev1|sev2|sev3",
"category": "correctness|security|style|tests|docs",
"category": "correctness|security|style|tests|docs|product",
"file": "path/to/file" or null,
"line": 42 or null,
"message": "what's wrong and what to do"}}
Expand Down
68 changes: 11 additions & 57 deletions src/forge_loop/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,65 +44,19 @@ def _cmd_run(args: argparse.Namespace) -> int:


def _cmd_cluster_status(args: argparse.Namespace) -> int:
"""Enumerate live runners via the shared registry.
"""Deprecated: multi-host cluster mode was removed in #39.

Reads ``<prefix>:runners:*`` keys from Redis and prints each runner's
host, in-flight count, leader flag, and seconds since last heartbeat.
Falls back to a clear error if the broker is unreachable — operator
needs an actionable signal, not a Python traceback.
The subcommand is kept as a stub so old scripts get a clear, actionable
error instead of a silent no-op or AttributeError.
"""

import json as _json
import time as _time

from forge_loop.cluster import RunnerRegistry
from forge_loop.queue import QueueUnavailable, default_host_id
from forge_loop.queue.redis_backend import _load_redis # noqa: PLC2701

try:
redis = _load_redis()
client = redis.Redis.from_url(args.queue, decode_responses=True)
client.ping()
except QueueUnavailable as exc:
sys.stderr.write(f"cluster status: {exc}\n")
return 2
except Exception as exc: # noqa: BLE001
sys.stderr.write(f"cluster status: cannot reach {args.queue}: {exc}\n")
return 2

# The registry is read-only here — runner_id/host_id don't matter
# for ``list_runners``; pass placeholders so we don't fight the API.
registry = RunnerRegistry(client, runner_id="cli", host_id=default_host_id())
runners = registry.list_runners()
now = _time.time()
if args.json:
sys.stdout.write(
_json.dumps(
[
{
"runner_id": r.runner_id,
"host_id": r.host_id,
"in_flight": r.in_flight,
"is_leader": r.is_leader,
"age_s": round(now - r.last_heartbeat, 2),
}
for r in runners
]
)
+ "\n"
)
return 0
if not runners:
sys.stdout.write("no live runners\n")
return 0
sys.stdout.write(f"{'RUNNER':<28} {'HOST':<24} {'LEAD':<5} {'INFLIGHT':<10} AGE\n")
for r in runners:
sys.stdout.write(
f"{r.runner_id:<28} {r.host_id:<24} "
f"{'yes' if r.is_leader else '-':<5} {r.in_flight:<10} "
f"{now - r.last_heartbeat:.1f}s\n"
)
return 0
_ = args
sys.stderr.write(
"cluster status: multi-host cluster mode was removed in #39 "
"(premature distribution; one-operator-one-box is the supported "
"surface). Use 'forge-loop status' and 'forge-loop events' instead.\n"
)
return 2


def _cmd_status(_args: argparse.Namespace) -> int:
Expand Down Expand Up @@ -843,7 +797,7 @@ def main(argv: list[str] | None = None) -> int:
"--queue",
default=None,
help="Queue backend URL. Default: in-memory (single host). "
"Pass redis://host:port/db to share work across multiple runner processes.",
"Pass sqlite:///path/to/queue.db for the durable embedded backend.",
)
p_run.set_defaults(func=_cmd_run)
sub.add_parser("status", help="Print current state file").set_defaults(func=_cmd_status)
Expand Down
16 changes: 0 additions & 16 deletions src/forge_loop/cluster/__init__.py

This file was deleted.

Loading
Loading