Skip to content

feat(maintenance): stuck-issue sweep — demote loop:ready issues with N exhausted iteration attempts #129

@hadamrd

Description

@hadamrd

Why

CTO question while dogfooding the brainstormer epic: "how come maintenance didn't detect the issue?"

The current maintenance.py daemon is an LLM-driven backlog GROOMER (closes dupes, retitles, adds loop:ready). It does NOT scan for stuck loop:ready issues that have repeatedly hit worker_iterations_exhausted.

Result: even after the iteration loop bailed on a broken branch, the issue stayed loop:ready (until #128 fixed the label drop). And even with that fix, transient bugs that don't trip escalate_to_human cleanly will still leave issues stuck.

What

A stuck_sweep.py health-check that runs every tick (or every N ticks) and:

  1. Reads the last 100 events from loop-runner-events.jsonl
  2. Counts worker_iterations_exhausted events per issue
  3. Any issue with ≥ 2 exhausted attempts AND still loop:ready → demote to loop:needs-human + comment with the last failure state
  4. Any issue with a PR in a pushed_no_pr / committed_not_pushed state persisting > 3 ticks → escalate similarly
  5. Emit a typed stuck_sweep_demoted event

Acceptance

  • src/forge_loop/stuck_sweep.py (new) with sweep(events_file, gh_client) -> SweepReport
  • Wired into tick loop: runs after the iteration loop on every tick, before dispatching the next batch
  • Configurable via settings.maintenance.stuck_threshold_attempts (default 2)
  • Tests: fixtures of events.jsonl with mixed stuck/healthy issues; assert correct demotions

Test matrix

  • Issue with 2 exhausted events → demoted
  • Issue with 1 exhausted + 1 success after → NOT demoted (it recovered)
  • Issue with 0 exhausted events → NOT demoted
  • Demotion fires the typed event + correct labels
  • GhClient failure during demotion → caught, logged, doesn't crash sweep

File pointers

  • src/forge_loop/stuck_sweep.py (new)
  • src/forge_loop/events.py — register StuckSweepDemotedEvent
  • src/forge_loop/settings.pymaintenance.stuck_threshold_attempts
  • src/forge_loop/runner/tick.py — call sweep per tick
  • tests/test_stuck_sweep.py (new)

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions