Skip to content

feat(engine): bound claude -p phases with a timeout, cost cap, and heartbeat#53

Merged
ecukalla merged 1 commit into
mainfrom
feature/ISSUE-48-phase-timeouts
May 30, 2026
Merged

feat(engine): bound claude -p phases with a timeout, cost cap, and heartbeat#53
ecukalla merged 1 commit into
mainfrom
feature/ISSUE-48-phase-timeouts

Conversation

@ecukalla
Copy link
Copy Markdown
Owner

Summary

Closes #48. Bounds every token-spending claude -p phase so a throttled or wedged hosted
call can no longer hang a feature-loop run indefinitely, and makes a slow-but-progressing
phase distinguishable from a hung one.

What changed

Per-phase timeout (FL_PHASE_TIMEOUT, default 1200s) — every claude -p phase (build,
the test/security gates, simplify, retrospective) runs under GNU timeout -k 30.

  • A timed-out gate has its failure file synthesized, so the loop treats it as failed
    and retries — rather than mis-reading "no failure file written" as a pass.
  • A timed-out post-green /code-simplify discards its partial edits and ships the
    already-green tip (matching the recommendation in engine: claude -p phases have no timeout or progress heartbeat — one stalled ~29 min (ISSUE-43 run) #48 itself).
  • Where timeout isn't on PATH (e.g. a bare macOS host), calls run unbounded rather than
    failing — and the prefix array is guarded so an empty expansion can't abort under
    set -u on macOS's Bash 3.2.

Cost cap (FL_MAX_BUDGET_USD, opt-in) — when set, passed as --max-budget-usd to each
phase, bounding a runaway agentic loop by cost. The issue proposed --max-turns, but the
pinned CLI (2.1.156) exposes --max-budget-usd, not --max-turns; this implements the
bound with the flag the CLI actually offers. Unset = no cap, so a guessed default can't
truncate a legitimate multi-step phase.

Progress heartbeat (FL_HEARTBEAT_SECS, default 60) — off-TTY (piped/CI/Docker), where
the spinner is a no-op, long phases emit a plain … still running (Nm) tick, and STATUS.md
stamps each running phase's start time. (Mitigation #4 in the issue — structured 429
parsing — is intentionally not added: the CLI logs no structured rate-limit signal to key
on; the elapsed tick + the timed-out phase's captured log tail surface throttling instead.)

Test plan

  • New bats coverage: a timed-out gate is marked failed (no hang), a simplify timeout ships
    the green tip, --max-budget-usd is passed through only when set, and a slow phase emits
    a heartbeat tick. The two timeout tests skip gracefully where timeout is unavailable.
  • Full suite green on macOS Bash 3.2 (46 ok). Tests 27/28 (PTY/spinner, from fix(spinner): auto-quiet on relay/CI terminals and forward opt-outs into Docker #49) fail only
    on this macOS host — script(1) differs from Linux — and fail identically on main;
    they pass in CI.

Provenance

Implemented by an autonomous feature-loop run on #48 (dogfooding), then rebased onto
v0.4.2, squashed to a single clean commit, AI-attribution trailers stripped (per repo
convention), and reviewed by hand — including a Bash 3.2 portability pass and confirming
--max-budget-usd is a real CLI flag.

…artbeat

Every token-spending claude -p phase (build, the test/security gates, simplify, retrospective) now runs under a per-phase wall-clock timeout (FL_PHASE_TIMEOUT, default 1200s) via GNU timeout, so a throttled or wedged hosted call can no longer hang the run indefinitely. A timed-out gate has its failure file synthesized so the loop retries instead of mis-reading the unfinished gate as a pass; a timed-out post-green /code-simplify discards its partial edits and ships the already-green tip. Where timeout is unavailable the calls run unbounded rather than failing.

Adds opt-in FL_MAX_BUDGET_USD, passed as --max-budget-usd to each phase, to bound a runaway agentic loop by cost (the plan proposed --max-turns, but the pinned CLI exposes --max-budget-usd). Adds an off-TTY progress heartbeat: long phases emit a plain "still running (Nm)" tick every FL_HEARTBEAT_SECS (default 60) and STATUS.md stamps each running phase's start time, so a slow-but-progressing phase is distinguishable from a hung one.

Adds bats coverage: a timed-out gate is marked failed (no hang), a simplify timeout ships the green tip, --max-budget-usd is passed through only when set, and a slow phase emits a heartbeat tick.

Closes #48.
@ecukalla ecukalla force-pushed the feature/ISSUE-48-phase-timeouts branch from 8913ad2 to a9d0b1a Compare May 30, 2026 16:16
@ecukalla ecukalla merged commit 6703842 into main May 30, 2026
5 checks passed
@ecukalla ecukalla deleted the feature/ISSUE-48-phase-timeouts branch May 31, 2026 07:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

engine: claude -p phases have no timeout or progress heartbeat — one stalled ~29 min (ISSUE-43 run)

1 participant