[FORGE-22] feat(orchestrator): gc reconciler + retry policy (P2-T09)#203
Merged
Conversation
… + failure_reason discriminator
[FORGE-22] P2-T09 units 1+2 — foundation for the gc reconciler.
src/orchestrator/retry.ts (new, pure)
backoffMs(), nextEligibleAt(), nextRetryState(), classifyError().
Symphony formula: min(1000 * 2^(attempt-1), retry_backoff_ms_max).
Defaults: retry_attempts=10, retry_backoff_ms_max=300_000 (matches
agents.* in settings.yaml — verified against the existing keys).
src/schemas/task-state.ts
Adds optional failure_reason: 'decision_key_budget' | 'retries_exhausted'
| 'fatal' with zod refinement: present iff state === 'failed'.
Schema-additive; existing state.json files round-trip unchanged.
src/core/errors.ts
Adds DECISION_KEY_EXHAUSTED (consumer wiring only — producer site
deferred to follow-up per FORGE-22 plan §Scope).
Adds LEASE_IDENTITY_MISMATCH + LEASE_STATE_NOT_TERMINAL for the
new admin-release primitive.
src/orchestrator/leases.ts
Adds adminReleaseLeaseByIdentity({forgeDir, taskId, expectedClaimId,
expectedGeneration, expectedOwnerRunId, expectedPath,
requireTerminalState, reason}). Identity-gated, NOT a generic
ownership-bypass: caller supplies exact lease identity captured at
snapshot time, executor re-reads + confirms (claim_id, generation,
owner_run_id) match before unlink. Row-14 path additionally
re-reads state.json and confirms state ∈ TERMINAL_TASK_STATES.
Records event: 'admin_released' in claim-history.jsonl with reason.
Codex 2nd-pass #2 resolution.
Tests: +30 net (16 retry + 9 admin-release + 5 task-state invariant).
Full suite: 1307 pass / 0 fail / 19 skipped.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…onciler
[FORGE-22] P2-T09 units 3+4 — the deterministic reconciler.
src/orchestrator/gc.ts (new, pure)
planGc(snapshot) → GcPlan. Discriminated union on `action`:
mark_terminal | mark_abandoned | mark_unclaimed |
archive_question | reverify_verdict |
release_lease_admin | prune_branch | report_orphan.
14 row detectors mirror spec/ORCHESTRATOR.md §gc reconciliation rules.
CHEAP_ROW_IDS = {2, 5, 8, 11, 12, 13, 14} — local-only checks for
detect-and-warn auto-gc consumers (Unit 5).
Deterministic ordering: (rowId, taskId). assertNeverGcRow() locks
exhaustiveness at the executor's switch.
Row 4 auto-recovery DEFERRED (Codex 2nd-pass #1 — Issue lacks claim
metadata); planner emits report_orphan with kind:
tracker_claimed_no_local for detection.
src/cli/orchestrate/gc.ts (refactored)
Two-phase verb:
Phase 0 — legacy v1 question-tree migration (FORGE-73, preserved
verbatim; 6 existing tests still pass).
Phase 1 — buildSnapshot() → planGc() → format (dry-run) or
executeRow().
Snapshot scans local state: state.json, lease.json* (multi-lease
for row 13), attempts/ (verdict, question, answer files).
Tracker integration deferred — empty trackerIssues means rows
1/3/4/6/7 don't fire in this PR.
Executor dispatches: release_lease_admin via adminReleaseLeaseByIdentity;
archive_question via mkdir+rename. Other actions (mark_terminal,
mark_abandoned, mark_unclaimed, reverify_verdict, prune_branch)
emit stderr warnings — deferred to follow-up, NOT errors.
exitCode is non-zero only on actual mutation failure.
src/cli/orchestrate/index.ts
gcHandler synopsis updated to reflect reconciler scope.
Tests:
test/orchestrator/gc.test.ts (new, 30 tests):
14 row detectors with fixture snapshots, cheap-set partition,
determinism, no-op alignment, GcPlanRow exhaustiveness check.
test/cli/orchestrate/gc-reconciler.test.ts (new, 8 tests):
Clean tree, dry-run vs apply, row 14 end-to-end, row 13
duplicate-lease release, row 11 archive, multi-row plan,
idempotence, legacy + reconciler coexist.
Full suite: 1345 pass / 0 fail / 19 skipped (+38 from base 1307).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
[FORGE-22] P2-T09 units 5+6 — auto-gc consumer wiring + spec amendment. src/cli/orchestrate/gc.ts Adds detectCheapDivergences(forgeDir, err, now) — reusable helper that builds a 'cheap' snapshot, runs planGc, and emits one stderr warning per row: `[gc] <task>: row <N> (<action>) — run \`forge orchestrate gc\` to apply`. Best-effort; any I/O failure short-circuits silently so the host verb never breaks. src/cli/orchestrate/phases.ts + status.ts Both read-band verbs invoke detectCheapDivergences() at the start of the --ready path / status assembly. Warnings to stderr only — stdout JSON contract unchanged. Read-band invariant preserved (no mutation from read verbs). spec/ORCHESTRATOR.md (2-line amendment) Line 601: clarify that read-band invocations DETECT (not "trigger mutations"); explicit `gc` is the only mutator. Line 640: replace "lightweight reconcile runs on every …" with "lightweight reconcile DETECTS the cheap rows …; read-band verbs NEVER mutate." Stderr warning format documented inline. Codex 2nd-pass #3 (spec amendment scope) + #4 (read-band mutation pattern break) resolved per FORGE-22 plan rev-3. Tests: test/cli/orchestrate/phases-autogc.test.ts (8 tests): unit tests of detectCheapDivergences (row 14, row 2, clean, no-mutation, best-effort I/O, expensive-row exclusion) + wiring smoke tests confirming phases.ts and status.ts both import the helper. Full suite: 1353 pass / 0 fail / 19 skipped (+8 from previous commit). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…on safety [FORGE-22] Codex 3rd-pass review applied (diff against rev-3 implementation, 7 findings — 2 BLOCK, 4 IMPROVEMENT/CONSIDER fixed, 1 doc clarification). BLOCK 1 (confidence 10) — `adminReleaseLeaseByIdentity` could unlink a freshly-renewed lease. heartbeat() preserves (claim_id, generation, owner_run_id) and only advances expires_at + last_heartbeat_at, so a heartbeat firing between snapshot and unlink slipped through the 3-field identity check. Fix: identity now requires a fourth field `expectedExpiresAt` (caller captures at snapshot time, executor confirms). Defense-in-depth: verify-before-unlink re-reads the lease one final time and re-confirms all four fields match. Row 14 additionally re-reads state.json immediately before unlink and re-confirms terminal state — complete.ts can transition shipped → running on changes_requested without a legal-transition gate, so the stale terminal check was racey. src/orchestrator/leases.ts: AdminReleaseByIdentityOptions extended; step 4 (verify lease) + step 4b (re-verify state) added between identity check and unlink. src/orchestrator/gc.ts: GcPlanRow.release_lease_admin payload schema extended; row 13 + row 14 planners capture lease.expires_at. BLOCK 2 (confidence 9) — Row 13 tie sort could target the canonical lease. readdirSync order is FS-dependent; equal-generation leases sorted by generation alone could put the canonical file in the to-release set. Fix: tie-breaker prefers isCanonical=true. Path order as final fallback for deterministic behavior under all-corruption scenarios. CONSIDER 3 (confidence 9) — schema-iff refinement could break adopter state.json files with state=failed and no failure_reason. Fix: loosen to asymmetric invariant — failure_reason IMPLIES state === 'failed' (not iff). Adopter state.json with state=failed and no reason continues to parse; readers default to treating absent reason as 'fatal' classification. Migration-safe. CONSIDER 4 (confidence 9) — OrchestrateGcResult couldn't distinguish "no divergence detected" from "all detected rows deferred to follow-up." Fix: add `reconcilerDeferred?: readonly GcPlanRow[]` field. Apply mode populates both `reconcilerRows` (applied) and `reconcilerDeferred` (deferred — stderr-warned but not mutated). Programmatic callers can now tell the difference. IMPROVEMENT 5 (confidence 8) — detectCheapDivergences swallowed snapshot-build failures silently. Fix: catch path now writes "[gc] auto-detect failed (continuing): <reason>" to stderr before returning. Operator sees corrupted .forge/orchestrator trees instead of mysterious empty output. CONSIDER 6 (confidence 8) — row-2 tracker-side variant gap not explicitly noted in code. Fix: detectRow2 doc comment now explicitly says the tracker-side sub-case is DEFERRED (the CLI snapshot passes empty trackerIssues). Non-issues 7 & 8 (confidence 8 each) — discriminated-union exhaustiveness intact; --dry-run does not mutate either phase. Confirmed by Codex. Tests: +1 in leases.test.ts: heartbeat-race scenario — heartbeat fires between snapshot and admin-release; identity mismatch on expires_at refuses the unlink. Updated task-state schema tests: state='failed' without failure_reason now ACCEPTED (asymmetric invariant); failure_reason without state='failed' still rejected. All 9 existing admin-release call sites updated to pass expectedExpiresAt. Full suite: 1355 pass / 0 fail / 19 skipped (+2 from previous commit). Typecheck clean. Build OK. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 20, 2026
firatcand
added a commit
that referenced
this pull request
May 20, 2026
Brings in FORGE-22 (gc reconciler + retry policy, PR #203) and FORGE-89 (/codex → /second-opinion rename, PR #205). Resolved conflicts in spec/CONTEXT.md + spec/PRD.md by combining our structural admonitions (deferred-to-v0.5 boxes, Decision 10/11 strikethrough+redirect, settings.yaml decisions-block annotation) with their /codex → /second-opinion renames in the preserved v0.5 prose. No semantic loss either side; both hygiene threads compose cleanly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships the gc reconciler + retry policy from
spec/ORCHESTRATOR.md§"gc reconciliation rules". Pure planner (src/orchestrator/gc.ts) drives 14 row detectors; CLI shim (src/cli/orchestrate/gc.ts) builds the snapshot, formats--dry-run, and executes the plan. Read-band verbs (phases --ready,status) now emit detect-and-warn auto-gc on stderr without mutating.src/orchestrator/retry.ts(pure): Symphony backoffmin(1000 * 2^(attempt-1), retry_backoff_ms_max),retry_attemptscap,failure_reasondiscriminator onTaskStateRecord(decision_key_budget|retries_exhausted|fatal).mark_terminal | mark_abandoned | mark_unclaimed | archive_question | reverify_verdict | release_lease_admin | prune_branch | report_orphan). Cheap-rows set frozen as{2, 5, 8, 11, 12, 13, 14}.adminReleaseLeaseByIdentityinsrc/orchestrator/leases.ts. Four-field identity check (claim_id,generation,owner_run_id,expires_at) + verify-before-unlink + row-14 terminal-state re-check immediately before unlink. Closes the heartbeat-renewal TOCTOU.detectCheapDivergences()invoked fromphases --readyandstatus. Detect-only; warnings to stderr in form[gc] <task>: row <N> (<action>) — run \forge orchestrate gc` to apply`. Spec §640 amended to document this.Why
Phase 2 / P0. Unblocks Phase 3 (FORGE-25 doctor checks, FORGE-26 perf-test harness, FORGE-27 cross-host CI matrix) which all depend on a deterministic gc primitive. Also unblocks FORGE-116 (
/wrap-upskill) which needs gc as its end-of-task housekeeping engine.Plan went through two Codex reviews (8 + 7 findings, all addressed); implementation went through one more Codex pass (7 findings — 2 BLOCK on TOCTOU, all fixed in
164800b).Test plan
npm run typecheck— cleannpm test— 1355 pass / 0 fail / 19 skipped (+78 from base 1277)npm run build— ESM 369.93 kB / CJS 403.04 kBgitleaks detect— no leaksheartbeat()between snapshot and admin-release; identity mismatch correctly refuses the unlinkManual smoke:
node dist/bin/forge.cjs orchestrate gc --dry-runon an empty.forge/orchestrator→ "no divergences found";orchestrate gc --helpsynopsis reflects new scope.Deferred to follow-up tickets
src/trackers/types.ts:Issuewith claim metadata → unblocks gc row 4 auto-recovery and full row-2 tracker-side variantDECISION_KEY_EXHAUSTEDfrom the question-channelmax_attemptssite (ties to FORGE-65) → activates the consumer wiring shipped herelast_failed_attoTaskStateRecord+ writer incomplete.ts→ activates retry-eligibility filter inphases --readymark_terminal/mark_abandoned/mark_unclaimed/reverify_verdict/prune_branch(currently emitreconcilerDeferredwarnings)Closes FORGE-22.
🤖 Generated with Claude Code