auto_gate: max_iterations is the cap, not max_iterations+1 by gabehamilton · Pull Request #3 · difflabai/forge

gabehamilton · 2026-05-21T22:19:00Z

Summary

`AutoGateHandler::execute` used `force_pass = next > max_iterations` so `max_iterations=N` permitted N+1 gate visits — N findings → refine cycles plus one final force-pass evaluation. Change `>` to `>=` so the Nth visit force-passes when the policy is still unsatisfied; total gate visits is now bounded by `max_iterations`.

Concrete impact

The `findings_empty` policy reads `findings_count` from the runtime context, but no agent-driven box stage currently writes that key (there's no stdout-parse or output-schema bridge yet — see doc comment update). So for agent-driven loops the policy effectively reduces to "loop until max_iterations." With each refine subprocess taking 3–5 minutes in real bootstrap runs, the extra cycle was burning 5+ minutes per gate.

Repro from a real ai-strategy-builder bootstrap run on `pipelines/bootstrap.dot` (spec_review has `max_iterations=3`):

{"...,"node_id":"spec_review","notes":"auto gate: findings detected (iteration 1/3)"}
{"...,"node_id":"refine_specs",...}
{"...,"node_id":"spec_review","notes":"auto gate: findings detected (iteration 2/3)"}
{"...,"node_id":"refine_specs",...}
{"...,"node_id":"spec_review","notes":"auto gate: findings detected (iteration 3/3)"}
{"...,"node_id":"refine_specs",...}   ← 4th cycle still running; 1-hour timeout fired before the 5th gate evaluation

After this patch, visit #3 force-passes, the pipeline reaches `scaffold_evals`, and the user gets through bootstrap inside the timeout budget.

Tests

Adds `auto_gate_at_nth_visit_with_findings_force_passes` covering the `prior=2, max=3` edge that previously looped.
All 12 existing `auto_gate` tests still pass — the boundary they exercised (`prior=N, max=N`) produces force-pass under both `>` and `>=` semantics, so no test was load-bearing on the wrong behavior.

Documentation

Updates the `AutoGateHandler` doc comment to spell out the new semantic (`max_iterations=N` means "at most N evaluations") and to call out the structural limitation around `findings_count` not being written by agent stages today.

Depends on

#2 (claude-code adapter: --dangerously-skip-permissions opt-in) — this PR is branched off `claude-code-skip-permissions` because that branch was needed to actually reach the `spec_review` loop and observe the bug in a real run. Merge order: #2 first, then this.

🤖 Generated with Claude Code

Was `next > max_iterations`. With max_iterations=3 a fresh gate would visit 4 times — 3 [F] Findings cycles plus a 4th force-pass evaluation, which was off-by-one for what users intuitively expect from "max_iterations". Concrete impact: a `findings_empty` policy that never sees a 0 count (no agent stage writes findings_count to context yet) would burn 3 full refine subprocesses before finally force-passing. With each refine taking ~5 minutes in real bootstrap runs, that's a 15-minute floor for every auto-gate the pipeline traverses. Change `>` to `>=` so the Nth visit (when max_iterations=N) force-passes if the policy still hasn't been satisfied. Adds regression test `auto_gate_at_nth_visit_with_findings_force_passes` that asserts visit 3 of a max=3 gate force-passes rather than looping. All previous tests still pass (the boundary they exercised — prior=N, max=N — still produces force-pass under both semantics). Documents the structural limitation in the handler doc comment: agent box stages have no current path to write `findings_count` to the runtime context, so for those gates the policy effectively reduces to "loop until max_iterations". Fixing that requires a separate change (parse agent stdout, or surface output_schema into context). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request updates the AutoGateHandler to fix an off-by-one error in the iteration limit logic, ensuring that the gate force-passes exactly on the Nth evaluation by changing the comparison from > to >=. The documentation has been updated to clarify this behavior and note current limitations with agent-driven stages, and a regression test has been added to verify the fix. I have no feedback to provide.

gemini-code-assist Bot reviewed May 21, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

auto_gate: max_iterations is the cap, not max_iterations+1#3

auto_gate: max_iterations is the cap, not max_iterations+1#3
gabehamilton wants to merge 1 commit into
claude-code-skip-permissionsfrom
auto-gate-max-iterations-off-by-one

gabehamilton commented May 21, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

gabehamilton commented May 21, 2026

Summary

Concrete impact

Tests

Documentation

Depends on

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant