Skip to content

SPIR safety_ceiling=1 force-advances high-risk phases past real defects — proposal to make it risk-calibrated #960

@amrmelsayed

Description

@amrmelsayed

Problem

In SPIR, safety_ceiling=1 is the current default for the iteration counter on each phase's implement-consult cycle. Any iter-1 REQUEST_CHANGES verdict triggers an auto-force-advance of the phase, marking it complete and moving porch to the next phase.

For high-risk architectural / concurrency / correctness phases — exactly the phases where rigorous review catches the most — this default produces a structural problem:

  • The phase's first consult turns up a real defect (very common when codex is in the panel; we've seen 8 consecutive sessions where codex caught real defects the other reviewers missed).
  • Architect rules the fix is in-scope, builder lands it.
  • Builder needs to re-run consult to verify the fix → but porch has already force-advanced and won't accept a phase_N consult anymore.
  • The remaining iteration (often 3–5 more rounds before codex APPROVE) happens off-protocol via direct architect rulings + builder commits + standalone rebuttal docs.
  • Porch's state machine and the audit trail diverge: porch thinks the phase was REJECTED-then-advanced; reality is that the phase was iteratively closed via architect-directed off-protocol consult.

Concrete recent case

SPIR #1190 Phase 1 (cluesmith/shannon) — iter-1 closed RC, porch force-advanced. Architect-directed iter-2 through iter-6 happened off-protocol:

  • iter-3 KEY-2: persistence-idempotence — real codex catch, fixed
  • iter-4 KEY-1: broadcast/persistence id desync — real codex catch, fixed
  • iter-5 KEY-1: contract test substring-matching gap — real codex catch, fixed
  • iter-6: pending verification

All 6 iters' work is captured in git + standalone rebuttal docs, but porch's state machine never tracked iter-2 through iter-6 as part of phase_1.

Proposed fixes (two options to surface)

Option A — Per-phase safety_ceiling via plan front-matter

High-risk phases declare a non-default ceiling in the plan:

plan_phases:
  - id: phase_1_user_frame_id
    title: ...
    safety_ceiling: 5    # explicit, plan-author decides risk
  - id: phase_2_low_risk
    title: ...
    # implicit safety_ceiling = 1

Pro: Risk-calibrated; default stays conservative for routine phases.
Con: Plan-authors need to know which phases warrant the lift.

Option B — Document architect-directed off-protocol iteration as legitimate

Add a protocol section explicitly authorizing the architect to run off-protocol iter-N consults when porch has force-advanced after a real-defect catch at iter-1. The pattern (rebuttal docs + git commits + afx messages = canonical trail) becomes blessed instead of unofficial.

Pro: Zero porch-state-machine changes; formalizes existing practice.
Con: Two parallel audit trails (porch + rebuttal-doc series); future readers have to know which is canonical for each phase.

Recommendation

Option A is the cleaner long-term fix. Option B is the pragmatic interim option if A is too disruptive to ship soon. Both are worth considering; this issue is to open the discussion, not pre-commit to either.

Out of scope

  • Changes to PIR's max_iterations=1 design — PIR is intentionally single-pass and the trade-off is different (PRs through PIR are smaller / more bounded than SPIR phases). This issue is SPIR-specific.

Related

  • cluesmith/shannon: SPIR #1190 (the concrete case study cited above)
  • The "8 consecutive codex-catches" pattern documented across this work session — supports the claim that high-risk phases reliably benefit from multi-iter consult cycles

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions