Skip to content

RLCR loop: add diminishing-returns trip, severity ladder, and bookkeeping-postcondition guards #198

@chenzongyao200127

Description

@chenzongyao200127

Context

Observed in a 24-round RLCR loop terminated by the stagnation
circuit breaker. Findings are derived from a sanitised methodology
analysis; no project identifiers are included.

Methodology improvement suggestions

1. Add an "environment-blocked" terminal state

When the reviewer twice in a row names the same operationally-gated
dependency as the only open work, auto-transition the loop to an
operator-handoff state. The current circuit breaker fires only on
stagnation pattern; it should also fire on "remaining work is outside
the loop's reach", which is detectable from the reviewer naming the
same environment dependency across consecutive rounds.

2. Severity ladder for reviewer verdicts

Reviewers reliably promoted wording mismatches to "blocking" side
issues. Introduce three explicit severities: (a) behaviour-incorrect,
(b) interface-contract drift, (c) wording/cosmetic. Only (a) should
be allowed to block a round; (c) should be batched into periodic
cleanup passes.

3. Mirror-surface enumeration in the contract template

When the same defect class exists in N parallel files, the contract
should require enumerating all mirrors up front and fixing them in
one commit. A search-budget heuristic in the contract template can
enforce this: before editing any wording, run a project-wide search
for sibling phrasings and include all hits.

4. Postconditions must target shipping artefacts, not bookkeeping files

The meta-housekeeping cascade in the final two rounds was caused by
a brittle exact-grep postcondition on a summary file whose layout
the writer was free to wrap. Three rules:

  • Bookkeeping artefacts get no grep-style postconditions.
  • Prose postconditions must use whitespace-tolerant matching.
  • Two consecutive rounds whose mainline objective is repairing the
    loop's own bookkeeping should auto-trigger the stagnation breaker.

5. Audit-only rounds require explicit authorisation

Regression scrubs should be inlined into the validation step of the
next substantive round, not consume a full contract / implement /
review cycle on their own.

6. Track acceptance-criteria status as a time series

The headline AC metric was constant for 15 consecutive rounds.
Escalate when the metric is unchanged for K rounds (K=3), distinct
from "no defects found".

7. Contract complexity should scale to edit size

Late rounds had 10–11 success criteria for two-line edits, most
vacuously satisfied. A vacuous-criteria ratio above 50% in a freshly
generated contract is itself a busywork signal.

8. Split review verdicts on two axes

Distinguish "contract met this round" from "global progress
changed". Fourteen consecutive contract-met verdicts co-existed
with zero global movement — the vocabulary hid the divergence.

9. Cap summary length proportional to edit size

Late summaries reached 400–500 lines for two-line changes by
re-quoting prior context. Move recurring boilerplate to a
dereferenced evidence ledger.

10. Cross-cutting "diminishing returns" trip

A single piece of machinery monitoring (a) AC metric unchanged for K
rounds, (b) edit volume trending to zero, and (c) reviewer's only
remaining gap is the same environment-bound task. When all three are
true, auto-transition to operator handoff. In this session those
signals were all true 12 rounds before the stagnation breaker
finally fired.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions