RLCR: 10 methodology improvements from a wall-clock-gated session

 ## Summary

  Ran a 9-round RLCR session against a plan whose central deliverable required multi-hour
   external compute. The loop terminated by user intervention via the stagnation circuit
  breaker. The mechanical iteration was healthy; the structural weakness was a mismatch
  between wall-clock-gated plan shape and per-round closure semantics. Below are 10
  concrete methodology suggestions derived from a sanitized post-session analysis.

  ## Key observation

  The reviewer's strict "did this round close a new mainline task?" metric returned
  negative for every round between launch and completion of the external compute job,
  regardless of what the executor actually did. This pushed the executor toward smaller
  and smaller cleanup deliverables to "earn" forward motion, sometimes introducing
  collateral bugs the next round had to revert.

  ## Improvement suggestions (pattern → fix)

  ### 1. Wall-clock-gated plans collide with per-round closure semantics
  Add a plan-level annotation for tasks gated on external time (compute jobs, human
  approvals, scheduled events, etc.). When set and unresolved, evaluate the round against
   an alternate criterion ("preserve readiness + capture available evidence") rather than
   strict closure. Reviewer-controlled, not executor-controlled.

  ### 2. Verdicts conflate side-issue progress with true stagnation
  Split the verdict into two orthogonal axes: mainline-closure (yes/no) and round-quality
   (advanced / repaired / cleanup-only / regressed / circular). The stagnation circuit
  breaker fires on the round-quality axis, not on closure.
  should auto-escalate to the user before the standard circuit breaker.

  ### 4. Executor over-claims that reviewer must correct
  Require summaries to include a machine-checkable claims block (artifact X exists at
  path Y, commit Z touches files A/B/C, tracker row N reads ...). A pre-review hook
  mechanically verifies those specific claims and rejects the summary back to the
  executor before invoking the reviewer.

  ### 5. "Cleanup" edits that change behavior get classified as cleanup
  When a round describes itself as cleanup-only or behavior-preserving, require a
  diff-based check that flags schema renames, interface changes, or public-symbol
  renames. Force explicit reclassification as a behavior change rather than letting it
  pass as cleanup.

  ### 6. Paperwork volume scales with rounds, not with progress
  When a round inherits a stable wall-clock blocker from the prior round, allow a
  delta-only paperwork mode: short summary referencing the previous full summary by hash,
   plus only what changed. Full re-baseline when the blocker clears, the scope changes
  materially, or every K rounds.

  ### 7. No first-class "plan-unattainable" terminal state
  Add an explicit terminal state distinct from STALLED / REGRESSED for "plan success
  criteria are now permanently unreachable on this attempt" (artifact destroyed,
  permission revoked, deadline passed, etc.). A round can request this state with
  evidence; the reviewer confirms or rejects. Confirmed plan-unattainable produces clean
  documented closure rather than indefinite stall.

  ### 8. Self-referential commit-hash placeholders
  Pre-commit / round-finalization check that scans round artifacts for self-referential
  placeholders matching common patterns (literal "this commit", "to be filled",
  angle-bracket placeholders). Reject the round or mark artifacts as pending post-commit
  fixup so the issue is tracked instead of silently shipped.

  ### 9. Tracker as ground truth worked; summaries did not
  Codify the asymmetry observed in this session: the tracker (especially with reviewer
  write access) is the canonical state of record; summaries are narrative commentary that
   must reference but cannot contradict the tracker. Auto-flag any summary claim that
  conflicts with the tracker.

  ### 10. "Produce nothing this round" should be acceptable when blocked
  Make a no-op round explicitly acceptable when a wall-clock-gated task is the only
  mainline work. Pair with a longer scheduling delay before the next round (rather than
  firing immediately and demanding new output). This reduces the busywork attractor and
  aligns loop cadence with plan cadence.

  ## What worked well

  - Goal-tracker discipline (especially with reviewer write access) stayed coherent
  across all rounds and was the most reliable artifact.
  - The reviewer caught at least three substantive bugs the executor missed: a schema
  rename masquerading as cleanup, a factual transcription error in a results table, and
  an over-claim of artifact landing. Direct evidence of verification value beyond
  self-review.
  - Communication was detailed and traceable to file lines; the cost was volume.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RLCR: 10 methodology improvements from a wall-clock-gated session #136

Summary

Key observation

Improvement suggestions (pattern → fix)

1. Wall-clock-gated plans collide with per-round closure semantics

2. Verdicts conflate side-issue progress with true stagnation

4. Executor over-claims that reviewer must correct

5. "Cleanup" edits that change behavior get classified as cleanup

6. Paperwork volume scales with rounds, not with progress

7. No first-class "plan-unattainable" terminal state

8. Self-referential commit-hash placeholders

9. Tracker as ground truth worked; summaries did not

10. "Produce nothing this round" should be acceptable when blocked

What worked well

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

RLCR: 10 methodology improvements from a wall-clock-gated session #136

Description

Summary

Key observation

Improvement suggestions (pattern → fix)

1. Wall-clock-gated plans collide with per-round closure semantics

2. Verdicts conflate side-issue progress with true stagnation

4. Executor over-claims that reviewer must correct

5. "Cleanup" edits that change behavior get classified as cleanup

6. Paperwork volume scales with rounds, not with progress

7. No first-class "plan-unattainable" terminal state

8. Self-referential commit-hash placeholders

9. Tracker as ground truth worked; summaries did not

10. "Produce nothing this round" should be acceptable when blocked

What worked well

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions