Skip to content

fix(cc-254): harness assert_* no auto-pass + 3 process artifacts from PR-B.2 retro#149

Merged
screenleon merged 3 commits into
mainfrom
cc-249-prb1-amend-harness-no-implicit-pass
May 23, 2026
Merged

fix(cc-254): harness assert_* no auto-pass + 3 process artifacts from PR-B.2 retro#149
screenleon merged 3 commits into
mainfrom
cc-249-prb1-amend-harness-no-implicit-pass

Conversation

@screenleon
Copy link
Copy Markdown
Owner

Summary

Two things in one PR, both rooted in the CC-249 PR-B.2 dispatch incident (2026-05-24):

  1. Fix (CC-254) — amend PR-B.1 (feat(cc-249): PR-B.1 — add 4 unified assert_* helpers to test-harness #148): remove auto-pass from the 4 harness assert_* helpers + update self-tests + document the new contract
  2. Process artifacts — capture the lesson so future spikes / additive PRs / Explore surveys don't repeat the gap

What broke (and why)

PR-B.1 helpers auto-called pass "$name" on success. PR-B.2 dispatch surfaced a hidden contract: all 13 consumer test scripts already follow assert_X "$name" ...; pass "$name" (assert is check; consumer calls pass). Pure rename would have double-counted PASS. Codex defensively shadowed the harness by re-defining 4 helpers in each of 9/13 consumer files — explicitly violating spike Q3 "no shim". Dispatch hit timeout 124 with 9 apply_patch failures in 4 complex files.

Root cause = PR-B.1 helper API, not codex's behavior or PR-B.2's brief. Spike never inspected consumer call-sites, only declarations.

Fix (CC-254)

scripts/lib/test-harness.sh:

  • Strip pass "$name" from each of the 4 helpers (assert_exit, assert_file_contains, assert_file_matches, assert_string_contains)
  • Add header comment documenting the new contract: success returns 0 silently; consumer is responsible for && pass "$name" if PASS accounting is desired
  • Helpers still call fail on failure (unchanged)

scripts/test-test-harness.sh:

  • 4 pass-path self-tests amended: explicit && pass '<name>' after each assert
  • 4 fail-path self-tests unaffected
  • 30/30 passes (no regression)

After this lands, PR-B.2 reverts to pure rename — each assert_contains "$n" $f "$x"assert_file_contains "$n" $f "$x"; existing pass "$n" stays untouched. No double-count, no shim.

Process artifacts

Artifact Where
Memory [[feedback_spike_pilot_required]] ~/.claude/projects/.../memory/ (separate repo, already committed)
Memory [[feedback_explore_call_site_context]] same
3 new brief-authoring rules in agents/project-pm.md this PR
Amendment section in docs/spikes/CC-249.md Open Risks this PR

The 3 brief-authoring rules added to agents/project-pm.md:

  1. Spike-pilot rule: every API-design spike MUST include ## Pilot walkthrough (1 representative consumer, verbatim before/after diff). If walkthrough can't be written cleanly → spike not yet mature.
  2. Additive-PR pilot rule: every PR shipping new API MUST migrate ≥ 1 real consumer in the SAME PR.
  3. Explore call-site-context rule: Explore symbol surveys must capture BOTH declarations AND call-sites with 2-line context.

Verification

  • bash scripts/test-test-harness.sh30/30 (no regression after auto-pass removal + self-test amend)
  • bash scripts/test-run-all-tests.sh — 13/13 integration
  • bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline)
  • /pr-gate expressFinal: GO, critic + qa-tester both [pass], zero findings

Out of scope

  • PR-B.2 re-dispatch — separate work after this merges (pure rename now possible)
  • CC-244 typed spike_v1 schema adding pilot_walkthrough: field — long-term automation; ticket exists
  • Tail commit flips CC-254 row pr:TBDpr:#<this> after gh pr create

🤖 Generated with Claude Code

screenleon and others added 3 commits May 24, 2026 01:25
PR-B.1 (#148) shipped assert_* helpers that auto-called `pass "$name"`
on success. Spike Q1-Q5 assumed pure-rename consumer migration would
work. PR-B.2 (consumer migration) dispatch 2026-05-24 surfaced a
design conflict: 13 consumer test scripts ALL follow the
`assert_X "$name" ...; pass "$name"` pattern — `assert_*` is a check;
the consumer separately calls `pass` to increment the counter.

Pure rename would double-count PASS (harness implicit pass + consumer
explicit pass). Codex defensively shadowed the harness by re-defining
the 4 helpers locally in 9 of 13 consumer files — Q3 explicitly
rejected that "shim". Dispatch hit timeout 124 with 9 apply_patch
failures across 4 complex files.

Root cause = PR-B.1 helper API, not codex's interpretation or PR-B.2's
brief.

Fix:
- Strip `pass "$name"` from each of the 4 helpers (`assert_exit`,
  `assert_file_contains`, `assert_file_matches`,
  `assert_string_contains`). Helpers still call `fail` on failure.
- Header comment in `scripts/lib/test-harness.sh` documents the new
  contract: success returns 0 without side-effects; consumer
  responsible for `&& pass "$name"` if accounting is desired.
- 4 pass-path self-tests in `scripts/test-test-harness.sh` amended to
  call `pass` explicitly after the assert. Fail-path self-tests
  unaffected.

After this lands, PR-B.2 reverts to pure rename — each
`assert_contains "$n" $f "$x"` becomes `assert_file_contains "$n" $f "$x"`
and the existing `pass "$n"` underneath stays unchanged. No double-
count, no shim.

Verification:
- bash scripts/test-test-harness.sh — 30/30 (no regression)
- bash scripts/test-run-all-tests.sh — 13/13 integration
- bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline)

BACKLOG row CC-254 files this amendment; will close as ✅ pr:#NNN in
tail commit after gh pr create.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three process improvements captured from the CC-249 PR-B.2 design-gap
incident (see commit 9869034 + spike Amendment):

1. **`agents/project-pm.md`** — adds 3 new rules to "Writing a brief
   for codex-executor":
   - Spike-pilot rule: every API-design spike MUST include a `## Pilot
     walkthrough` section in the output spike doc (1 representative
     consumer, verbatim before/after diff applying every spike
     decision). If walkthrough can't be written cleanly, spike's API
     decision is not yet mature.
   - Additive-PR pilot rule: every PR shipping a new API MUST migrate
     at least 1 real consumer in the SAME PR.
   - Explore call-site-context rule: Explore symbol surveys must
     capture BOTH declarations AND call-site context (raw line + 2
     before/after). Declaration-only surveys miss usage-pattern
     conflicts.

2. **`docs/spikes/CC-249.md`** — appends Amendment section to Open
   Risks documenting:
   - The hidden contract gap (consumers follow `assert_X ...; pass`
     pattern; spike never inspected call-sites)
   - Symptoms (PR-B.2 dispatch timeout + shadow-shim 9/13 files)
   - Recovery (CC-254 amendment)
   - Cross-links to the 2 new memory entries

3. Memory (separate repo, already committed):
   - `[[feedback_spike_pilot_required]]` — full rule + CC-249 retro
     evidence + cross-links
   - `[[feedback_explore_call_site_context]]` — full rule + Explore
     prompt template fragment + CC-249 retro evidence

These artifacts make the lesson reusable across projects. Future
spike + additive PR briefs are expected to apply the patterns; future
Explore surveys for symbol divergence include the call-site context
template fragment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per CC-250/CC-251/CC-252 tail-commit pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@screenleon screenleon merged commit c62b884 into main May 23, 2026
16 checks passed
@screenleon screenleon deleted the cc-249-prb1-amend-harness-no-implicit-pass branch May 23, 2026 16:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant