feat(cc-249): PR-B.1 — add 4 unified assert_* helpers to test-harness#148
Merged
Conversation
CC-249 spike (docs/spikes/CC-249.md, PR #146) decided the unified assert_* API. PR-B.1 adds the 4 helpers + shared fail-message formatter to scripts/lib/test-harness.sh, PURE ADDITIVE — zero consumer changes; PR-B.2 (consumer migration) follows separately. Helpers (per spike Q1/Q2): - `assert_exit <name> <actual> <expected>` — exit-code check; majority arg order (value-under-test first) - `assert_file_contains <name> <file> <literal>` — grep -Fq for literal string in file (current variant A; 6+ consumers will adopt in PR-B.2) - `assert_file_matches <name> <file> <regex>` — grep -qE for regex pattern in file (current variant B; 2 consumers) - `assert_string_contains <name> <haystack> <needle>` — bash string containment via [[ != *needle* ]] (current variant C; 2 consumers). Also absorbs assert_output_contains in PR-B.2 Plus shared `_th_assert_fail_msg` formatter — each helper calls it on the fail path; message contains helper-name + relevant args + failed condition (debuggable from output alone). All 4 helpers call existing harness `pass` / `fail` fns on success/failure → FORMAT-aware output preserved + counter increments go through the same path as every other case. Test coverage: 8 new self-test cases in test-test-harness.sh (1 pass + 1 fail per helper) using the file's existing local pass_case/fail_case + run_harness_probe pattern (probe subshell, not harness's own counters — these are tests OF the harness): - assert-exit-{pass,fail} - assert-file-contains-{pass,fail} - assert-file-matches-{pass,fail} - assert-string-contains-{pass,fail} Verification: - bash scripts/test-test-harness.sh — 30/30 (was 22; +8 new) - bash scripts/test-run-all-tests.sh — 13/13 integration - bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline) - git diff scripts/lib/test-harness.sh — only additions after line 124 (th_summary close); no deletions; existing fns byte-identical - expected_head_sha 03555fe verified by codex before first edit BACKLOG CC-249 row: stays `🟢 someday` (PR-B.2 still pending); pr-ref `pr:TBD-PRB1` placeholder (tail commit flips to actual PR# after `gh pr create`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CC-249 row stays 🟢 someday (PR-B.2 consumer migration still pending). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
screenleon
added a commit
that referenced
this pull request
May 23, 2026
… PR-B.2 retro (#149) * fix(cc-254): harness assert_* helpers no auto-pass (amend PR-B.1 #148) PR-B.1 (#148) shipped assert_* helpers that auto-called `pass "$name"` on success. Spike Q1-Q5 assumed pure-rename consumer migration would work. PR-B.2 (consumer migration) dispatch 2026-05-24 surfaced a design conflict: 13 consumer test scripts ALL follow the `assert_X "$name" ...; pass "$name"` pattern — `assert_*` is a check; the consumer separately calls `pass` to increment the counter. Pure rename would double-count PASS (harness implicit pass + consumer explicit pass). Codex defensively shadowed the harness by re-defining the 4 helpers locally in 9 of 13 consumer files — Q3 explicitly rejected that "shim". Dispatch hit timeout 124 with 9 apply_patch failures across 4 complex files. Root cause = PR-B.1 helper API, not codex's interpretation or PR-B.2's brief. Fix: - Strip `pass "$name"` from each of the 4 helpers (`assert_exit`, `assert_file_contains`, `assert_file_matches`, `assert_string_contains`). Helpers still call `fail` on failure. - Header comment in `scripts/lib/test-harness.sh` documents the new contract: success returns 0 without side-effects; consumer responsible for `&& pass "$name"` if accounting is desired. - 4 pass-path self-tests in `scripts/test-test-harness.sh` amended to call `pass` explicitly after the assert. Fail-path self-tests unaffected. After this lands, PR-B.2 reverts to pure rename — each `assert_contains "$n" $f "$x"` becomes `assert_file_contains "$n" $f "$x"` and the existing `pass "$n"` underneath stays unchanged. No double- count, no shim. Verification: - bash scripts/test-test-harness.sh — 30/30 (no regression) - bash scripts/test-run-all-tests.sh — 13/13 integration - bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline) BACKLOG row CC-254 files this amendment; will close as ✅ pr:#NNN in tail commit after gh pr create. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(cc-254): process artifacts from PR-B.2 retro Three process improvements captured from the CC-249 PR-B.2 design-gap incident (see commit 9869034 + spike Amendment): 1. **`agents/project-pm.md`** — adds 3 new rules to "Writing a brief for codex-executor": - Spike-pilot rule: every API-design spike MUST include a `## Pilot walkthrough` section in the output spike doc (1 representative consumer, verbatim before/after diff applying every spike decision). If walkthrough can't be written cleanly, spike's API decision is not yet mature. - Additive-PR pilot rule: every PR shipping a new API MUST migrate at least 1 real consumer in the SAME PR. - Explore call-site-context rule: Explore symbol surveys must capture BOTH declarations AND call-site context (raw line + 2 before/after). Declaration-only surveys miss usage-pattern conflicts. 2. **`docs/spikes/CC-249.md`** — appends Amendment section to Open Risks documenting: - The hidden contract gap (consumers follow `assert_X ...; pass` pattern; spike never inspected call-sites) - Symptoms (PR-B.2 dispatch timeout + shadow-shim 9/13 files) - Recovery (CC-254 amendment) - Cross-links to the 2 new memory entries 3. Memory (separate repo, already committed): - `[[feedback_spike_pilot_required]]` — full rule + CC-249 retro evidence + cross-links - `[[feedback_explore_call_site_context]]` — full rule + Explore prompt template fragment + CC-249 retro evidence These artifacts make the lesson reusable across projects. Future spike + additive PR briefs are expected to apply the patterns; future Explore surveys for symbol divergence include the call-site context template fragment. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(cc-254): backlog pr-ref TBD → pr:#149 (tail commit) Per CC-250/CC-251/CC-252 tail-commit pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR-B.1 of the CC-249 sequence — pure additive harness changes per spike (
docs/spikes/CC-249.mdfrom PR #146). Adds 4 unifiedassert_*helpers + shared fail-message formatter toscripts/lib/test-harness.sh. Zero consumer changes; PR-B.2 (14-file consumer migration) follows separately.Helpers added (per spike Q1/Q2)
assert_exit<name> <actual> <expected>assert_file_contains<name> <file> <literal_substring>grep -Fqfor literal string in fileassert_file_matches<name> <file> <regex>grep -qEfor regex pattern in fileassert_string_contains<name> <haystack_string> <needle>[[ haystack != *needle* ]]containmentPlus
_th_assert_fail_msg— shared fail-path formatter each helper calls; message contains helper-name + relevant args + failed condition (debuggable from output alone, no need to re-read helper source).All 4 helpers call existing harness
pass/failfns on success/failure → FORMAT-aware output preserved + counter increments go through the standard path. No direct printf bypass.What's NOT in this PR (deferred to PR-B.2)
scripts/test-*.shfiles keep their localassert_*definitions for now)assert_output_containsdeletion (lives in 2 consumers; deleted in PR-B.2)Test coverage
8 new self-test cases in
scripts/test-test-harness.sh(1 pass-path + 1 fail-path per helper):assert-exit-{pass,fail}assert-file-contains-{pass,fail}assert-file-matches-{pass,fail}assert-string-contains-{pass,fail}Each case invokes the new helper in a probe subshell (per file's existing
run_harness_probepattern — outer test uses localpass_case/fail_case, not the harness's ownpass/fail, to avoid polluting the harness's counters during a self-test).Fail-path cases assert: (a) helper's
failwas called in the probe (FAIL line + counter increment); (b) the captured probe stderr/stdout contains the helper name AND the failing args (debuggability check).Test plan
bash scripts/test-test-harness.sh— 30/30 (was 22; +8 new)bash scripts/test-run-all-tests.sh— 13/13 integrationbash pm/scripts/validate.sh BACKLOG.md— parity 30 (CC-228 baseline)git diff scripts/lib/test-harness.sh— only additions afterth_summaryclose; no deletions; existingth_init/pass/fail/should_run/th_summarybyte-identical/pr-gate standard— Final: GO, critic + qa-tester + architecture-reviewer all approve, zero findingsshellcheck --severity=styleon test-harness.sh + test-test-harness.sh — not installed locally; CI will runBrief discipline applied (per
[[feedback_codex_brief_discipline]]CC-251)expected_head_sha: 03555feverified by codex before first edit ✓apply_patchretry-cap constraint included (no debug-loop hang) ✓verbatim-as-attached-fileN/A (< 50 lines literal embedded)PR-B.2 brief will trigger ALL 3 patterns (14 files > 4-files trigger; per-file migration matrix > 50 lines → attached as
/tmp/cc-249-migration/<file>.md).Next (PR-B.2, after this merges)
Consumer migration in lockstep — replace each consumer's local
assert_*defs with calls to the now-installed harness helpers, delete the local defs, validate stdout byte-identity per consumer. Estimated 14 files × 5–15 LoC delta each, net –50 to –150 LoC.🤖 Generated with Claude Code