Skip to content

feat(cc-249): PR-B.1 — add 4 unified assert_* helpers to test-harness#148

Merged
screenleon merged 2 commits into
mainfrom
cc-249-pr-b1-harness-asserts
May 23, 2026
Merged

feat(cc-249): PR-B.1 — add 4 unified assert_* helpers to test-harness#148
screenleon merged 2 commits into
mainfrom
cc-249-pr-b1-harness-asserts

Conversation

@screenleon
Copy link
Copy Markdown
Owner

Summary

PR-B.1 of the CC-249 sequence — pure additive harness changes per spike (docs/spikes/CC-249.md from PR #146). Adds 4 unified assert_* helpers + shared fail-message formatter to scripts/lib/test-harness.sh. Zero consumer changes; PR-B.2 (14-file consumer migration) follows separately.

Helpers added (per spike Q1/Q2)

Helper Signature Behavior
assert_exit <name> <actual> <expected> Exit-code equality (majority arg order; value-under-test first)
assert_file_contains <name> <file> <literal_substring> grep -Fq for literal string in file
assert_file_matches <name> <file> <regex> grep -qE for regex pattern in file
assert_string_contains <name> <haystack_string> <needle> bash [[ haystack != *needle* ]] containment

Plus _th_assert_fail_msg — shared fail-path formatter each helper calls; message contains helper-name + relevant args + failed condition (debuggable from output alone, no need to re-read helper source).

All 4 helpers call existing harness pass / fail fns on success/failure → FORMAT-aware output preserved + counter increments go through the standard path. No direct printf bypass.

What's NOT in this PR (deferred to PR-B.2)

  • Consumer migrations (14 scripts/test-*.sh files keep their local assert_* definitions for now)
  • assert_output_contains deletion (lives in 2 consumers; deleted in PR-B.2)

Test coverage

8 new self-test cases in scripts/test-test-harness.sh (1 pass-path + 1 fail-path per helper):

  • assert-exit-{pass,fail}
  • assert-file-contains-{pass,fail}
  • assert-file-matches-{pass,fail}
  • assert-string-contains-{pass,fail}

Each case invokes the new helper in a probe subshell (per file's existing run_harness_probe pattern — outer test uses local pass_case/fail_case, not the harness's own pass/fail, to avoid polluting the harness's counters during a self-test).

Fail-path cases assert: (a) helper's fail was called in the probe (FAIL line + counter increment); (b) the captured probe stderr/stdout contains the helper name AND the failing args (debuggability check).

Test plan

  • bash scripts/test-test-harness.sh30/30 (was 22; +8 new)
  • bash scripts/test-run-all-tests.sh — 13/13 integration
  • bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline)
  • git diff scripts/lib/test-harness.sh — only additions after th_summary close; no deletions; existing th_init/pass/fail/should_run/th_summary byte-identical
  • /pr-gate standardFinal: GO, critic + qa-tester + architecture-reviewer all approve, zero findings
  • shellcheck --severity=style on test-harness.sh + test-test-harness.sh — not installed locally; CI will run

Brief discipline applied (per [[feedback_codex_brief_discipline]] CC-251)

  • expected_head_sha: 03555fe verified by codex before first edit ✓
  • apply_patch retry-cap constraint included (no debug-loop hang) ✓
  • verbatim-as-attached-file N/A (< 50 lines literal embedded)

PR-B.2 brief will trigger ALL 3 patterns (14 files > 4-files trigger; per-file migration matrix > 50 lines → attached as /tmp/cc-249-migration/<file>.md).

Next (PR-B.2, after this merges)

Consumer migration in lockstep — replace each consumer's local assert_* defs with calls to the now-installed harness helpers, delete the local defs, validate stdout byte-identity per consumer. Estimated 14 files × 5–15 LoC delta each, net –50 to –150 LoC.

🤖 Generated with Claude Code

screenleon and others added 2 commits May 24, 2026 00:22
CC-249 spike (docs/spikes/CC-249.md, PR #146) decided the unified
assert_* API. PR-B.1 adds the 4 helpers + shared fail-message
formatter to scripts/lib/test-harness.sh, PURE ADDITIVE — zero
consumer changes; PR-B.2 (consumer migration) follows separately.

Helpers (per spike Q1/Q2):
- `assert_exit <name> <actual> <expected>` — exit-code check; majority
  arg order (value-under-test first)
- `assert_file_contains <name> <file> <literal>` — grep -Fq for literal
  string in file (current variant A; 6+ consumers will adopt in PR-B.2)
- `assert_file_matches <name> <file> <regex>` — grep -qE for regex
  pattern in file (current variant B; 2 consumers)
- `assert_string_contains <name> <haystack> <needle>` — bash string
  containment via [[ != *needle* ]] (current variant C; 2 consumers).
  Also absorbs assert_output_contains in PR-B.2

Plus shared `_th_assert_fail_msg` formatter — each helper calls it on
the fail path; message contains helper-name + relevant args + failed
condition (debuggable from output alone).

All 4 helpers call existing harness `pass` / `fail` fns on
success/failure → FORMAT-aware output preserved + counter increments
go through the same path as every other case.

Test coverage: 8 new self-test cases in test-test-harness.sh (1 pass +
1 fail per helper) using the file's existing local pass_case/fail_case
+ run_harness_probe pattern (probe subshell, not harness's own
counters — these are tests OF the harness):
- assert-exit-{pass,fail}
- assert-file-contains-{pass,fail}
- assert-file-matches-{pass,fail}
- assert-string-contains-{pass,fail}

Verification:
- bash scripts/test-test-harness.sh — 30/30 (was 22; +8 new)
- bash scripts/test-run-all-tests.sh — 13/13 integration
- bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline)
- git diff scripts/lib/test-harness.sh — only additions after line 124
  (th_summary close); no deletions; existing fns byte-identical
- expected_head_sha 03555fe verified by codex before first edit

BACKLOG CC-249 row: stays `🟢 someday` (PR-B.2 still pending); pr-ref
`pr:TBD-PRB1` placeholder (tail commit flips to actual PR# after
`gh pr create`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CC-249 row stays 🟢 someday (PR-B.2 consumer migration still pending).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@screenleon screenleon merged commit b5e5902 into main May 23, 2026
16 checks passed
@screenleon screenleon deleted the cc-249-pr-b1-harness-asserts branch May 23, 2026 15:26
screenleon added a commit that referenced this pull request May 23, 2026
… PR-B.2 retro (#149)

* fix(cc-254): harness assert_* helpers no auto-pass (amend PR-B.1 #148)

PR-B.1 (#148) shipped assert_* helpers that auto-called `pass "$name"`
on success. Spike Q1-Q5 assumed pure-rename consumer migration would
work. PR-B.2 (consumer migration) dispatch 2026-05-24 surfaced a
design conflict: 13 consumer test scripts ALL follow the
`assert_X "$name" ...; pass "$name"` pattern — `assert_*` is a check;
the consumer separately calls `pass` to increment the counter.

Pure rename would double-count PASS (harness implicit pass + consumer
explicit pass). Codex defensively shadowed the harness by re-defining
the 4 helpers locally in 9 of 13 consumer files — Q3 explicitly
rejected that "shim". Dispatch hit timeout 124 with 9 apply_patch
failures across 4 complex files.

Root cause = PR-B.1 helper API, not codex's interpretation or PR-B.2's
brief.

Fix:
- Strip `pass "$name"` from each of the 4 helpers (`assert_exit`,
  `assert_file_contains`, `assert_file_matches`,
  `assert_string_contains`). Helpers still call `fail` on failure.
- Header comment in `scripts/lib/test-harness.sh` documents the new
  contract: success returns 0 without side-effects; consumer
  responsible for `&& pass "$name"` if accounting is desired.
- 4 pass-path self-tests in `scripts/test-test-harness.sh` amended to
  call `pass` explicitly after the assert. Fail-path self-tests
  unaffected.

After this lands, PR-B.2 reverts to pure rename — each
`assert_contains "$n" $f "$x"` becomes `assert_file_contains "$n" $f "$x"`
and the existing `pass "$n"` underneath stays unchanged. No double-
count, no shim.

Verification:
- bash scripts/test-test-harness.sh — 30/30 (no regression)
- bash scripts/test-run-all-tests.sh — 13/13 integration
- bash pm/scripts/validate.sh BACKLOG.md — parity 30 (CC-228 baseline)

BACKLOG row CC-254 files this amendment; will close as ✅ pr:#NNN in
tail commit after gh pr create.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(cc-254): process artifacts from PR-B.2 retro

Three process improvements captured from the CC-249 PR-B.2 design-gap
incident (see commit 9869034 + spike Amendment):

1. **`agents/project-pm.md`** — adds 3 new rules to "Writing a brief
   for codex-executor":
   - Spike-pilot rule: every API-design spike MUST include a `## Pilot
     walkthrough` section in the output spike doc (1 representative
     consumer, verbatim before/after diff applying every spike
     decision). If walkthrough can't be written cleanly, spike's API
     decision is not yet mature.
   - Additive-PR pilot rule: every PR shipping a new API MUST migrate
     at least 1 real consumer in the SAME PR.
   - Explore call-site-context rule: Explore symbol surveys must
     capture BOTH declarations AND call-site context (raw line + 2
     before/after). Declaration-only surveys miss usage-pattern
     conflicts.

2. **`docs/spikes/CC-249.md`** — appends Amendment section to Open
   Risks documenting:
   - The hidden contract gap (consumers follow `assert_X ...; pass`
     pattern; spike never inspected call-sites)
   - Symptoms (PR-B.2 dispatch timeout + shadow-shim 9/13 files)
   - Recovery (CC-254 amendment)
   - Cross-links to the 2 new memory entries

3. Memory (separate repo, already committed):
   - `[[feedback_spike_pilot_required]]` — full rule + CC-249 retro
     evidence + cross-links
   - `[[feedback_explore_call_site_context]]` — full rule + Explore
     prompt template fragment + CC-249 retro evidence

These artifacts make the lesson reusable across projects. Future
spike + additive PR briefs are expected to apply the patterns; future
Explore surveys for symbol divergence include the call-site context
template fragment.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore(cc-254): backlog pr-ref TBD → pr:#149 (tail commit)

Per CC-250/CC-251/CC-252 tail-commit pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant