feat(cc-247,cc-248): th_init --format=<preset> + --fail-fast harness options#142
Merged
Conversation
…arness Closed 5-preset enum + orthogonal fail-fast boolean. Default behavior unchanged (colon-flat + collect-all). Presets (one per pre-existing per-file override profile): - colon-flat — `PASS: %s\n` always / `FAIL: %s: %s\n` inline - colon-mixed — `PASS: %s\n` always / ` FAIL ` 2sp + 8sp detail - indent-1sp — ` PASS ` 1sp VERBOSE-only / ` FAIL ` 1sp + 8sp detail - indent-2sp — ` PASS ` 2sp always / ` FAIL ` 2sp + 8sp detail - indent-2sp-quiet — ` PASS ` 2sp VERBOSE-only / ` FAIL ` 2sp + no-indent detail Unknown --format value rejected at th_init parse-time with stderr listing all 5 valid names. --fail-fast triggers th_summary (which exits 1 on non-zero FAIL count) right after fail()'s print body. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…er-file overrides Per-file pass()/fail() overrides DELETED from 5 harness consumers; each switches to th_init --format=<preset> (+ --fail-fast where applicable): | File | Preset | Fail-fast | |----------------------------|-------------------|-----------| | test-commands.sh | indent-1sp | | | test-usage-weekly.sh | indent-2sp | --fail-fast | | test-usage-tracker.sh | indent-2sp | --fail-fast | | test-hooks.sh | indent-2sp-quiet | | | test-skill-refine.sh | colon-mixed | --fail-fast | Each consumer's stdout byte-identical pre- vs post-migration under both VERBOSE unset and VERBOSE=1 (validated via git-stash round-trip diff; only non-deterministic noise — mktemp dir suffix, performance timing — varies). Bonus: test-run-all-tests.sh renames its local `pass`/`fail` to `pass_case`/`fail_case`. The file uses its own counters (does not source the harness), but the rename avoids name collision with the harness's preset-aware `pass`/`fail` symbols. Output format `PASS: %s\n` / `FAIL: %s: %s\n` unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…test-harness
New cases (12 total, all green): preset-{colon-flat,colon-mixed,
indent-1sp-{verbose,off},indent-2sp,indent-2sp-quiet-{verbose,off}},
preset-default-matches-colon-flat, preset-unknown, fail-fast-{on,off,
no-failures,orthogonal}, filter-still-works, list-still-works.
Also: run_harness_probe now calls pass_case on success (was silent —
suite under-reported coverage; visible count rose from 11 to 22 with
no new failures).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lder) CC-247 and CC-248 index rows' `pr:` field flipped from `pr:TBD` to `pr:TBD-PRA`. Real PR number applied in a follow-up commit after `gh pr create` returns. Validator parity preserved at 30 pre-existing E-codes (CC-228 baseline). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Tail commit per PR-A's special_instruction_pr_ref: flip placeholder to actual PR number after `gh pr create`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 23, 2026
screenleon
added a commit
that referenced
this pull request
May 23, 2026
…close cc-250 (#145) * docs(cc-251): brief-authoring discipline for multi-file dispatches + close cc-250 CC-251 — 3 patterns added to prevent codex apply_patch debug-loop hang on > 4 files OR > 50 lines verbatim briefs: 1. **apply_patch retry-cap** (constraint text): HALT after 2nd consecutive verification failure on the same file; no 3rd retry. Codex has no internal retry-cap → without this, debug loop runs until dispatch timeout (1800s). 2. **Verbatim-as-attached-file** (pattern): write embedded content (override-policy paragraphs, BACKLOG rows, brief-template fragments) to /tmp/<task>-content/*.md BEFORE dispatch; brief references path + says "copy verbatim, do NOT paraphrase". Eliminates the hallucinate-when-retyping failure mode (CC-250 stderr observed `pass/fail print-format` → `print-format` — "pass/fail" dropped). 3. **`expected_head_sha` state pin** (schema field): 40-char sha in brief metadata + `git rev-parse HEAD == <sha>` self_verify check. Catches "wrong branch / branch advanced / file changed by another process" before any patch is attempted. Documented in: - `agents/project-pm.md` — "Multi-file brief discipline" prose added to the "Writing a brief for codex-executor" section - `docs/dispatch-brief.md` — `expected_head_sha` as Optional section with usage example - Memory `[[feedback_codex_brief_discipline]]` (separate repo) — retro evidence from CC-247/248 #142 + CC-250 #144 dispatch hangs For briefs touching > 8 files OR > 200 lines verbatim, ALSO split the dispatch into 2–3 smaller ones (each 2–4 files); split alone without the 3 patterns above doesn't fully prevent the hang. Also closes **CC-250** as a tail-cleanup (status flip + `pr:TBD` → `pr:#144` + body Outcome / See blocks; mirrors the pattern PR #144 brief explicitly deferred to a follow-up commit). MILESTONES.md M1 prerequisite sub-table flips CC-250 ⏳ → ✅ (#144) and adds CC-251 ⏳. Long-term resolution: CC-235 (tiered-lifecycle-gate) enforces split mechanically; CC-244 (typed pipeline) turns verbatim into schema fields; CC-215 (pmctl) may add `--expect-head <sha>` wrapper flag. Validator parity preserved at 30 (CC-228 baseline). No code change; discipline is brief-authoring time, not runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(cc-251): normalize verbatim-length threshold to 50 lines PR-gate critic [low]: intro states `> 50 lines` trigger but bullet 2 said `> 30 lines`. Standardize to 50 across both. Matches the memory copy at feedback_codex_brief_discipline.md (also updated 30→50). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
PR-A of the CC-247/CC-248/CC-249 sequence (PR-0 docs landed in #141; PR-B for CC-249 follows after a
/pre-implspike).Adds 5 named print-format presets + orthogonal
--fail-fastboolean toscripts/lib/test-harness.sh, then migrates 5 harness consumers to drop their per-filepass/failoverrides. All consumer stdout byte-identical pre vs post.API
Presets (closed enum; unknown value exits 1 with stderr listing all 5):
colon-flat(default)PASS: %s\nFAIL: %s: %s\ncolon-mixedPASS: %s\nFAIL %s\n(2sp)%s\n(8sp)indent-1spPASS %s\n(1sp)FAIL %s\n(1sp)%s\n(8sp)indent-2spPASS %s\n(2sp)FAIL %s\n(2sp)%s\n(8sp)indent-2sp-quietPASS %s\n(2sp)FAIL %s\n(2sp)%s\n(no indent)--fail-fast: afterfail()prints, callsth_summary(which exits 1 on non-zero FAIL).Consumer migration (5 files)
scripts/test-commands.shindent-1spscripts/test-usage-weekly.shindent-2spscripts/test-usage-tracker.shindent-2spscripts/test-hooks.shindent-2sp-quietscripts/test-skill-refine.shcolon-mixedscripts/test-run-all-tests.shis NOT a harness consumer (uses its ownpass_case/fail_case); only locally renamed to avoid name collision with harness symbols. Output format unchanged.Test plan
bash scripts/test-test-harness.sh— 22/22 (was 11/11; +12 new preset/fail-fast cases + visibility fix torun_harness_probe)bash scripts/test-run-all-tests.shintegration: 13/13mktemprandom suffix + timing measurements differ — non-deterministic noise)grep -E '^(pass|fail)\(\)' scripts/test-*.sh→ 0 matches (all overrides cleaned)bash scripts/lint-scripts.sh— 52 OKpm/scripts/validate.shparity preserved at 30 (CC-228 baseline)shellcheck --severity=style— not installed locally; CI will runPR-gate
/pr-gatestandard tier: Final: GO — critic + qa-tester + architecture-reviewer all approve, only low-severity confirmations, zero blocks. (security-reviewer + risk-reviewer not in standard tier — no sensitive paths in diff.)Notes for reviewers
pr:field for CC-247 + CC-248 will be flipped fromTBD-PRAplaceholder to actual PR# in a tail commit immediately after this PR opens (avoids the codex-doesn't-know-its-own-PR-number problem cleanly).assert_*helpers) stays deferred until the/pre-implspike resolves the 3-wayassert_containssignature divergence.pass_casecall torun_harness_probe; full self_verify clean after that.🤖 Generated with Claude Code