Skip to content

feat(v0.1.3): resolve 26 ce-review findings + unify audience to kebab-case#27

Merged
brettdavies merged 1 commit intodevfrom
feat/v013-ce-review-fixes
Apr 22, 2026
Merged

feat(v0.1.3): resolve 26 ce-review findings + unify audience to kebab-case#27
brettdavies merged 1 commit intodevfrom
feat/v013-ce-review-fixes

Conversation

@brettdavies
Copy link
Copy Markdown
Owner

@brettdavies brettdavies commented Apr 22, 2026

Summary

Resolves all 26 findings from the ce-review of commit dc5d74168577 and unifies audience JSON values to kebab-case (a
post-review decision that eliminates a mixed-casing asymmetry flagged during review).

Changelog

Added

  • audience_reason field on scorecard JSON — populated only when audience is null, with "suppressed" (signal check masked by --audit-profile) or "insufficient_signal" (signal check never produced) so consumers can see why the classifier withheld a label.
  • audit_profiles array in coverage/matrix.json — each entry carries {name, description, suppresses[]}, letting agents and site renderers enumerate the four --audit-profile categories and what each one suppresses without scraping --help.

Changed

  • Suppressed and errored results[].label values now show the check's human-readable label (e.g., "Respects NO_COLOR") instead of falling back to the check id.

Documentation

  • README.md, AGENTS.md, and CLAUDE.md updated to describe the shipped v0.1.3 scorecard surface: audience + audience_reason + audit_profile field semantics, the --audit-profile flag with examples, and the audit_profiles section of coverage/matrix.json as the programmatic source for category enumeration.

Type of Change

  • feat: New feature (non-breaking change which adds functionality)
  • fix: Bug fix (non-breaking change which fixes an issue)
  • refactor: Code refactoring (no functional changes)
  • test: Adding or updating tests
  • docs: Documentation update

Related Issues/Stories

  • Plan (CLI-side): docs/plans/2026-04-21-v013-handoff-5-cli-audience-classifier.md — Implementation Log captures the
    post-review kebab-case deviation
  • Plan (site-side coordination): agentnative-site/docs/plans/2026-04-21-v013-handoff-6-site-leaderboard-launch.md
    Unit 0.5 covers the kebab-case absorption that must land before the v0.1.3 scorecard regen wave
  • Compounded learning: docs/solutions/best-practices/module-directory-promotion-pattern-2026-04-22.md — canonical git mv recipe for foo.rsfoo/mod.rs promotions (documented off the back of this PR's help_probe extraction)
  • Compounded-trilogy cross-links: docs/solutions/best-practices/rust-module-splitting-srp-not-loc-20260327.md,
    docs/solutions/best-practices/rust-pub-crate-fields-for-cross-module-impl-pattern-2026-04-20.md
  • Related PR (base commit under review): feat(v0.1.3): audience classifier + --audit-profile suppression + p1-env-hints Pattern 2 #26
  • CE-review artifacts: .context/compound-engineering/ce-review/20260422-134229-565fc6d7/ (10 reviewer JSONs +
    metadata.json; local-only by convention)

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing completed
  • All tests passing

Test Summary:

  • Unit tests: 402 passing (net +8 new across scorecard::audience, scorecard::tests, principles::registry,
    principles::matrix, cli::tests, and runner::help_probe::env_hints_bash::tests)
  • Integration tests: 50 passing (net +3 new: concrete non-null audience on self-dogfood, --principle forces audience: null, full-shape JSON snapshot lock)
  • cargo fmt --check: clean
  • cargo clippy --all-targets -- -Dwarnings: clean
  • cargo deny check: advisories, bans, licenses, sources all OK
  • cargo run -- generate coverage-matrix --check: no drift
  • scripts/hooks/pre-push (full local-CI mirror): all checks passed
  • GitHub CI on the PR head (7a384b8): all 5 required checks green
  • Dogfood without profile: anc check . --output jsonschema_version: "1.1", audience: "agent-optimized",
    audit_profile: null, audience_reason absent (omitted via skip_serializing_if).
  • Dogfood with profile: anc check . --audit-profile human-tuiaudience: null, audience_reason: "suppressed",
    coverage_summary.must.verified drops 17 → 15 (suppressed P1 checks no longer inflate the metric).

Files Modified

Modified:

  • AGENTS.md, CLAUDE.md, README.md — v0.1.3 scorecard surface sync
  • coverage/matrix.json — regenerated (adds top-level audit_profiles array)
  • src/check.rs — new abstract label(&self) -> &'static str trait method
  • src/checks/**/*.rs (40 files) — mechanical label() impl + run() body migration from label: "…".into() literals
    to label: self.label().into()
  • src/cli.rsAuditProfileExceptionCategory parity drift tests
  • src/main.rsSUPPRESSION_EVIDENCE_PREFIX usage + check.label() in error and --audit-profile suppression
    branches
  • src/principles/registry.rsSUPPRESSION_EVIDENCE_PREFIX constant, ALL_EXCEPTION_CATEGORIES slice + compile-time
    variant guard, ExceptionCategory::description(), doc additions on SUPPRESSION_TABLE (trust boundary, drift-test
    scope)
  • src/principles/matrix.rsAuditProfileEntry struct, audit_profiles section + per-variant coverage test,
    matrix_json_includes_audit_profiles_section
  • src/scorecard/audience.rs — kebab-case classify(), new classify_reason() helper, is_audit_profile_suppression
    promoted to pub(crate), debug_assert! against duplicate signal IDs, module doc + suppression-prefix references
  • src/scorecard/mod.rsaudience_reason: Option<String> field (derived in build_scorecard, skip_serializing_if
    when audience has a label), build_coverage_summary filter, Scorecard struct doc, exit_code trust-boundary doc,
    snapshot + casing + duplicate-signal regression tests
  • tests/integration.rs — convention test blocking CheckResult { … } outside fn run() in src/checks/**, non-null
    audience dogfood, --principle + audience interaction, full-shape JSON snapshot, tightened diagnostic-only
    assertion, Pattern 2 bracketed-rejection and mixed-case negative fixtures

Created:

  • src/runner/help_probe/env_hints_bash.rs — Pattern 2 submodule extracted from the 931-line help_probe.rs; holds
    parse_env_hints_bash_style(), extract_env_tokens(), find_env_section(), is_section_header_line(),
    strip_clap_env_annotations(), SHELL_ENV_BLACKLIST, PATTERN2_WINDOW, and 9 Pattern 2 tests (including 3 new for
    PAGER, section-header exclusion, and mixed-case / placeholder negatives).

Renamed:

  • src/runner/help_probe.rssrc/runner/help_probe/mod.rs (via git mv; git detects rename at 58% similarity, blame
    preserved).

Key Features

  • Single source of truth for the suppression evidence string (SUPPRESSION_EVIDENCE_PREFIX): producer (main.rs),
    consumer (audience::is_audit_profile_suppression), and filter (build_coverage_summary) all reference the same
    constant so a rename can't silently desync the three sites.
  • Check::label() on the trait means every emitted CheckResult — including the runtime layer's error and suppression
    branches — now uses a single source of truth for each check's human label. No more "scorecard row shows the cryptic
    id" when a check errors out or gets masked by a profile.
  • EnvHintSource enum (clap_annotation / proximity / env_section) tags every emitted EnvHint with the pattern
    that matched, giving agents a structured signal when debugging a p1-env-hints false positive. Pattern 1 wins on
    duplicates; proximity loses to env_section when both apply.
  • Compile-time guard (_all_categories_covers_every_variant) + runtime drift tests
    (audit_profile_and_exception_category_variants_are_isomorphic, suppression_table_check_ids_exist_in_catalog)
    triangulate the AuditProfile (CLI) ↔ ExceptionCategory (registry) ↔ SUPPRESSION_TABLE (registry) relationship so
    a fifth category can't land via partial edits.

Benefits

  • Correctness of published metrics. Before this PR, every v0.1.3 scorecard emitted with --audit-profile overstated
    coverage_summary.verified — the site's H6 leaderboard would have rendered inflated coverage numbers that didn't
    match the results-level evidence. Shipped the fix before the site's regen wave.
  • JSON consistency. Unifying audience to kebab-case closes the only document-level casing asymmetry in the
    scorecard shape, so consumers no longer need two-rule case logic for one document.
  • Agent discoverability. Both the audit_profiles matrix section and the expanded AGENTS.md "Agent-facing JSON
    surface" section turn previously help-text-scraped knowledge into first-class, committed, machine-readable artifacts.
  • Regression surface. Eight new positive / negative / drift tests pin intentional behavior that future refactors
    might unknowingly break: exit-code-under-suppression, JSON casing contract, CheckResult construction scope,
    bracketed-env-var rejection, AuditProfile↔ExceptionCategory parity, signal-ID uniqueness, full-shape JSON snapshot,
    --principle + audience interaction.

Breaking Changes

  • No breaking changes for v0.1.2 consumers (which always received audience: null and audit_profile: null).
  • Breaking for pre-release v0.1.3 consumers that had pinned on snake_case audience values. The only such consumer
    is in-house (agentnative-site's H6 leaderboard), which hasn't shipped yet. Unit 0.5 of the site's H6 plan covers the
    adaptation and must land before the v0.1.3 scorecard regen wave.

Deployment Notes

  • No special deployment steps required for this PR into dev.

Release-branch mechanics per RELEASES.md happen in a follow-up PR:
branch off origin/main, cherry-pick these commits (plus 6fdca00 and dc5d741 from dev), bump Cargo.toml 0.1.2
0.1.3, regenerate changelog via generate-changelog.sh, PR → main. The Homebrew-tap finalize-release dispatch will
land on the wrong repo per todo 006 in brettdavies/.github — the manual recovery command is pre-staged in the v0.1.3
plan's operational notes section.

Checklist

  • Code follows project conventions and style guidelines
  • Commit messages follow Conventional Commits
  • Self-review of code completed (via ce-review 10-reviewer pass against dc5d741; this PR resolves all findings)
  • Tests added/updated and passing (net +11 new)
  • No new warnings or errors introduced (-Dwarnings clean)
  • Changes are backward compatible for v0.1.2 consumers (see Breaking Changes for the v0.1.3 pre-release note)

Additional Context

The ce-review process produced 10 per-reviewer JSON artifacts under
.context/compound-engineering/ce-review/20260422-134229-565fc6d7/ capturing every finding's full detail-tier fields
(why_it_matters + evidence[]). These are local-only by convention and useful for debugging which finding drove which
change.

The module-directory-promotion-pattern-2026-04-22.md solution doc published to docs/solutions/best-practices/ as
part of this session (not this PR — lives in the shared solutions-docs repo) compounds the
three-instances-of-the-same-refactor evidence (v0.1.2 H4 runner.rs split, v0.1.3 H5 scorecard.rs split, and this
PR's help_probe.rs split) into a canonical recipe. The companion trilogy cross-links across two older Rust
module-organization docs landed in the same solutions-docs push.

…-case

Applies every finding from the post-merge ce-review of commit dc5d741:
the P1 coverage-summary inflation fix, the full Check::label() trait
migration, the Pattern 2 extraction, the audience_reason field, the
audit_profiles matrix section, the codified regression guards, and the
audience kebab-case unification (post-review decision — see the v0.1.3 H5
plan's Implementation Log for rationale).

## Changelog

### Changed

- Scorecard JSON `audience` values now serialize as kebab-case
  (`agent-optimized` / `mixed` / `human-primary`) to match `audit_profile`'s
  kebab-case within the same document. Consumers keying on the v0.1.2
  `null` fallback are unaffected; any v0.1.3 consumer must pin on the
  kebab-case shape.
- `coverage_summary.{must,should,may}.verified` no longer counts
  `--audit-profile`-suppressed Skips. Suppression means the requirement
  was not verified, so the metric now matches what the site leaderboard
  renders per tool.

### Added

- `audience_reason` field on scorecard JSON, populated only when
  `audience` is `null`: `"suppressed"` (signal check masked by
  `--audit-profile`) or `"insufficient_signal"` (signal check never
  produced, e.g. source-only run). Additive to v1.1; omitted when
  `audience` has a label.
- Top-level `audit_profiles` array in `coverage/matrix.json` — each entry
  carries `{name, description, suppresses[]}`, letting agents enumerate
  the four categories and their per-category suppressions without
  scraping `--help`.
- `Check::label()` trait method; every check now exposes its human label
  at the trait level. `--audit-profile`-suppressed and errored results
  now show the human label (e.g., "Respects NO_COLOR") instead of
  falling back to the check id.
- `EnvHintSource` enum (`clap_annotation` / `proximity` / `env_section`)
  on every emitted `EnvHint` so agents debugging a Pattern 2 false
  positive can see which branch matched. `audit_profile` descriptions in
  the coverage-matrix artifact explain each suppression category.

### Fixed

- `p1-env-hints` Pattern 2 no longer matches `$PAGER` in flag prose
  (added to `SHELL_ENV_BLACKLIST`; the doc comment always cited it as
  the archetypal exclusion but the slice omitted it). Also no longer
  captures uppercase-underscored section-header lines such as
  `DOCKER_CONFIG:` as env hints.

## Type of Change

- [x] `feat`: New feature (non-breaking for v1.1 consumers that
  feature-detected `audience`/`audit_profile` as null)

## Testing

- Unit tests: 402 passing (net +8 new vs. v0.1.3-rc)
- Integration tests: 50 passing (net +3 new)
- `cargo clippy --all-targets -- -Dwarnings`: clean
- `cargo fmt --check`: clean
- `cargo deny check`: clean
- `scripts/hooks/pre-push` (full local CI mirror): all checks passed
- `anc generate coverage-matrix --check`: no drift
- Dogfood: `anc check . --output json` returns
  `schema_version: "1.1"`, `audience: "agent-optimized"`,
  `audit_profile: null`, with `audience_reason` omitted.
- Dogfood under profile: `anc check . --audit-profile human-tui` returns
  `audience: null`, `audience_reason: "suppressed"`,
  `coverage_summary.must.verified` drops from 17 → 15 (no longer counts
  the suppressed P1 checks).

## Key ce-review findings resolved

- **P1 #1** — `build_coverage_summary` excluded suppressed Skips via
  `audience::is_audit_profile_suppression`; regression test added.
- **P2 #2** — `PAGER` added to `SHELL_ENV_BLACKLIST`; negative test
  covers the full trio (`$PATH`, `$HOME`, `$PAGER`).
- **P2 #3** — `SUPPRESSION_EVIDENCE_PREFIX` extracted in
  `src/principles/registry.rs`; producer (`main.rs`), consumer
  (`scorecard::audience::is_audit_profile_suppression`), and filter
  (`build_coverage_summary`) all reference the single constant.
- **P2 #4, #5** — README.md and AGENTS.md rewritten for the shipped
  v0.1.3 JSON surface, including the `--audit-profile` flag example and
  the committed `audit_profiles` matrix pointer.
- **P2 #6** — `Check::label()` abstract trait method + full migration
  across all 40 check impls + `main.rs` error/suppression branches
  + `FakeCheck` stubs in matrix.rs and scorecard/mod.rs test modules.
- **P2 #7 / P3 #11** — `help_probe.rs` (931 LoC) extracted to
  `help_probe/mod.rs` (~540) + `help_probe/env_hints_bash.rs` (~440)
  via `git mv` (history preserved); `EnvHintSource` enum introduced at
  parent scope so both patterns can tag emitted hints.
- **P2 #8, #9, #10** — tightened `test_audit_profile_diagnostic_does_not_panic_on_self`
  to assert p5-dry-run suppression; added `test_audience_non_null_on_self_dogfood`
  + two `AuditProfile ↔ ExceptionCategory` parity drift tests.
- **P3 #12** — `audit_profiles` section in `coverage/matrix.json`.
- **P3 #13** — `audience_reason` field + `classify_reason()` helper.
- **P3 #15** — shared `is_section_header_line` predicate used by both
  `find_env_section` and `extract_env_tokens`.
- **P3 #18** — `test_scorecard_json_has_stable_top_level_keys` locks
  the v1.1 JSON shape bidirectionally.
- **P3 #23** — `debug_assert!` in `classify()` against duplicate signal
  IDs; `#[should_panic]` test pins the invariant.
- **P3 #25** — `test_principle_filter_forces_audience_null` covers
  `--principle` + audience interaction.
- **P3 #26** — Pattern 2 negative fixtures reject `$Path`, `[FILES]`,
  `<TEMPLATE>`, `CamelCase`, `MACROname`.
- **#16, #17, #20, #22** — codified tests for intentional exit-code
  behavior under suppression, kebab-case JSON contract, `fn run()`-only
  `CheckResult` construction in `src/checks/**`, and Pattern 2's
  bracketed-env-var rejection window.
- Several advisories (#14, #16, #17, #19, #20, #21, #22, #24) landed as
  targeted doc comments on `SUPPRESSION_TABLE`, `exit_code`, `Scorecard`,
  `extract_env_tokens`, and CLAUDE.md's Source Check Convention section.

## Files Modified

**Renamed:** `src/runner/help_probe.rs` → `src/runner/help_probe/mod.rs`
(detected as rename by git; blame preserved).

**New:** `src/runner/help_probe/env_hints_bash.rs`.

**Rust sources (45 files):** `src/check.rs`, `src/cli.rs`, `src/main.rs`,
40 check files under `src/checks/**`, `src/principles/{matrix.rs,registry.rs}`,
`src/runner/help_probe/{mod.rs,env_hints_bash.rs}`,
`src/scorecard/{audience.rs,mod.rs}`, `tests/integration.rs`.

**Artifacts:** `coverage/matrix.json` regenerated (adds `audit_profiles`).

**Docs:** `README.md`, `AGENTS.md`, `CLAUDE.md`.

## Breaking Changes

- [x] No breaking changes for v0.1.2 consumers (emitted `audience: null`
  and `audit_profile: null`).
- [x] Kebab-case audience values require v0.1.3 consumers that pinned on
  snake_case to update. The site's H6 leaderboard hasn't shipped; see
  `agentnative-site/docs/plans/2026-04-21-v013-handoff-6-site-leaderboard-launch.md`
  Unit 0.5 for the pre-regen site-side adaptation plan.

## Deployment Notes

- [x] Release-branch mechanics per `RELEASES.md` happen in the follow-up
  PR: branch off `origin/main`, cherry-pick these commits, bump
  `Cargo.toml` `0.1.2 → 0.1.3`, regenerate changelog, PR to `main`.
@brettdavies brettdavies merged commit ceb0678 into dev Apr 22, 2026
6 checks passed
@brettdavies brettdavies deleted the feat/v013-ce-review-fixes branch April 22, 2026 23:21
brettdavies added a commit that referenced this pull request Apr 22, 2026
…-case (#27)

## Summary

Resolves all 26 findings from the ce-review of commit `dc5d74168577` and unifies `audience` JSON values to kebab-case (a
post-review decision that eliminates a mixed-casing asymmetry flagged during review).

## Changelog

### Added

- Add `audience_reason` field on scorecard JSON — populated only when `audience` is `null`, with `"suppressed"` (signal
  check masked by `--audit-profile`) or `"insufficient_signal"` (signal check never produced) so consumers can see why
  the classifier withheld a label.
- Add `audit_profiles` array in `coverage/matrix.json` — each entry carries `{name, description, suppresses[]}`, letting
  agents and site renderers enumerate the four `--audit-profile` categories and what each one suppresses without
  scraping `--help`.

### Changed

- Change `audience` JSON values from snake_case to kebab-case (`agent-optimized` / `mixed` / `human-primary`) so both
  `audience` and `audit_profile` share a single casing within the same scorecard document. Consumers pinning on v0.1.2's
  `null` fallback are unaffected; v0.1.3 consumers must match the kebab-case shape.
- Change `coverage_summary.{must,should,may}.verified` to exclude `--audit-profile`-suppressed checks. Suppression now
  correctly reads as "not verified" in the metric, so site leaderboards no longer overstate per-tool coverage under
  audit profiles.
- Change suppressed and errored `results[].label` values to show the check's human-readable label (e.g., "Respects
  NO_COLOR") instead of falling back to the check id.

### Fixed

- Fix `p1-env-hints` Pattern 2 treating `$PAGER` as a flag-bound env reference. `PAGER` was always named in the
  `SHELL_ENV_BLACKLIST` doc comment as the archetypal exclusion, but was missing from the slice; tools like `git`, `gh`,
  and `man` that document `$PAGER` no longer produce a false-positive `p1-env-hints` warn.
- Fix Pattern 2 capturing uppercase-underscored section-header lines (e.g., `DOCKER_CONFIG:`) as env hints. Section
  headers are now excluded via a shared predicate before token extraction.

### Documentation

- Update README.md, AGENTS.md, and CLAUDE.md to describe the shipped v0.1.3 scorecard surface: `audience` +
  `audience_reason` + `audit_profile` field semantics, the `--audit-profile` flag with examples, and the
  `audit_profiles` section of `coverage/matrix.json` as the programmatic source for category enumeration.

## Type of Change

- [x] `feat`: New feature (non-breaking change which adds functionality)
- [x] `fix`: Bug fix (non-breaking change which fixes an issue)
- [x] `refactor`: Code refactoring (no functional changes)
- [x] `test`: Adding or updating tests
- [x] `docs`: Documentation update

## Related Issues/Stories

- Plan (CLI-side): `docs/plans/2026-04-21-v013-handoff-5-cli-audience-classifier.md` — Implementation Log captures the
  post-review kebab-case deviation
- Plan (site-side coordination): `agentnative-site/docs/plans/2026-04-21-v013-handoff-6-site-leaderboard-launch.md` —
  Unit 0.5 covers the kebab-case absorption that must land before the v0.1.3 scorecard regen wave
- Compounded learning: `docs/solutions/best-practices/module-directory-promotion-pattern-2026-04-22.md` — canonical `git
  mv` recipe for `foo.rs` → `foo/mod.rs` promotions (documented off the back of this PR's `help_probe` extraction)
- Compounded-trilogy cross-links: `docs/solutions/best-practices/rust-module-splitting-srp-not-loc-20260327.md`,
  `docs/solutions/best-practices/rust-pub-crate-fields-for-cross-module-impl-pattern-2026-04-20.md`
- Related PR (base commit under review): #26
- CE-review artifacts: `.context/compound-engineering/ce-review/20260422-134229-565fc6d7/` (10 reviewer JSONs +
  `metadata.json`; local-only by convention)

## Testing

- [x] Unit tests added/updated
- [x] Integration tests added/updated
- [x] Manual testing completed
- [x] All tests passing

**Test Summary:**

- Unit tests: 402 passing (net +8 new across `scorecard::audience`, `scorecard::tests`, `principles::registry`,
  `principles::matrix`, `cli::tests`, and `runner::help_probe::env_hints_bash::tests`)
- Integration tests: 50 passing (net +3 new: concrete non-null audience on self-dogfood, `--principle` forces `audience:
  null`, full-shape JSON snapshot lock)
- `cargo fmt --check`: clean
- `cargo clippy --all-targets -- -Dwarnings`: clean
- `cargo deny check`: advisories, bans, licenses, sources all OK
- `cargo run -- generate coverage-matrix --check`: no drift
- `scripts/hooks/pre-push` (full local-CI mirror): all checks passed
- GitHub CI on the PR head (`7a384b8`): all 5 required checks green
- Dogfood without profile: `anc check . --output json` → `schema_version: "1.1"`, `audience: "agent-optimized"`,
  `audit_profile: null`, `audience_reason` absent (omitted via `skip_serializing_if`).
- Dogfood with profile: `anc check . --audit-profile human-tui` → `audience: null`, `audience_reason: "suppressed"`,
  `coverage_summary.must.verified` drops 17 → 15 (suppressed P1 checks no longer inflate the metric).

## Files Modified

**Modified:**

- `AGENTS.md`, `CLAUDE.md`, `README.md` — v0.1.3 scorecard surface sync
- `coverage/matrix.json` — regenerated (adds top-level `audit_profiles` array)
- `src/check.rs` — new abstract `label(&self) -> &'static str` trait method
- `src/checks/**/*.rs` (40 files) — mechanical `label()` impl + `run()` body migration from `label: "…".into()` literals
  to `label: self.label().into()`
- `src/cli.rs` — `AuditProfile` ↔ `ExceptionCategory` parity drift tests
- `src/main.rs` — `SUPPRESSION_EVIDENCE_PREFIX` usage + `check.label()` in error and `--audit-profile` suppression
  branches
- `src/principles/registry.rs` — `SUPPRESSION_EVIDENCE_PREFIX` constant, `ALL_EXCEPTION_CATEGORIES` slice + compile-time
  variant guard, `ExceptionCategory::description()`, doc additions on `SUPPRESSION_TABLE` (trust boundary, drift-test
  scope)
- `src/principles/matrix.rs` — `AuditProfileEntry` struct, `audit_profiles` section + per-variant coverage test,
  `matrix_json_includes_audit_profiles_section`
- `src/scorecard/audience.rs` — kebab-case `classify()`, new `classify_reason()` helper, `is_audit_profile_suppression`
  promoted to `pub(crate)`, `debug_assert!` against duplicate signal IDs, module doc + suppression-prefix references
- `src/scorecard/mod.rs` — `audience_reason: Option<String>` field (derived in `build_scorecard`, `skip_serializing_if`
  when `audience` has a label), `build_coverage_summary` filter, Scorecard struct doc, `exit_code` trust-boundary doc,
  snapshot + casing + duplicate-signal regression tests
- `tests/integration.rs` — convention test blocking `CheckResult { … }` outside `fn run()` in `src/checks/**`, non-null
  audience dogfood, `--principle` + audience interaction, full-shape JSON snapshot, tightened `diagnostic-only`
  assertion, Pattern 2 bracketed-rejection and mixed-case negative fixtures

**Created:**

- `src/runner/help_probe/env_hints_bash.rs` — Pattern 2 submodule extracted from the 931-line `help_probe.rs`; holds
  `parse_env_hints_bash_style()`, `extract_env_tokens()`, `find_env_section()`, `is_section_header_line()`,
  `strip_clap_env_annotations()`, `SHELL_ENV_BLACKLIST`, `PATTERN2_WINDOW`, and 9 Pattern 2 tests (including 3 new for
  PAGER, section-header exclusion, and mixed-case / placeholder negatives).

**Renamed:**

- `src/runner/help_probe.rs` → `src/runner/help_probe/mod.rs` (via `git mv`; git detects rename at 58% similarity, blame
  preserved).

## Key Features

- Single source of truth for the suppression evidence string (`SUPPRESSION_EVIDENCE_PREFIX`): producer (`main.rs`),
  consumer (`audience::is_audit_profile_suppression`), and filter (`build_coverage_summary`) all reference the same
  constant so a rename can't silently desync the three sites.
- `Check::label()` on the trait means every emitted `CheckResult` — including the runtime layer's error and suppression
  branches — now uses a single source of truth for each check's human label. No more "scorecard row shows the cryptic
  id" when a check errors out or gets masked by a profile.
- `EnvHintSource` enum (`clap_annotation` / `proximity` / `env_section`) tags every emitted `EnvHint` with the pattern
  that matched, giving agents a structured signal when debugging a p1-env-hints false positive. Pattern 1 wins on
  duplicates; proximity loses to env_section when both apply.
- Compile-time guard (`_all_categories_covers_every_variant`) + runtime drift tests
  (`audit_profile_and_exception_category_variants_are_isomorphic`, `suppression_table_check_ids_exist_in_catalog`)
  triangulate the `AuditProfile` (CLI) ↔ `ExceptionCategory` (registry) ↔ `SUPPRESSION_TABLE` (registry) relationship so
  a fifth category can't land via partial edits.

## Benefits

- **Correctness of published metrics.** Before this PR, every v0.1.3 scorecard emitted with `--audit-profile` overstated
  `coverage_summary.verified` — the site's H6 leaderboard would have rendered inflated coverage numbers that didn't
  match the results-level evidence. Shipped the fix before the site's regen wave.
- **JSON consistency.** Unifying `audience` to kebab-case closes the only document-level casing asymmetry in the
  scorecard shape, so consumers no longer need two-rule case logic for one document.
- **Agent discoverability.** Both the `audit_profiles` matrix section and the expanded AGENTS.md "Agent-facing JSON
  surface" section turn previously help-text-scraped knowledge into first-class, committed, machine-readable artifacts.
- **Regression surface.** Eight new positive / negative / drift tests pin intentional behavior that future refactors
  might unknowingly break: exit-code-under-suppression, JSON casing contract, `CheckResult` construction scope,
  bracketed-env-var rejection, AuditProfile↔ExceptionCategory parity, signal-ID uniqueness, full-shape JSON snapshot,
  `--principle` + audience interaction.

## Breaking Changes

- [x] No breaking changes for v0.1.2 consumers (which always received `audience: null` and `audit_profile: null`).
- [x] Breaking for pre-release v0.1.3 consumers that had pinned on snake_case `audience` values. The only such consumer
  is in-house (agentnative-site's H6 leaderboard), which hasn't shipped yet. Unit 0.5 of the site's H6 plan covers the
  adaptation and must land before the v0.1.3 scorecard regen wave.

## Deployment Notes

- [x] No special deployment steps required for this PR into `dev`.

Release-branch mechanics per `RELEASES.md` happen in a follow-up PR:
branch off `origin/main`, cherry-pick these commits (plus `6fdca00` and `dc5d741` from dev), bump `Cargo.toml` `0.1.2` →
`0.1.3`, regenerate changelog via `generate-changelog.sh`, PR → `main`. The Homebrew-tap finalize-release dispatch will
land on the wrong repo per todo `006` in `brettdavies/.github` — the manual recovery command is pre-staged in the v0.1.3
plan's operational notes section.

## Checklist

- [x] Code follows project conventions and style guidelines
- [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/)
- [x] Self-review of code completed (via ce-review 10-reviewer pass against `dc5d741`; this PR resolves all findings)
- [x] Tests added/updated and passing (net +11 new)
- [x] No new warnings or errors introduced (`-Dwarnings` clean)
- [x] Changes are backward compatible for v0.1.2 consumers (see Breaking Changes for the v0.1.3 pre-release note)

## Additional Context

The ce-review process produced 10 per-reviewer JSON artifacts under
`.context/compound-engineering/ce-review/20260422-134229-565fc6d7/` capturing every finding's full detail-tier fields
(`why_it_matters` + `evidence[]`). These are local-only by convention and useful for debugging which finding drove which
change.

The `module-directory-promotion-pattern-2026-04-22.md` solution doc published to `docs/solutions/best-practices/` as
part of this session (not this PR — lives in the shared solutions-docs repo) compounds the
three-instances-of-the-same-refactor evidence (v0.1.2 H4 `runner.rs` split, v0.1.3 H5 `scorecard.rs` split, and this
PR's `help_probe.rs` split) into a canonical recipe. The companion trilogy cross-links across two older Rust
module-organization docs landed in the same solutions-docs push.
@brettdavies brettdavies mentioned this pull request Apr 23, 2026
16 tasks
brettdavies added a commit that referenced this pull request Apr 23, 2026
## Summary

Release branch for v0.1.3. Cherry-picked from `dev`:

- **PR #26** — `audience` classifier shipped (`agent-optimized` /
`mixed` / `human-primary`), `--audit-profile` flag (`human-tui`,
`file-traversal`, `posix-utility`, `diagnostic-only`) with suppression
wiring, `p1-env-hints` Pattern 2 for bash-style `$FOO` / `TOOL_FOO`
references near flag definitions, plus `audience_reason` +
`audit_profiles` matrix section.
- **PR #27** — 26 ce-review findings resolved, `audience` snake_case →
kebab-case unification, `$PAGER` / section-header false-positive fixes
in Pattern 2, suppressed/errored `results[].label` now shows human
labels, `coverage_summary.verified` correctly excludes
`--audit-profile`-suppressed checks.

Plus standard release mechanics: `Cargo.toml` bump to `0.1.3`,
`Cargo.lock` refresh, regenerated `CHANGELOG.md` (git-cliff from
squash-commit `## Changelog` bodies). No completions drift — PR #26
already regenerated them.

## Changelog

<!-- Release PR itself has no user-facing content — the upstream PRs #26
and #27 carry the `## Changelog` sections that git-cliff extracted into
CHANGELOG.md. Leaving this section empty per release-branch convention.
-->

## Type of Change

- [x] `feat`: New feature (non-breaking change which adds functionality)

## Related Issues/Stories

- Story: v0.1.3 H5 — audience classifier + audit-profile suppression
- Related PRs: #26 (feature), #27 (review resolutions)
- Architecture: see `docs/plans/v0.1.3-h5-plan.md` on `dev`

## Testing

- [x] Unit tests added/updated
- [x] Integration tests added/updated
- [x] Manual testing completed
- [x] All tests passing

**Test Summary:**

- Pre-push hook green on release branch: fmt / clippy `-Dwarnings` /
test / cargo-deny / Windows compat
- CI green on PR head: `ci / Fmt, clippy, test`, `ci / Package check`,
`ci / Security audit (advisories)`, `ci / Security audit (bans licenses
sources)`, `ci / Windows check`, `ci / Changelog`, `guard-docs /
check-forbidden-docs`, `guard-provenance / check-provenance`,
`guard-release / check-release-branch-name`
- Coverage matrix drift check passes against committed artifacts
- Smoke-tested `anc check .` (dogfood) with and without each
`--audit-profile` category

## Files Modified

**Modified:**

- `Cargo.toml`, `Cargo.lock` — version bump 0.1.2 → 0.1.3
- `CHANGELOG.md` — regenerated via `scripts/generate-changelog.sh`
- See cherry-pick commits `4a18051` (PR #26) and `db11e97` (PR #27) for
the full code-change surface (56 files, +2954 / −533).

**Created:**

- `src/scorecard/audience.rs`
- `src/runner/help_probe/env_hints_bash.rs`

**Deleted:**

- None (files renamed via `git mv` as part of module reorganization —
see cherry-pick bodies).

## Key Features

- Populated `audience` scorecard field with kebab-case labels
(`agent-optimized` / `mixed` / `human-primary`) when all four signal
behavioral checks ran; `null` with `audience_reason` when any are
missing.
- `--audit-profile <category>` flag on `anc check` suppresses checks
that don't apply to specific tool classes (human-TUI, file-traversal
utilities, POSIX utilities, diagnostic-only tools).
- `p1-env-hints` Pattern 2 recognizes bash-style `$FOO` / `TOOL_FOO` env
references near flag definitions — tools like `ripgrep` and `aider` that
document env bindings in free prose now Pass instead of Warn.
- `audit_profiles[]` section in `coverage/matrix.json` — consumers can
enumerate profiles programmatically instead of scraping `--help`.

## Benefits

- Scorecards now carry enough metadata for agents to route tools by
audience without running every check.
- `--audit-profile` prevents false-negative "fails" for tools that
legitimately don't satisfy a given principle (e.g., a file-traversal
utility isn't expected to output structured JSON).
- Pattern 2 removes a whole class of false positives on non-clap Rust
CLIs and non-Rust CLIs that document env vars in help prose.

## Breaking Changes

- [x] No breaking changes

Scorecard schema stays at `1.1`. New fields (`audience_reason`,
`audit_profile`, `audit_profiles[]` in matrix) are additive. `audience`
JSON values move from snake_case to kebab-case — v0.1.2 consumers only
ever saw `null` for this field, so consumers pinning on the v0.1.2 shape
are unaffected; v0.1.3 consumers must match the kebab-case form.

## Deployment Notes

- [ ] No special deployment steps required
- [x] Deployment steps documented below:

Post-merge:

1. Annotated tag + push — `git tag -a -m "Release v0.1.3" v0.1.3 && git
push origin main --tags`.
2. Tag push triggers `release.yml` → `cargo publish` (Trusted
Publishing) → GitHub Release (draftless, `make_latest: false` during
bottle window) → Homebrew dispatch → `finalize-release` flips
`make_latest: true`.
3. Follow-on on `agentnative-site`: install v0.1.3, regenerate
scorecards (new `audience` kebab-case + `audit_profile` +
`audience_reason` fields will populate), run
`scripts/sync-coverage-matrix.sh` to pull the new `audit_profiles`
section into `src/data/coverage-matrix.json`.

## Screenshots/Recordings

N/A — CLI-only change.

## Checklist

- [x] Code follows project conventions and style guidelines
- [x] Commit messages follow [Conventional
Commits](https://www.conventionalcommits.org/)
- [x] Self-review of code completed
- [x] Tests added/updated and passing
- [x] No new warnings or errors introduced
- [x] Changes are backward compatible (or breaking changes documented)
- [x] Cherry-picks touched no guarded paths (`docs/plans/`,
`docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`)
- [x] `Cargo.toml` version matches the tag this branch will produce

## Additional Context

The two upstream feature PRs (#26 and #27) each carry their own `##
Changelog` sections — those are what `scripts/generate-changelog.sh`
extracts into `CHANGELOG.md`. This release PR's empty `## Changelog` is
intentional: duplicating the upstream bullets here would double-count
them in the next release cycle.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant