feat(v0.1.3): resolve 26 ce-review findings + unify audience to kebab-case#27
Merged
brettdavies merged 1 commit intodevfrom Apr 22, 2026
Merged
feat(v0.1.3): resolve 26 ce-review findings + unify audience to kebab-case#27brettdavies merged 1 commit intodevfrom
brettdavies merged 1 commit intodevfrom
Conversation
…-case Applies every finding from the post-merge ce-review of commit dc5d741: the P1 coverage-summary inflation fix, the full Check::label() trait migration, the Pattern 2 extraction, the audience_reason field, the audit_profiles matrix section, the codified regression guards, and the audience kebab-case unification (post-review decision — see the v0.1.3 H5 plan's Implementation Log for rationale). ## Changelog ### Changed - Scorecard JSON `audience` values now serialize as kebab-case (`agent-optimized` / `mixed` / `human-primary`) to match `audit_profile`'s kebab-case within the same document. Consumers keying on the v0.1.2 `null` fallback are unaffected; any v0.1.3 consumer must pin on the kebab-case shape. - `coverage_summary.{must,should,may}.verified` no longer counts `--audit-profile`-suppressed Skips. Suppression means the requirement was not verified, so the metric now matches what the site leaderboard renders per tool. ### Added - `audience_reason` field on scorecard JSON, populated only when `audience` is `null`: `"suppressed"` (signal check masked by `--audit-profile`) or `"insufficient_signal"` (signal check never produced, e.g. source-only run). Additive to v1.1; omitted when `audience` has a label. - Top-level `audit_profiles` array in `coverage/matrix.json` — each entry carries `{name, description, suppresses[]}`, letting agents enumerate the four categories and their per-category suppressions without scraping `--help`. - `Check::label()` trait method; every check now exposes its human label at the trait level. `--audit-profile`-suppressed and errored results now show the human label (e.g., "Respects NO_COLOR") instead of falling back to the check id. - `EnvHintSource` enum (`clap_annotation` / `proximity` / `env_section`) on every emitted `EnvHint` so agents debugging a Pattern 2 false positive can see which branch matched. `audit_profile` descriptions in the coverage-matrix artifact explain each suppression category. ### Fixed - `p1-env-hints` Pattern 2 no longer matches `$PAGER` in flag prose (added to `SHELL_ENV_BLACKLIST`; the doc comment always cited it as the archetypal exclusion but the slice omitted it). Also no longer captures uppercase-underscored section-header lines such as `DOCKER_CONFIG:` as env hints. ## Type of Change - [x] `feat`: New feature (non-breaking for v1.1 consumers that feature-detected `audience`/`audit_profile` as null) ## Testing - Unit tests: 402 passing (net +8 new vs. v0.1.3-rc) - Integration tests: 50 passing (net +3 new) - `cargo clippy --all-targets -- -Dwarnings`: clean - `cargo fmt --check`: clean - `cargo deny check`: clean - `scripts/hooks/pre-push` (full local CI mirror): all checks passed - `anc generate coverage-matrix --check`: no drift - Dogfood: `anc check . --output json` returns `schema_version: "1.1"`, `audience: "agent-optimized"`, `audit_profile: null`, with `audience_reason` omitted. - Dogfood under profile: `anc check . --audit-profile human-tui` returns `audience: null`, `audience_reason: "suppressed"`, `coverage_summary.must.verified` drops from 17 → 15 (no longer counts the suppressed P1 checks). ## Key ce-review findings resolved - **P1 #1** — `build_coverage_summary` excluded suppressed Skips via `audience::is_audit_profile_suppression`; regression test added. - **P2 #2** — `PAGER` added to `SHELL_ENV_BLACKLIST`; negative test covers the full trio (`$PATH`, `$HOME`, `$PAGER`). - **P2 #3** — `SUPPRESSION_EVIDENCE_PREFIX` extracted in `src/principles/registry.rs`; producer (`main.rs`), consumer (`scorecard::audience::is_audit_profile_suppression`), and filter (`build_coverage_summary`) all reference the single constant. - **P2 #4, #5** — README.md and AGENTS.md rewritten for the shipped v0.1.3 JSON surface, including the `--audit-profile` flag example and the committed `audit_profiles` matrix pointer. - **P2 #6** — `Check::label()` abstract trait method + full migration across all 40 check impls + `main.rs` error/suppression branches + `FakeCheck` stubs in matrix.rs and scorecard/mod.rs test modules. - **P2 #7 / P3 #11** — `help_probe.rs` (931 LoC) extracted to `help_probe/mod.rs` (~540) + `help_probe/env_hints_bash.rs` (~440) via `git mv` (history preserved); `EnvHintSource` enum introduced at parent scope so both patterns can tag emitted hints. - **P2 #8, #9, #10** — tightened `test_audit_profile_diagnostic_does_not_panic_on_self` to assert p5-dry-run suppression; added `test_audience_non_null_on_self_dogfood` + two `AuditProfile ↔ ExceptionCategory` parity drift tests. - **P3 #12** — `audit_profiles` section in `coverage/matrix.json`. - **P3 #13** — `audience_reason` field + `classify_reason()` helper. - **P3 #15** — shared `is_section_header_line` predicate used by both `find_env_section` and `extract_env_tokens`. - **P3 #18** — `test_scorecard_json_has_stable_top_level_keys` locks the v1.1 JSON shape bidirectionally. - **P3 #23** — `debug_assert!` in `classify()` against duplicate signal IDs; `#[should_panic]` test pins the invariant. - **P3 #25** — `test_principle_filter_forces_audience_null` covers `--principle` + audience interaction. - **P3 #26** — Pattern 2 negative fixtures reject `$Path`, `[FILES]`, `<TEMPLATE>`, `CamelCase`, `MACROname`. - **#16, #17, #20, #22** — codified tests for intentional exit-code behavior under suppression, kebab-case JSON contract, `fn run()`-only `CheckResult` construction in `src/checks/**`, and Pattern 2's bracketed-env-var rejection window. - Several advisories (#14, #16, #17, #19, #20, #21, #22, #24) landed as targeted doc comments on `SUPPRESSION_TABLE`, `exit_code`, `Scorecard`, `extract_env_tokens`, and CLAUDE.md's Source Check Convention section. ## Files Modified **Renamed:** `src/runner/help_probe.rs` → `src/runner/help_probe/mod.rs` (detected as rename by git; blame preserved). **New:** `src/runner/help_probe/env_hints_bash.rs`. **Rust sources (45 files):** `src/check.rs`, `src/cli.rs`, `src/main.rs`, 40 check files under `src/checks/**`, `src/principles/{matrix.rs,registry.rs}`, `src/runner/help_probe/{mod.rs,env_hints_bash.rs}`, `src/scorecard/{audience.rs,mod.rs}`, `tests/integration.rs`. **Artifacts:** `coverage/matrix.json` regenerated (adds `audit_profiles`). **Docs:** `README.md`, `AGENTS.md`, `CLAUDE.md`. ## Breaking Changes - [x] No breaking changes for v0.1.2 consumers (emitted `audience: null` and `audit_profile: null`). - [x] Kebab-case audience values require v0.1.3 consumers that pinned on snake_case to update. The site's H6 leaderboard hasn't shipped; see `agentnative-site/docs/plans/2026-04-21-v013-handoff-6-site-leaderboard-launch.md` Unit 0.5 for the pre-regen site-side adaptation plan. ## Deployment Notes - [x] Release-branch mechanics per `RELEASES.md` happen in the follow-up PR: branch off `origin/main`, cherry-pick these commits, bump `Cargo.toml` `0.1.2 → 0.1.3`, regenerate changelog, PR to `main`.
brettdavies
added a commit
that referenced
this pull request
Apr 22, 2026
…-case (#27) ## Summary Resolves all 26 findings from the ce-review of commit `dc5d74168577` and unifies `audience` JSON values to kebab-case (a post-review decision that eliminates a mixed-casing asymmetry flagged during review). ## Changelog ### Added - Add `audience_reason` field on scorecard JSON — populated only when `audience` is `null`, with `"suppressed"` (signal check masked by `--audit-profile`) or `"insufficient_signal"` (signal check never produced) so consumers can see why the classifier withheld a label. - Add `audit_profiles` array in `coverage/matrix.json` — each entry carries `{name, description, suppresses[]}`, letting agents and site renderers enumerate the four `--audit-profile` categories and what each one suppresses without scraping `--help`. ### Changed - Change `audience` JSON values from snake_case to kebab-case (`agent-optimized` / `mixed` / `human-primary`) so both `audience` and `audit_profile` share a single casing within the same scorecard document. Consumers pinning on v0.1.2's `null` fallback are unaffected; v0.1.3 consumers must match the kebab-case shape. - Change `coverage_summary.{must,should,may}.verified` to exclude `--audit-profile`-suppressed checks. Suppression now correctly reads as "not verified" in the metric, so site leaderboards no longer overstate per-tool coverage under audit profiles. - Change suppressed and errored `results[].label` values to show the check's human-readable label (e.g., "Respects NO_COLOR") instead of falling back to the check id. ### Fixed - Fix `p1-env-hints` Pattern 2 treating `$PAGER` as a flag-bound env reference. `PAGER` was always named in the `SHELL_ENV_BLACKLIST` doc comment as the archetypal exclusion, but was missing from the slice; tools like `git`, `gh`, and `man` that document `$PAGER` no longer produce a false-positive `p1-env-hints` warn. - Fix Pattern 2 capturing uppercase-underscored section-header lines (e.g., `DOCKER_CONFIG:`) as env hints. Section headers are now excluded via a shared predicate before token extraction. ### Documentation - Update README.md, AGENTS.md, and CLAUDE.md to describe the shipped v0.1.3 scorecard surface: `audience` + `audience_reason` + `audit_profile` field semantics, the `--audit-profile` flag with examples, and the `audit_profiles` section of `coverage/matrix.json` as the programmatic source for category enumeration. ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) - [x] `fix`: Bug fix (non-breaking change which fixes an issue) - [x] `refactor`: Code refactoring (no functional changes) - [x] `test`: Adding or updating tests - [x] `docs`: Documentation update ## Related Issues/Stories - Plan (CLI-side): `docs/plans/2026-04-21-v013-handoff-5-cli-audience-classifier.md` — Implementation Log captures the post-review kebab-case deviation - Plan (site-side coordination): `agentnative-site/docs/plans/2026-04-21-v013-handoff-6-site-leaderboard-launch.md` — Unit 0.5 covers the kebab-case absorption that must land before the v0.1.3 scorecard regen wave - Compounded learning: `docs/solutions/best-practices/module-directory-promotion-pattern-2026-04-22.md` — canonical `git mv` recipe for `foo.rs` → `foo/mod.rs` promotions (documented off the back of this PR's `help_probe` extraction) - Compounded-trilogy cross-links: `docs/solutions/best-practices/rust-module-splitting-srp-not-loc-20260327.md`, `docs/solutions/best-practices/rust-pub-crate-fields-for-cross-module-impl-pattern-2026-04-20.md` - Related PR (base commit under review): #26 - CE-review artifacts: `.context/compound-engineering/ce-review/20260422-134229-565fc6d7/` (10 reviewer JSONs + `metadata.json`; local-only by convention) ## Testing - [x] Unit tests added/updated - [x] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing **Test Summary:** - Unit tests: 402 passing (net +8 new across `scorecard::audience`, `scorecard::tests`, `principles::registry`, `principles::matrix`, `cli::tests`, and `runner::help_probe::env_hints_bash::tests`) - Integration tests: 50 passing (net +3 new: concrete non-null audience on self-dogfood, `--principle` forces `audience: null`, full-shape JSON snapshot lock) - `cargo fmt --check`: clean - `cargo clippy --all-targets -- -Dwarnings`: clean - `cargo deny check`: advisories, bans, licenses, sources all OK - `cargo run -- generate coverage-matrix --check`: no drift - `scripts/hooks/pre-push` (full local-CI mirror): all checks passed - GitHub CI on the PR head (`7a384b8`): all 5 required checks green - Dogfood without profile: `anc check . --output json` → `schema_version: "1.1"`, `audience: "agent-optimized"`, `audit_profile: null`, `audience_reason` absent (omitted via `skip_serializing_if`). - Dogfood with profile: `anc check . --audit-profile human-tui` → `audience: null`, `audience_reason: "suppressed"`, `coverage_summary.must.verified` drops 17 → 15 (suppressed P1 checks no longer inflate the metric). ## Files Modified **Modified:** - `AGENTS.md`, `CLAUDE.md`, `README.md` — v0.1.3 scorecard surface sync - `coverage/matrix.json` — regenerated (adds top-level `audit_profiles` array) - `src/check.rs` — new abstract `label(&self) -> &'static str` trait method - `src/checks/**/*.rs` (40 files) — mechanical `label()` impl + `run()` body migration from `label: "…".into()` literals to `label: self.label().into()` - `src/cli.rs` — `AuditProfile` ↔ `ExceptionCategory` parity drift tests - `src/main.rs` — `SUPPRESSION_EVIDENCE_PREFIX` usage + `check.label()` in error and `--audit-profile` suppression branches - `src/principles/registry.rs` — `SUPPRESSION_EVIDENCE_PREFIX` constant, `ALL_EXCEPTION_CATEGORIES` slice + compile-time variant guard, `ExceptionCategory::description()`, doc additions on `SUPPRESSION_TABLE` (trust boundary, drift-test scope) - `src/principles/matrix.rs` — `AuditProfileEntry` struct, `audit_profiles` section + per-variant coverage test, `matrix_json_includes_audit_profiles_section` - `src/scorecard/audience.rs` — kebab-case `classify()`, new `classify_reason()` helper, `is_audit_profile_suppression` promoted to `pub(crate)`, `debug_assert!` against duplicate signal IDs, module doc + suppression-prefix references - `src/scorecard/mod.rs` — `audience_reason: Option<String>` field (derived in `build_scorecard`, `skip_serializing_if` when `audience` has a label), `build_coverage_summary` filter, Scorecard struct doc, `exit_code` trust-boundary doc, snapshot + casing + duplicate-signal regression tests - `tests/integration.rs` — convention test blocking `CheckResult { … }` outside `fn run()` in `src/checks/**`, non-null audience dogfood, `--principle` + audience interaction, full-shape JSON snapshot, tightened `diagnostic-only` assertion, Pattern 2 bracketed-rejection and mixed-case negative fixtures **Created:** - `src/runner/help_probe/env_hints_bash.rs` — Pattern 2 submodule extracted from the 931-line `help_probe.rs`; holds `parse_env_hints_bash_style()`, `extract_env_tokens()`, `find_env_section()`, `is_section_header_line()`, `strip_clap_env_annotations()`, `SHELL_ENV_BLACKLIST`, `PATTERN2_WINDOW`, and 9 Pattern 2 tests (including 3 new for PAGER, section-header exclusion, and mixed-case / placeholder negatives). **Renamed:** - `src/runner/help_probe.rs` → `src/runner/help_probe/mod.rs` (via `git mv`; git detects rename at 58% similarity, blame preserved). ## Key Features - Single source of truth for the suppression evidence string (`SUPPRESSION_EVIDENCE_PREFIX`): producer (`main.rs`), consumer (`audience::is_audit_profile_suppression`), and filter (`build_coverage_summary`) all reference the same constant so a rename can't silently desync the three sites. - `Check::label()` on the trait means every emitted `CheckResult` — including the runtime layer's error and suppression branches — now uses a single source of truth for each check's human label. No more "scorecard row shows the cryptic id" when a check errors out or gets masked by a profile. - `EnvHintSource` enum (`clap_annotation` / `proximity` / `env_section`) tags every emitted `EnvHint` with the pattern that matched, giving agents a structured signal when debugging a p1-env-hints false positive. Pattern 1 wins on duplicates; proximity loses to env_section when both apply. - Compile-time guard (`_all_categories_covers_every_variant`) + runtime drift tests (`audit_profile_and_exception_category_variants_are_isomorphic`, `suppression_table_check_ids_exist_in_catalog`) triangulate the `AuditProfile` (CLI) ↔ `ExceptionCategory` (registry) ↔ `SUPPRESSION_TABLE` (registry) relationship so a fifth category can't land via partial edits. ## Benefits - **Correctness of published metrics.** Before this PR, every v0.1.3 scorecard emitted with `--audit-profile` overstated `coverage_summary.verified` — the site's H6 leaderboard would have rendered inflated coverage numbers that didn't match the results-level evidence. Shipped the fix before the site's regen wave. - **JSON consistency.** Unifying `audience` to kebab-case closes the only document-level casing asymmetry in the scorecard shape, so consumers no longer need two-rule case logic for one document. - **Agent discoverability.** Both the `audit_profiles` matrix section and the expanded AGENTS.md "Agent-facing JSON surface" section turn previously help-text-scraped knowledge into first-class, committed, machine-readable artifacts. - **Regression surface.** Eight new positive / negative / drift tests pin intentional behavior that future refactors might unknowingly break: exit-code-under-suppression, JSON casing contract, `CheckResult` construction scope, bracketed-env-var rejection, AuditProfile↔ExceptionCategory parity, signal-ID uniqueness, full-shape JSON snapshot, `--principle` + audience interaction. ## Breaking Changes - [x] No breaking changes for v0.1.2 consumers (which always received `audience: null` and `audit_profile: null`). - [x] Breaking for pre-release v0.1.3 consumers that had pinned on snake_case `audience` values. The only such consumer is in-house (agentnative-site's H6 leaderboard), which hasn't shipped yet. Unit 0.5 of the site's H6 plan covers the adaptation and must land before the v0.1.3 scorecard regen wave. ## Deployment Notes - [x] No special deployment steps required for this PR into `dev`. Release-branch mechanics per `RELEASES.md` happen in a follow-up PR: branch off `origin/main`, cherry-pick these commits (plus `6fdca00` and `dc5d741` from dev), bump `Cargo.toml` `0.1.2` → `0.1.3`, regenerate changelog via `generate-changelog.sh`, PR → `main`. The Homebrew-tap finalize-release dispatch will land on the wrong repo per todo `006` in `brettdavies/.github` — the manual recovery command is pre-staged in the v0.1.3 plan's operational notes section. ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed (via ce-review 10-reviewer pass against `dc5d741`; this PR resolves all findings) - [x] Tests added/updated and passing (net +11 new) - [x] No new warnings or errors introduced (`-Dwarnings` clean) - [x] Changes are backward compatible for v0.1.2 consumers (see Breaking Changes for the v0.1.3 pre-release note) ## Additional Context The ce-review process produced 10 per-reviewer JSON artifacts under `.context/compound-engineering/ce-review/20260422-134229-565fc6d7/` capturing every finding's full detail-tier fields (`why_it_matters` + `evidence[]`). These are local-only by convention and useful for debugging which finding drove which change. The `module-directory-promotion-pattern-2026-04-22.md` solution doc published to `docs/solutions/best-practices/` as part of this session (not this PR — lives in the shared solutions-docs repo) compounds the three-instances-of-the-same-refactor evidence (v0.1.2 H4 `runner.rs` split, v0.1.3 H5 `scorecard.rs` split, and this PR's `help_probe.rs` split) into a canonical recipe. The companion trilogy cross-links across two older Rust module-organization docs landed in the same solutions-docs push.
brettdavies
added a commit
that referenced
this pull request
Apr 23, 2026
## Summary Release branch for v0.1.3. Cherry-picked from `dev`: - **PR #26** — `audience` classifier shipped (`agent-optimized` / `mixed` / `human-primary`), `--audit-profile` flag (`human-tui`, `file-traversal`, `posix-utility`, `diagnostic-only`) with suppression wiring, `p1-env-hints` Pattern 2 for bash-style `$FOO` / `TOOL_FOO` references near flag definitions, plus `audience_reason` + `audit_profiles` matrix section. - **PR #27** — 26 ce-review findings resolved, `audience` snake_case → kebab-case unification, `$PAGER` / section-header false-positive fixes in Pattern 2, suppressed/errored `results[].label` now shows human labels, `coverage_summary.verified` correctly excludes `--audit-profile`-suppressed checks. Plus standard release mechanics: `Cargo.toml` bump to `0.1.3`, `Cargo.lock` refresh, regenerated `CHANGELOG.md` (git-cliff from squash-commit `## Changelog` bodies). No completions drift — PR #26 already regenerated them. ## Changelog <!-- Release PR itself has no user-facing content — the upstream PRs #26 and #27 carry the `## Changelog` sections that git-cliff extracted into CHANGELOG.md. Leaving this section empty per release-branch convention. --> ## Type of Change - [x] `feat`: New feature (non-breaking change which adds functionality) ## Related Issues/Stories - Story: v0.1.3 H5 — audience classifier + audit-profile suppression - Related PRs: #26 (feature), #27 (review resolutions) - Architecture: see `docs/plans/v0.1.3-h5-plan.md` on `dev` ## Testing - [x] Unit tests added/updated - [x] Integration tests added/updated - [x] Manual testing completed - [x] All tests passing **Test Summary:** - Pre-push hook green on release branch: fmt / clippy `-Dwarnings` / test / cargo-deny / Windows compat - CI green on PR head: `ci / Fmt, clippy, test`, `ci / Package check`, `ci / Security audit (advisories)`, `ci / Security audit (bans licenses sources)`, `ci / Windows check`, `ci / Changelog`, `guard-docs / check-forbidden-docs`, `guard-provenance / check-provenance`, `guard-release / check-release-branch-name` - Coverage matrix drift check passes against committed artifacts - Smoke-tested `anc check .` (dogfood) with and without each `--audit-profile` category ## Files Modified **Modified:** - `Cargo.toml`, `Cargo.lock` — version bump 0.1.2 → 0.1.3 - `CHANGELOG.md` — regenerated via `scripts/generate-changelog.sh` - See cherry-pick commits `4a18051` (PR #26) and `db11e97` (PR #27) for the full code-change surface (56 files, +2954 / −533). **Created:** - `src/scorecard/audience.rs` - `src/runner/help_probe/env_hints_bash.rs` **Deleted:** - None (files renamed via `git mv` as part of module reorganization — see cherry-pick bodies). ## Key Features - Populated `audience` scorecard field with kebab-case labels (`agent-optimized` / `mixed` / `human-primary`) when all four signal behavioral checks ran; `null` with `audience_reason` when any are missing. - `--audit-profile <category>` flag on `anc check` suppresses checks that don't apply to specific tool classes (human-TUI, file-traversal utilities, POSIX utilities, diagnostic-only tools). - `p1-env-hints` Pattern 2 recognizes bash-style `$FOO` / `TOOL_FOO` env references near flag definitions — tools like `ripgrep` and `aider` that document env bindings in free prose now Pass instead of Warn. - `audit_profiles[]` section in `coverage/matrix.json` — consumers can enumerate profiles programmatically instead of scraping `--help`. ## Benefits - Scorecards now carry enough metadata for agents to route tools by audience without running every check. - `--audit-profile` prevents false-negative "fails" for tools that legitimately don't satisfy a given principle (e.g., a file-traversal utility isn't expected to output structured JSON). - Pattern 2 removes a whole class of false positives on non-clap Rust CLIs and non-Rust CLIs that document env vars in help prose. ## Breaking Changes - [x] No breaking changes Scorecard schema stays at `1.1`. New fields (`audience_reason`, `audit_profile`, `audit_profiles[]` in matrix) are additive. `audience` JSON values move from snake_case to kebab-case — v0.1.2 consumers only ever saw `null` for this field, so consumers pinning on the v0.1.2 shape are unaffected; v0.1.3 consumers must match the kebab-case form. ## Deployment Notes - [ ] No special deployment steps required - [x] Deployment steps documented below: Post-merge: 1. Annotated tag + push — `git tag -a -m "Release v0.1.3" v0.1.3 && git push origin main --tags`. 2. Tag push triggers `release.yml` → `cargo publish` (Trusted Publishing) → GitHub Release (draftless, `make_latest: false` during bottle window) → Homebrew dispatch → `finalize-release` flips `make_latest: true`. 3. Follow-on on `agentnative-site`: install v0.1.3, regenerate scorecards (new `audience` kebab-case + `audit_profile` + `audience_reason` fields will populate), run `scripts/sync-coverage-matrix.sh` to pull the new `audit_profiles` section into `src/data/coverage-matrix.json`. ## Screenshots/Recordings N/A — CLI-only change. ## Checklist - [x] Code follows project conventions and style guidelines - [x] Commit messages follow [Conventional Commits](https://www.conventionalcommits.org/) - [x] Self-review of code completed - [x] Tests added/updated and passing - [x] No new warnings or errors introduced - [x] Changes are backward compatible (or breaking changes documented) - [x] Cherry-picks touched no guarded paths (`docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`) - [x] `Cargo.toml` version matches the tag this branch will produce ## Additional Context The two upstream feature PRs (#26 and #27) each carry their own `## Changelog` sections — those are what `scripts/generate-changelog.sh` extracts into `CHANGELOG.md`. This release PR's empty `## Changelog` is intentional: duplicating the upstream bullets here would double-count them in the next release cycle.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Resolves all 26 findings from the ce-review of commit
dc5d74168577and unifiesaudienceJSON values to kebab-case (apost-review decision that eliminates a mixed-casing asymmetry flagged during review).
Changelog
Added
audience_reasonfield on scorecard JSON — populated only whenaudienceisnull, with"suppressed"(signal check masked by--audit-profile) or"insufficient_signal"(signal check never produced) so consumers can see why the classifier withheld a label.audit_profilesarray incoverage/matrix.json— each entry carries{name, description, suppresses[]}, letting agents and site renderers enumerate the four--audit-profilecategories and what each one suppresses without scraping--help.Changed
results[].labelvalues now show the check's human-readable label (e.g., "Respects NO_COLOR") instead of falling back to the check id.Documentation
audience+audience_reason+audit_profilefield semantics, the--audit-profileflag with examples, and theaudit_profilessection ofcoverage/matrix.jsonas the programmatic source for category enumeration.Type of Change
feat: New feature (non-breaking change which adds functionality)fix: Bug fix (non-breaking change which fixes an issue)refactor: Code refactoring (no functional changes)test: Adding or updating testsdocs: Documentation updateRelated Issues/Stories
docs/plans/2026-04-21-v013-handoff-5-cli-audience-classifier.md— Implementation Log captures thepost-review kebab-case deviation
agentnative-site/docs/plans/2026-04-21-v013-handoff-6-site-leaderboard-launch.md—Unit 0.5 covers the kebab-case absorption that must land before the v0.1.3 scorecard regen wave
docs/solutions/best-practices/module-directory-promotion-pattern-2026-04-22.md— canonicalgit mvrecipe forfoo.rs→foo/mod.rspromotions (documented off the back of this PR'shelp_probeextraction)docs/solutions/best-practices/rust-module-splitting-srp-not-loc-20260327.md,docs/solutions/best-practices/rust-pub-crate-fields-for-cross-module-impl-pattern-2026-04-20.md.context/compound-engineering/ce-review/20260422-134229-565fc6d7/(10 reviewer JSONs +metadata.json; local-only by convention)Testing
Test Summary:
scorecard::audience,scorecard::tests,principles::registry,principles::matrix,cli::tests, andrunner::help_probe::env_hints_bash::tests)--principleforcesaudience: null, full-shape JSON snapshot lock)cargo fmt --check: cleancargo clippy --all-targets -- -Dwarnings: cleancargo deny check: advisories, bans, licenses, sources all OKcargo run -- generate coverage-matrix --check: no driftscripts/hooks/pre-push(full local-CI mirror): all checks passed7a384b8): all 5 required checks greenanc check . --output json→schema_version: "1.1",audience: "agent-optimized",audit_profile: null,audience_reasonabsent (omitted viaskip_serializing_if).anc check . --audit-profile human-tui→audience: null,audience_reason: "suppressed",coverage_summary.must.verifieddrops 17 → 15 (suppressed P1 checks no longer inflate the metric).Files Modified
Modified:
AGENTS.md,CLAUDE.md,README.md— v0.1.3 scorecard surface synccoverage/matrix.json— regenerated (adds top-levelaudit_profilesarray)src/check.rs— new abstractlabel(&self) -> &'static strtrait methodsrc/checks/**/*.rs(40 files) — mechanicallabel()impl +run()body migration fromlabel: "…".into()literalsto
label: self.label().into()src/cli.rs—AuditProfile↔ExceptionCategoryparity drift testssrc/main.rs—SUPPRESSION_EVIDENCE_PREFIXusage +check.label()in error and--audit-profilesuppressionbranches
src/principles/registry.rs—SUPPRESSION_EVIDENCE_PREFIXconstant,ALL_EXCEPTION_CATEGORIESslice + compile-timevariant guard,
ExceptionCategory::description(), doc additions onSUPPRESSION_TABLE(trust boundary, drift-testscope)
src/principles/matrix.rs—AuditProfileEntrystruct,audit_profilessection + per-variant coverage test,matrix_json_includes_audit_profiles_sectionsrc/scorecard/audience.rs— kebab-caseclassify(), newclassify_reason()helper,is_audit_profile_suppressionpromoted to
pub(crate),debug_assert!against duplicate signal IDs, module doc + suppression-prefix referencessrc/scorecard/mod.rs—audience_reason: Option<String>field (derived inbuild_scorecard,skip_serializing_ifwhen
audiencehas a label),build_coverage_summaryfilter, Scorecard struct doc,exit_codetrust-boundary doc,snapshot + casing + duplicate-signal regression tests
tests/integration.rs— convention test blockingCheckResult { … }outsidefn run()insrc/checks/**, non-nullaudience dogfood,
--principle+ audience interaction, full-shape JSON snapshot, tighteneddiagnostic-onlyassertion, Pattern 2 bracketed-rejection and mixed-case negative fixtures
Created:
src/runner/help_probe/env_hints_bash.rs— Pattern 2 submodule extracted from the 931-linehelp_probe.rs; holdsparse_env_hints_bash_style(),extract_env_tokens(),find_env_section(),is_section_header_line(),strip_clap_env_annotations(),SHELL_ENV_BLACKLIST,PATTERN2_WINDOW, and 9 Pattern 2 tests (including 3 new forPAGER, section-header exclusion, and mixed-case / placeholder negatives).
Renamed:
src/runner/help_probe.rs→src/runner/help_probe/mod.rs(viagit mv; git detects rename at 58% similarity, blamepreserved).
Key Features
SUPPRESSION_EVIDENCE_PREFIX): producer (main.rs),consumer (
audience::is_audit_profile_suppression), and filter (build_coverage_summary) all reference the sameconstant so a rename can't silently desync the three sites.
Check::label()on the trait means every emittedCheckResult— including the runtime layer's error and suppressionbranches — now uses a single source of truth for each check's human label. No more "scorecard row shows the cryptic
id" when a check errors out or gets masked by a profile.
EnvHintSourceenum (clap_annotation/proximity/env_section) tags every emittedEnvHintwith the patternthat matched, giving agents a structured signal when debugging a p1-env-hints false positive. Pattern 1 wins on
duplicates; proximity loses to env_section when both apply.
_all_categories_covers_every_variant) + runtime drift tests(
audit_profile_and_exception_category_variants_are_isomorphic,suppression_table_check_ids_exist_in_catalog)triangulate the
AuditProfile(CLI) ↔ExceptionCategory(registry) ↔SUPPRESSION_TABLE(registry) relationship soa fifth category can't land via partial edits.
Benefits
--audit-profileoverstatedcoverage_summary.verified— the site's H6 leaderboard would have rendered inflated coverage numbers that didn'tmatch the results-level evidence. Shipped the fix before the site's regen wave.
audienceto kebab-case closes the only document-level casing asymmetry in thescorecard shape, so consumers no longer need two-rule case logic for one document.
audit_profilesmatrix section and the expanded AGENTS.md "Agent-facing JSONsurface" section turn previously help-text-scraped knowledge into first-class, committed, machine-readable artifacts.
might unknowingly break: exit-code-under-suppression, JSON casing contract,
CheckResultconstruction scope,bracketed-env-var rejection, AuditProfile↔ExceptionCategory parity, signal-ID uniqueness, full-shape JSON snapshot,
--principle+ audience interaction.Breaking Changes
audience: nullandaudit_profile: null).audiencevalues. The only such consumeris in-house (agentnative-site's H6 leaderboard), which hasn't shipped yet. Unit 0.5 of the site's H6 plan covers the
adaptation and must land before the v0.1.3 scorecard regen wave.
Deployment Notes
dev.Release-branch mechanics per
RELEASES.mdhappen in a follow-up PR:branch off
origin/main, cherry-pick these commits (plus6fdca00anddc5d741from dev), bumpCargo.toml0.1.2→0.1.3, regenerate changelog viagenerate-changelog.sh, PR →main. The Homebrew-tap finalize-release dispatch willland on the wrong repo per todo
006inbrettdavies/.github— the manual recovery command is pre-staged in the v0.1.3plan's operational notes section.
Checklist
dc5d741; this PR resolves all findings)-Dwarningsclean)Additional Context
The ce-review process produced 10 per-reviewer JSON artifacts under
.context/compound-engineering/ce-review/20260422-134229-565fc6d7/capturing every finding's full detail-tier fields(
why_it_matters+evidence[]). These are local-only by convention and useful for debugging which finding drove whichchange.
The
module-directory-promotion-pattern-2026-04-22.mdsolution doc published todocs/solutions/best-practices/aspart of this session (not this PR — lives in the shared solutions-docs repo) compounds the
three-instances-of-the-same-refactor evidence (v0.1.2 H4
runner.rssplit, v0.1.3 H5scorecard.rssplit, and thisPR's
help_probe.rssplit) into a canonical recipe. The companion trilogy cross-links across two older Rustmodule-organization docs landed in the same solutions-docs push.