feat(v0.1.2): behavioral check expansion + HelpOutput cache#24
Merged
brettdavies merged 6 commits intodevfrom Apr 21, 2026
Merged
feat(v0.1.2): behavioral check expansion + HelpOutput cache#24brettdavies merged 6 commits intodevfrom
brettdavies merged 6 commits intodevfrom
Conversation
Raises required_approving_review_count from 0 to 1 on the main branch ruleset so PRs targeting main require at least one approval before merge. Complements required_status_checks + required_linear_history already in place.
Introduce `Confidence { High, Medium, Low }` enum so behavioral checks
can signal how much they trust their own verdict. Direct probes keep
the default `High`; the two new heuristic P1/P6 behavioral checks in
this release report `Medium`.
The additive `confidence` field in each scorecard result does not bump
`schema_version` — v1.1 consumers feature-detect and tolerate missing
keys. Existing checks (37 construction sites, plus test helpers) now
pass `Confidence::High` explicitly; no semantic change.
New src/runner/help_probe.rs exposes HelpOutput — a once-per-target probe of <binary> --help with lazy, cached parse views: flags(), env_hints(), subcommands(). Checks that consume the help surface share the same HelpOutput so the binary is never re-spawned within a single run. Parsers are English-only on purpose. Localized help is documented as a named exception in docs/coverage-matrix.md; consumers skip (not warn) when the English surface is absent. Unit tests cover the ripgrep/clap/bare/non-English fixtures plus caching idempotence. src/runner.rs is promoted to src/runner/mod.rs so the new submodule can live alongside BinaryRunner without touching unrelated code.
Checks that need to inspect the --help surface now call `project.help_output()`; the OnceLock initializer probes `<binary> --help` exactly once and all subsequent calls return the same `HelpOutput`. No churn to the `Check` trait signature — existing checks continue to compile unchanged.
…oral Three new behavioral checks land as a set, all consuming the shared HelpOutput cache. Registry linkage is via covers() — no new requirement IDs. Each check gets the canonical unit test shape: happy path, skip-applicability, warn-missing, non-English fixture. p1-flag-existence (confidence: high, covers p1-must-no-interactive) — second behavioral proof alongside p1-non-interactive. Passes when the help surface advertises any of the canonical non-interactive flags (--no-interactive, --batch, --headless, -y, --yes, -p, --print, --no-input, --assume-yes). Skips when the target already satisfies P1 via help-on-bare-invocation or stdin-clean-exit. p1-env-hints (confidence: medium, covers p1-must-env-var) — behavioral layer for env-var coverage previously source-only via p1-env-flags-source. Reads clap-style `[env: FOO]` annotations from the parsed help surface. p6-no-pager-behavioral (confidence: medium, covers p6-must-no-pager) — behavioral layer for pager coverage previously source-only via p6-no-pager. Passes when `--no-pager` is advertised. Skips when no pager signal present. Warns when pager is referenced but the escape hatch is missing. Coverage matrix regeneration lands in the next commit.
Regenerated docs/coverage-matrix.md and coverage/matrix.json to pick up covers() linkage for the three new behavioral checks. Three requirements move from single-layer to dual-layer coverage: - p1-must-no-interactive +p1-flag-existence (behavioral) - p1-must-env-var +p1-env-hints (behavioral) - p6-must-no-pager +p6-no-pager-behavioral (behavioral) MatrixSummary gains a `dual_layer` count so the site /coverage page and the committed markdown surface the signal directly. JSON shape addition is additive — schema_version stays at 1.0 and consumers feature-detect missing fields. Also silences the dead_code warnings for HelpOutput::subcommands() (reserved for future P3/P6 checks) and tightens Flag::matches() to drop the redundant branch that clippy flagged.
brettdavies
added a commit
that referenced
this pull request
Apr 21, 2026
## Summary Release branch for v0.1.2. Cherry-picked from `dev`: - **PR #24** — v0.1.2 feature: 3 new behavioral checks (`p1-flag-existence`, `p1-env-hints`, `p6-no-pager-behavioral`) + shared `HelpOutput` cache + `Confidence` field on `CheckResult` + `dual_layer` stat in coverage matrix + `main` ruleset review count 0 → 1. - **PR #23** — post-v0.1.1 README + AGENTS sync that never made it into the v0.1.1 release branch. Plus the standard release-branch mechanics: `Cargo.toml` bump to `0.1.2`, `Cargo.lock` refresh, regenerated `CHANGELOG.md` (git-cliff from squash-commit `## Changelog` bodies). Completions had no drift this cycle — CLI surface unchanged. ## Type of Change - [x] `feat`: new user-visible capabilities (three behavioral checks + `confidence` field + dual-layer coverage stat) ## Testing - [x] All tests passing on dev head before cherry-pick (41 / 41) - [x] Pre-push hook green on release branch: fmt / clippy `-Dwarnings` / test / cargo-deny / Windows compat - [x] Coverage matrix drift check passes against the committed artifacts - [x] Smoke-tested against `rg` / `bat` / `bird` / `xr` / `anc` itself ## Files Modified See the two cherry-pick bodies (commits `72ca148` and `2fcb08d`) plus `CHANGELOG.md` + `Cargo.toml` + `Cargo.lock` for the release mechanics. ## Deployment Notes Post-merge: 1. Annotated tag + push — `git tag -a -m "Release v0.1.2" v0.1.2 && git push origin main --tags`. 2. Tag push triggers `release.yml` → `cargo publish` (Trusted Publishing) → GitHub Release (draftless, `make_latest: false` during bottle window) → Homebrew dispatch → `finalize-release` flips `make_latest: true`. 3. Follow-on on `agentnative-site`: install v0.1.2, regenerate the 10 committed scorecards (H3's proven workflow), run `scripts/sync-coverage-matrix.sh` to pull `dual_layer: 7` into `src/data/coverage-matrix.json`. ## Breaking Changes - [x] No breaking changes. Scorecard schema stays at `1.1`; `confidence` on results and `dual_layer` on matrix summary are additive. ## Checklist - [x] CI-equivalent gates green locally - [x] Cherry-picks touched no guarded paths (`docs/plans/`, `docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`) - [x] `Cargo.toml` version matches the tag this branch will produce - [x] CHANGELOG entries reflect user-facing changes only
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships the three behavioral checks and shared
HelpOutputcache promised for v0.1.2 H4:p1-flag-existence— second behavioral proof ofp1-must-no-interactivealongsidep1-non-interactivep1-env-hints— behavioral layer over source-onlyp1-env-flags-sourcep6-no-pager-behavioral— behavioral layer over source-onlyp6-no-pagerAll three consume a single
HelpOutputper target, so the binary's--helpis spawned once per run regardless of how many checks inspect it. Each check declarescovers()against an existing registry ID — no new requirement IDs land in this PR, only new verifiers. Three requirements move from single-layer to dual-layer coverage (dual-layer count: 4 → 7).Also carries an unrelated chore: raising
required_approving_review_countfrom 0 → 1 on themainbranch ruleset, committed as the first commit of the branch.Changelog
Added
p1-flag-existencebehavioral check — passes when--helpadvertises a non-interactive gate flag (--no-interactive,--batch,--headless,-y,--yes,-p,--print,--no-input,--assume-yes). Skips when the target already satisfies P1 via help-on-bare-invocation or stdin-primary.p1-env-hintsbehavioral check — passes when--helpexposes clap-style[env: FOO]bindings for flags. Emits medium confidence; the heuristic covers the canonical but not the only env-binding format.p6-no-pager-behavioralbehavioral check — passes when--no-pageris advertised in--help. Skips when no pager signal (less/more/$PAGER/--pager) appears. Emits medium confidence.confidencefield to every scorecard result (high/medium/low). Additive; v1.1 consumers feature-detect.dual_layercount to the coverage matrix summary so the headline prose surfaces how many covered requirements have verifiers in two layers.Changed
mainbranch from 0 to 1.Documentation
docs/coverage-matrix.md+coverage/matrix.jsonto pick up the three new behavioral verifiers.Type of Change
Testing
Test Summary:
rg,bat,bird,xrviaanc check --command; all new checks produce sensible verdicts.anc check .on the agentnative repo produces Pass or Skip (no Warn) for all three new checks.Files Modified
Created:
src/runner/help_probe.rs—HelpOutputcache with lazyflags(),env_hints(),subcommands()parsers.src/checks/behavioral/flag_existence.rssrc/checks/behavioral/env_hints.rssrc/checks/behavioral/no_pager_behavioral.rsModified:
src/runner.rs→src/runner/mod.rs(promoted to a module directory sohelp_probecan live alongsideBinaryRunner).src/project.rs— lazyhelp_output()accessor onProject(OnceLock, same pattern asparsed_files).src/types.rs—Confidenceenum +CheckResult::confidencefield.src/scorecard.rs— serializeconfidenceon every result view.src/principles/matrix.rs—dual_layerstat + surfaced in summary prose.src/checks/behavioral/mod.rs— register the three new checks.docs/coverage-matrix.md+coverage/matrix.json— regenerated..github/rulesets/protect-main.json— raise required approving review count onmain.confidence: Confidence::High,in theirCheckResultconstruction — mechanical, no semantic change.Key Features
--helpprobe per target regardless of how many checks consume it. Runner caching meansp1-flag-existenceandp6-no-pager-behavioralshare state without explicit plumbing.Breaking Changes
Deployment Notes
agentnative-site(H3's proven workflow) once v0.1.2 is released to crates.io + Homebrew. Matrix sync (scripts/sync-coverage-matrix.sh) picks up the newdual_layerfield automatically.Checklist