Skip to content

feat(v0.1.2): behavioral check expansion + HelpOutput cache#24

Merged
brettdavies merged 6 commits intodevfrom
feat/v012-behavioral-check-expansion
Apr 21, 2026
Merged

feat(v0.1.2): behavioral check expansion + HelpOutput cache#24
brettdavies merged 6 commits intodevfrom
feat/v012-behavioral-check-expansion

Conversation

@brettdavies
Copy link
Copy Markdown
Owner

Summary

Ships the three behavioral checks and shared HelpOutput cache promised for v0.1.2 H4:

  • p1-flag-existence — second behavioral proof of p1-must-no-interactive alongside p1-non-interactive
  • p1-env-hints — behavioral layer over source-only p1-env-flags-source
  • p6-no-pager-behavioral — behavioral layer over source-only p6-no-pager

All three consume a single HelpOutput per target, so the binary's --help is spawned once per run regardless of how many checks inspect it. Each check declares covers() against an existing registry ID — no new requirement IDs land in this PR, only new verifiers. Three requirements move from single-layer to dual-layer coverage (dual-layer count: 4 → 7).

Also carries an unrelated chore: raising required_approving_review_count from 0 → 1 on the main branch ruleset, committed as the first commit of the branch.

Changelog

Added

  • Add p1-flag-existence behavioral check — passes when --help advertises a non-interactive gate flag (--no-interactive, --batch, --headless, -y, --yes, -p, --print, --no-input, --assume-yes). Skips when the target already satisfies P1 via help-on-bare-invocation or stdin-primary.
  • Add p1-env-hints behavioral check — passes when --help exposes clap-style [env: FOO] bindings for flags. Emits medium confidence; the heuristic covers the canonical but not the only env-binding format.
  • Add p6-no-pager-behavioral behavioral check — passes when --no-pager is advertised in --help. Skips when no pager signal (less / more / $PAGER / --pager) appears. Emits medium confidence.
  • Add confidence field to every scorecard result (high / medium / low). Additive; v1.1 consumers feature-detect.
  • Add dual_layer count to the coverage matrix summary so the headline prose surfaces how many covered requirements have verifiers in two layers.

Changed

  • Raise required approving review count on main branch from 0 to 1.

Documentation

  • Regenerate docs/coverage-matrix.md + coverage/matrix.json to pick up the three new behavioral verifiers.

Type of Change

  • `feat`: New feature (non-breaking change which adds functionality)

Testing

  • Unit tests added/updated
  • Integration tests added/updated
  • Manual testing completed
  • All tests passing

Test Summary:

  • Unit + integration tests: 41 passing (up from 35 — 6 tests added for the new checks, plus additional HelpOutput parser coverage).
  • Pre-push hook (mirrors CI): fmt, clippy -Dwarnings, test, cargo-deny, Windows compat — all green.
  • Smoke-tested against rg, bat, bird, xr via anc check --command; all new checks produce sensible verdicts.
  • Dogfood: anc check . on the agentnative repo produces Pass or Skip (no Warn) for all three new checks.

Files Modified

Created:

  • src/runner/help_probe.rsHelpOutput cache with lazy flags(), env_hints(), subcommands() parsers.
  • src/checks/behavioral/flag_existence.rs
  • src/checks/behavioral/env_hints.rs
  • src/checks/behavioral/no_pager_behavioral.rs

Modified:

  • src/runner.rssrc/runner/mod.rs (promoted to a module directory so help_probe can live alongside BinaryRunner).
  • src/project.rs — lazy help_output() accessor on Project (OnceLock, same pattern as parsed_files).
  • src/types.rsConfidence enum + CheckResult::confidence field.
  • src/scorecard.rs — serialize confidence on every result view.
  • src/principles/matrix.rsdual_layer stat + surfaced in summary prose.
  • src/checks/behavioral/mod.rs — register the three new checks.
  • docs/coverage-matrix.md + coverage/matrix.json — regenerated.
  • .github/rulesets/protect-main.json — raise required approving review count on main.
  • 34 existing check files gain confidence: Confidence::High, in their CheckResult construction — mechanical, no semantic change.

Key Features

  • One --help probe per target regardless of how many checks consume it. Runner caching means p1-flag-existence and p6-no-pager-behavioral share state without explicit plumbing.
  • Dual-layer coverage growth: three requirements now have both behavioral and source/project verifiers, halving the single-layer bucket for P1 and P6 MUSTs.

Breaking Changes

  • No breaking changes

Deployment Notes

  • Post-merge, regenerate the 10 committed scorecards on agentnative-site (H3's proven workflow) once v0.1.2 is released to crates.io + Homebrew. Matrix sync (scripts/sync-coverage-matrix.sh) picks up the new dual_layer field automatically.

Checklist

  • Code follows project conventions and style guidelines
  • Commit messages follow Conventional Commits
  • Self-review of code completed
  • Tests added/updated and passing
  • No new warnings or errors introduced
  • Changes are backward compatible

Raises required_approving_review_count from 0 to 1 on the main branch
ruleset so PRs targeting main require at least one approval before
merge. Complements required_status_checks + required_linear_history
already in place.
Introduce `Confidence { High, Medium, Low }` enum so behavioral checks
can signal how much they trust their own verdict. Direct probes keep
the default `High`; the two new heuristic P1/P6 behavioral checks in
this release report `Medium`.

The additive `confidence` field in each scorecard result does not bump
`schema_version` — v1.1 consumers feature-detect and tolerate missing
keys. Existing checks (37 construction sites, plus test helpers) now
pass `Confidence::High` explicitly; no semantic change.
New src/runner/help_probe.rs exposes HelpOutput — a once-per-target
probe of <binary> --help with lazy, cached parse views:
flags(), env_hints(), subcommands(). Checks that consume the help
surface share the same HelpOutput so the binary is never re-spawned
within a single run.

Parsers are English-only on purpose. Localized help is documented
as a named exception in docs/coverage-matrix.md; consumers skip
(not warn) when the English surface is absent. Unit tests cover the
ripgrep/clap/bare/non-English fixtures plus caching idempotence.

src/runner.rs is promoted to src/runner/mod.rs so the new submodule
can live alongside BinaryRunner without touching unrelated code.
Checks that need to inspect the --help surface now call
`project.help_output()`; the OnceLock initializer probes
`<binary> --help` exactly once and all subsequent calls return
the same `HelpOutput`. No churn to the `Check` trait signature —
existing checks continue to compile unchanged.
…oral

Three new behavioral checks land as a set, all consuming the shared
HelpOutput cache. Registry linkage is via covers() — no new
requirement IDs. Each check gets the canonical unit test shape:
happy path, skip-applicability, warn-missing, non-English fixture.

p1-flag-existence (confidence: high, covers p1-must-no-interactive) —
  second behavioral proof alongside p1-non-interactive. Passes when
  the help surface advertises any of the canonical non-interactive
  flags (--no-interactive, --batch, --headless, -y, --yes, -p,
  --print, --no-input, --assume-yes). Skips when the target already
  satisfies P1 via help-on-bare-invocation or stdin-clean-exit.

p1-env-hints (confidence: medium, covers p1-must-env-var) —
  behavioral layer for env-var coverage previously source-only via
  p1-env-flags-source. Reads clap-style `[env: FOO]` annotations
  from the parsed help surface.

p6-no-pager-behavioral (confidence: medium, covers p6-must-no-pager) —
  behavioral layer for pager coverage previously source-only via
  p6-no-pager. Passes when `--no-pager` is advertised. Skips when
  no pager signal present. Warns when pager is referenced but the
  escape hatch is missing.

Coverage matrix regeneration lands in the next commit.
Regenerated docs/coverage-matrix.md and coverage/matrix.json to pick
up covers() linkage for the three new behavioral checks. Three
requirements move from single-layer to dual-layer coverage:

- p1-must-no-interactive  +p1-flag-existence (behavioral)
- p1-must-env-var         +p1-env-hints (behavioral)
- p6-must-no-pager        +p6-no-pager-behavioral (behavioral)

MatrixSummary gains a `dual_layer` count so the site /coverage page
and the committed markdown surface the signal directly. JSON shape
addition is additive — schema_version stays at 1.0 and consumers
feature-detect missing fields.

Also silences the dead_code warnings for HelpOutput::subcommands()
(reserved for future P3/P6 checks) and tightens Flag::matches() to
drop the redundant branch that clippy flagged.
@brettdavies brettdavies merged commit f969f8c into dev Apr 21, 2026
6 checks passed
@brettdavies brettdavies deleted the feat/v012-behavioral-check-expansion branch April 21, 2026 22:03
@brettdavies brettdavies mentioned this pull request Apr 21, 2026
10 tasks
brettdavies added a commit that referenced this pull request Apr 21, 2026
## Summary

Release branch for v0.1.2. Cherry-picked from `dev`:

- **PR #24** — v0.1.2 feature: 3 new behavioral checks
(`p1-flag-existence`, `p1-env-hints`, `p6-no-pager-behavioral`) + shared
`HelpOutput` cache + `Confidence` field on `CheckResult` + `dual_layer`
stat in coverage matrix + `main` ruleset review count 0 → 1.
- **PR #23** — post-v0.1.1 README + AGENTS sync that never made it into
the v0.1.1 release branch.

Plus the standard release-branch mechanics: `Cargo.toml` bump to
`0.1.2`, `Cargo.lock` refresh, regenerated `CHANGELOG.md` (git-cliff
from squash-commit `## Changelog` bodies). Completions had no drift this
cycle — CLI surface unchanged.

## Type of Change

- [x] `feat`: new user-visible capabilities (three behavioral checks +
`confidence` field + dual-layer coverage stat)

## Testing

- [x] All tests passing on dev head before cherry-pick (41 / 41)
- [x] Pre-push hook green on release branch: fmt / clippy `-Dwarnings` /
test / cargo-deny / Windows compat
- [x] Coverage matrix drift check passes against the committed artifacts
- [x] Smoke-tested against `rg` / `bat` / `bird` / `xr` / `anc` itself

## Files Modified

See the two cherry-pick bodies (commits `72ca148` and `2fcb08d`) plus
`CHANGELOG.md` + `Cargo.toml` + `Cargo.lock` for the release mechanics.

## Deployment Notes

Post-merge:

1. Annotated tag + push — `git tag -a -m "Release v0.1.2" v0.1.2 && git
push origin main --tags`.
2. Tag push triggers `release.yml` → `cargo publish` (Trusted
Publishing) → GitHub Release (draftless, `make_latest: false` during
bottle window) → Homebrew dispatch → `finalize-release` flips
`make_latest: true`.
3. Follow-on on `agentnative-site`: install v0.1.2, regenerate the 10
committed scorecards (H3's proven workflow), run
`scripts/sync-coverage-matrix.sh` to pull `dual_layer: 7` into
`src/data/coverage-matrix.json`.

## Breaking Changes

- [x] No breaking changes. Scorecard schema stays at `1.1`; `confidence`
on results and `dual_layer` on matrix summary are additive.

## Checklist

- [x] CI-equivalent gates green locally
- [x] Cherry-picks touched no guarded paths (`docs/plans/`,
`docs/solutions/`, `docs/brainstorms/`, `docs/reviews/`)
- [x] `Cargo.toml` version matches the tag this branch will produce
- [x] CHANGELOG entries reflect user-facing changes only
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant