Goal
Ship the first end-to-end usable rustmanifest check command. After this PR, a user can run:
rustmanifest check src/
rustmanifest check src/ --format sarif > findings.sarif
rustmanifest check src/ --format tty
against any Rust source tree and get back deterministic, machine-readable findings drawn from the 5 tier-1 rules shipped in Phase 1A.
The CLI is the first deliverable that proves the rules pack works end-to-end; everything until now (Phase 0 + 1A) is foundation.
Production-grade requirements (2026 baseline)
Engine (rustmanifest-engine)
Source newtype wrapping (path, content) with byte-range helpers.
PatternAnalyzer implementing the Analyzer trait for tier-1 (regex) rules.
- Regex pre-compiled once per rule, cached.
- Returns
Findings with byte ranges (not line/column — locations live as byte offsets for editor-agnostic precision).
- Honors
exclude_globs via globset.
Orchestrator running every analyzer against every file in parallel via rayon.
- Deterministic output: findings sorted by
(file, byte_start, rule_id).
- Cancellation via
Arc<AtomicBool> for clean shutdown (MCP server use case in Phase 2).
- Memory budget: configurable per-file size limit (default 10 MiB). Oversized files yield a single
Skipped finding instead of being read.
- File discovery via the
ignore crate: .gitignore-aware, parallel walk, glob-aware. Binary files skipped.
- Pragma suppression:
// rustmanifest: allow(RM-RULE-ID) reason="..." on the same or immediately preceding line. The reason="..." is mandatory; pragmas without a reason are themselves a finding (RM-META-001, reserved — emitted by the engine, not a rules-pack rule).
- Tracing instrumentation:
tracing spans on Orchestrator::run, PatternAnalyzer::analyze, file reads. Per-rule timing emitted as event fields.
EngineError via thiserror covering: IO errors, regex compile errors (build-time only), oversized files, walker errors.
Renderers (rustmanifest-report)
Three renderers; same Finding input, three outputs:
JsonRenderer — canonical JSON, newline-delimited Findings array or single document depending on flag (default: pretty-printed array).
SarifRenderer — SARIF 2.1.0 compliant. Populates runs[].tool.driver.{name,version,informationUri,rules} and runs[].results[].{ruleId,level,message,locations[].physicalLocation.{artifactLocation.uri,region.byteOffset,region.byteLength}}. Level mapping: error→error, warning→warning, info→note, hint→none.
TtyRenderer — human-readable for terminals. Color via anstyle; respects NO_COLOR env var and --no-color CLI flag; auto-detects TTY via anstream. Shows file path, severity-colored badge, rule ID, message, optional source context lines.
All renderers implement the existing Renderer trait with a concrete Error type.
CLI (rustmanifest-cli)
rustmanifest check [PATHS...]:
- Positional
PATHS: one or more files or directories. Default: . (current directory).
--format <json|sarif|tty>: output format. Default tty on TTY, json when stdout is piped.
--severity-filter <error|warning|info|hint>: minimum severity to report. Default hint (all).
--no-color: force no color even on TTY.
--max-file-size <bytes>: override per-file memory budget. Default 10 MiB.
--threads <N>: rayon thread pool size. Default: num_cpus.
--verbose / -v: increase tracing verbosity (stacks: warn → info → debug → trace).
Exit codes:
0 — no findings at or above --severity-filter.
1 — findings present at or above the threshold.
2 — operational error (IO, walker, unparseable args).
tracing-subscriber initialized at startup with env-var override (RUST_LOG).
Quality gates
- Unit tests in each crate: ≥80% line coverage on new code in
rustmanifest-engine and rustmanifest-report.
- Integration golden tests:
rustmanifest-engine/tests/golden.rs — orchestrator on the in-tree fixtures from Phase 1A. Every fail.rs produces exactly one finding for its rule; every pass.rs produces zero.
rustmanifest-cli/tests/cli.rs via assert_cmd — exit codes, output formats, exit codes per severity filter.
- Snapshot tests in
rustmanifest-report via insta: golden TTY/JSON/SARIF outputs for a canonical finding set. Regenerable via cargo insta review.
- Benchmarks in
rustmanifest-engine/benches/orchestrator.rs via criterion: orchestrator throughput on synthesized N-file workloads. Not gated in CI (added in a later perf-focused PR), but compiled in CI as a smoke check.
- No
unwrap/expect/panic in non-test code; per-file #![allow(...)] with reason = "..." only inside tests/ and benches/.
unsafe_code = forbid stays workspace-wide.
Determinism
- Findings sorted by
(file, byte_start, byte_end, rule_id) before rendering.
- No
HashMap iteration in output paths (use BTreeMap or sort).
- No
std::time::Instant reads in finding metadata.
- No randomized rayon scheduling artifacts: collect into Vec, sort, render.
New dependencies
Runtime (rustmanifest-engine):
rayon (data parallelism)
ignore (gitignore-aware walker)
globset (exclude glob matching; transitively pulled by ignore anyway)
tracing
thiserror
Runtime (rustmanifest-report):
anstyle (color codes)
anstream (TTY detection + auto-disable on non-TTY)
Runtime (rustmanifest-cli):
tracing-subscriber (with env-filter feature)
anyhow (CLI-level errors)
Dev (engine, report, cli):
insta (snapshot tests)
assert_cmd + predicates (CLI integration tests)
tempfile (test fixture trees)
Dev (engine):
All MIT or Apache-2.0, well within deny.toml allowlist. multiple-versions = "deny" should remain satisfied; if a transitive duplicate appears the PR addresses it explicitly (no blanket skips).
Out of scope (deferred)
- AST tier-2 analyzers (Phase 1C).
rustmanifest.toml config loading and profile resolution (Phase 1D).
explain <rule-id> CLI subcommand (Phase 1D, needs methodology resources).
- Eval corpus and precision/recall measurement (Phase 1E).
- Persistent result cache (Phase 1F).
unused-pragma finding when a pragma references a rule that didn't fire — deferred to Phase 1D.
Acceptance criteria
cargo +nightly fmt --all -- --check clean.
cargo clippy --workspace --all-targets --all-features -- -D warnings clean.
cargo test --workspace passes. New tests:
- Engine: orchestrator + pattern analyzer + pragma parser + walker.
- Report: 3 renderers via snapshots.
- CLI:
check exit codes, format selection, severity filtering.
cargo build --workspace --release succeeds on all 4 supported OS in CI.
cargo bench --workspace --no-run succeeds (compile-only gate).
reuse lint 100%.
cargo deny check clean.
cargo tree --workspace --duplicates empty or each duplicate is justified inline in deny.toml with rationale.
- Schema drift gate green.
- All 13 CI jobs green on the PR.
- Manual smoke:
cargo run -p rustmanifest-cli -- check crates/rustmanifest-rules-core/rules/RM-SEC-001/fail.rs returns exit code 1 and a finding for RM-SEC-001 in the chosen format.
Risks and mitigations
- Output non-determinism under
rayon: explicit sort step before rendering.
- SARIF schema correctness: validate snapshot against published SARIF 2.1.0 schema in a dedicated test (load the schema JSON, validate generated output via
jsonschema); if jsonschema is too heavy, fall back to structural equality with a hand-curated golden.
ignore walker pulls a large dep tree: documented; mitigated by the globset reuse instead of pulling a separate glob crate.
criterion is a heavy build-only dep: gated to [dev-dependencies], no impact on the release binary.
- Workspace lints break dev deps:
tests/ and benches/ files get file-level #![allow(clippy::unwrap_used, clippy::expect_used, clippy::panic, reason = "test/bench code")] — never workspace-wide.
Goal
Ship the first end-to-end usable
rustmanifest checkcommand. After this PR, a user can run:against any Rust source tree and get back deterministic, machine-readable findings drawn from the 5 tier-1 rules shipped in Phase 1A.
The CLI is the first deliverable that proves the rules pack works end-to-end; everything until now (Phase 0 + 1A) is foundation.
Production-grade requirements (2026 baseline)
Engine (
rustmanifest-engine)Sourcenewtype wrapping(path, content)with byte-range helpers.PatternAnalyzerimplementing theAnalyzertrait for tier-1 (regex) rules.Findings with byte ranges (not line/column — locations live as byte offsets for editor-agnostic precision).exclude_globsviaglobset.Orchestratorrunning every analyzer against every file in parallel viarayon.(file, byte_start, rule_id).Arc<AtomicBool>for clean shutdown (MCP server use case in Phase 2).Skippedfinding instead of being read.ignorecrate:.gitignore-aware, parallel walk, glob-aware. Binary files skipped.// rustmanifest: allow(RM-RULE-ID) reason="..."on the same or immediately preceding line. Thereason="..."is mandatory; pragmas without a reason are themselves a finding (RM-META-001, reserved — emitted by the engine, not a rules-pack rule).tracingspans onOrchestrator::run,PatternAnalyzer::analyze, file reads. Per-rule timing emitted as event fields.EngineErrorviathiserrorcovering: IO errors, regex compile errors (build-time only), oversized files, walker errors.Renderers (
rustmanifest-report)Three renderers; same
Findinginput, three outputs:JsonRenderer— canonical JSON, newline-delimitedFindingsarray or single document depending on flag (default: pretty-printed array).SarifRenderer— SARIF 2.1.0 compliant. Populatesruns[].tool.driver.{name,version,informationUri,rules}andruns[].results[].{ruleId,level,message,locations[].physicalLocation.{artifactLocation.uri,region.byteOffset,region.byteLength}}. Level mapping:error→error,warning→warning,info→note,hint→none.TtyRenderer— human-readable for terminals. Color viaanstyle; respectsNO_COLORenv var and--no-colorCLI flag; auto-detects TTY viaanstream. Shows file path, severity-colored badge, rule ID, message, optional source context lines.All renderers implement the existing
Renderertrait with a concreteErrortype.CLI (
rustmanifest-cli)rustmanifest check [PATHS...]:PATHS: one or more files or directories. Default:.(current directory).--format <json|sarif|tty>: output format. Defaultttyon TTY,jsonwhen stdout is piped.--severity-filter <error|warning|info|hint>: minimum severity to report. Defaulthint(all).--no-color: force no color even on TTY.--max-file-size <bytes>: override per-file memory budget. Default 10 MiB.--threads <N>: rayon thread pool size. Default:num_cpus.--verbose / -v: increase tracing verbosity (stacks: warn → info → debug → trace).Exit codes:
0— no findings at or above--severity-filter.1— findings present at or above the threshold.2— operational error (IO, walker, unparseable args).tracing-subscriberinitialized at startup with env-var override (RUST_LOG).Quality gates
rustmanifest-engineandrustmanifest-report.rustmanifest-engine/tests/golden.rs— orchestrator on the in-tree fixtures from Phase 1A. Everyfail.rsproduces exactly one finding for its rule; everypass.rsproduces zero.rustmanifest-cli/tests/cli.rsviaassert_cmd— exit codes, output formats, exit codes per severity filter.rustmanifest-reportviainsta: golden TTY/JSON/SARIF outputs for a canonical finding set. Regenerable viacargo insta review.rustmanifest-engine/benches/orchestrator.rsviacriterion: orchestrator throughput on synthesized N-file workloads. Not gated in CI (added in a later perf-focused PR), but compiled in CI as a smoke check.unwrap/expect/panicin non-test code; per-file#![allow(...)]withreason = "..."only insidetests/andbenches/.unsafe_code = forbidstays workspace-wide.Determinism
(file, byte_start, byte_end, rule_id)before rendering.HashMapiteration in output paths (useBTreeMapor sort).std::time::Instantreads in finding metadata.New dependencies
Runtime (
rustmanifest-engine):rayon(data parallelism)ignore(gitignore-aware walker)globset(exclude glob matching; transitively pulled byignoreanyway)tracingthiserrorRuntime (
rustmanifest-report):anstyle(color codes)anstream(TTY detection + auto-disable on non-TTY)Runtime (
rustmanifest-cli):tracing-subscriber(withenv-filterfeature)anyhow(CLI-level errors)Dev (engine, report, cli):
insta(snapshot tests)assert_cmd+predicates(CLI integration tests)tempfile(test fixture trees)Dev (engine):
criterion(benchmarks)All MIT or Apache-2.0, well within
deny.tomlallowlist.multiple-versions = "deny"should remain satisfied; if a transitive duplicate appears the PR addresses it explicitly (no blanket skips).Out of scope (deferred)
rustmanifest.tomlconfig loading and profile resolution (Phase 1D).explain <rule-id>CLI subcommand (Phase 1D, needs methodology resources).unused-pragmafinding when a pragma references a rule that didn't fire — deferred to Phase 1D.Acceptance criteria
cargo +nightly fmt --all -- --checkclean.cargo clippy --workspace --all-targets --all-features -- -D warningsclean.cargo test --workspacepasses. New tests:checkexit codes, format selection, severity filtering.cargo build --workspace --releasesucceeds on all 4 supported OS in CI.cargo bench --workspace --no-runsucceeds (compile-only gate).reuse lint100%.cargo deny checkclean.cargo tree --workspace --duplicatesempty or each duplicate is justified inline indeny.tomlwith rationale.cargo run -p rustmanifest-cli -- check crates/rustmanifest-rules-core/rules/RM-SEC-001/fail.rsreturns exit code 1 and a finding forRM-SEC-001in the chosen format.Risks and mitigations
rayon: explicit sort step before rendering.jsonschema); ifjsonschemais too heavy, fall back to structural equality with a hand-curated golden.ignorewalker pulls a large dep tree: documented; mitigated by theglobsetreuse instead of pulling a separate glob crate.criterionis a heavy build-only dep: gated to[dev-dependencies], no impact on the release binary.tests/andbenches/files get file-level#![allow(clippy::unwrap_used, clippy::expect_used, clippy::panic, reason = "test/bench code")]— never workspace-wide.