Skip to content

feat(docs): warn-or-allowlist for non-rivet files in scanned dirs (Task #56)#224

Merged
avrabe merged 1 commit intomainfrom
feat/docs-warn-or-allowlist
Apr 27, 2026
Merged

feat(docs): warn-or-allowlist for non-rivet files in scanned dirs (Task #56)#224
avrabe merged 1 commit intomainfrom
feat/docs-warn-or-allowlist

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented Apr 26, 2026

User directive (verbatim)

"rivet generates a lot of docs being generated with no front matter and other things and rivet is just silent in here even if the directory we choose has them instead of giving warnings or asking them to whitelist these as not rivet part and in the linkage"

Design choice: warn + allowlist hybrid

The docs scanner walks every docs: directory in rivet.yaml. Files that are not valid rivet docs (no YAML front-matter, malformed front-matter) used to be silently dropped via log::info! on a logger that is off by default — so the user had no signal that anything was being hidden, and artifact IDs referenced in those files never reached the link graph.

This PR makes the scanner loud by default and gives the user one knob to opt out:

  1. Loud by default. Every declined file emits a stderr warning of the shape:

    warning: rivet doc scanner skipping ./docs/foo.md: no YAML frontmatter
      hint: if this is intentional, add the path to docs[].exclude in rivet.yaml
    

    rivet validate and rivet docs check print a one-line summary at the end of the scan: <loaded> loaded, <warned> skipped, <excluded> excluded by allowlist.

  2. Allowlist via docs[].exclude. rivet.yaml now accepts both forms:

    docs:
      - docs                                   # legacy: bare path
      - path: arch                             # detailed: path + opt-out
        exclude:
          - "generated/**"                     # `**` matches subtree
          - "*.draft.md"                       # bare patterns match basename

    Files matching an exclude: glob are dropped silently — that is the user's "yes, I know, this isn't a rivet doc" signal.

The new DocsEntry enum on model::ProjectConfig is #[serde(untagged)], so legacy docs: [docs, arch] configs continue to parse with no migration step.

Self-audit findings

Running cargo run --release -- validate against rivet's own repo with this PR applied surfaces 7 markdown files that were silently skipped under the old behavior:

warning: rivet doc scanner skipping ./docs/feature-model-bindings.md: no YAML frontmatter
warning: rivet doc scanner skipping ./docs/feature-model-schema.md: no YAML frontmatter
warning: rivet doc scanner skipping ./docs/getting-started.md: no YAML frontmatter
warning: rivet doc scanner skipping ./docs/oracles.md: no YAML frontmatter
warning: rivet doc scanner skipping ./docs/pure-variants-comparison.md: no YAML frontmatter
warning: rivet doc scanner skipping ./docs/schemas.md: no YAML frontmatter
warning: rivet doc scanner skipping ./docs/what-is-rivet.md: no YAML frontmatter
rivet docs: 6 loaded, 7 skipped (warnings above), 0 excluded by allowlist

This PR does not modify rivet.yaml — it only surfaces the situation. The follow-up decision (back-fill front-matter vs. add exclude: entries) is left to the user, intentionally; pinning the current behavior is the conservative move.

Linkage implication (deferred, documented)

The user flagged that silently-skipped docs may contain artifact IDs in prose that should participate in the link graph. This PR does not invent a fix for that — it pins the current behavior (skipped = no link-graph contribution) and lets the warning act as the trigger that makes the decision visible. A separate follow-up can consider whether un-frontmattered files should still be parsed for [[ID]] references, with their own fragment-only "anonymous source" handling. Tracked as a TODO; not bundled here to keep the diff focused.

Migration

  • docs: [docs, arch] → continues to work, identical to the old behavior plus the new warnings.
  • docs: block with mixed legacy and detailed entries → fully supported; verified by the unit + integration tests.
  • No existing rivet.yaml in the wild needs to change. Auditors who want the warnings silenced can add the relevant exclude: patterns.

Where the change lives

  • rivet-core/src/model.rs — new DocsEntry enum, ProjectConfig.docs: Vec<DocsEntry>.
  • rivet-core/src/document.rsload_documents_with_report, ScanReport, glob_to_regex helper.
  • rivet-core/src/doc_check.rsDocScanRoot, ScanSummary, collect_docs_with_summary.
  • rivet-cli/src/main.rs (3 call sites) and rivet-cli/src/serve/mod.rs — wired to the new entry/exclude shape and surface the summary line.
  • rivet-cli/src/docs.rs — embedded help-doc updated to document the new YAML surface and the loud-by-default behavior.

Test plan

  • cargo test -p rivet-core --lib document::tests — 40 tests pass, 5 new unit tests covering glob_to_regex (*/**/? semantics), load_documents_with_report warn paths, exclude-glob silencing, invalid-pattern resilience, and DocsEntry serde round-trip.
  • cargo test -p rivet-core --lib — all 826 lib tests pass.
  • cargo test -p rivet-cli --test docs_scanner_warnings — 3 new integration tests pass: warn on missing front-matter, silent under exclude glob, legacy bare-path schema still warns.
  • cargo test -p rivet-cli --tests — all integration tests pass (no regressions in cli_commands, init, serve, variant, etc.).
  • cargo fmt --check — clean.
  • cargo clippy -p rivet-core -p rivet-cli --all-targets — no new warnings.
  • Self-audit: ./target/release/rivet validate --format json against rivet's own repo emits the expected 7 warnings + summary line, exit code unchanged.

🤖 Generated with Claude Code

The docs scanner used to silently skip every `.md` file under a
configured `docs:` directory that lacked YAML front-matter (or whose
front-matter failed to parse) — `log::info!` on a logger that is off
by default. Side effect: artifact IDs in those files never made it
into the link graph, and the user had no signal that anything was
hidden.

Loud-by-default behavior:
- `document::load_documents_with_report` and
  `doc_check::collect_docs_with_summary` now print a stderr warning
  per declined file ("rivet doc scanner skipping <path>: <reason>"
  with a hint pointing at the new `docs[].exclude` knob), plus a
  one-line summary tally per scan.
- A new `DocsEntry` enum on `ProjectConfig` accepts both the legacy
  bare-path form and a `{path, exclude}` table whose globs (`*`,
  `**`, `?`) silence the warning for files the user has explicitly
  marked as out of scope. `docs[].exclude` is matched against the
  path *relative to the docs root*; bare patterns (no `/`) match the
  basename at any depth.
- All existing rivet.yaml shapes stay valid — the new `exclude:` is
  opt-in, and untagged serde keeps `docs: [docs, arch]` working.

Self-audit on rivet's own repo finds 7 files that were silently
skipped before this change:
  docs/feature-model-bindings.md, docs/feature-model-schema.md,
  docs/getting-started.md, docs/oracles.md,
  docs/pure-variants-comparison.md, docs/schemas.md,
  docs/what-is-rivet.md.
This PR pins the *current* behavior — the user gets to decide
whether to back-fill front-matter or add an `exclude:` entry.

The "do silently-skipped docs contain artifact IDs the link graph
should pick up?" question is intentionally left for a follow-up;
the warning is the trigger that makes that decision visible.

Tests: 5 new unit tests (glob_to_regex + load_documents_with_report
behavior across missing/malformed front-matter, exclude globs,
invalid pattern handling, DocsEntry serde round-trip) + 3 integration
tests (`rivet validate` end-to-end stderr assertions for the warn
path, the exclude path, and the legacy bare-path schema).

Implements: REQ-010
Verifies: REQ-004
@github-actions
Copy link
Copy Markdown

📐 Rivet artifact delta

No artifact changes in this PR. Code-only changes (renderer, CLI wiring, tests) don't touch the artifact graph.

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 26, 2026

Codecov Report

❌ Patch coverage is 63.77358% with 96 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rivet-core/src/doc_check.rs 0.00% 75 Missing ⚠️
rivet-core/src/document.rs 91.37% 15 Missing ⚠️
rivet-core/src/model.rs 62.50% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

@avrabe avrabe merged commit 28333f1 into main Apr 27, 2026
23 of 28 checks passed
@avrabe avrabe deleted the feat/docs-warn-or-allowlist branch April 27, 2026 03:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant