Skip to content

Wire IndicatorValidator + RecordStructureValidator at strict_marc (bd-0x73.1)#175

Merged
dchud merged 1 commit into
mainfrom
feat/bd-0x73.1-validators
May 8, 2026
Merged

Wire IndicatorValidator + RecordStructureValidator at strict_marc (bd-0x73.1)#175
dchud merged 1 commit into
mainfrom
feat/bd-0x73.1-validators

Conversation

@dchud
Copy link
Copy Markdown
Owner

@dchud dchud commented May 8, 2026

Summary

Resolves bd-0x73.1 — the four unwired validator structs from the v0.8
audit. Decisions per validator:

  • IndicatorValidator → wired at validation_level="strict_marc" (per-tag MARC 21 indicator rules; IsbnValidator/EncodingValidator left opt-in).
  • RecordStructureValidator → wired at validation_level="strict_marc" (MARC 21 leader-byte semantics).
  • IsbnValidator → opt-in callable API; documented in new docs/reference/validators.md.
  • EncodingValidator → opt-in callable API; documented alongside IsbnValidator. Heuristic detection deliberately stays out of strict_marc since strict_marc must be deterministic.

The principled split is format-semantic vs. content/heuristic, not a BC-driven carve-out (pre-1.0).

Design call worth flagging

Per-tag indicator violations and leader-byte semantic violations reuse existing error codes E201 and E002 with tag- or position-specific expected: strings, rather than minting new variants. Granularity belongs at validation_level, not at the error-code dimension — same conceptual concern, same code, different sub-rule. This keeps the public error surface from fragmenting as more validators get wired.

Both new failure shapes are recoverable in lenient/permissive, consistent with bd-0x73.10's single rule.

RecordStructureValidator::validate_leader now fires InvalidLeader (E002) instead of the misleading InvalidField it built before (drive-by fix).

Test plan

  • cargo test --test validation_level_matrix — two new fixtures (e201_per_tag_indicator_245.bin, e002_invalid_record_status.bin) exercising the 2×3 {validation_level} × {recovery_mode} matrix.
  • tests/error_coverage.toml — both cases registered with validation_level = "strict_marc"; harness reports 20/20 wired (was 18/18).
  • .cargo/check.sh (full): 538 lib/integration + 32 doc tests pass; clippy clean (core + python); ruff clean; maturin builds; mkdocs builds.
  • Python suite: 731 passed / 8 skipped.

Docs

  • New docs/reference/validators.md page covering all four validators with auto-run vs opt-in classification, code examples, and the format-semantic / content-heuristic principle.
  • docs/reference/error-codes.md — E002 and E201 entries describe both failure shapes.
  • docs/reference/error-handling.md — validation-level table extended with the per-tag and leader-semantic rows.
  • CHANGELOG.md [Unreleased] updated.

🤖 Generated with Claude Code

…-0x73.1)

Resolves the four unwired validator structs from the v0.8 audit per
bd-0x73.1: format-semantic checks (`IndicatorValidator`,
`RecordStructureValidator`) now run automatically at
`validation_level="strict_marc"`; content/heuristic helpers
(`IsbnValidator`, `EncodingValidator`) remain user-callable opt-in
APIs documented in `docs/reference/validators.md`.

Per-tag MARC 21 indicator semantics (e.g., 245 ind1 ∈ {0,1}) and
leader-byte semantics (e.g., record_status ∈ {a,c,d,n,p}) reuse
existing error codes E201 and E002 with tag- or position-specific
`expected:` strings — granularity belongs at validation_level, not
at the error-code dimension. Both are recoverable in
`lenient`/`permissive`, consistent with .10's single rule.

`RecordStructureValidator::validate_leader` now fires `InvalidLeader`
(E002) instead of the misleading `InvalidField` it built before.

Test matrix extends `validation_level_matrix.rs` with two new
fixtures (per-tag indicator failure on 245, invalid record_status
in leader); `error_coverage.toml` carries both with
`validation_level = "strict_marc"`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 8, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks
🆕 2 new benchmarks
⏩ 60 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
🆕 Simulation parse_10k_clean_strict N/A 90.5 ms N/A
🆕 Simulation parse_10k_bad_indicators_lenient N/A 90.3 ms N/A

Comparing feat/bd-0x73.1-validators (e60895f) with main (deacc24)

Open in CodSpeed

Footnotes

  1. 60 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@dchud dchud self-assigned this May 8, 2026
@dchud dchud merged commit 9ebb39e into main May 8, 2026
49 of 50 checks passed
@dchud dchud deleted the feat/bd-0x73.1-validators branch May 8, 2026 22:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant