Skip to content

E007 positional context propagation — enrich IoError at all io-fallible parse sites #210

@dchud

Description

@dchud

Context

Phase 1 finding (bd-0x73.21 comment #97, E007 section + cross-cutting #5): impl From<std::io::Error> for MarcError at src/error.rs:1449 constructs MarcError::IoError with all positional fields None. The impl's own comment says "without surrounding context." Every ? on an io::Error strips positional context — for multi-record streams, users see "IO error" with no record_index, byte_offset, or source.

This is the largest-blast-radius bundle in Phase 2 because it touches every io-fallible call site in every reader.

Method

  1. Inventory. Find every site in src/ where ? is used on a Result<_, io::Error> (or equivalent) in a context where ParseContext is available or could be made available. Approximate scope:

    • src/iso2709.rs, src/iso2709_skeleton.rs — main parse loops.
    • src/reader.rs, src/authority_reader.rs, src/holdings_reader.rs — reader entry points.
    • src/recovery.rs — recovery-path io.
    • Possibly more discovered during the audit.
  2. Replace bare ? with .map_err(|e| ctx.err_io(e)) at each site. Use the LIVE ParseContext::err_io helper at src/iso2709.rs:373.

  3. Where ParseContext isn't directly available (e.g., low-level utility functions called from many places): thread it in if practical, OR document the site as From-impl fallback (acceptable for utility code that doesn't know the stream context).

  4. Update the From impl (src/error.rs:1449) to add a doc-comment warning: callers in parse paths should prefer ctx.err_io(...) for context. Keep the From impl for backwards-compat and for non-parse io contexts.

  5. Update manifest E007 entry. Currently expected_context = []. After enrichment, expected_context should include record_index and source_name for parse-path triggers. Either expand the existing entry's expected_context (if all triggers now carry context) or add separate manifest entries for parse-path vs raw-io triggers.

  6. Add regression test. Parse a multi-record stream where one record's read fails partway (e.g., via a mock Reader that errors after N bytes). Assert the surfaced E007 carries record_index matching the failing record.

Files likely touched

Wide. Every io-fallible call site in readers: src/iso2709.rs, src/iso2709_skeleton.rs, src/reader.rs, src/authority_reader.rs, src/holdings_reader.rs, src/recovery.rs, possibly src/leader.rs. Plus src/error.rs (From impl doc update), tests/error_coverage.toml (E007 entry expansion), tests/data/error_fixtures/ (possibly new fixtures), regression tests in tests/ and tests/python/, CHANGELOG.

Risk

Largest diff in Phase 2. Take care to:

  • Not change semantics — only enrich context.
  • Verify each call site has ctx in scope; some may not (use From impl fallback there).
  • Run full test suite to catch any subtle changes.
  • Verify Python OSError surface still works (E007 fires as Python OSError per docs; the wrapping should preserve that).

Done when

  • All io-fallible call sites in readers use ctx.err_io(...) (or are documented as From-impl-only with reason).
  • E007 fired from a parse path carries record_index and source_name (verified by regression test).
  • Manifest E007 entry's expected_context updated to match.
  • Python OSError surface unchanged for end users.

Reference

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions