Enrich E007 IoError with positional context on mid-record reads by dchud · Pull Request #222 · dchud/mrrc

dchud · 2026-05-28T00:03:30Z

What

MarcError::IoError (E007) raised while reading a record's data area now carries positional context — record_index, byte_offset, and source_name — instead of the context-free From<std::io::Error> fallback that left all three None. Before this, a read failure partway through a multi-record stream surfaced as a bare "IO error" with no indication of which record or byte position failed.

How

read_record_data (src/iso2709.rs) now constructs the error via ParseContext::err_io(e) rather than e.into(). At that point begin_record has run and the leader has been consumed, so the context is fully positioned for the in-progress record. All three ISO 2709 readers (bibliographic, authority, holdings) share the parse_iso2709_record skeleton, so this single site covers every reader's mid-record path.

The leader read at a record boundary (read_leader_bytes) deliberately keeps the context-free From fallback: it runs before begin_record, and a clean EOF there legitimately means there is no next record, so the context's record_index still names the previous record. Enriching there would misattribute a boundary-read failure. This reasoning is now documented on both read_leader_bytes and the From<io::Error> impl.

Python's OSError surface is unchanged — E007 still maps to built-in OSError; only the typed Rust variant gained fields.

Tests

New tests/e007_io_positional_context.rs: a multi-record stream whose second record's data read fails asserts the surfaced IoError carries record_index == 2, the reader's source_name, and a byte_offset past record 1. A companion test locks the leader-boundary fallback (no record_index).
Error-coverage manifest: the E007 entry is split into a raw-io case (io_error, leader boundary, no context) and a parse-path case (io_error_parse_path, mid-record, expected_context = [record_index, byte_offset, source_name]), with a matching source_name → source probe mapping and a Python harness skip branch.

Closes #210

Bead: bd-0x73.21.6

read_record_data now constructs MarcError::IoError via ParseContext::err_io rather than the context-free From<io::Error> conversion, so an I/O failure while reading a record's data area carries record_index, byte_offset, and source_name. The leader read at a record boundary keeps the From fallback by design: begin_record has not run there, so enriching would misattribute the failure to the previous record (or none, on the first). Documented on both read_leader_bytes and the From impl. All three ISO 2709 readers (bibliographic, authority, holdings) share the parse_iso2709_record skeleton, so the single read_record_data fix covers every reader's mid-record path. Python's OSError surface is unchanged. Adds a regression test (multi-record stream, the second record's data read fails) asserting record_index and source_name, plus a companion test locking the leader-boundary context-free fallback. Splits the error-coverage manifest E007 entry into raw-io (leader boundary, no context) and parse-path (mid-record, enriched) cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

codspeed-hq · 2026-05-28T00:06:54Z

Merging this PR will improve performance by 19.81%

⚠️

Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 11 improved benchmarks
✅ 49 untouched benchmarks
⏩ 18 skipped benchmarks¹

Performance Changes

	Mode	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	WallTime	`test_parallel_read_4x_1k`	110.1 ms	90.7 ms	+21.48%
⚡	WallTime	`test_threaded_with_title_extraction_2x_10k`	359.3 ms	300.6 ms	+19.56%
⚡	WallTime	`test_file_parallel_4x_10k`	1.2 s	1 s	+14.14%
⚡	WallTime	`test_file_parallel_4x_10k_with_extraction`	1,217 ms	998.4 ms	+21.9%
⚡	WallTime	`test_parallel_read_with_extract_4x_1k`	116.1 ms	98.6 ms	+17.72%
⚡	WallTime	`test_threaded_reading_4x_1k`	111.3 ms	89.9 ms	+23.89%
⚡	WallTime	`test_threaded_reading_4x_10k`	1,114.9 ms	924 ms	+20.67%
⚡	WallTime	`test_parallel_read_4x_10k`	1,135.7 ms	949.3 ms	+19.63%
⚡	WallTime	`test_threading_speedup_4x_10k`	1,153.1 ms	961.1 ms	+19.97%
⚡	WallTime	`test_threaded_with_title_extraction_4x_10k`	1,147.7 ms	955.1 ms	+20.16%
⚡	WallTime	`test_parallel_read_with_extract_4x_10k`	1.2 s	1 s	+19.11%

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing fix/e007-positional-context-bd-0x73.21.6 (9627258) with main (15bf24c)}

18 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Three ParseContext helpers had no callers: err_record_length_invalid, err_base_address_invalid, and err_base_address_not_found. Their error firing sites live in Leader::from_bytes and Leader::validate_for_reading, which are low-level Leader methods with no ParseContext in scope. The ISO 2709 / JSON / marcjson parse paths enrich those errors after the fact via e.with_position(ctx).with_bytes_near(...), so the helpers are redundant rather than refactorable into the firing sites. Decision: delete (option b). err_io, originally listed as dead, is now live after the E007 enrichment work (PR #222), so it stays. The two u32::try_from checks in MarcWriter::write_record are redundant runtime guards: check_iso2709_size caps both record_length and base_address at 99_999, far below u32::MAX, so the error arms are unreachable from the public API. Decision: keep with a comment (option c) — they still prevent a silent usize->u32 truncation if a future refactor reaches the leader-population step without routing through check_iso2709_size first. Bead: bd-0x73.21.4 Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

dchud self-assigned this May 28, 2026

dchud merged commit e423469 into main May 28, 2026
48 checks passed

dchud deleted the fix/e007-positional-context-bd-0x73.21.6 branch May 28, 2026 00:34

dchud added a commit that referenced this pull request May 28, 2026

chore(beads): close bd-0x73.21.6 after #222 merge

2150151

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dchud mentioned this pull request May 28, 2026

Remove dead leader err_* helpers, annotate writer u32 guards #227

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enrich E007 IoError with positional context on mid-record reads#222

Enrich E007 IoError with positional context on mid-record reads#222
dchud merged 1 commit into
mainfrom
fix/e007-positional-context-bd-0x73.21.6

dchud commented May 28, 2026

Uh oh!

codspeed-hq Bot commented May 28, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dchud commented May 28, 2026

What

How

Tests

Uh oh!

codspeed-hq Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by 19.81%

Performance Changes

Footnotes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codspeed-hq Bot commented May 28, 2026 •

edited

Loading