Skip to content

v0.4.3

@PCfVW PCfVW tagged this 01 May 07:41
Phase 4.7 (Remote-only NPZ inspection via reader-generic API) +
audit-driven follow-on work:

- inspect_npz_from_reader<R: Read + Seek> — Phase 4.7 deliverable.
- parse() now memory-maps safetensors (~3236x faster on 11.6 GiB
  shard, audit finding #2).
- TensorEntry::num_elements + GGUF CLI overflow guards (audit #12).
- New perf-discipline infrastructure: CLAUDE.md "Performance Changes"
  rule, docs/perf-experiments.md case-study log, ad-hoc bench
  harnesses for dequant + parse paths.
- v0.4.0 GGUF refactor re-validated: Q4_0 win confirmed at ~8 %
  (smaller than originally claimed 10-15 %), Q8_0 measured as a
  ~6 % regression (opposite direction of CHANGELOG claim) — recorded
  in docs/perf-experiments.md entry #3, code unchanged.
- Two perf-claim experiments measured and discarded (NPZ memset
  elimination -33 %, FP8 chunked extend -23 %), recorded in
  docs/perf-experiments.md entries #1 and #2.
- memmap2 promoted to mandatory dependency.

CHANGELOG consolidates the [Unreleased] entries into one [0.4.3]
section in standard Keep a Changelog order (Added -> Changed ->
Fixed). Cargo.lock refreshed via cargo check.

Local publish dry-run gauntlet green:
- cargo fmt --check
- cargo clippy --all-targets -- -D warnings
- cargo clippy --all-targets --all-features -- -D warnings
- cargo test --all-features  (320 unit + every integration suite)
- cargo doc --all-features --no-deps  (RUSTDOCFLAGS=-D warnings)
- cargo publish --dry-run --allow-dirty  (106 files, 7.2 MiB)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Assets 2
Loading