Skip to content

Releases: PaytonWebber/model2vec-zig

v0.2.0

12 Jun 01:06
v0.2.0
ed5b443

Choose a tag to compare

Added

  • Zero-copy model loading: the matrix points into the file or embedded bytes when alignment allows (little-endian targets). Load on potion-base-8M drops 36 -> 21 ms (f32), and @embedFile consumers stop paying RSS for a second copy of the matrix.
  • tq4 files carry a tq4_version metadata field; the reader rejects unknown versions.
  • The safetensors parser is fuzzed: a randomized harness runs on every zig build test, plus a zig build test --fuzz entry point.

Changed (breaking)

  • safetensors.parse takes an options argument and Matrix slices are const.
  • Model.loadFromBytes borrows the safetensors bytes, which must outlive the Model.
  • tq4 files written by 0.1.0 are rejected; re-run m2v-quantize --tq4.

Fixed

  • Three parser overflow bugs against crafted files: a wrapped header-length bounds check (crash), wrap-prone tensor offset checks, and an overflowing shape product that could hand out a matrix backed by no data.

Full details in CHANGELOG.md. Requires Zig 0.16.0.

v0.1.0

11 Jun 04:32
v0.1.0

Choose a tag to compare

model2vec/potion inference in pure Zig: WordPiece tokenization, mean pooling, L2 normalization, parity-tested against the Python reference (max abs diff under 1e-5).

  • f32 and i8 safetensors; i8 quantizer output is byte-identical to the reference implementation's
  • TurboQuant-style 4-bit format (m2v-quantize --tq4): 8x smaller than f32, costs 0.0020 mean NDCG@10 on the MTEB(eng, v2) retrieval suite (see docs/turboquant.md)
  • Model.loadFromBytes for @embedFile-shipped models, Model.fingerprint() for keying persisted vectors
  • 4.1 us per embed of a 17-token text on potion-base-8M (x86_64, ReleaseFast)

Requires Zig 0.16.0.