Skip to content

v3.12.0

Choose a tag to compare

@wpferrell wpferrell released this 19 May 03:17
· 48 commits to main since this release

v3.12.0 ships binary index encoding for fast tensor lookup on large models, plus a dedup-regression test that pins the existing v2.2.0 tied-tensor feature against future refactors.

Step 0 storage analysis

Two of the three improvements in the spec turned out to be either already shipped or not applicable to the current architecture:

Spec item Status Why
Tensor deduplication Already shipped (v2.2.0) tied_ref codec + duplicate_map. Step 0 confirmed zero models in the local set have tied tensors — modern LLMs don't tie embed/lm_head. Now has a regression test on a synthetic tied model.
Layer-aligned shard splitting Not implemented BigSmall inherits the safetensors shard layout from the source HF model. Re-sharding is a separate "reshard" tool — out of immediate scope.
Binary index encoding Shipped New bigsmall.index.bin written alongside the JSON for models with ≥ 100 tensors.

Added

  • bigsmall.hub_index.write_binary_index(directory, shard_paths) — 30-byte fixed-width records per tensor + shared name/codec tables. Magic BSIX, version 1.
  • bigsmall.hub_index.read_binary_index(path) — same shape as read_index() plus a binary field with full per-tensor offset/codec records.
  • bigsmall.hub_index.maybe_read_binary_index(directory) — graceful None fallback.
  • Auto-write of .bin in compress_for_hub when tensor count ≥ 100.

Tests

  • 6 new tests in tests/test_opt_step9.py: binary index roundtrip, threshold behaviour, missing-file fallback, bad-magic rejection, synthetic-tied-model dedup regression. 142 passed / 0 skipped / 4 deselected (up from 136).

Compatibility

  • Zero changes to existing file formats.
  • Older .bs shards work unchanged; binary index is purely optional.
  • read_index() still defaults to JSON; binary path is opt-in via maybe_read_binary_index().

Install: pip install bigsmall==3.12.0