Release v3.12.0 · wpferrell/Bigsmall

v3.12.0 ships binary index encoding for fast tensor lookup on large models, plus a dedup-regression test that pins the existing v2.2.0 tied-tensor feature against future refactors.

Step 0 storage analysis

Two of the three improvements in the spec turned out to be either already shipped or not applicable to the current architecture:

Spec item	Status	Why
Tensor deduplication	Already shipped (v2.2.0)	`tied_ref` codec + `duplicate_map`. Step 0 confirmed zero models in the local set have tied tensors — modern LLMs don't tie embed/lm_head. Now has a regression test on a synthetic tied model.
Layer-aligned shard splitting	Not implemented	BigSmall inherits the safetensors shard layout from the source HF model. Re-sharding is a separate "reshard" tool — out of immediate scope.
Binary index encoding	Shipped	New `bigsmall.index.bin` written alongside the JSON for models with ≥ 100 tensors.

Added

bigsmall.hub_index.write_binary_index(directory, shard_paths) — 30-byte fixed-width records per tensor + shared name/codec tables. Magic BSIX, version 1.
bigsmall.hub_index.read_binary_index(path) — same shape as read_index() plus a binary field with full per-tensor offset/codec records.
bigsmall.hub_index.maybe_read_binary_index(directory) — graceful None fallback.
Auto-write of .bin in compress_for_hub when tensor count ≥ 100.

Tests

6 new tests in tests/test_opt_step9.py: binary index roundtrip, threshold behaviour, missing-file fallback, bad-magic rejection, synthetic-tied-model dedup regression. 142 passed / 0 skipped / 4 deselected (up from 136).

Compatibility

Zero changes to existing file formats.
Older .bs shards work unchanged; binary index is purely optional.
read_index() still defaults to JSON; binary path is opt-in via maybe_read_binary_index().

Install: pip install bigsmall==3.12.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v3.12.0

Choose a tag to compare

Sorry, something went wrong.