v3.12.0
v3.12.0 ships binary index encoding for fast tensor lookup on large models, plus a dedup-regression test that pins the existing v2.2.0 tied-tensor feature against future refactors.
Step 0 storage analysis
Two of the three improvements in the spec turned out to be either already shipped or not applicable to the current architecture:
| Spec item | Status | Why |
|---|---|---|
| Tensor deduplication | Already shipped (v2.2.0) | tied_ref codec + duplicate_map. Step 0 confirmed zero models in the local set have tied tensors — modern LLMs don't tie embed/lm_head. Now has a regression test on a synthetic tied model. |
| Layer-aligned shard splitting | Not implemented | BigSmall inherits the safetensors shard layout from the source HF model. Re-sharding is a separate "reshard" tool — out of immediate scope. |
| Binary index encoding | Shipped | New bigsmall.index.bin written alongside the JSON for models with ≥ 100 tensors. |
Added
bigsmall.hub_index.write_binary_index(directory, shard_paths)— 30-byte fixed-width records per tensor + shared name/codec tables. MagicBSIX, version 1.bigsmall.hub_index.read_binary_index(path)— same shape asread_index()plus abinaryfield with full per-tensor offset/codec records.bigsmall.hub_index.maybe_read_binary_index(directory)— gracefulNonefallback.- Auto-write of
.binincompress_for_hubwhen tensor count ≥ 100.
Tests
- 6 new tests in
tests/test_opt_step9.py: binary index roundtrip, threshold behaviour, missing-file fallback, bad-magic rejection, synthetic-tied-model dedup regression. 142 passed / 0 skipped / 4 deselected (up from 136).
Compatibility
- Zero changes to existing file formats.
- Older
.bsshards work unchanged; binary index is purely optional. read_index()still defaults to JSON; binary path is opt-in viamaybe_read_binary_index().
Install: pip install bigsmall==3.12.0