v3.1.0
v3.1.0 ships the V4 Session B codec infrastructure: two new lossless candidate codecs (fp2_residual_v1, cross_layer_delta group API) are implemented, registered, and gated behind the auto_select_codec safety net.
Both codecs lose to bf16_se_ac on real transformer attention/MLP tensors — the V4 Session A entropy bound was based on a lossy BF16-rounded FP32-subtraction proxy that cannot be realised under a strict-lossless contract. The codecs are kept in the registry because (a) they are correctly lossless and tested as such, and (b) they provide infrastructure hooks for future V4 quantize-plus-residual / cross-layer work.
Added
fp2_residual_v1codec — FP2 + lossless BF16 residual + XOR correction streamcross_layer_deltagroup + pair APIs — pure-byte XOR transform withdelta_fromextras key- Container v2 stamping when either new codec is selected
enable_fp2_residualopt-out flag onauto_select_codec
Tests
- 11 new tests, 74 passed / 2 skipped total
Empirical findings
- FP2+residual averages 90.249% of raw vs 65.707% for
bf16_se_acon Phi-3.5-mini shard 1 (loses 86/86 tensors). Safety net keeps file size at v3.0.0 baseline exactly. - Cross-layer XOR delta wins ~1-1.4% on tiny norm-layer groups, loses on MLP/attention. Aggregate impact <0.0001%.
Backwards compatibility
- Files written by 3.0.0 read identically by 3.1.0
- Files using
fp2_residual_v1require bigsmall >= 3.1.0 to decode
Install: `pip install bigsmall==3.1.0`