Skip to content

v3.8.0

Choose a tag to compare

@wpferrell wpferrell released this 19 May 00:28
· 50 commits to main since this release

v3.8.0 is a research-only release. Two compression-ratio improvement ideas from OPT_STEP5_CLAUDE.md were investigated; both decision gates failed. No code changes ship. The findings are documented so future sessions don't re-investigate.

Idea 1: per-tensor custom exponent remapping → REJECTED

Spec predicted 0.2-0.5pp gain. Actual aggregate header saving: 0.00078pp of raw (125x below the 0.1pp gate). On 209 BF16 tensors across Phi-3.5-mini + Qwen3-8B shard 1, mean used exponents = 22.

Why off by ~250x: remapping is a bijection → H(remapped)=H(original). The Categorical AC coder is already at Shannon's optimum regardless of symbol labels. Only header overhead reduction could help (~330 B/tensor = negligible).

Idea 2: cross-model family pooled entropy → REJECTED

Family Models KL penalty Header saving Net
Qwen Qwen2.5-32B + Qwen3-8B 27.4 MB 494 B −27.4 MB
Gemma gemma-3-{4b,12b,27b}-it 6.6 MB 1558 B −6.6 MB

Same finding pattern as A2 (cross-tensor shared tables within one model) from the V4 lossless arc.

Why ship anyway

Per spec: "still bump to 3.8.0 with research findings documented. The measurement work is valuable even if nothing ships." The CHANGELOG entry is the documented answer for anyone tempted to try either approach again.

Compatibility

Zero code changes outside __version__ and CHANGELOG. 119 tests pass / 2 skipped. pip install bigsmall==3.8.0 is functionally identical to 3.7.0 with this CHANGELOG entry attached.

Install: pip install bigsmall==3.8.0