v3.8.0
v3.8.0 is a research-only release. Two compression-ratio improvement ideas from OPT_STEP5_CLAUDE.md were investigated; both decision gates failed. No code changes ship. The findings are documented so future sessions don't re-investigate.
Idea 1: per-tensor custom exponent remapping → REJECTED
Spec predicted 0.2-0.5pp gain. Actual aggregate header saving: 0.00078pp of raw (125x below the 0.1pp gate). On 209 BF16 tensors across Phi-3.5-mini + Qwen3-8B shard 1, mean used exponents = 22.
Why off by ~250x: remapping is a bijection → H(remapped)=H(original). The Categorical AC coder is already at Shannon's optimum regardless of symbol labels. Only header overhead reduction could help (~330 B/tensor = negligible).
Idea 2: cross-model family pooled entropy → REJECTED
| Family | Models | KL penalty | Header saving | Net |
|---|---|---|---|---|
| Qwen | Qwen2.5-32B + Qwen3-8B | 27.4 MB | 494 B | −27.4 MB |
| Gemma | gemma-3-{4b,12b,27b}-it | 6.6 MB | 1558 B | −6.6 MB |
Same finding pattern as A2 (cross-tensor shared tables within one model) from the V4 lossless arc.
Why ship anyway
Per spec: "still bump to 3.8.0 with research findings documented. The measurement work is valuable even if nothing ships." The CHANGELOG entry is the documented answer for anyone tempted to try either approach again.
Compatibility
Zero code changes outside __version__ and CHANGELOG. 119 tests pass / 2 skipped. pip install bigsmall==3.8.0 is functionally identical to 3.7.0 with this CHANGELOG entry attached.
Install: pip install bigsmall==3.8.0