Releases
v1.3.0
Compare
Sorry, something went wrong.
No results found
1.3.0 (2026-03-31)
Features
kv-cache: add sliding window attention bypass for Gemma models (#53 ) (ee5300e )
triton: add head_dim 64/96 kernel support with non-pow2 padding (#52 ) (44edd60 )
verify: add head_dim 256 support and validate Gemma-2/Gemma-3 (#55 ) (e0f5d45 )
verify: validate Phi-3-mini compression quality (#54 ) (f5fec46 )
verify: validate Qwen2.5-3B and Phi-4 compression quality (#51 ) (48243c9 )
Bug Fixes
benchmark: add head_dim detection division guard (#56 ) (380a7eb )
triton: add defensive assertions and throughput penalty docs (44edd60 )
vllm: guard forward() against kv_cache=None during profiling (#57 ) (3c5fdde )
Performance Improvements
experiments: add experiment 022 clip duration comparison (#47 ) (55503a9 )
experiments: add experiment 023 frame count sweep (#49 ) (d025208 )
experiments: add experiment 024 zero-change model probe (#50 ) (d11fc3a )
Documentation
roadmap: add experiment 023 frame count sweep findings (d025208 )
You can’t perform that action at this time.