v0.8.23

Latest

Latest

EricLBuehler released this 25 Jun 06:39

· 14 commits to master since this release

8744ceb

What's Changed

feat(cuda): support CUDA 13.3 by @EricLBuehler in #2275
chore: bump candle dep by @EricLBuehler in #2276
feat(cli): improve --quant and --isq docs in cli by @EricLBuehler in #2277
feat(gemma4): do not load projections for shared kv layers by @EricLBuehler in #2281
feat(quant): add isq executor and planning by @EricLBuehler in #2283
feat: add Hunyuan v1 dense and MoE support by @ASheng1019 in #2268
feat(install): fix handling when there are preexisting installs by @EricLBuehler in #2284
feat(gdn): add isq support by @EricLBuehler in #2285
Fix reversed FCFS priority in PagedAttentionScheduler preemption by @pjdurden in #2250
Validate GGUF special token ids against vocab to prevent OOB panic by @pjdurden in #2282

New Contributors

@ASheng1019 made their first contribution in #2268
@pjdurden made their first contribution in #2250

Full Changelog: v0.8.22...v0.8.23

Contributors

pjdurden, ASheng1019, and EricLBuehler

Assets 70