·
24 commits
to main
since this release
ternative v1.0.0 - First release. Runtime LoRA merge, I2_S support, OpenAI-compatible server, GPU decode ~6-7 tok/s on RTX 3050. See README for full details.