v0.1.4 - Slim Runtime Source Release
·
30 commits
to vllm-2080ti-deifinitive
since this release
vLLM 2080 Ti Definitive Edition v0.1.4
This release trims the public source tree to the focused SM75 runtime: source tree, one-click build entry point, interactive launcher, validated profiles, and project documentation.
Changes:
- Slims the public repository to the focused SM75 runtime source tree, launcher scripts, validated profiles, and project documentation.
- Adds the interactive start.sh service manager and one-click build.sh source build entry point.
- Keeps Docker artifacts out of this source release; Docker packaging remains a separate future deployment layer.
- Keeps active profiles in profiles/ and experimental snippets under profiles/experimental/.
- Keeps docs/model-profile-routes.md limited to tested stable FP16 routes; speed-mode and quantized-KV routes will be added after fresh validation.
- Restores the Qwen/Gemma feature matrix to the v0.1.3 capability-view wording while preserving the 0.1.4 naming cleanup: TurboQuant KV and 256K/512K labels.
- Carries forward the v0.1.3 graph-safety runtime fixes while removing upstream CI/docs/test bulk from the public source tree.