kubeai 0.9.0

github-actions released this 19 Oct 05:39

· 177 commits to main since this release

3c37aed

Highlights

Autoscaling now works for any engine including Ollama and FasterWhisper
Add ability to cache models using shared filesystems (Filestore, EFS, etc)

What's Changed

Autoscale based on KubeAI OpenTelemetry active requests metrics by @nstogner in #261
add resourceProfiles and 405b on A100 80GB by @samos123 in #264
Refactor e2e tests by @nstogner in #263
Add Autoscaler State ConfigMap by @nstogner in #268
add tpu quota to GKE install guide and use values-gke.yaml by @samos123 in #271
update vllm images to 0.6.3 by @samos123 in #273
Shared filesystem caching by @nstogner in #272
add manual test of vLLM on GPU and TPU by @samos123 in #279

Full Changelog: v0.8.0...v0.9.0

Contributors

samos123 and nstogner

Assets 2