Release vllm-stack-0.1.11 · vllm-project/production-stack

The stack deployment of vLLM

What's Changed

feat(helm) add PDB per deployment by @enneitex in #851
Add production-ready vLLM CoreWeave CKS terraform stack by @brokedba in #834
[Feat][Router] Add disaggregated prefill orchestrated routing by @yahavb in #777
[bugfix] deprecate disable log request by @ruizhang0101 in #885
feat(helm): add configurable NodePort to router service by @keyuchen21 in #875
fix(benchmark/multi-round-qa): fix TTFT None Type crash caused by reasoning models (reasoning_content) by @brokedba in #873
[bugfix] fix cache server start command by @ruizhang0101 in #872
feat(helm) add monitoring conf as a sub chart by @enneitex in #860
[Bugfix] Forward backend Content-Type in StreamingResponse by @shernshiou in #880
fix(service_discovery): correctly return 503 on missing endpoints by @nejch in #889
bugfix: omit replicas field when autoscaling is enabled by @Isakgicu in #891
[Feat][Operator] Add prefixaware and kvaware routing options to VLLMRouter CRD by @keyuchen21 in #881
[Bugfix] Reduce RBAC permissions for secrets to least privilege by @EzgiTastan in #894
[CI/Build] Add .dockerignore to exclude test files from Docker builds by @EzgiTastan in #895
[Feat] Add generic cache-server resources support for InfiniBand/RDMA by @happytreees in #898
[Feat][Router] make healthcheck values configurable by @max-wittig in #906
[Router] Add reply and heartbeat port options for KV-aware routing by @can-sun in #908
[helm] document every values, udpate json schema and various fix by @enneitex in #886
[BugFix] Omit .spec.replicas when KEDA is enabled to prevent field ownership conflict by @lriverawong in #907
[Feat] Support KEDA Auto Scaling in Production Stack Operator by @aeon-x in #903
[Feat] Helm: add support for per-model tolerations by @AlexanderSing in #897
[CI/Build] Pin GitHub Actions to commit SHAs by @xiaotian-yu in #909
[CI] remove local registry and add runner cleanup by @ruizhang0101 in #922
[Feat] Implement OpenAI external provider by @shernshiou in #902
[Minor Improvements] Vllmruntime Autoscaling in Operator by @aeon-x in #918
[Bugfix][Router] Fix router auth for transcription proxy by @yzhan1 in #914
fix(helm) fix default values for cache deployment by @enneitex in #917
[Bugfix] fix(vllm-router): keep roundrobin state per endpoint set / model by @Killusions in #916
Feat/implement streaming path in audio transcription by @WaelRabah in #926
[Feat][Router] Add per-model request latency histogram by @banlor in #940
[Bugfix] to shared storage not working with dynamic PV provisioning by @NiccoloTosato in #933
[Bugfix][Router] Preserve full backend model metadata in /v1/models by @yzhan1 in #927
[Misc] Bump chart version to 0.1.11 by @ruizhang0101 in #942

New Contributors

@yahavb made their first contribution in #777
@Isakgicu made their first contribution in #891
@EzgiTastan made their first contribution in #894
@happytreees made their first contribution in #898
@can-sun made their first contribution in #908
@lriverawong made their first contribution in #907
@aeon-x made their first contribution in #903
@AlexanderSing made their first contribution in #897
@xiaotian-yu made their first contribution in #909
@yzhan1 made their first contribution in #914
@Killusions made their first contribution in #916
@WaelRabah made their first contribution in #926
@banlor made their first contribution in #940
@NiccoloTosato made their first contribution in #933

Full Changelog: vllm-stack-0.1.10...vllm-stack-0.1.11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vllm-stack-0.1.11

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!