Release vllm-stack-0.1.10 · vllm-project/production-stack

The stack deployment of vLLM

What's Changed

Add servingEngineSpec environment variable by @shernshiou in #799
[Fix] Handle missing max_tokens in disaggregated prefill requests by @keyuchen21 in #797
[Router]: add routes for Image and Audio API by @nmiguel in #820
[Router][Fix]: fixed name of images/edits endpoint by @nmiguel in #822
Update contact information in README.md by @ruizhang0101 in #821
Fix OCI OKE deployment script (entry_point.sh) — end-to-end tested by @fede-kamel in #811
mention resources at the values.yaml as valid option by @eladmotola in #806
[Doc] Update README for global env on servingEngineSpec by @shernshiou in #814
feat(helm): add standard Kubernetes labels to deployments and services by @keyuchen21 in #810
[BugFix][Feat]: fix serviceEngineSpec probe field and improve probe management in helm template by @emanuelecassese in #809
[Bugfix] Increase router default memory size by @ruizhang0101 in #804
[FEAT] Add per-model token and error Prometheus metrics (part of #699) by @ardecode in #813
[CI/CD] Add stable router image by @ruizhang0101 in #823
[Feat] Add toleration for vllmRunTimes by @mahmoudk1000 in #825
[Feat] Operator : add GPUType for resources to replace "nvidia.com/gpu" in vllmruntime by @dotmobo in #829
[Bugfix] Update aiohttp and python-multipart by @shernshiou in #831
fix: make --log-level CLI argument actually control router log levels by @keyuchen21 in #832
fix: Exclude content-length from response headers in route_general_transcriptions by @fidoriel in #733
[Feat] Reorder hfTokenSecret for vllmRunTimes by @mahmoudk1000 in #826
feat(router): add initial support for anthropic messages endpoint by @nejch in #775
[Feat] Add token redaction for logger debug by @shernshiou in #824
refactor: replace logging.getLogger() with init_logger() across codebase by @keyuchen21 in #835
[CI/CD] add ci/cd for production stack operator by @ruizhang0101 in #843
fix: filter hop-by-hop headers from streaming responses by @keyuchen21 in #836
fix: upgrade h11 to 0.16.0 to resolve GHSA-vqfr-h8mv-ghfj by @keyuchen21 in #837
Increase timeout values in e2e test workflow by @ruizhang0101 in #848
[Feat][Router] Add request migration with configurable failover reroute attempts by @ikaadil in #839
feat(helm) add support for extra manifests and annotation on pvc by @enneitex in #847
feat: add --root-path CLI option for hosting router under a subpath by @keyuchen21 in #844
[Misc] Expose LMCache log level as configurable Helm value and default to INFO. by @NargiT in #846
[Feat] Add --log-format json option for structured logging by @keyuchen21 in #849
[Router]: image edit routes multi-part form request by @nmiguel in #850
[Docs] Update readme by @ruizhang0101 in #856
Bump chart version to 0.1.10 by @ruizhang0101 in #859

New Contributors

@nmiguel made their first contribution in #820
@emanuelecassese made their first contribution in #809
@dotmobo made their first contribution in #829
@fidoriel made their first contribution in #733
@nejch made their first contribution in #775
@enneitex made their first contribution in #847

Full Changelog: vllm-stack-0.1.9...vllm-stack-0.1.10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vllm-stack-0.1.10

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!