Skip to content

Commit

Permalink
Support to serve vLLM on Kubernetes with LWS (#4829)
Browse files Browse the repository at this point in the history
Signed-off-by: kerthcet <kerthcet@gmail.com>
  • Loading branch information
kerthcet committed May 16, 2024
1 parent 9a31a81 commit 8e7fb5d
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 0 deletions.
12 changes: 12 additions & 0 deletions docs/source/serving/deploying_with_lws.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
.. _deploying_with_lws:

Deploying with LWS
============================

LeaderWorkerSet (LWS) is a Kubernetes API that aims to address common deployment patterns of AI/ML inference workloads.
A major use case is for multi-host/multi-node distributed inference.

vLLM can be deployed with `LWS <https://github.com/kubernetes-sigs/lws>`_ on Kubernetes for distributed model serving.

Please see `this guide <https://github.com/kubernetes-sigs/lws/tree/main/docs/examples/vllm>`_ for more details on
deploying vLLM on Kubernetes using LWS.
1 change: 1 addition & 0 deletions docs/source/serving/integrations.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,5 @@ Integrations
deploying_with_kserve
deploying_with_triton
deploying_with_bentoml
deploying_with_lws
serving_with_langchain

0 comments on commit 8e7fb5d

Please sign in to comment.