docs: k8s quickstart and observability with k8s #225

JaredforReal · 2025-09-25T14:31:01Z

What type of PR is this?
docs: k8s quickstart and observability with k8s

What this PR does / why we need it:
fix: typo in docker-compose.yml.
fix: command error in tools/mock-vllm/Dockerfile, COPY needs 2 parameters.

docs: change docker-quickstart.md to deploy-quickstart.md, add k8s quickstart to it.
docs: add k8s observability in observability.md.

Which issue(s) this PR fixes:
a preparation to #48

Thoughts on Docker Compose and k8s:

Keep Docker Compose lightweight and complete for developers: (Envoy + semantic-router + mock-vllm + observability)
Use Istio and more advanced techniques to make k8s production-ready.

I will try clarifying this difference in the docs and improving both experiences in future PRs
Love to hear any suggestions from the community!

netlify · 2025-09-25T14:31:09Z

✅ Deploy Preview for vllm-semantic-router ready!

Name	Link
🔨 Latest commit	`9b7bada`
🔍 Latest deploy log	https://app.netlify.com/projects/vllm-semantic-router/deploys/68d661fc627b170008f48da2
😎 Deploy Preview	https://deploy-preview-225--vllm-semantic-router.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

github-actions · 2025-09-25T14:31:13Z

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 `website`

Owners: @Xunzhuo
Files changed:

website/docs/installation/deploy-quickstart.md
website/docs/tutorials/observability/observability.md
website/sidebars.js

📁 `Root Directory`

Owners: @rootfs, @Xunzhuo
Files changed:

docker-compose.yml

📁 `tools`

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

tools/mock-vllm/Dockerfile

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

JaredforReal · 2025-09-25T14:34:42Z

Need further testing, leave it as a draft.

rootfs · 2025-09-25T14:45:38Z

@JaredforReal this is great! Would you be available to follow up with another PR to add a GHA to deploy k8s and run validation? A quick solution could be use the kind-action to create an env.

JaredforReal · 2025-09-25T16:06:54Z

@JaredforReal this is great! Would you be available to follow up with another PR to add a GHA to deploy k8s and run validation? A quick solution could be use the kind-action to create an env.

@rootfs Sure, I’ll work on it

srampal

Added some review comments.

srampal · 2025-09-25T18:39:49Z

website/docs/getting-started/deploy-quickstart.md

@@ -0,0 +1,224 @@
+# Deployment Quickstart


I suggest to rename this web page as "Containerized Deployment Quickstart". Since there is already a "Install in Local" and this guide is specifically about running semantic router in a container (with Docker or Kubernetes) its better to make it clear and differentiate that this guide is about containerized install.

This is a good idea

srampal · 2025-09-25T18:50:04Z

website/docs/getting-started/deploy-quickstart.md

+# Deployment Quickstart
+
+This unified guide helps you quickly run Semantic Router locally (Docker Compose) or in a cluster (Kubernetes) and explains when to choose each path. Both share the same configuration concepts: Docker is ideal for rapid iteration and demos, while Kubernetes is suited for long‑running workloads, elasticity, and upcoming Operator / CRD scenarios.
+


Suggested change

This guide describes deployment of the Semantic router as a containerized component using either Docker Compose or a Kubernetes cluster. This does not cover deployment of the Envoy router or the LLM endpoints inside the same containerized environment (for instance the same Kubernetes cluster). To follow this guide, you must still separately deploy the Envoy gateway and LLM endpoints separately following the instructions in the [`Install in Local`](../installation.md] guide. Future guides will cover additional deployment scenarios including where the other components such as Envoy gateway are also running in the Kubernetes cluster and using different types of controllers such as Istio or Gateway API.

@JaredforReal BTW I am working on this issue #39 which covers the case of running with Envoy gateway also in Kubernetes and using controllers like Istio or Gateway api so will add documentation for those scenarios as part of the PR for that issue. So in this PR your documentation can cover the case where just the semantic router is in kubernetes but the rest of the deployment is the same as described in the "Install in Local" guide.

Thanks for the clarification! I’ll keep the scope of this PR to the case where only the semantic router is deployed in Kubernetes, with Envoy and the LLM endpoints still following the steps in the Install in Local guide.
But for Docker Compose, we’ve already automated the Envoy setup and also provide a testing profile with a mock vLLM, so that developers can get a lightweight but complete experience out of the box. I’ll continue improving the developer experience for both setups in future PRs, and I look forward to aligning with the work in issue #39 once it’s ready.

Xunzhuo · 2025-09-26T09:05:44Z

sorry to trouble you, plz move the docs into new paths: https://vllm-semantic-router.com/docs/installation
i did a refactor on layout

Signed-off-by: JaredforReal <w13431838023@gmail.com>

rootfs · 2025-09-26T12:36:20Z

@JaredforReal thanks for writing this up! We have queries in slack about envoy proxy install, would you please follow up with instructions too? Thanks!

JaredforReal · 2025-09-26T13:21:07Z

@rootfs Sure, will work on it

* fix typo & add k8s quickstart doc Signed-off-by: JaredforReal <w13431838023@gmail.com> * change docker to deploy quickstart Signed-off-by: JaredforReal <w13431838023@gmail.com> * refactor deploy-quickstart.md Signed-off-by: JaredforReal <w13431838023@gmail.com> * declare k8s needs seperate llm endpoint and envoy set up Signed-off-by: JaredforReal <w13431838023@gmail.com> * add some reference in k8s requirement Signed-off-by: JaredforReal <w13431838023@gmail.com> * change docker to deploy quickstart Signed-off-by: JaredforReal <w13431838023@gmail.com> --------- Signed-off-by: JaredforReal <w13431838023@gmail.com> Signed-off-by: liuhy <liuhongyu@apache.org>

github-actions bot assigned rootfs and Xunzhuo Sep 25, 2025

srampal reviewed Sep 25, 2025

View reviewed changes

JaredforReal added 5 commits September 26, 2025 17:44

fix typo & add k8s quickstart doc

86717f9

Signed-off-by: JaredforReal <w13431838023@gmail.com>

change docker to deploy quickstart

b0bce95

Signed-off-by: JaredforReal <w13431838023@gmail.com>

refactor deploy-quickstart.md

3922f96

Signed-off-by: JaredforReal <w13431838023@gmail.com>

declare k8s needs seperate llm endpoint and envoy set up

fa36c2f

Signed-off-by: JaredforReal <w13431838023@gmail.com>

add some reference in k8s requirement

149adfd

Signed-off-by: JaredforReal <w13431838023@gmail.com>

JaredforReal force-pushed the k8s branch from dd6cd38 to 149adfd Compare September 26, 2025 09:46

change docker to deploy quickstart

9b7bada

Signed-off-by: JaredforReal <w13431838023@gmail.com>

JaredforReal marked this pull request as ready for review September 26, 2025 11:57

JaredforReal requested review from Xunzhuo and rootfs as code owners September 26, 2025 11:57

rootfs approved these changes Sep 26, 2025

View reviewed changes

rootfs merged commit 56712af into vllm-project:main Sep 26, 2025
9 checks passed

JaredforReal deleted the k8s branch September 26, 2025 18:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: k8s quickstart and observability with k8s #225

docs: k8s quickstart and observability with k8s #225

Uh oh!

JaredforReal commented Sep 25, 2025 •

edited

Loading

Uh oh!

netlify bot commented Sep 25, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 25, 2025 •

edited

Loading

Uh oh!

JaredforReal commented Sep 25, 2025

Uh oh!

rootfs commented Sep 25, 2025 •

edited

Loading

Uh oh!

JaredforReal commented Sep 25, 2025

Uh oh!

srampal left a comment

Uh oh!

srampal Sep 25, 2025

Uh oh!

JaredforReal Sep 26, 2025

Uh oh!

srampal Sep 25, 2025

Uh oh!

srampal Sep 25, 2025

Uh oh!

JaredforReal Sep 26, 2025

Uh oh!

Xunzhuo commented Sep 26, 2025

Uh oh!

rootfs commented Sep 26, 2025

Uh oh!

Uh oh!

JaredforReal commented Sep 26, 2025

Uh oh!

Uh oh!

		# Deployment Quickstart

		This unified guide helps you quickly run Semantic Router locally (Docker Compose) or in a cluster (Kubernetes) and explains when to choose each path. Both share the same configuration concepts: Docker is ideal for rapid iteration and demos, while Kubernetes is suited for long‑running workloads, elasticity, and upcoming Operator / CRD scenarios.

docs: k8s quickstart and observability with k8s #225

docs: k8s quickstart and observability with k8s #225

Uh oh!

Conversation

JaredforReal commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for vllm-semantic-router ready!

Uh oh!

github-actions bot commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

👥 vLLM Semantic Team Notification

📁 website

📁 Root Directory

📁 tools

🎉 Thanks for your contributions!

Uh oh!

JaredforReal commented Sep 25, 2025

Uh oh!

rootfs commented Sep 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JaredforReal commented Sep 25, 2025

Uh oh!

srampal left a comment

Choose a reason for hiding this comment

Uh oh!

srampal Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

JaredforReal Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

srampal Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

srampal Sep 25, 2025

Choose a reason for hiding this comment

Uh oh!

JaredforReal Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

Xunzhuo commented Sep 26, 2025

Uh oh!

rootfs commented Sep 26, 2025

Uh oh!

Uh oh!

JaredforReal commented Sep 26, 2025

Uh oh!

Uh oh!

JaredforReal commented Sep 25, 2025 •

edited

Loading

netlify bot commented Sep 25, 2025 •

edited

Loading

github-actions bot commented Sep 25, 2025 •

edited

Loading

📁 `website`

📁 `Root Directory`

📁 `tools`

rootfs commented Sep 25, 2025 •

edited

Loading