Skip to content

Conversation

JaredforReal
Copy link
Contributor

@JaredforReal JaredforReal commented Sep 25, 2025

What type of PR is this?
docs: k8s quickstart and observability with k8s

What this PR does / why we need it:
fix: typo in docker-compose.yml.
fix: command error in tools/mock-vllm/Dockerfile, COPY needs 2 parameters.

docs: change docker-quickstart.md to deploy-quickstart.md, add k8s quickstart to it.
docs: add k8s observability in observability.md.

Which issue(s) this PR fixes:
a preparation to #48

Thoughts on Docker Compose and k8s:

  • Keep Docker Compose lightweight and complete for developers: (Envoy + semantic-router + mock-vllm + observability)
  • Use Istio and more advanced techniques to make k8s production-ready.

I will try clarifying this difference in the docs and improving both experiences in future PRs
Love to hear any suggestions from the community!

Copy link

netlify bot commented Sep 25, 2025

Deploy Preview for vllm-semantic-router ready!

Name Link
🔨 Latest commit 9b7bada
🔍 Latest deploy log https://app.netlify.com/projects/vllm-semantic-router/deploys/68d661fc627b170008f48da2
😎 Deploy Preview https://deploy-preview-225--vllm-semantic-router.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link

github-actions bot commented Sep 25, 2025

👥 vLLM Semantic Team Notification

The following members have been identified for the changed files in this PR and have been automatically assigned:

📁 website

Owners: @Xunzhuo
Files changed:

  • website/docs/installation/deploy-quickstart.md
  • website/docs/tutorials/observability/observability.md
  • website/sidebars.js

📁 Root Directory

Owners: @rootfs, @Xunzhuo
Files changed:

  • docker-compose.yml

📁 tools

Owners: @yuluo-yx, @rootfs, @Xunzhuo
Files changed:

  • tools/mock-vllm/Dockerfile

vLLM

🎉 Thanks for your contributions!

This comment was automatically generated based on the OWNER files in the repository.

@JaredforReal
Copy link
Contributor Author

Need further testing, leave it as a draft.

@rootfs
Copy link
Collaborator

rootfs commented Sep 25, 2025

@JaredforReal this is great! Would you be available to follow up with another PR to add a GHA to deploy k8s and run validation? A quick solution could be use the kind-action to create an env.

@JaredforReal
Copy link
Contributor Author

@JaredforReal this is great! Would you be available to follow up with another PR to add a GHA to deploy k8s and run validation? A quick solution could be use the kind-action to create an env.

@rootfs Sure, I’ll work on it

Copy link

@srampal srampal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some review comments.

@@ -0,0 +1,224 @@
# Deployment Quickstart
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest to rename this web page as "Containerized Deployment Quickstart". Since there is already a "Install in Local" and this guide is specifically about running semantic router in a container (with Docker or Kubernetes) its better to make it clear and differentiate that this guide is about containerized install.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good idea

# Deployment Quickstart

This unified guide helps you quickly run Semantic Router locally (Docker Compose) or in a cluster (Kubernetes) and explains when to choose each path. Both share the same configuration concepts: Docker is ideal for rapid iteration and demos, while Kubernetes is suited for long‑running workloads, elasticity, and upcoming Operator / CRD scenarios.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This guide describes deployment of the Semantic router as a containerized component using either Docker Compose or a Kubernetes cluster. This does not cover deployment of the Envoy router or the LLM endpoints inside the same containerized environment (for instance the same Kubernetes cluster). To follow this guide, you must still separately deploy the Envoy gateway and LLM endpoints separately following the instructions in the [`Install in Local`](../installation.md] guide. Future guides will cover additional deployment scenarios including where the other components such as Envoy gateway are also running in the Kubernetes cluster and using different types of controllers such as Istio or Gateway API.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JaredforReal BTW I am working on this issue #39 which covers the case of running with Envoy gateway also in Kubernetes and using controllers like Istio or Gateway api so will add documentation for those scenarios as part of the PR for that issue. So in this PR your documentation can cover the case where just the semantic router is in kubernetes but the rest of the deployment is the same as described in the "Install in Local" guide.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! I’ll keep the scope of this PR to the case where only the semantic router is deployed in Kubernetes, with Envoy and the LLM endpoints still following the steps in the Install in Local guide.
But for Docker Compose, we’ve already automated the Envoy setup and also provide a testing profile with a mock vLLM, so that developers can get a lightweight but complete experience out of the box. I’ll continue improving the developer experience for both setups in future PRs, and I look forward to aligning with the work in issue #39 once it’s ready.

@Xunzhuo
Copy link
Member

Xunzhuo commented Sep 26, 2025

sorry to trouble you, plz move the docs into new paths: https://vllm-semantic-router.com/docs/installation
i did a refactor on layout

Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: JaredforReal <w13431838023@gmail.com>
@JaredforReal JaredforReal marked this pull request as ready for review September 26, 2025 11:57
@rootfs
Copy link
Collaborator

rootfs commented Sep 26, 2025

@JaredforReal thanks for writing this up! We have queries in slack about envoy proxy install, would you please follow up with instructions too? Thanks!

@rootfs rootfs merged commit 56712af into vllm-project:main Sep 26, 2025
9 checks passed
@JaredforReal
Copy link
Contributor Author

@rootfs Sure, will work on it

@JaredforReal JaredforReal deleted the k8s branch September 26, 2025 18:15
Aias00 pushed a commit to Aias00/semantic-router that referenced this pull request Oct 4, 2025
* fix typo & add k8s quickstart doc

Signed-off-by: JaredforReal <w13431838023@gmail.com>

* change docker to deploy quickstart

Signed-off-by: JaredforReal <w13431838023@gmail.com>

* refactor deploy-quickstart.md

Signed-off-by: JaredforReal <w13431838023@gmail.com>

* declare k8s needs seperate llm endpoint and envoy set up

Signed-off-by: JaredforReal <w13431838023@gmail.com>

* add some reference in k8s requirement

Signed-off-by: JaredforReal <w13431838023@gmail.com>

* change docker to deploy quickstart

Signed-off-by: JaredforReal <w13431838023@gmail.com>

---------

Signed-off-by: JaredforReal <w13431838023@gmail.com>
Signed-off-by: liuhy <liuhongyu@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants