-
Notifications
You must be signed in to change notification settings - Fork 434
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[k8s][docs] Add feature cards for users and admins #3582
Changes from 4 commits
8c58639
5181b68
3f724d2
774653f
bd2d5a9
6dbb239
59b6575
4cf41de
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,5 @@ | ||
.. _ai-gallery: | ||
|
||
AI Gallery | ||
==================== | ||
|
||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,27 +3,70 @@ | |
Running on Kubernetes | ||
============================= | ||
|
||
.. note:: | ||
Kubernetes support is under active development. `Please share your feedback <https://forms.gle/KmAtyNhEysiw2ZCR7>`_ | ||
or `directly reach out to the development team <http://slack.skypilot.co>`_ | ||
for feature requests and more. | ||
|
||
SkyPilot tasks can be run on your private on-prem or cloud Kubernetes clusters. | ||
The Kubernetes cluster gets added to the list of "clouds" in SkyPilot and SkyPilot | ||
tasks can be submitted to your Kubernetes cluster just like any other cloud provider. | ||
|
||
**Benefits of using SkyPilot to run jobs on your Kubernetes cluster:** | ||
Why use SkyPilot on Kubernetes? | ||
------------------------------- | ||
|
||
* Get SkyPilot features (setup management, job execution, queuing, logging, SSH access) on your Kubernetes resources | ||
* Replace complex Kubernetes manifests with simple SkyPilot tasks | ||
* Seamlessly "burst" jobs to the cloud if your Kubernetes cluster is congested | ||
* Retain observability and control over your cluster with your existing Kubernetes tools | ||
.. tab-set:: | ||
|
||
**Supported Kubernetes deployments:** | ||
.. tab-item:: For AI Developers | ||
:sync: why-ai-devs-tab | ||
|
||
* Hosted Kubernetes services (EKS, GKE) | ||
* On-prem clusters (Kubeadm, Rancher) | ||
* Local development clusters (KinD, minikube) | ||
.. grid:: 2 | ||
:gutter: 3 | ||
|
||
.. grid-item-card:: ✅ Ease of Use | ||
:text-align: center | ||
|
||
No complex kubernetes manifests - write a simple SkyPilot YAML and run ``sky launch``. | ||
|
||
.. grid-item-card:: 📋 Interactive development on Kubernetes | ||
:text-align: center | ||
|
||
:ref:`SSH access to pods <dev-ssh>`, :ref:`VSCode integration <dev-vscode>`, :ref:`job management <managed-jobs>`, :ref:`model serving <sky-serve>`, :ref:`autodown idle pods <auto-stop>` and more. | ||
romilbhardwaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. grid-item-card:: ☁️ Burst to the cloud | ||
:text-align: center | ||
|
||
Kubernetes cluster is full? SkyPilot seamlessly gets resources on the cloud to get your job running sooner. | ||
romilbhardwaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. grid-item-card:: 🖼 Run popular models on Kubernetes | ||
:text-align: center | ||
|
||
Train and serve `Llama-3 <https://skypilot.readthedocs.io/en/latest/gallery/llms/llama-3.html>`_, `Mixtral <https://skypilot.readthedocs.io/en/latest/gallery/llms/mixtral.html>`_, and more on your Kubernetes with ready-to-use recipes from the :ref:`AI gallery <ai-gallery>`. | ||
|
||
|
||
.. tab-item:: For Infrastructure Admins | ||
:sync: why-admins-tab | ||
|
||
.. grid:: 2 | ||
:gutter: 3 | ||
|
||
.. grid-item-card:: ☁️ Burst to the cloud | ||
:text-align: center | ||
|
||
Scale out to capacity across clouds and regions without requiring manual intervention. | ||
romilbhardwaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. grid-item-card:: 🚯️ Minimize resource wastage | ||
:text-align: center | ||
|
||
SkyPilot can automatically terminate idle pods to free up resources for other users. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should we mention that it respects user's existing kubernetes scheduling config and additionally do autodown? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Rephrased as |
||
|
||
.. grid-item-card:: 👀 Observability | ||
:text-align: center | ||
|
||
Use your existing tools, such as the :ref:`Kubernetes Dashboard <kubernetes-observability>`, to monitor SkyPilot pods. | ||
romilbhardwaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
.. grid-item-card:: 🍽️ Self-serve infra for your teams | ||
:text-align: center | ||
|
||
.. | ||
This point should maybe talk about quotas + sharing through kueue/native k8s quotas. | ||
|
||
Reduce operational overhead by letting your teams provision their own resources on Kubernetes, while you retain control over the cluster. | ||
|
||
|
||
Kubernetes Cluster Requirements | ||
|
@@ -34,6 +77,12 @@ To connect and use a Kubernetes cluster, SkyPilot needs: | |
* An existing Kubernetes cluster running Kubernetes v1.20 or later. | ||
* A `Kubeconfig <https://kubernetes.io/docs/concepts/configuration/organize-cluster-access-kubeconfig/>`_ file containing access credentials and namespace to be used. | ||
|
||
**Supported Kubernetes deployments:** | ||
|
||
* Hosted Kubernetes services (EKS, GKE) | ||
* On-prem clusters (Kubeadm, Rancher, K3s) | ||
* Local development clusters (KinD, minikube) | ||
romilbhardwaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
In a typical workflow: | ||
|
||
1. A cluster administrator sets up a Kubernetes cluster. Detailed admin guides for | ||
|
@@ -218,6 +267,10 @@ FAQs | |
|
||
For isolation, you can create separate Kubernetes namespaces and set them in the kubeconfig distributed to users. SkyPilot will use the namespace set in the kubeconfig for running all tasks. | ||
|
||
* **How do I view the pods created by SkyPilot on my Kubernetes cluster?** | ||
|
||
You can use your existing observability tools to filter resources with the label :code:`parent=skypilot`. As an example, follow the instructions :ref:`here <kubernetes-observability>` to deploy the Kubernetes Dashboard on your cluster. | ||
romilbhardwaj marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* **How can I specify custom configuration for the pods created by SkyPilot?** | ||
|
||
You can override the pod configuration used by SkyPilot by setting the :code:`pod_config` key in :code:`~/.sky/config.yaml`. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link for an example comparing the Kubernetes manifests vs the SkyPilot YAML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that requires a little more work since we need to work on compressing our existing YAMLs (in terms of lines of code #3594) to get a good comparison against kubernetes manifests. E.g., k8s vllm gemma manifest is 65 lines, while SkyPilot gemma is 38 lines. We should do it as a part of the K8s blog post, marked it as a TODO for now.