pytorch · mreso · Mar 29, 2023 · Mar 28, 2023 · Mar 28, 2023
diff --git a/docs/contents.rst b/docs/contents.rst
@@ -16,6 +16,7 @@
   model_zoo
   request_envelopes
   server
+  mps
   snapshot
   sphinx/requirements
   torchserve_on_win_native

diff --git a/docs/index.md b/docs/index.md
@@ -49,3 +49,4 @@ TorchServe is a performant, flexible and easy to use tool for serving PyTorch ea
 * [TorchServe on Kubernetes](https://github.com/pytorch/serve/blob/master/kubernetes/README.md#torchserve-on-kubernetes) -  Demonstrates a Torchserve deployment in Kubernetes using Helm Chart supported in both Azure Kubernetes Service and Google Kubernetes service
 * [mlflow-torchserve](https://github.com/mlflow/mlflow-torchserve) - Deploy mlflow pipeline models into TorchServe
 * [Kubeflow pipelines](https://github.com/kubeflow/pipelines/tree/master/samples/contrib/pytorch-samples) - Kubeflow pipelines and Google Vertex AI Managed pipelines
+* [NVIDIA MPS](mps.md) - Use NVIDIA MPS to optimize multi-worker deployment on a single GPU
diff --git a/docs/mps.md b/docs/mps.md
@@ -1,4 +1,4 @@
-# Enabling NVIDIA MPS in TorchServe
+# Running TorchServe with NVIDIA MPS
 In order to deploy ML models, TorchServe spins up each worker in a separate processes, thus isolating each worker from the others.
 Each process creates its own CUDA context to execute its kernels and access the allocated memory.