runner: README Modal instructions

livepeer · Mar 5, 2024 · 907b8c2 · 907b8c2
1 parent 62931de
commit 907b8c2
Showing 1 changed file with 23 additions and 0 deletions.
diff --git a/runner/README.md b/runner/README.md
@@ -125,6 +125,29 @@ The file can be re-generated by running:
 python gen_openapi.py
 ```
 
+## Deploy on Modal
+
+The runner can be deployed on [Modal](https://modal.com/), a serverless GPU platform.
+
+The `modal_app.py` file contains all the logic for deploying Modal apps for a set of pipelines + model IDs.
+
+Before deploying, make sure to do the following:
+
+- Do a dry-run in a dev [environment](https://modal.com/docs/reference/cli/environment).
+- Create an `api-auth-token` [secret](https://modal.com/docs/guide/secrets#secrets) with the `AUTH_TOKEN` environment variable under "Secrets" in the dashboard.
+- Run `modal volume create models` to create a network [volume](https://modal.com/docs/guide/volumes#volumes) that will store model weights.
+- Run `modal run modal_app.py::download_model --model-id <MODEL_ID>` for each of the model IDs referenced in `modal_app.py`.
+  - For gated HuggingFace models (i.e. `stabilityai/stable-video-diffusion-img2vid-xt-1-1`), create an `huggingface` secret under "Secrets" in the dashboard
+  with the `HF_TOKEN` environment variable set to your HuggingFace access token.
+
+Then, make sure the apps are deployed:
+
+```
+modal deploy modal_app.py
+```
+
+The web endpoints for each of the apps will be visible in your dashboard.
+
 ## Credits
 
 Based off of [this repo](https://github.com/huggingface/api-inference-community/tree/main/docker_images/diffusers).