Skip to content

Commit

Permalink
runner: README Modal instructions
Browse files Browse the repository at this point in the history
  • Loading branch information
yondonfu committed Mar 5, 2024
1 parent 62931de commit 907b8c2
Showing 1 changed file with 23 additions and 0 deletions.
23 changes: 23 additions & 0 deletions runner/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,29 @@ The file can be re-generated by running:
python gen_openapi.py
```

## Deploy on Modal

The runner can be deployed on [Modal](https://modal.com/), a serverless GPU platform.

The `modal_app.py` file contains all the logic for deploying Modal apps for a set of pipelines + model IDs.

Before deploying, make sure to do the following:

- Do a dry-run in a dev [environment](https://modal.com/docs/reference/cli/environment).
- Create an `api-auth-token` [secret](https://modal.com/docs/guide/secrets#secrets) with the `AUTH_TOKEN` environment variable under "Secrets" in the dashboard.
- Run `modal volume create models` to create a network [volume](https://modal.com/docs/guide/volumes#volumes) that will store model weights.
- Run `modal run modal_app.py::download_model --model-id <MODEL_ID>` for each of the model IDs referenced in `modal_app.py`.
- For gated HuggingFace models (i.e. `stabilityai/stable-video-diffusion-img2vid-xt-1-1`), create an `huggingface` secret under "Secrets" in the dashboard
with the `HF_TOKEN` environment variable set to your HuggingFace access token.

Then, make sure the apps are deployed:

```
modal deploy modal_app.py
```

The web endpoints for each of the apps will be visible in your dashboard.

## Credits

Based off of [this repo](https://github.com/huggingface/api-inference-community/tree/main/docker_images/diffusers).

0 comments on commit 907b8c2

Please sign in to comment.