diff --git a/docs/user-guides/server.md b/docs/user-guides/server.md index 964c2c462..c8609c6cf 100644 --- a/docs/user-guides/server.md +++ b/docs/user-guides/server.md @@ -58,38 +58,42 @@ One may wonder where is this `onnx-flow.yml` or `tensorrt-flow.yml` come from. M The procedure and UI of ONNX and TensorRT runtime would look the same as Pytorch runtime. - ## Model support The various `CLIP` models implemented in the [OpenAI](https://github.com/openai/CLIP), [OpenCLIP](https://github.com/mlfoundations/open_clip), and [MultilingualCLIP](https://github.com/FreddeFrallan/Multilingual-CLIP) are supported. +`ViT-B-32::openai` is used as the default model in all runtimes. Due to the limitation of some runtimes, not every runtime supports all models. Please also note that **different models give different sizes of output dimensions**. This will affect your downstream applications. For example, switching the model from one to another make your embedding incomparable, which breaks the downstream applications. Below is a list of supported models of each runtime and its corresponding size. We include the disk usage (in delta) and the peak RAM and VRAM usage (in delta) when running on a single Nvidia TITAN RTX GPU (24GB VRAM) for a series of text and image encoding tasks with `batch_size=8` using PyTorch runtime. | Model | PyTorch | ONNX | TensorRT | Output Dimension | Disk Usage (MB) | Peak RAM Usage (GB) | Peak VRAM Usage (GB) | |---------------------------------------|---------|------|----------|------------------|-----------------|---------------------|----------------------| -| RN50 | ✅ | ✅ | ✅ | 1024 | 244 | 2.99 | 1.36 | -| RN101 | ✅ | ✅ | ✅ | 512 | 278 | 3.05 | 1.40 | -| RN50x4 | ✅ | ✅ | ✅ | 640 | 402 | 3.23 | 1.63 | -| RN50x16 | ✅ | ✅ | ❌ | 768 | 631 | 3.63 | 2.02 | -| RN50x64 | ✅ | ✅ | ❌ | 1024 | 1291 | 4.08 | 2.98 | -| ViT-B-32 | ✅ | ✅ | ✅ | 512 | 338 | 3.20 | 1.40 | -| ViT-B-16 | ✅ | ✅ | ✅ | 512 | 335 | 3.20 | 1.44 | -| ViT-B-16-plus-240 | ✅ | ✅ | 🚧 | 640 | 795 | 3.03 | 1.59 | -| ViT-L-14 | ✅ | ✅ | ❌ | 768 | 890 | 3.66 | 2.04 | -| ViT-L-14-336 | ✅ | ✅ | ❌ | 768 | 891 | 3.74 | 2.23 | +| RN50::openai | ✅ | ✅ | ✅ | 1024 | 244 | 2.99 | 1.36 | +| RN50::yfcc15m | ✅ | ✅ | ✅ | 1024 | 389 | 2.86 | 1.36 | +| RN50::cc12m | ✅ | ✅ | ✅ | 1024 | 389 | 2.84 | 1.36 | +| RN101::openai | ✅ | ✅ | ✅ | 512 | 278 | 3.05 | 1.40 | +| RN101::yfcc15m | ✅ | ✅ | ✅ | 512 | 457 | 2.88 | 1.40 | +| RN50x4::openai | ✅ | ✅ | ✅ | 640 | 402 | 3.23 | 1.63 | +| RN50x16::openai | ✅ | ✅ | ❌ | 768 | 631 | 3.63 | 2.02 | +| RN50x64::openai | ✅ | ✅ | ❌ | 1024 | 1291 | 4.08 | 2.98 | +| ViT-B-32::openai | ✅ | ✅ | ✅ | 512 | 338 | 3.20 | 1.40 | +| ViT-B-32::laion2b_e16 | ✅ | ✅ | ✅ | 512 | 577 | 2.93 | 1.40 | +| ViT-B-32::laion400m_e31 | ✅ | ✅ | ✅ | 512 | 577 | 2.93 | 1.40 | +| ViT-B-32::laion400m_e32 | ✅ | ✅ | ✅ | 512 | 577 | 2.94 | 1.40 | +| ViT-B-16::openai | ✅ | ✅ | ✅ | 512 | 335 | 3.20 | 1.44 | +| ViT-B-16::laion400m_e31 | ✅ | ✅ | ✅ | 512 | 571 | 2.93 | 1.44 | +| ViT-B-16::laion400m_e32 | ✅ | ✅ | ✅ | 512 | 571 | 2.94 | 1.44 | +| ViT-B-16-plus-240::laion400m_e31 | ✅ | ✅ | 🚧 | 640 | 795 | 3.03 | 1.59 | +| ViT-B-16-plus-240::laion400m_e32 | ✅ | ✅ | 🚧 | 640 | 795 | 3.03 | 1.59 | +| ViT-L-14::openai | ✅ | ✅ | ❌ | 768 | 890 | 3.66 | 2.04 | +| ViT-L-14::laion400m_e31 | ✅ | ✅ | ❌ | 768 | 1631 | 3.43 | 2.03 | +| ViT-L-14::laion400m_e32 | ✅ | ✅ | ❌ | 768 | 1631 | 3.42 | 2.03 | +| ViT-L-14-336::openai | ✅ | ✅ | ❌ | 768 | 891 | 3.74 | 2.23 | | M-CLIP/XLM-Roberta-Large-Vit-B-32 | ✅ | 🚧 | 🚧 | 512 | 4284 | 5.37 | 1.68 | | M-CLIP/XLM-Roberta-Large-Vit-L-14 | ✅ | 🚧 | ❌ | 768 | 4293 | 4.30 | 4.97 | | M-CLIP/XLM-Roberta-Large-Vit-B-16Plus | ✅ | 🚧 | 🚧 | 640 | 4293 | 4.30 | 4.13 | | M-CLIP/LABSE-Vit-L-14 | ✅ | 🚧 | ❌ | 768 | 3609 | 4.30 | 4.70 | -✅ = First class support — 🚧 = Unsupported, working in progress - -`ViT-B-32::openai` is used as the default model in all runtimes. To use specific pretrained models provided by `open_clip`, please use `::` to separate model name and pretrained weight name, e.g. `ViT-B-32::laion2b_e16`. -Full list of open_clip models and weights can be found [here](https://github.com/mlfoundations/open_clip#pretrained-model-interface). - -```{note} -For model definition with `-quickgelu` postfix, please use non `-quickgelu` model name. -``` +✅ = Supported — 🚧 = Working in progress — ❌ = Not supported ### Use custom model for onnx You can also use your own model in ONNX runtime by specifying the model name and the path to ONNX model directory in YAML file.