From 92269f80d191f57ef765a08ff4daef7b7a582b13 Mon Sep 17 00:00:00 2001 From: Zhuohan Li Date: Tue, 22 Aug 2023 07:24:46 +0000 Subject: [PATCH] Update Supported Model List --- README.md | 3 +++ docs/source/models/supported_models.rst | 9 +++++++++ 2 files changed, 12 insertions(+) diff --git a/README.md b/README.md index 7c4892a8f788..d221053ff208 100644 --- a/README.md +++ b/README.md @@ -42,6 +42,7 @@ vLLM is flexible and easy to use with: vLLM seamlessly supports many Huggingface models, including the following architectures: +- Aqualia (`BAAI/Aquila-7B`, `BAAI/AquilaChat-7B`, etc.) - Baichuan (`baichuan-inc/Baichuan-7B`, `baichuan-inc/Baichuan-13B-Chat`, etc.) - BLOOM (`bigscience/bloom`, `bigscience/bloomz`, etc.) - Falcon (`tiiuae/falcon-7b`, `tiiuae/falcon-40b`, `tiiuae/falcon-rw-7b`, etc.) @@ -49,9 +50,11 @@ vLLM seamlessly supports many Huggingface models, including the following archit - GPT BigCode (`bigcode/starcoder`, `bigcode/gpt_bigcode-santacoder`, etc.) - GPT-J (`EleutherAI/gpt-j-6b`, `nomic-ai/gpt4all-j`, etc.) - GPT-NeoX (`EleutherAI/gpt-neox-20b`, `databricks/dolly-v2-12b`, `stabilityai/stablelm-tuned-alpha-7b`, etc.) +- InternLM (`internlm/internlm-7b`, `internlm/internlm-chat-7b`, etc.) - LLaMA & LLaMA-2 (`meta-llama/Llama-2-70b-hf`, `lmsys/vicuna-13b-v1.3`, `young-geng/koala`, `openlm-research/open_llama_13b`, etc.) - MPT (`mosaicml/mpt-7b`, `mosaicml/mpt-30b`, etc.) - OPT (`facebook/opt-66b`, `facebook/opt-iml-max-30b`, etc.) +- Qwen (`Qwen/Qwen-7B`, `Qwen/Qwen-7B-Chat`, etc.) Install vLLM with pip or [from source](https://vllm.readthedocs.io/en/latest/getting_started/installation.html#build-from-source): diff --git a/docs/source/models/supported_models.rst b/docs/source/models/supported_models.rst index eb9da233d4ec..96228546c3f1 100644 --- a/docs/source/models/supported_models.rst +++ b/docs/source/models/supported_models.rst @@ -14,6 +14,9 @@ Alongside each architecture, we include some popular models that use it. * - Architecture - Models - Example HuggingFace Models + * - :code:`AquilaForCausalLM` + - Aqualia + - :code:`BAAI/Aquila-7B`, :code:`BAAI/AquilaChat-7B`, etc. * - :code:`BaiChuanForCausalLM` - Baichuan - :code:`baichuan-inc/Baichuan-7B`, :code:`baichuan-inc/Baichuan-13B-Chat`, etc. @@ -35,6 +38,9 @@ Alongside each architecture, we include some popular models that use it. * - :code:`GPTNeoXForCausalLM` - GPT-NeoX, Pythia, OpenAssistant, Dolly V2, StableLM - :code:`EleutherAI/gpt-neox-20b`, :code:`EleutherAI/pythia-12b`, :code:`OpenAssistant/oasst-sft-4-pythia-12b-epoch-3.5`, :code:`databricks/dolly-v2-12b`, :code:`stabilityai/stablelm-tuned-alpha-7b`, etc. + * - :code:`InternLMForCausalLM` + - InternLM + - :code:`internlm/internlm-7b`, :code:`internlm/internlm-chat-7b`, etc. * - :code:`LlamaForCausalLM` - LLaMA, LLaMA-2, Vicuna, Alpaca, Koala, Guanaco - :code:`meta-llama/Llama-2-13b-hf`, :code:`openlm-research/open_llama_13b`, :code:`lmsys/vicuna-13b-v1.3`, :code:`young-geng/koala`, :code:`JosephusCheung/Guanaco`, etc. @@ -44,6 +50,9 @@ Alongside each architecture, we include some popular models that use it. * - :code:`OPTForCausalLM` - OPT, OPT-IML - :code:`facebook/opt-66b`, :code:`facebook/opt-iml-max-30b`, etc. + * - :code:`OPTForCausalLM` + - Qwen + - :code:`Qwen/Qwen-7B`, :code:`Qwen/Qwen-7B-Chat`, etc. If your model uses one of the above model architectures, you can seamlessly run your model with vLLM. Otherwise, please refer to :ref:`Adding a New Model ` for instructions on how to implement support for your model.