Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: support Yi-1.5 series #1489

Merged
merged 5 commits into from
May 15, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/source/getting_started/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Currently, supported models include:
- ``baichuan``, ``baichuan-chat``, ``baichuan-2-chat``
- ``internlm-16k``, ``internlm-chat-7b``, ``internlm-chat-8k``, ``internlm-chat-20b``
- ``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``
- ``Yi``, ``Yi-chat``
- ``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``
- ``code-llama``, ``code-llama-python``, ``code-llama-instruct``
- ``c4ai-command-r-v01``, ``c4ai-command-r-v01-4bit``
- ``vicuna-v1.3``, ``vicuna-v1.5``
Expand Down
2 changes: 1 addition & 1 deletion doc/source/models/builtin/llm/codeqwen1.5-chat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
codeqwen1.5-chat
========================================

- **Context Length:** 32768
- **Context Length:** 65536
- **Model Name:** codeqwen1.5-chat
- **Languages:** en, zh
- **Abilities:** chat
Expand Down
29 changes: 25 additions & 4 deletions doc/source/models/builtin/llm/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@ The following is a list of built-in LLM in Xinference:

* - :ref:`codeqwen1.5-chat <models_llm_codeqwen1.5-chat>`
- chat
- 32768
- 65536
- CodeQwen1.5 is the Code-Specific version of Qwen1.5. It is a transformer-based decoder-only language model pretrained on a large amount of data of codes.

* - :ref:`codeshell <models_llm_codeshell>`
Expand Down Expand Up @@ -381,6 +381,11 @@ The following is a list of built-in LLM in Xinference:
- 8192
- Starcoderplus is an open-source LLM trained by fine-tuning Starcoder on RedefinedWeb and StarCoderData datasets.

* - :ref:`starling-lm <models_llm_starling-lm>`
- chat
- 4096
- We introduce Starling-7B, an open large language model (LLM) trained by Reinforcement Learning from AI Feedback (RLAIF). The model harnesses the power of our new GPT-4 labeled ranking dataset

* - :ref:`tiny-llama <models_llm_tiny-llama>`
- generate
- 2048
Expand Down Expand Up @@ -431,19 +436,29 @@ The following is a list of built-in LLM in Xinference:
- 4096
- The Yi series models are large language models trained from scratch by developers at 01.AI.

* - :ref:`yi-1.5 <models_llm_yi-1.5>`
- generate
- 4096
- Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

* - :ref:`yi-1.5-chat <models_llm_yi-1.5-chat>`
- chat
- 4096
- Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

* - :ref:`yi-200k <models_llm_yi-200k>`
- generate
- 204800
- 262144
- The Yi series models are large language models trained from scratch by developers at 01.AI.

* - :ref:`yi-chat <models_llm_yi-chat>`
- chat
- 204800
- 4096
- The Yi series models are large language models trained from scratch by developers at 01.AI.

* - :ref:`yi-vl-chat <models_llm_yi-vl-chat>`
- chat, vision
- 204800
- 4096
- Yi Vision Language (Yi-VL) model is the open-source, multimodal version of the Yi Large Language Model (LLM) series, enabling content comprehension, recognition, and multi-round conversations about images.

* - :ref:`zephyr-7b-alpha <models_llm_zephyr-7b-alpha>`
Expand Down Expand Up @@ -607,6 +622,8 @@ The following is a list of built-in LLM in Xinference:

starcoderplus

starling-lm

tiny-llama

vicuna-v1.3
Expand All @@ -627,6 +644,10 @@ The following is a list of built-in LLM in Xinference:

yi

yi-1.5

yi-1.5-chat

yi-200k

yi-chat
Expand Down
60 changes: 60 additions & 0 deletions doc/source/models/builtin/llm/yi-1.5-chat.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
.. _models_llm_yi-1.5-chat:

========================================
Yi-1.5-chat
========================================

- **Context Length:** 4096
- **Model Name:** Yi-1.5-chat
- **Languages:** en, zh
- **Abilities:** chat
- **Description:** Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 6 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 6
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-1.5-6B-Chat
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-1.5-6B-Chat>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-1.5-6B-Chat>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-1.5-chat --size-in-billions 6 --model-format pytorch --quantization ${quantization}


Model Spec 2 (pytorch, 9 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 9
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-1.5-9B-Chat
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-1.5-9B-Chat>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-1.5-9B-Chat>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-1.5-chat --size-in-billions 9 --model-format pytorch --quantization ${quantization}


Model Spec 3 (pytorch, 34 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 34
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-1.5-34B-Chat
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-1.5-34B-Chat>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-1.5-34B-Chat>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-1.5-chat --size-in-billions 34 --model-format pytorch --quantization ${quantization}

60 changes: 60 additions & 0 deletions doc/source/models/builtin/llm/yi-1.5.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
.. _models_llm_yi-1.5:

========================================
Yi-1.5
========================================

- **Context Length:** 4096
- **Model Name:** Yi-1.5
- **Languages:** en, zh
- **Abilities:** generate
- **Description:** Yi-1.5 is an upgraded version of Yi. It is continuously pre-trained on Yi with a high-quality corpus of 500B tokens and fine-tuned on 3M diverse fine-tuning samples.

Specifications
^^^^^^^^^^^^^^


Model Spec 1 (pytorch, 6 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 6
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-1.5-6B
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-1.5-6B>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-1.5-6B>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-1.5 --size-in-billions 6 --model-format pytorch --quantization ${quantization}


Model Spec 2 (pytorch, 9 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 9
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-1.5-9B
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-1.5-9B>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-1.5-9B>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-1.5 --size-in-billions 9 --model-format pytorch --quantization ${quantization}


Model Spec 3 (pytorch, 34 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 34
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-1.5-34B
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-1.5-34B>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-1.5-34B>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-1.5 --size-in-billions 34 --model-format pytorch --quantization ${quantization}

2 changes: 1 addition & 1 deletion doc/source/models/builtin/llm/yi-200k.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Yi-200k
========================================

- **Context Length:** 204800
- **Context Length:** 262144
- **Model Name:** Yi-200k
- **Languages:** en, zh
- **Abilities:** generate
Expand Down
21 changes: 18 additions & 3 deletions doc/source/models/builtin/llm/yi-chat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
Yi-chat
========================================

- **Context Length:** 204800
- **Context Length:** 4096
- **Model Name:** Yi-chat
- **Languages:** en, zh
- **Abilities:** chat
Expand All @@ -29,7 +29,22 @@ chosen quantization method from the options listed above::
xinference launch --model-name Yi-chat --size-in-billions 34 --model-format gptq --quantization ${quantization}


Model Spec 2 (pytorch, 34 Billion)
Model Spec 2 (pytorch, 6 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
- **Model Size (in billions):** 6
- **Quantizations:** 4-bit, 8-bit, none
- **Model ID:** 01-ai/Yi-6B-Chat
- **Model Hubs**: `Hugging Face <https://huggingface.co/01-ai/Yi-6B-Chat>`__, `ModelScope <https://modelscope.cn/models/01ai/Yi-6B-Chat>`__

Execute the following command to launch the model, remember to replace ``${quantization}`` with your
chosen quantization method from the options listed above::

xinference launch --model-name Yi-chat --size-in-billions 6 --model-format pytorch --quantization ${quantization}


Model Spec 3 (pytorch, 34 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** pytorch
Expand All @@ -44,7 +59,7 @@ chosen quantization method from the options listed above::
xinference launch --model-name Yi-chat --size-in-billions 34 --model-format pytorch --quantization ${quantization}


Model Spec 3 (ggufv2, 34 Billion)
Model Spec 4 (ggufv2, 34 Billion)
++++++++++++++++++++++++++++++++++++++++

- **Model Format:** ggufv2
Expand Down
2 changes: 1 addition & 1 deletion doc/source/models/builtin/llm/yi-vl-chat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
yi-vl-chat
========================================

- **Context Length:** 204800
- **Context Length:** 4096
- **Model Name:** yi-vl-chat
- **Languages:** en, zh
- **Abilities:** chat, vision
Expand Down
2 changes: 1 addition & 1 deletion doc/source/user_guide/backends.rst
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ Currently, supported model includes:
- ``baichuan``, ``baichuan-chat``, ``baichuan-2-chat``
- ``internlm-16k``, ``internlm-chat-7b``, ``internlm-chat-8k``, ``internlm-chat-20b``
- ``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``
- ``Yi``, ``Yi-chat``
- ``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``
- ``code-llama``, ``code-llama-python``, ``code-llama-instruct``
- ``c4ai-command-r-v01``, ``c4ai-command-r-v01-4bit``
- ``vicuna-v1.3``, ``vicuna-v1.5``
Expand Down
Loading
Loading