Skip to content

Commit

Permalink
DOC: update readme and fix description about model engine (#1566)
Browse files Browse the repository at this point in the history
  • Loading branch information
qinxuye committed May 31, 2024
1 parent f8dd5ba commit cb9dbb2
Show file tree
Hide file tree
Showing 8 changed files with 126 additions and 63 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,12 @@ potential of cutting-edge AI models.
- Docker image: [#855](https://github.com/xorbitsai/inference/pull/855)
- Support multimodal: [#829](https://github.com/xorbitsai/inference/pull/829)
### New Models
- Built-in support for [CogVLM2](https://github.com/THUDM/CogVLM2): [#1551](https://github.com/xorbitsai/inference/pull/1551)
- Built-in support for [InternVL-Chat-V1-5](https://github.com/OpenGVLab/InternVL): [#1536](https://github.com/xorbitsai/inference/pull/1536)
- Built-in support for [Yi-1.5](https://github.com/01-ai/Yi-1.5): [#1489](https://github.com/xorbitsai/inference/pull/1489)
- Built-in support for [Llama 3](https://github.com/meta-llama/llama3): [#1332](https://github.com/xorbitsai/inference/pull/1332)
- Built-in support for [Qwen1.5 110B](https://huggingface.co/Qwen/Qwen1.5-110B-Chat): [#1388](https://github.com/xorbitsai/inference/pull/1388)
- Built-in support for [Mixtral-8x22B-instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1): [#1340](https://github.com/xorbitsai/inference/pull/1340)
- Built-in support for [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01): [#1310](https://github.com/xorbitsai/inference/pull/1310)
### Integrations
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
- [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.
Expand Down
2 changes: 1 addition & 1 deletion README_zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@ Xorbits Inference(Xinference)是一个性能强大且功能全面的分布
- Docker 镜像支持: [#855](https://github.com/xorbitsai/inference/pull/855)
- 支持多模态模型:[#829](https://github.com/xorbitsai/inference/pull/829)
### 新模型
- 内置 [CogVLM2](https://github.com/THUDM/CogVLM2): [#1551](https://github.com/xorbitsai/inference/pull/1551)
- 内置 [InternVL-Chat-V1-5](https://github.com/OpenGVLab/InternVL): [#1536](https://github.com/xorbitsai/inference/pull/1536)
- 内置 [Yi-1.5](https://github.com/01-ai/Yi-1.5): [#1489](https://github.com/xorbitsai/inference/pull/1489)
- 内置 [Llama 3](https://github.com/meta-llama/llama3): [#1332](https://github.com/xorbitsai/inference/pull/1332)
- 内置 [Qwen1.5 110B](https://huggingface.co/Qwen/Qwen1.5-110B-Chat): [#1388](https://github.com/xorbitsai/inference/pull/1388)
- 内置 [Mixtral-8x22B-instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1): [#1340](https://github.com/xorbitsai/inference/pull/1340)
- 内置 [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01): [#1310](https://github.com/xorbitsai/inference/pull/1310)
### 集成
- [FastGPT](https://doc.fastai.site/docs/development/custom-models/xinference/):一个基于 LLM 大模型的开源 AI 知识库构建平台。提供了开箱即用的数据处理、模型调用、RAG 检索、可视化 AI 工作流编排等能力,帮助您轻松实现复杂的问答场景。
- [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-03-11 13:33+0800\n"
"POT-Creation-Date: 2024-05-31 11:46+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand Down Expand Up @@ -96,8 +96,8 @@ msgid ""
"Xinference will automatically report health check at Xinference startup. "
"Setting this environment to 1 can disable health check."
msgstr ""
"在满足条件时,Xinference 会自动汇报worker健康状况,设置改"
"环境变量为 1可以禁用健康检查。"
"在满足条件时,Xinference 会自动汇报worker健康状况,设置改环境变量为 1可以"
"禁用健康检查。"

#: ../../source/getting_started/environments.rst:40
msgid "XINFERENCE_DISABLE_VLLM"
Expand All @@ -111,3 +111,18 @@ msgstr ""
"在满足条件时,Xinference 会自动使用 vLLM 作为推理引擎提供推理效率,设置改"
"环境变量为 1可以禁用 vLLM。"

#: ../../source/getting_started/environments.rst:45
#, fuzzy
msgid "XINFERENCE_DISABLE_METRICS"
msgstr "XINFERENCE_DISABLE_VLLM"

#: ../../source/getting_started/environments.rst:46
msgid ""
"Xinference will by default enable the metrics exporter on the supervisor "
"and worker. Setting this environment to 1 will disable the /metrics "
"endpoint on the supervisor and the HTTP service (only provide the "
"/metrics endpoint) on the worker."
msgstr ""
"Xinference 会默认在 supervisor 和 worker 上启用 metrics exporter。"
"设置环境变量为 1可以在 supervisor 上禁用 /metrics 端点,"
"并在 worker 上禁用 HTTP 服务(仅提供 /metrics 端点)"
54 changes: 36 additions & 18 deletions doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-05-11 10:26+0800\n"
"POT-Creation-Date: 2024-05-31 11:46+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand All @@ -16,7 +16,7 @@ msgstr ""
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.11.0\n"
"Generated-By: Babel 2.14.0\n"

#: ../../source/getting_started/installation.rst:5
msgid "Installation"
Expand Down Expand Up @@ -95,7 +95,9 @@ msgstr "当模型格式为 ``awq`` 时,量化选项需为 ``Int4`` 。"
msgid ""
"When the model format is ``gptq``, the quantization is ``Int3``, ``Int4``"
" or ``Int8``."
msgstr "当模型格式为 ``gptq`` 时,量化选项需为 ``Int3`` 、 ``Int4`` 或者 ``Int8`` 。"
msgstr ""
"当模型格式为 ``gptq`` 时,量化选项需为 ``Int3`` 、 ``Int4`` 或者 ``Int8``"
" 。"

#: ../../source/getting_started/installation.rst:35
msgid "The system is Linux and has at least one CUDA device"
Expand Down Expand Up @@ -132,55 +134,57 @@ msgid "``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``"
msgstr ""

#: ../../source/getting_started/installation.rst:46
msgid "``Yi``, ``Yi-chat``"
msgid "``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``"
msgstr ""

#: ../../source/getting_started/installation.rst:47
msgid "``code-llama``, ``code-llama-python``, ``code-llama-instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:48
msgid "``c4ai-command-r-v01``, ``c4ai-command-r-v01-4bit``"
msgid ""
"``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-"
"instruct``"
msgstr ""

#: ../../source/getting_started/installation.rst:49
msgid "``vicuna-v1.3``, ``vicuna-v1.5``"
msgid "``codeqwen1.5``, ``codeqwen1.5-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:50
msgid "``internlm2-chat``"
msgid "``vicuna-v1.3``, ``vicuna-v1.5``"
msgstr ""

#: ../../source/getting_started/installation.rst:51
msgid "``qwen-chat``"
msgid "``internlm2-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:52
msgid "``mixtral-instruct-v0.1``, ``mixtral-8x22B-instruct-v0.1``"
msgid "``qwen-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:53
msgid "``chatglm3``, ``chatglm3-32k``, ``chatglm3-128k``"
msgid "``mixtral-instruct-v0.1``, ``mixtral-8x22B-instruct-v0.1``"
msgstr ""

#: ../../source/getting_started/installation.rst:54
msgid "``deepseek-chat``, ``deepseek-coder-instruct``"
msgid "``chatglm3``, ``chatglm3-32k``, ``chatglm3-128k``"
msgstr ""

#: ../../source/getting_started/installation.rst:55
msgid "``qwen1.5-chat``, ``qwen1.5-moe-chat``"
msgstr ""

#: ../../source/getting_started/installation.rst:56
msgid "``codeqwen1.5-chat``"
msgid "``gemma-it``"
msgstr ""

#: ../../source/getting_started/installation.rst:57
msgid "``gemma-it``"
msgid "``orion-chat``, ``orion-chat-rag``"
msgstr ""

#: ../../source/getting_started/installation.rst:58
msgid "``orion-chat``, ``orion-chat-rag``"
msgid "``c4ai-command-r-v01``"
msgstr ""

#: ../../source/getting_started/installation.rst:61
Expand All @@ -197,8 +201,8 @@ msgid ""
"cpp-python``. It's advised to install the llama.cpp-related dependencies "
"manually based on your hardware specifications to enable acceleration."
msgstr ""
"Xinference 通过 ``llama-cpp-python`` 支持 ``gguf`` 和 ``ggml`` 格式的模型。建议根据当前使用的硬件手动安装依赖,从而获得最佳的"
"加速效果。"
"Xinference 通过 ``llama-cpp-python`` 支持 ``gguf`` 和 ``ggml`` 格式的模型"
"。建议根据当前使用的硬件手动安装依赖,从而获得最佳的加速效果。"

#: ../../source/getting_started/installation.rst:71
#: ../../source/getting_started/installation.rst:94
Expand Down Expand Up @@ -232,5 +236,19 @@ msgid ""
"automatic KV cache reuse across multiple calls. And it also supports "
"other common techniques like continuous batching and tensor parallelism."
msgstr ""
"SGLang 具有基于 RadixAttention 的高性能推理运行时。它通过在多个调用之间自动重用KV缓存,显著加速了复杂 LLM 程序的执行。"
"它还支持其他常见推理技术,如连续批处理和张量并行处理。"
"SGLang 具有基于 RadixAttention 的高性能推理运行时。它通过在多个调用之间"
"自动重用KV缓存,显著加速了复杂 LLM 程序的执行。它还支持其他常见推理技术,"
"如连续批处理和张量并行处理。"

#~ msgid "``Yi``, ``Yi-chat``"
#~ msgstr ""

#~ msgid "``c4ai-command-r-v01``, ``c4ai-command-r-v01-4bit``"
#~ msgstr ""

#~ msgid "``deepseek-chat``, ``deepseek-coder-instruct``"
#~ msgstr ""

#~ msgid "``codeqwen1.5-chat``"
#~ msgstr ""

50 changes: 25 additions & 25 deletions doc/source/locale/zh_CN/LC_MESSAGES/models/index.po
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ msgid ""
msgstr ""
"Project-Id-Version: Xinference \n"
"Report-Msgid-Bugs-To: \n"
"POT-Creation-Date: 2024-02-07 17:52+0800\n"
"POT-Creation-Date: 2024-05-31 11:46+0800\n"
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
"Language: zh_CN\n"
Expand All @@ -17,7 +17,7 @@ msgstr ""
"MIME-Version: 1.0\n"
"Content-Type: text/plain; charset=utf-8\n"
"Content-Transfer-Encoding: 8bit\n"
"Generated-By: Babel 2.13.1\n"
"Generated-By: Babel 2.14.0\n"

#: ../../source/models/index.rst:5
msgid "Models"
Expand Down Expand Up @@ -109,85 +109,85 @@ msgid ""
"Xinference's Python client:"
msgstr "你可以通过命令行或者 Xinference 的 Python 客户端来启动一个模型。"

#: ../../source/models/index.rst:105
#: ../../source/models/index.rst:107
msgid ""
"For model type ``LLM``, launching the model requires not only specifying "
"the model name, but also the size of the parameters and the model format."
" Please refer to the list of LLM :ref:`model families "
"<models_llm_index>`."
"the model name, but also the size of the parameters , the model format "
"and the model engine. Please refer to the list of LLM :ref:`model "
"families <models_llm_index>`."
msgstr ""
"对于模型类型 ``LLM``,启动模型不仅需要指定模型名称,还需要参数的大小和"
"模型格式。请参考 :ref:`models_llm_index` 文档。"
"对于模型类型 ``LLM``,启动模型不仅需要指定模型名称,还需要参数的大小、"
"模型格式以及模型引擎。请参考 :ref:`models_llm_index` 文档。"

#: ../../source/models/index.rst:108
#: ../../source/models/index.rst:110
msgid ""
"The following command gives you the currently running models in "
"Xinference:"
msgstr "以下命令可以列出 Xinference 中正在运行的模型:"

#: ../../source/models/index.rst:129
#: ../../source/models/index.rst:131
msgid ""
"When you no longer need a model that is currently running, you can remove"
" it in the following way to free up the resources it occupies:"
msgstr "当你不再需要当前正在运行的模型时,以下列方式释放其占用的资源:"

#: ../../source/models/index.rst:153
#: ../../source/models/index.rst:155
msgid "Model Usage"
msgstr "模型使用"

#: ../../source/models/index.rst:158
#: ../../source/models/index.rst:160
msgid "Chat & Generate"
msgstr "聊天 & 生成"

#: ../../source/models/index.rst:162
#: ../../source/models/index.rst:164
msgid "Learn how to chat with LLMs in Xinference."
msgstr "学习如何在 Xinference 中与 LLM聊天。"

#: ../../source/models/index.rst:164
#: ../../source/models/index.rst:166
msgid "Tools"
msgstr "工具"

#: ../../source/models/index.rst:168
#: ../../source/models/index.rst:170
msgid "Learn how to connect LLM with external tools."
msgstr "学习如何将 LLM 与外部工具连接起来。"

#: ../../source/models/index.rst:173
#: ../../source/models/index.rst:175
msgid "Embeddings"
msgstr "嵌入"

#: ../../source/models/index.rst:177
#: ../../source/models/index.rst:179
msgid "Learn how to create text embeddings in Xinference."
msgstr "学习如何在 Xinference 中创建文本嵌入。"

#: ../../source/models/index.rst:179
#: ../../source/models/index.rst:181
msgid "Rerank"
msgstr "重排序"

#: ../../source/models/index.rst:183
#: ../../source/models/index.rst:185
msgid "Learn how to use rerank models in Xinference."
msgstr "学习如何在 Xinference 中使用重排序模型。"

#: ../../source/models/index.rst:188
#: ../../source/models/index.rst:190
msgid "Images"
msgstr "图像"

#: ../../source/models/index.rst:192
#: ../../source/models/index.rst:194
msgid "Learn how to generate images with Xinference."
msgstr "学习如何使用Xinference生成图像。"

#: ../../source/models/index.rst:194
#: ../../source/models/index.rst:196
msgid "Vision"
msgstr "视觉"

#: ../../source/models/index.rst:198
#: ../../source/models/index.rst:200
msgid "Learn how to process image with LLMs."
msgstr "学习如何使用 LLM 处理图像。"

#: ../../source/models/index.rst:203
#: ../../source/models/index.rst:205
msgid "Audio"
msgstr "音频"

#: ../../source/models/index.rst:207
#: ../../source/models/index.rst:209
msgid "Learn how to turn audio into text or text into audio with Xinference."
msgstr "学习如何使用 Xinference 将音频转换为文本或将文本转换为音频。"

Loading

0 comments on commit cb9dbb2

Please sign in to comment.