DOC: update readme and fix description about model engine (#1566)

xorbitsai · May 31, 2024 · cb9dbb2 · cb9dbb2
1 parent f8dd5ba
commit cb9dbb2
Show file tree

Hide file tree

Showing 8 changed files with 126 additions and 63 deletions.
diff --git a/README.md b/README.md
@@ -34,12 +34,12 @@ potential of cutting-edge AI models.
 - Docker image: [#855](https://github.com/xorbitsai/inference/pull/855)
 - Support multimodal: [#829](https://github.com/xorbitsai/inference/pull/829)
 ### New Models
+- Built-in support for [CogVLM2](https://github.com/THUDM/CogVLM2): [#1551](https://github.com/xorbitsai/inference/pull/1551)
 - Built-in support for [InternVL-Chat-V1-5](https://github.com/OpenGVLab/InternVL): [#1536](https://github.com/xorbitsai/inference/pull/1536)
 - Built-in support for [Yi-1.5](https://github.com/01-ai/Yi-1.5): [#1489](https://github.com/xorbitsai/inference/pull/1489)
 - Built-in support for [Llama 3](https://github.com/meta-llama/llama3): [#1332](https://github.com/xorbitsai/inference/pull/1332)
 - Built-in support for [Qwen1.5 110B](https://huggingface.co/Qwen/Qwen1.5-110B-Chat): [#1388](https://github.com/xorbitsai/inference/pull/1388)
 - Built-in support for [Mixtral-8x22B-instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1): [#1340](https://github.com/xorbitsai/inference/pull/1340)
-- Built-in support for [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01): [#1310](https://github.com/xorbitsai/inference/pull/1310)
 ### Integrations
 - [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
 - [FastGPT](https://github.com/labring/FastGPT): a knowledge-based platform built on the LLM, offers out-of-the-box data processing and model invocation capabilities, allows for workflow orchestration through Flow visualization.

diff --git a/README_zh_CN.md b/README_zh_CN.md
@@ -31,12 +31,12 @@ Xorbits Inference（Xinference）是一个性能强大且功能全面的分布
 - Docker 镜像支持: [#855](https://github.com/xorbitsai/inference/pull/855)
 - 支持多模态模型：[#829](https://github.com/xorbitsai/inference/pull/829)
 ### 新模型
+- 内置 [CogVLM2](https://github.com/THUDM/CogVLM2): [#1551](https://github.com/xorbitsai/inference/pull/1551)
 - 内置 [InternVL-Chat-V1-5](https://github.com/OpenGVLab/InternVL): [#1536](https://github.com/xorbitsai/inference/pull/1536)
 - 内置 [Yi-1.5](https://github.com/01-ai/Yi-1.5): [#1489](https://github.com/xorbitsai/inference/pull/1489)
 - 内置 [Llama 3](https://github.com/meta-llama/llama3): [#1332](https://github.com/xorbitsai/inference/pull/1332)
 - 内置 [Qwen1.5 110B](https://huggingface.co/Qwen/Qwen1.5-110B-Chat): [#1388](https://github.com/xorbitsai/inference/pull/1388)
 - 内置 [Mixtral-8x22B-instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x22B-Instruct-v0.1): [#1340](https://github.com/xorbitsai/inference/pull/1340)
-- 内置 [Command-R](https://huggingface.co/CohereForAI/c4ai-command-r-v01): [#1310](https://github.com/xorbitsai/inference/pull/1310)
 ### 集成
 - [FastGPT](https://doc.fastai.site/docs/development/custom-models/xinference/)：一个基于 LLM 大模型的开源 AI 知识库构建平台。提供了开箱即用的数据处理、模型调用、RAG 检索、可视化 AI 工作流编排等能力，帮助您轻松实现复杂的问答场景。
 - [Dify](https://docs.dify.ai/advanced/model-configuration/xinference): 一个涵盖了大型语言模型开发、部署、维护和优化的 LLMOps 平台。

diff --git a/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/environments.po b/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/environments.po
@@ -8,7 +8,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: Xinference \n"
 "Report-Msgid-Bugs-To: \n"
-"POT-Creation-Date: 2024-03-11 13:33+0800\n"
+"POT-Creation-Date: 2024-05-31 11:46+0800\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
 "Language: zh_CN\n"
@@ -96,8 +96,8 @@ msgid ""
 "Xinference will automatically report health check at Xinference startup. "
 "Setting this environment to 1 can disable health check."
 msgstr ""
-"在满足条件时，Xinference 会自动汇报worker健康状况，设置改"
-"环境变量为 1可以禁用健康检查。"
+"在满足条件时，Xinference 会自动汇报worker健康状况，设置改环境变量为 1可以"
+"禁用健康检查。"
 
 #: ../../source/getting_started/environments.rst:40
 msgid "XINFERENCE_DISABLE_VLLM"
@@ -111,3 +111,18 @@ msgstr ""
 "在满足条件时，Xinference 会自动使用 vLLM 作为推理引擎提供推理效率，设置改"
 "环境变量为 1可以禁用 vLLM。"
 
+#: ../../source/getting_started/environments.rst:45
+#, fuzzy
+msgid "XINFERENCE_DISABLE_METRICS"
+msgstr "XINFERENCE_DISABLE_VLLM"
+
+#: ../../source/getting_started/environments.rst:46
+msgid ""
+"Xinference will by default enable the metrics exporter on the supervisor "
+"and worker. Setting this environment to 1 will disable the /metrics "
+"endpoint on the supervisor and the HTTP service (only provide the "
+"/metrics endpoint) on the worker."
+msgstr ""
+"Xinference 会默认在 supervisor 和 worker 上启用 metrics exporter。"
+"设置环境变量为 1可以在 supervisor 上禁用 /metrics 端点，"
+"并在 worker 上禁用 HTTP 服务（仅提供 /metrics 端点）"
diff --git a/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po b/doc/source/locale/zh_CN/LC_MESSAGES/getting_started/installation.po
@@ -7,7 +7,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: Xinference \n"
 "Report-Msgid-Bugs-To: \n"
-"POT-Creation-Date: 2024-05-11 10:26+0800\n"
+"POT-Creation-Date: 2024-05-31 11:46+0800\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
 "Language: zh_CN\n"
@@ -16,7 +16,7 @@ msgstr ""
 "MIME-Version: 1.0\n"
 "Content-Type: text/plain; charset=utf-8\n"
 "Content-Transfer-Encoding: 8bit\n"
-"Generated-By: Babel 2.11.0\n"
+"Generated-By: Babel 2.14.0\n"
 
 #: ../../source/getting_started/installation.rst:5
 msgid "Installation"
@@ -95,7 +95,9 @@ msgstr "当模型格式为 ``awq`` 时，量化选项需为 ``Int4`` 。"
 msgid ""
 "When the model format is ``gptq``, the quantization is ``Int3``, ``Int4``"
 " or ``Int8``."
-msgstr "当模型格式为 ``gptq`` 时，量化选项需为 ``Int3`` 、 ``Int4`` 或者 ``Int8`` 。"
+msgstr ""
+"当模型格式为 ``gptq`` 时，量化选项需为 ``Int3`` 、 ``Int4`` 或者 ``Int8``"
+" 。"
 
 #: ../../source/getting_started/installation.rst:35
 msgid "The system is Linux and has at least one CUDA device"
@@ -132,55 +134,57 @@ msgid "``mistral-v0.1``, ``mistral-instruct-v0.1``, ``mistral-instruct-v0.2``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:46
-msgid "``Yi``, ``Yi-chat``"
+msgid "``Yi``, ``Yi-1.5``, ``Yi-chat``, ``Yi-1.5-chat``, ``Yi-1.5-chat-16k``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:47
 msgid "``code-llama``, ``code-llama-python``, ``code-llama-instruct``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:48
-msgid "``c4ai-command-r-v01``, ``c4ai-command-r-v01-4bit``"
+msgid ""
+"``deepseek``, ``deepseek-coder``, ``deepseek-chat``, ``deepseek-coder-"
+"instruct``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:49
-msgid "``vicuna-v1.3``, ``vicuna-v1.5``"
+msgid "``codeqwen1.5``, ``codeqwen1.5-chat``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:50
-msgid "``internlm2-chat``"
+msgid "``vicuna-v1.3``, ``vicuna-v1.5``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:51
-msgid "``qwen-chat``"
+msgid "``internlm2-chat``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:52
-msgid "``mixtral-instruct-v0.1``, ``mixtral-8x22B-instruct-v0.1``"
+msgid "``qwen-chat``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:53
-msgid "``chatglm3``, ``chatglm3-32k``, ``chatglm3-128k``"
+msgid "``mixtral-instruct-v0.1``, ``mixtral-8x22B-instruct-v0.1``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:54
-msgid "``deepseek-chat``, ``deepseek-coder-instruct``"
+msgid "``chatglm3``, ``chatglm3-32k``, ``chatglm3-128k``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:55
 msgid "``qwen1.5-chat``, ``qwen1.5-moe-chat``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:56
-msgid "``codeqwen1.5-chat``"
+msgid "``gemma-it``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:57
-msgid "``gemma-it``"
+msgid "``orion-chat``, ``orion-chat-rag``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:58
-msgid "``orion-chat``, ``orion-chat-rag``"
+msgid "``c4ai-command-r-v01``"
 msgstr ""
 
 #: ../../source/getting_started/installation.rst:61
@@ -197,8 +201,8 @@ msgid ""
 "cpp-python``. It's advised to install the llama.cpp-related dependencies "
 "manually based on your hardware specifications to enable acceleration."
 msgstr ""
-"Xinference 通过 ``llama-cpp-python`` 支持 ``gguf`` 和 ``ggml`` 格式的模型。建议根据当前使用的硬件手动安装依赖，从而获得最佳的"
-"加速效果。"
+"Xinference 通过 ``llama-cpp-python`` 支持 ``gguf`` 和 ``ggml`` 格式的模型"
+"。建议根据当前使用的硬件手动安装依赖，从而获得最佳的加速效果。"
 
 #: ../../source/getting_started/installation.rst:71
 #: ../../source/getting_started/installation.rst:94
@@ -232,5 +236,19 @@ msgid ""
 "automatic KV cache reuse across multiple calls. And it also supports "
 "other common techniques like continuous batching and tensor parallelism."
 msgstr ""
-"SGLang 具有基于 RadixAttention 的高性能推理运行时。它通过在多个调用之间自动重用KV缓存，显著加速了复杂 LLM 程序的执行。"
-"它还支持其他常见推理技术，如连续批处理和张量并行处理。"
+"SGLang 具有基于 RadixAttention 的高性能推理运行时。它通过在多个调用之间"
+"自动重用KV缓存，显著加速了复杂 LLM 程序的执行。它还支持其他常见推理技术，"
+"如连续批处理和张量并行处理。"
+
+#~ msgid "``Yi``, ``Yi-chat``"
+#~ msgstr ""
+
+#~ msgid "``c4ai-command-r-v01``, ``c4ai-command-r-v01-4bit``"
+#~ msgstr ""
+
+#~ msgid "``deepseek-chat``, ``deepseek-coder-instruct``"
+#~ msgstr ""
+
+#~ msgid "``codeqwen1.5-chat``"
+#~ msgstr ""
+
diff --git a/doc/source/locale/zh_CN/LC_MESSAGES/models/index.po b/doc/source/locale/zh_CN/LC_MESSAGES/models/index.po
@@ -8,7 +8,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: Xinference \n"
 "Report-Msgid-Bugs-To: \n"
-"POT-Creation-Date: 2024-02-07 17:52+0800\n"
+"POT-Creation-Date: 2024-05-31 11:46+0800\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
 "Language: zh_CN\n"
@@ -17,7 +17,7 @@ msgstr ""
 "MIME-Version: 1.0\n"
 "Content-Type: text/plain; charset=utf-8\n"
 "Content-Transfer-Encoding: 8bit\n"
-"Generated-By: Babel 2.13.1\n"
+"Generated-By: Babel 2.14.0\n"
 
 #: ../../source/models/index.rst:5
 msgid "Models"
@@ -109,85 +109,85 @@ msgid ""
 "Xinference's Python client:"
 msgstr "你可以通过命令行或者 Xinference 的 Python 客户端来启动一个模型。"
 
-#: ../../source/models/index.rst:105
+#: ../../source/models/index.rst:107
 msgid ""
 "For model type ``LLM``, launching the model requires not only specifying "
-"the model name, but also the size of the parameters and the model format."
-"  Please refer to the list of LLM :ref:`model families "
-"<models_llm_index>`."
+"the model name, but also the size of the parameters , the model format "
+"and the model engine.  Please refer to the list of LLM :ref:`model "
+"families <models_llm_index>`."
 msgstr ""
-"对于模型类型 ``LLM``，启动模型不仅需要指定模型名称，还需要参数的大小和"
-"模型格式。请参考 :ref:`models_llm_index` 文档。"
+"对于模型类型 ``LLM``，启动模型不仅需要指定模型名称，还需要参数的大小、"
+"模型格式以及模型引擎。请参考 :ref:`models_llm_index` 文档。"
 
-#: ../../source/models/index.rst:108
+#: ../../source/models/index.rst:110
 msgid ""
 "The following command gives you the currently running models in "
 "Xinference:"
 msgstr "以下命令可以列出 Xinference 中正在运行的模型："
 
-#: ../../source/models/index.rst:129
+#: ../../source/models/index.rst:131
 msgid ""
 "When you no longer need a model that is currently running, you can remove"
 " it in the following way to free up the resources it occupies:"
 msgstr "当你不再需要当前正在运行的模型时，以下列方式释放其占用的资源："
 
-#: ../../source/models/index.rst:153
+#: ../../source/models/index.rst:155
 msgid "Model Usage"
 msgstr "模型使用"
 
-#: ../../source/models/index.rst:158
+#: ../../source/models/index.rst:160
 msgid "Chat & Generate"
 msgstr "聊天 & 生成"
 
-#: ../../source/models/index.rst:162
+#: ../../source/models/index.rst:164
 msgid "Learn how to chat with LLMs in Xinference."
 msgstr "学习如何在 Xinference 中与 LLM聊天。"
 
-#: ../../source/models/index.rst:164
+#: ../../source/models/index.rst:166
 msgid "Tools"
 msgstr "工具"
 
-#: ../../source/models/index.rst:168
+#: ../../source/models/index.rst:170
 msgid "Learn how to connect LLM with external tools."
 msgstr "学习如何将 LLM 与外部工具连接起来。"
 
-#: ../../source/models/index.rst:173
+#: ../../source/models/index.rst:175
 msgid "Embeddings"
 msgstr "嵌入"
 
-#: ../../source/models/index.rst:177
+#: ../../source/models/index.rst:179
 msgid "Learn how to create text embeddings in Xinference."
 msgstr "学习如何在 Xinference 中创建文本嵌入。"
 
-#: ../../source/models/index.rst:179
+#: ../../source/models/index.rst:181
 msgid "Rerank"
 msgstr "重排序"
 
-#: ../../source/models/index.rst:183
+#: ../../source/models/index.rst:185
 msgid "Learn how to use rerank models in Xinference."
 msgstr "学习如何在 Xinference 中使用重排序模型。"
 
-#: ../../source/models/index.rst:188
+#: ../../source/models/index.rst:190
 msgid "Images"
 msgstr "图像"
 
-#: ../../source/models/index.rst:192
+#: ../../source/models/index.rst:194
 msgid "Learn how to generate images with Xinference."
 msgstr "学习如何使用Xinference生成图像。"
 
-#: ../../source/models/index.rst:194
+#: ../../source/models/index.rst:196
 msgid "Vision"
 msgstr "视觉"
 
-#: ../../source/models/index.rst:198
+#: ../../source/models/index.rst:200
 msgid "Learn how to process image with LLMs."
 msgstr "学习如何使用 LLM 处理图像。"
 
-#: ../../source/models/index.rst:203
+#: ../../source/models/index.rst:205
 msgid "Audio"
 msgstr "音频"
 
-#: ../../source/models/index.rst:207
+#: ../../source/models/index.rst:209
 msgid "Learn how to turn audio into text or text into audio with Xinference."
 msgstr "学习如何使用 Xinference 将音频转换为文本或将文本转换为音频。"