diff --git a/README.md b/README.md
index 578c25459e..ae6a29377d 100644
--- a/README.md
+++ b/README.md
@@ -20,8 +20,6 @@ and serve your or state-of-the-art built-in models using just a single command.
 researcher, developer, or data scientist, Xorbits Inference empowers you to unleash the full 
 potential of cutting-edge AI models.
 
-![demo](assets/demo.gif)
-
 <div align="center">
 <i><a href="https://join.slack.com/t/xorbitsio/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA">👉 Join our Slack community!</a></i>
 </div>
@@ -52,14 +50,35 @@ with popular third-party libraries like LangChain and LlamaIndex. (Coming soon)
 ## Getting Started
 Xinference can be installed via pip from PyPI. It is highly recommended to create a new virtual
 environment to avoid conflicts.
+
+### Installation
 ```bash
-$ pip install "xinference[all]"
+$ pip install "xinference"
 ```
-`xinference[all]` installs all the necessary packages for serving models. If you want to achieve acceleration on 
+`xinference` installs basic packages for serving models. 
+
+#### Installation with GGML
+To serve ggml models, you need to install the following extra dependencies:
+```bash
+$ pip install "xinference[ggml]"
+```
+If you want to achieve acceleration on 
 different hardware, refer to the installation documentation of the corresponding package.
 - [llama-cpp-python](https://github.com/abetlen/llama-cpp-python#installation-from-pypi-recommended) is required to run `baichuan`, `wizardlm-v1.0`, `vicuna-v1.3` and `orca`.
 - [chatglm-cpp-python](https://github.com/li-plus/chatglm.cpp#getting-started) is required to run `chatglm` and `chatglm2`.
 
+#### Installation with PyTorch
+To serve PyTorch models, you need to install the following extra dependencies:
+```bash
+$ pip install "xinference[pytorch]"
+```
+
+#### Installation with all dependencies
+If you want to serve all the supported models, install all the dependencies:
+```bash
+$ pip install "xinference[all]"
+```
+
 
 ### Deployment
 You can deploy Xinference locally with a single command or deploy it in a distributed cluster. 
@@ -97,7 +116,7 @@ You can also view a web UI using the Xinference endpoint to chat with all the
 builtin models. You can even **chat with two cutting-edge AI models side-by-side to compare
 their performance**!
 
-![web UI](assets/xinference-downloading.png)
+![web UI](assets/demo.gif)
 
 ### Xinference CLI
 Xinference provides a command line interface (CLI) for model management. Here are some useful 
diff --git a/README_ja_JP.md b/README_ja_JP.md
index 7cf51105f4..ebb17744ff 100644
--- a/README_ja_JP.md
+++ b/README_ja_JP.md
@@ -19,8 +19,6 @@ Xorbits Inference(Xinference) は、言語、音声認識、マルチモーダ
 あなたや最先端のビルトインモデルを簡単にデプロイし、提供することができます。 Xorbits Inference は、
 研究者、開発者、データサイエンティストを問わず、最先端の AI モデルの可能性を最大限に引き出すことができます。
 
-![demo](assets/demo.gif)
-
 <div align="center">
 <i><a href="https://join.slack.com/t/xorbitsio/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA">👉 Slack コミュニティにご参加ください！</a></i>
 </div>
@@ -89,7 +87,7 @@ Xinference が起動すると、CLI または Xinference クライアントか
 また、Xinference エンドポイントを使用してウェブ UI を表示し、すべての内蔵モデルとチャットすることもできます。
 **2 つの最先端 AI モデルを並べてチャットし、パフォーマンスを比較することもできます**！
 
-![web UI](assets/xinference-downloading.png)
+![web UI](assets/demo.gif)
 
 ### Xinference CLI
 Xinference には、モデル管理のためのコマンドラインインターフェース（CLI）が用意されています。便利なコマンドをいくつか紹介します:
diff --git a/README_zh_CN.md b/README_zh_CN.md
index 7b6a530bf2..e72f936ab6 100644
--- a/README_zh_CN.md
+++ b/README_zh_CN.md
@@ -19,8 +19,6 @@ Xorbits Inference（Xinference）是一个性能强大且功能全面的分布
 无论你是研究者，开发者，或是数据科学家，都可以通过 Xorbits Inference 与最前沿的 AI 模型，发掘更多可能。
 
 
-![demo](assets/demo.gif)
-
 <div align="center">
 <i><a href="https://join.slack.com/t/xorbitsio/shared_invite/zt-1z3zsm9ep-87yI9YZ_B79HLB2ccTq4WA">👉 立刻加入我们的 Slack 社区!</a></i>
 </div>
@@ -48,13 +46,34 @@ Xorbits Inference（Xinference）是一个性能强大且功能全面的分布
 
 ## 快速入门
 Xinference 可以通过 pip 从 PyPI 安装。我们非常推荐在安装前创建一个新的虚拟环境以避免依赖冲突。
+
+### 安装
 ```bash
-$ pip install "xinference[all]"
+$ pip install "xinference"
 ```
-`xinference[all]` 将会安装所有用于推理的必要依赖。如果你想要获得更高效的加速，请查看下列依赖的安装文档：
+`xinference` 将会安装所有用于推理的基础依赖。
+
+#### 支持 ggml 推理
+想要利用 ggml 推理，可以用以下命令：
+```bash
+$ pip install "xinference[ggml]"
+```
+如果你想要获得更高效的加速，请查看下列依赖的安装文档：
 - [llama-cpp-python](https://github.com/abetlen/llama-cpp-python#installation-from-pypi-recommended) 用于 `baichuan`, `wizardlm-v1.0`, `vicuna-v1.3` 及 `orca`.
 - [chatglm-cpp-python](https://github.com/li-plus/chatglm.cpp#getting-started) 用于 `chatglm` 及 `chatglm2`.
 
+#### 支持 PyTorch 推理
+想要利用 PyTorch 推理，可以使用以下命令：
+```bash
+$ pip install "xinference[pytorch]"
+```
+
+#### 支持所有类型
+如果想要支持推理所有支持的模型，可以安装所有的依赖：
+```bash
+$ pip install "xinference[all]"
+```
+
 
 ### 部署
 你可以一键进行本地部署，或按照下面的步骤将 Xinference 部署在计算集群。 
@@ -89,7 +108,7 @@ supervisor 所在服务器的主机名或 IP 地址。
 你还可以通过 web UI 与任意内置模型聊天。Xinference 甚至**支持同时与两个最前沿的 AI 模型聊天并比较它们的回复质
 量**！
 
-![web UI](assets/xinference-downloading.png)
+![web UI](assets/demo.gif)
 
 ### Xinference 命令行
 Xinference 提供了命令行工具用于模型管理。支持的命令包括：
diff --git a/setup.cfg b/setup.cfg
index 7ffe35818e..036a4b7948 100644
--- a/setup.cfg
+++ b/setup.cfg
@@ -69,6 +69,17 @@ all =
     transformers_stream_generator
     bitsandbytes
     protobuf
+ggml =
+    chatglm-cpp
+    llama-cpp-python>=0.1.77
+pytorch =
+    transformers>=4.31.0
+    torch
+    accelerate>=0.20.3
+    sentencepiece
+    transformers_stream_generator
+    bitsandbytes
+    protobuf
 doc =
     ipython>=6.5.0
     sphinx>=3.0.0,<5.0.0