Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add conda environment recipe and update installation instructions #177

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 6 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -112,6 +112,7 @@ MOSS是一个支持中英双语和多种插件的开源对话语言模型,`mos


## :robot: 本地部署

### 硬件要求

下表提供了一个batch size=1时本地部署MOSS进行推理所需的显存大小。**量化模型暂时不支持模型并行。**
Expand All @@ -123,29 +124,24 @@ MOSS是一个支持中英双语和多种插件的开源对话语言模型,`mos
| Int4 | 7.8GB | 12GB | 26GB |

### 下载安装

1. 下载本仓库内容至本地/远程服务器

```bash
git clone https://github.com/OpenLMLab/MOSS.git
cd MOSS
```

2. 创建conda环境
2. 创建 conda 环境

```bash
conda create --name moss python=3.8
conda env create --file conda-recipe.yaml # or `mamba env create --file conda-recipe.yaml`
conda activate moss
```

3. 安装依赖

```bash
pip install -r requirements.txt
```

其中`torch`和`transformers`版本不建议低于推荐版本。
其中 `torch` 和 `transformers` 版本不建议低于推荐版本。

目前triton仅支持Linux及WSL,暂不支持Windows及Mac OS,请等待后续更新。
目前 triton 仅支持 Linux 及 WSL,暂不支持 Windows 及 macOS,请等待后续更新。

### 使用示例

Expand Down
14 changes: 5 additions & 9 deletions README_en.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ MOSS is an open-sourced plugin-augmented conversational language model. `moss-mo


## :robot: Chat with MOSS

### GPU Requirements

The table below shows the minimal GPU memory required by performing MOSS inference when batch size is 1. Please note that **currently the quantized models do not support model parallism**.
Expand All @@ -119,6 +120,7 @@ The table below shows the minimal GPU memory required by performing MOSS inferen
| Int4 | 7.8GB | 12GB | 26GB |

### Installation

1. Clone this repo to your local/remote machine.

```bash
Expand All @@ -129,25 +131,19 @@ cd MOSS
2. Create a new conda environment

```bash
conda create --name moss python=3.8
conda env create --file conda-recipe.yaml # or `mamba env create --file conda-recipe.yaml`
conda activate moss
```

3. Install requirements

```bash
pip install -r requirements.txt
```

4. (Optional) 4/8-bit quantization requirement
3. (Optional) 4/8-bit quantization requirement

```bash
pip install triton
```

Note that the version of `torch` and `transformers` should be equal or higher than recommended.

Currently triton only supports Linux and WSL. Please wait for later updates if you are using Windows/MacOS.
Currently triton only supports Linux and WSL. Please wait for later updates if you are using Windows/macOS.

### Try MOSS

Expand Down
33 changes: 33 additions & 0 deletions conda-recipe.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Create virtual environment with command:
#
# $ conda env create --file conda-recipe.yaml
#

name: moss

channels:
- pytorch
- huggingface
- nvidia/label/cuda-11.7.1
- defaults
- conda-forge

dependencies:
- python = 3.10
- pip

- pytorch::pytorch >= 1.13
- pytorch::pytorch-mutex = *=*cuda*
- nvidia/label/cuda-11.7.1::cuda-toolkit = 11.7

- huggingface::transformers >= 4.25
- huggingface::datasets
- accelerate
- huggingface_hub
- sentencepiece

- matplotlib-base
- gradio
- streamlit
- pip:
- mdtex2html