Skip to content

Commit

Permalink
Merge pull request #289 from ymcui/plus
Browse files Browse the repository at this point in the history
Release LLaMA-Plus and Alpaca-Plus 13B
  • Loading branch information
ymcui committed May 10, 2023
2 parents 11e5d1f + b304ef7 commit dad59d7
Show file tree
Hide file tree
Showing 37 changed files with 1,165 additions and 125 deletions.
90 changes: 48 additions & 42 deletions README.md

Large diffs are not rendered by default.

69 changes: 38 additions & 31 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ To promote open research of large models in the Chinese NLP community, this proj
- 🚀 Open-sourced the Chinese LLaMA (general purpose) and Alpaca (instruction-tuned) (7B, 13B)
- 🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
- 🚀 Support [🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LlamaChat](https://github.com/alexrozanski/LlamaChat), etc.
- Released versions: 7B (basic, **Plus**), 13B (basic)
- Released versions: 7B (basic, **Plus**), 13B (basic, **Plus**)

💡 The following image shows the actual experience effect of the 7B version model after local deployment (animation unaccelerated, tested on Apple M1 Max).

Expand All @@ -37,7 +37,9 @@ To promote open research of large models in the Chinese NLP community, this proj

## News

**[2023/04/28] [Release v3.0](https://github.com/ymcui/Chinese-LLaMA-Alpaca/releases/tag/v3.0): LLaMA/Alpaca Plus versions are available, more training data used than basic ones. **
**[2023/05/10] [Release v3.1](https://github.com/ymcui/Chinese-LLaMA-Alpaca/releases/tag/v3.1): LLaMA/Alpaca Plus 13B versions are available, more training data used than basic ones. **

[2023/04/28] [Release v3.0](https://github.com/ymcui/Chinese-LLaMA-Alpaca/releases/tag/v3.0): LLaMA/Alpaca Plus versions are available, more training data used than basic ones.

[2023/04/18] Release v2.2: Add LlamaChat support (macOS UI), tokenizer merging scripts, documentations are migrated to GitHub Wiki. Refer to [Release Note](https://github.com/ymcui/Chinese-LLaMA-Alpaca/releases/tag/v2.2)

Expand Down Expand Up @@ -104,31 +106,35 @@ The Chinese LLaMA model has expanded the Chinese vocabulary on the basis of the
| Chinese-LLaMA-7B | general 20G | LLaMA-7B | 770M | [[BaiduDisk]](https://pan.baidu.com/s/1oORTdpr2TvlkxjpyWtb5Sw?pwd=33hb)</br>[[Google Drive]](https://drive.google.com/file/d/1iQp9T-BHjBjIrFWXq_kIm_cyNmpvv5WN/view?usp=sharing) |
| Chinese-LLaMA-Plus-7B ⭐️ | general 120G | LLaMA-7B | 790M | [[BaiduDisk]](https://pan.baidu.com/s/1zvyX9FN-WSRDdrtMARxxfw?pwd=2gtr)</br>[[Google Drive]](https://drive.google.com/file/d/1N97m3rBj-rp-J1X8rgRfluyomEscfAq0/view?usp=sharing) |
| Chinese-LLaMA-13B | general 20G | LLaMA-13B | 1G | [[BaiduDisk]](https://pan.baidu.com/s/1BxFhYhDMipW7LwI58cGmQQ?pwd=ef3t)<br/>[[Google Drive]](https://drive.google.com/file/d/12q9EH4mfKRnoKlbkkhzv1xDwWnroo9VS/view?usp=sharing) |
| Chinese-LLaMA-Plus-13B ⭐️ | general 120G | LLaMA-13B | 1G | [[BaiduDisk]](https://pan.baidu.com/s/1VGpNlrLx5zHuNzLOcTG-xw?pwd=8cvd)<br/>[[Google Drive]](https://drive.google.com/file/d/1q0L5Me_1j_9iiRRNfuEFUt3SOjQo3-g3/view?usp=share_link) |

### Chinese Alpaca

The Chinese Alpaca model further uses instruction data for fine-tuning on the basis of the above-mentioned Chinese LLaMA model. For details, see the [Training Details](#Training-Details) section.

**⚠️ Please use Alpaca model if you want to try ChatGPT-like model.**

| Model | Type | Required Original Model<sup>[1]</sup> | Size<sup>[2]</sup> | Download Links<sup>[3]</sup> |
| :----------------------- | :------------: | :------------------------------------: | :----------------: | :----------------------------------------------------------: |
| Chinese-Alpaca-7B | Instruction 2M | LLaMA-7B | 790M | [[BaiduDisk]](https://pan.baidu.com/s/1xV1UXjh1EPrPtXg6WyG7XQ?pwd=923e)</br>[[Google Drive]](https://drive.google.com/file/d/1JvFhBpekYiueWiUL3AF1TtaWDb3clY5D/view?usp=sharing) |
| Chinese-Alpaca-Plus-7B ⭐️ | Instruction 4M | *LLaMA-7B &<br/>Chinese-LLaMA-Plus-7B* | 1.1G | [[百度网盘]](https://pan.baidu.com/s/12tjjxmDWwLBM8Tj_7FAjHg?pwd=32hc)</br>[[Google Drive]](https://drive.google.com/file/d/1EDcTmq6tDmRxqarpapdyDGBE9opY0zrB/view?usp=share_link) |
| Chinese-Alpaca-13B | Instruction 3M | LLaMA-7B | 1.1G | [[BaiduDisk]](https://pan.baidu.com/s/1wYoSF58SnU9k0Lndd5VEYg?pwd=mm8i)<br/>[[Google Drive]](https://drive.google.com/file/d/1gzMc0xMCpXsXmU1uxFlgQ8VRnWNtDjD8/view?usp=share_link) |
| Model | Type | Required Original Model<sup>[1]</sup> | Size<sup>[2]</sup> | Download Links<sup>[3]</sup> |
| :------------------------ | :--------------: | :--------------------------------------------------: | :----------------: | :----------------------------------------------------------: |
| Chinese-Alpaca-7B | Instruction 2M | LLaMA-7B | 790M | [[BaiduDisk]](https://pan.baidu.com/s/1xV1UXjh1EPrPtXg6WyG7XQ?pwd=923e)</br>[[Google Drive]](https://drive.google.com/file/d/1JvFhBpekYiueWiUL3AF1TtaWDb3clY5D/view?usp=sharing) |
| Chinese-Alpaca-Plus-7B ⭐️ | Instruction 4M | *LLaMA-7B &<br/>Chinese-LLaMA-Plus-7B*<sup>[4]</sup> | 1.1G | [[百度网盘]](https://pan.baidu.com/s/12tjjxmDWwLBM8Tj_7FAjHg?pwd=32hc)</br>[[Google Drive]](https://drive.google.com/file/d/1EDcTmq6tDmRxqarpapdyDGBE9opY0zrB/view?usp=share_link) |
| Chinese-Alpaca-13B | Instruction 3M | LLaMA-7B | 1.1G | [[BaiduDisk]](https://pan.baidu.com/s/1wYoSF58SnU9k0Lndd5VEYg?pwd=mm8i)<br/>[[Google Drive]](https://drive.google.com/file/d/1gzMc0xMCpXsXmU1uxFlgQ8VRnWNtDjD8/view?usp=share_link) |
| Chinese-Alpaca-Plus-13B ⭐️ | Instruction 4.3M | *LLaMA-7B &<br/>Chinese-LLaMA-Plus-7B<sup>[4]</sup>* | 1.3G | [[百度网盘]](https://pan.baidu.com/s/1Mew4EjBlejWBBB6_WW6vig?pwd=mf5w)<br/> [[Google Drive]](https://drive.google.com/file/d/1CcLJvY7XsFAOjfSIqCpDI7jf3EEPDcEF/view?usp=share_link) |

### Model Hub

You can download all the above models in 🤗Model Hub, and use [🤗transformers](https://github.com/huggingface/transformers) and [🤗PEFT](https://github.com/huggingface/peft) to call Chinese LLaMA or the Alpaca LoRA model.

| Model | MODEL_NAME | Link |
| ------------------ | :--------------------------------: | :----------------------------------------------------------: |
| Chinese-LLaMA-7B | ziqingyang/chinese-llama-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-lora-7b) |
| Chinese-LLaMA-Plus-7B | ziqingyang/chinese-llama-plus-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-plus-lora-7b) |
| Chinese-LLaMA-13B | ziqingyang/chinese-llama-lora-13b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-lora-13b) |
| Chinese-Alpaca-7B | ziqingyang/chinese-alpaca-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-lora-7b) |
| Chinese-Alpaca-Plus-7B | ziqingyang/chinese-alpaca-plus-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-plus-lora-7b) |
| Chinese-Alpaca-13B | ziqingyang/chinese-alpaca-lora-13b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-lora-13b) |
| ------------------ | :--------------------------------- | :----------------------------------------------------------: |
| Chinese-LLaMA-7B | ziqingyang/chinese-llama-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-lora-7b) |
| Chinese-LLaMA-Plus-7B | ziqingyang/chinese-llama-plus-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-plus-lora-7b) |
| Chinese-LLaMA-13B | ziqingyang/chinese-llama-lora-13b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-lora-13b) |
| Chinese-LLaMA-Plus-13B | ziqingyang/chinese-llama-plus-lora-13b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-llama-plus-lora-13b) |
| Chinese-Alpaca-7B | ziqingyang/chinese-alpaca-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-lora-7b) |
| Chinese-Alpaca-Plus-7B | ziqingyang/chinese-alpaca-plus-lora-7b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-plus-lora-7b) |
| Chinese-Alpaca-13B | ziqingyang/chinese-alpaca-lora-13b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-lora-13b) |
| Chinese-Alpaca-Plus-13B | ziqingyang/chinese-alpaca-plus-lora-13b | [Model Hub Link](https://huggingface.co/ziqingyang/chinese-alpaca-plus-lora-13b) |


### Footnote and Others
Expand All @@ -139,6 +145,8 @@ You can download all the above models in 🤗Model Hub, and use [🤗transformer

**[3]** After downloading, be sure to check whether the SHA256 of the ZIP file is consistent; for the full value, please see [SHA256.md](./SHA256.md).

**[4]** Merging steps for Alpaca-Plus are different from others, please refer to [wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/Manual-Conversion#multiple-lora-weights-merging-applicable-to-chinese-alpaca-plus)

The file directory inside the ZIP file is as follows (using Chinese-LLaMA as an example):

```
Expand Down Expand Up @@ -185,23 +193,22 @@ Related documentation has been moved to the project's >>> [📚GitHub Wiki](http

## System Performance

In order to quickly evaluate the actual performance of related models, this project compared the effects of Chinese Alpaca-7B, Alpaca-13B, and Alpaca-Plus-7B on some common tasks given the same prompt. Reply generation is random and is affected by factors such as decoding hyperparameters and random seeds. The following related evaluations are not absolutely rigorous, and the test results are for reference only. Welcome to experience it yourself. For detailed evaluation results, please see [examples/README.md](./examples/README.md)

| Task | Samples | # | Alpaca-7B | Alpaca-13B | Alpaca-Plus-7B |
| ------------------------------ | :---------------------------------------------: | :--: | :-------: | :--------: | :------------: |
| **💯 Overall** | - | 200 | 65.3 | 70.9 | **👍🏻75.3** |
| Question Answering | [QA.md](./examples/QA.md) | 20 | 66 | 74 | **👍🏻80** |
| Open QA | [OQA.md](./OQA.md) | 20 | **👍🏻79** | 74 | **👍🏻78** |
| Computation, Reasoning | [REASONING.md](./examples/REASONING.md) | 20 | 31 | **👍🏻50** | 45 |
| Poetry, Literature, Philosophy | [LITERATURE.md](./examples/LITERATURE.md) | 20 | 68 | 73 | **👍🏻76** |
| Music, Sports, Entertainment | [ENTERTAINMENT.md](./examples/ENTERTAINMENT.md) | 20 | 68 | 74 | **👍🏻79** |
| Letters and Articles | [GENERATION.md](./examples/GENERATION.md) | 20 | 76 | **👍🏻81** | **👍🏻81** |
| Translation | [TRANSLATION.md](./examples/TRANSLATION.md) | 20 | 76 | 78 | **👍🏻82** |
| Multi-turn Dialogue | [DIALOGUE.md](./examples/DIALOGUE.md) | 20 | **👍🏻83** | 73 | **👍🏻84** |
| Coding | [CODE.md](./examples/CODE.md) | 20 | 57 | **👍🏻64** | 59 |
| Ethics | [ETHICS.md](./examples/ETHICS.md) | 20 | 49 | 68 | **👍🏻89** |

*Note: for results on **4-bit quantized models**, please refer to [./examples-q4/README.md](./examples-q4/README.md).*
In order to quickly evaluate the actual performance of related models, this project compared the effects of Chinese Alpaca-7B, Alpaca-13B, Alpaca-Plus-7B and Alpaca-Plus-13B on some common tasks given the same prompt. Reply generation is random and is affected by factors such as decoding hyperparameters and random seeds. The following related evaluations are not absolutely rigorous, and the test results are for reference only. Welcome to experience it yourself. For detailed evaluation results, please see [examples](./examples).

| 测试任务 | Samples | Alpaca-13B | Alpaca-Plus-7B | Alpaca-Plus-13B |
| ---------------- | :----: | :--------: | :------------: | :-------------: |
| **💯Overall** | 200 | 74.3 | 78.2 | **👍🏻80.8** |
| Question Answering | 20 | 70 | 74 | **👍🏻79** |
| Open QA | 20 | 77 | 77 | 77 |
| Computation, Reasoning | 20 | 61 | 61 | 60 |
| Poetry, Literature, Philosophy | 20 | 65 | **👍🏻76** | **👍🏻76** |
| Music, Sports, Entertainment | 20 | 68 | 73 | **👍🏻80** |
| Letters and Articles | 20 | 83 | 82 | **👍🏻87** |
| Translation | 20 | 84 | 87 | **👍🏻90** |
| Multi-turn Dialogue | 20 | 88 | 89 | 89 |
| Coding | 20 | 65 | 64 | **👍🏻70** |
| Ethics | 20 | 82 | **👍🏻99** | **👍🏻100** |


## Training Details

Expand Down
2 changes: 2 additions & 0 deletions SHA256.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,11 @@ The followings are SHA256 values for `adapter_model.bin` files.
| Chinese-LLaMA-7B | 2a2c24d096f5d509f24946fdbd8c25e1ce4a0acb955902f7436d74c0c0379d86 |
| Chinese-LLaMA-Plus-7B | 8c928db86b2a0cf73f019832f921eb7e1e069ca21441b4bfa12c4381c6cc46be |
| Chinese-LLaMA-13B | 6a4ce789d219bde122f8d9a20371937f2aa2ee86a2311d9f5e303df2e774f9fc |
| Chinese-LLaMA-Plus-13B | 784fcff9c4bdf4e77d442a01158e121caf8fcce0f97ffb32396fe7a3617ee7e8 |
| Chinese-Alpaca-7B | 0d9b6ed8e4a7d1ae590a16c89a452a488d66ff07e45487972f61c2b6e46e36de |
| Chinese-Alpaca-Plus-7B | 4ee0bf805c312a9a771624d481fbdb4485e1b0a70cd2a8da9f96137f177b795d |
| Chinese-Alpaca-13B | cb8dda3c005f3343a0740dcd7237fbb600cb14b6bff9b6f3d488c086a2f08ada |
| Chinese-Alpaca-Plus-13B | |


### Merged files (consolidated.*.pth)
Expand Down
66 changes: 14 additions & 52 deletions examples/README.md
Original file line number Diff line number Diff line change
@@ -1,59 +1,21 @@
# 中文Alpaca效果评测
# 效果对比

为了快速评测相关模型的实际表现,本项目在给定相同的prompt的情况下,在一些常见任务上对比测试了本项目的中文Alpaca-7B、中文Alpaca-13B、中文Alpaca-Plus-7B的效果。生成回复具有随机性,受解码超参、随机种子等因素影响。以下相关评测并非绝对严谨,测试结果仅供晾晒参考,欢迎自行体验
以下分数应视为paired score,也就是说分数是一个相对值,而不是绝对值,是多个系统相比较得到的结果。详细结果见相应目录

⚠️ *以下测试结果均基于**8-bit量化模型**,效果接近FP16,具体请看[量化方法](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/llama.cpp量化部署#关于量化参数上述命令中的最后一个参数)*
### q4_7b-13b:对比了4-bit量化版的Alpaca-7B和13B

| 测试任务 | 详细样例 | 样例数 | 中文Alpaca-7B | 中文Alpaca-13B | 中文Alpaca-Plus-7B |
| ---------------- | :------------------------------------: | :----: | :-----------: | :------------: | :----------------: |
| **💯总平均分** | - | 200 | 65.3 | 70.9 | **👍🏻75.3** |
| 知识问答 | [QA.md](./QA.md) | 20 | 66 | 74 | **👍🏻80** |
| 开放式问答 | [OQA.md](./OQA.md) | 20 | **👍🏻79** | 74 | **👍🏻78** |
| 数值计算、推理 | [REASONING.md](./REASONING.md) | 20 | 31 | **👍🏻50** | 45 |
| 诗词、文学、哲学 | [LITERATURE.md](./LITERATURE.md) | 20 | 68 | 73 | **👍🏻76** |
| 音乐、体育、娱乐 | [ENTERTAINMENT.md](./ENTERTAINMENT.md) | 20 | 68 | 74 | **👍🏻79** |
| 写信、写文章 | [GENERATION.md](./GENERATION.md) | 20 | 76 | **👍🏻81** | **👍🏻81** |
| 文本翻译 | [TRANSLATION.md](./TRANSLATION.md) | 20 | 76 | 78 | **👍🏻82** |
| 多轮交互 | [DIALOGUE.md](./DIALOGUE.md) | 20 | **👍🏻83** | 73 | **👍🏻84** |
| 代码编程 | [CODE.md](./CODE.md) | 20 | 57 | **👍🏻64** | 59 |
| 伦理、拒答 | [ETHICS.md](./ETHICS.md) | 20 | 49 | 68 | **👍🏻89** |
| | 样例数 | 中文Alpaca-7B | 中文Alpaca-13B |
| ------------- | :----: | :-----------: | :------------: |
| **💯总平均分** | 160 | **49** | **👍🏻71** |

说明:
### q8_7b-13b-p7b:对比了8-bit量化版的Alpaca-7B、13B、Plus-7B

- 以上分数应视为paired score,也就是说分数是一个相对值,而不是绝对值,是多个系统相比较得到的结果
- 基于以上说明,分数之间的大小关系有一些参考价值,而分数的绝对值没有太大参考价值
- 除多轮任务之外,所有任务均基于单轮回复进行打分(不包含任何对话历史)
- 每个样例运行2-3次,人工选取最好的一组交给[机器评分](#打分方式)以降低随机性带来的偏差
| | 样例数 | 中文Alpaca-7B | 中文Alpaca-13B | 中文Alpaca-Plus-7B |
| ------------- | :----: | :-----------: | :------------: | :----------------: |
| **💯总平均分** | 200 | 65.3 | 70.9 | **👍🏻75.3** |

#### 运行参数
### q8_13b-p7b-p13b:对比了8-bit量化版的Alpaca-13B、Plus-7B、Plus-13B

测试中使用了统一的解码参数:
```bash
./main -m zh-alpaca-models/{7B,13B,7B-Plus}/ggml-model-q8_0.bin --color -f ./prompts/alpaca.txt -ins \
-b 16 -c 2048 -n 512 -t 6 \
--temp 0.2 --top_k 40 --top_p 0.9 \
--repeat_penalty 1.1
```

*注:可能并不适合所有任务。实际使用时,对话、写作类等自由生成类任务可适当调高temp。*

#### 打分方式

- 一共10组任务,每组任务满分100分;每组任务20个样例,每个样例满分10分
- 样例的得分之和规整到100分区间作为该模型在该任务上的得分
- 使用GPT-4和ChatGPT(GPT-3.5)对两个系统的输出进行打分(10分制),模板如下:

```
The followings are ChatGPT-like systems' outputs based on a single prompt. Please rate an overall score on a ten point scale for each system and give a short explanation to justify your scores. Please try not to give the same scores for different system unless they are indistinguishable.
Prompt:
<prompt-input>
System1:
<system1-output>
System2:
<system2-output>
```

*注:优先使用GPT-4打分。由于GPT-4的交互次数限制,一部分打分由ChatGPT(gpt-3.5-turbo)进行。*
| | 样例数 | Alpaca-13B | Alpaca-Plus-7B | Alpaca-Plus-13B |
| ------------- | :----: | :--------: | :------------: | :-------------: |
| **💯总平均分** | 200 | 74.3 | 78.2 | **👍🏻80.8** |
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit dad59d7

Please sign in to comment.