Skip to content

Commit

Permalink
Merge pull request #659 from ymcui/doc_fix
Browse files Browse the repository at this point in the history
Miscellaneous doc fix
  • Loading branch information
ymcui committed Jun 25, 2023
2 parents 7745723 + 3e2ff96 commit fb9e149
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 4 deletions.
5 changes: 4 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.0-green"></a>
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.1-green"></a>
</p>


Expand Down Expand Up @@ -245,6 +245,8 @@ chinese_llama_lora_7b/

需要注意的是,综合评估大模型能力仍然是亟待解决的重要课题,合理辩证地看待大模型相关各种评测结果有助于大模型技术的良性发展。推荐用户在自己关注的任务上进行测试,选择适配相关任务的模型。

C-Eval推理代码请参考本项目 >>> [📚 GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/C-Eval评测结果与脚本)

## 训练细节

整个训练流程包括词表扩充、预训练和指令精调三部分。
Expand Down Expand Up @@ -273,6 +275,7 @@ FAQ中给出了常见问题的解答,请在提Issue前务必先查看FAQ。
问题8:Chinese-Alpaca-Plus效果很差
问题9:模型在NLU类任务(文本分类等)上效果不好
问题10:为什么叫33B,不应该是30B吗?
问题11:模型合并之后SHA256不一致
```

具体问题和解答请参考本项目 >>> [📚 GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/常见问题)
Expand Down
8 changes: 5 additions & 3 deletions README_EN.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,12 +10,11 @@
<img alt="GitHub release (latest by date)" src="https://img.shields.io/github/v/release/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub top language" src="https://img.shields.io/github/languages/top/ymcui/Chinese-LLaMA-Alpaca">
<img alt="GitHub last commit" src="https://img.shields.io/github/last-commit/ymcui/Chinese-LLaMA-Alpaca">
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.0-green"></a>
<a href="https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki"><img alt="GitHub wiki" src="https://img.shields.io/badge/Github%20Wiki-v4.1-green"></a>
</p>




Large Language Models (LLM), represented by ChatGPT and GPT-4, have sparked a new wave of research in the field of natural language processing, demonstrating capabilities of Artificial General Intelligence (AGI) and attracting widespread attention from the industry. However, the expensive training and deployment of large language models have posed certain obstacles to building transparent and open academic research.

To promote open research of large models in the Chinese NLP community, this project has open-sourced the **Chinese LLaMA model and the Alpaca large model with instruction fine-tuning**. These models expand the Chinese vocabulary based on the original LLaMA and use Chinese data for secondary pre-training, further enhancing Chinese basic semantic understanding. Additionally, the project uses Chinese instruction data for fine-tuning on the basis of the Chinese LLaMA, significantly improving the model's understanding and execution of instructions. Please refer to our technical report for further details [(Cui, Yang, and Yao, 2023)](https://arxiv.org/abs/2304.08177).
Expand All @@ -25,7 +24,7 @@ To promote open research of large models in the Chinese NLP community, this proj
- 🚀 Extended Chinese vocabulary on top of original LLaMA with significant encode/decode efficiency
- 🚀 Open-sourced the Chinese LLaMA (general purpose) and Alpaca (instruction-tuned)
- 🚀 Open-sourced the pre-training and instruction finetuning (SFT) scripts for further tuning on user's data
- 🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
- 🚀 Quickly deploy and experience the quantized version of the large model on CPU/GPU of your laptop (personal PC)
- 🚀 Support [🤗transformers](https://github.com/huggingface/transformers), [llama.cpp](https://github.com/ggerganov/llama.cpp), [text-generation-webui](https://github.com/oobabooga/text-generation-webui), [LlamaChat](https://github.com/alexrozanski/LlamaChat), [LangChain](https://github.com/hwchase17/langchain), , [privateGPT](https://github.com/imartinez/privateGPT), etc.
- Released versions: 7B (basic, **Plus**), 13B (basic, **Plus**) , 33B (basic)

Expand Down Expand Up @@ -250,6 +249,8 @@ This project also conducted tests on relevant models using the "NLU" objective e

It is important to note that the comprehensive assessment of the capabilities of large models is still an urgent and significant topic to address. It is beneficial to approach the various evaluation results of large models in a rational and balanced manner to promote the healthy development of large-scale model technology. It is recommended for users to conduct tests on their own tasks and choose models that are suitable for the relevant tasks.

For C-Eval inference code, please refer to >>> [📚GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/C-Eval-performance-and-script).

## Training Details

The entire training process includes three parts: vocabulary expansion, pre-training, and instruction fine-tuning. Please refer to [merge_tokenizers.py](scripts/merge_tokenizer/merge_tokenizers.py) for vocabulary expansion; refer to [run_clm.py](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) in 🤗transformers and the relevant parts of dataset processing in the [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca) project for pre-training and self-instruct fine-tuning.
Expand Down Expand Up @@ -278,6 +279,7 @@ Q7: Chinese-LLaMA 13B model cannot be launched with llama.cpp, reporting inconsi
Q8: Chinese-Alpaca-Plus does not show better performance than the others.
Q9: The model does not perform well on NLU tasks, such as text classification.
Q10: Why 33B not 30B?
Q11: Inconsistent SHA256
```

Please refer to our >>> [📚GitHub Wiki](https://github.com/ymcui/Chinese-LLaMA-Alpaca/wiki/FAQ).
Expand Down

0 comments on commit fb9e149

Please sign in to comment.