Skip to content

feizc/MLE-LLaMA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

logo-pic

MLE-LLaMA: Multi-Language Enhanced LLaMA

This project aims to make LLaMa understand Chinese, and can generate fluency chinese. We are inspired that LLaMa have learned good English expression and a little alignment prompt can makes it capture Chinese.

  • Token vocabulary support for multi-language. We found that llama tokenizer naturally support for Chinese.

  • Fine-tuning llama script.

    (1) download original ckpt from huggingface, and put them into file path ckpt.

    (2) train.py original script must be run on 80G A100 and more techniques should be employed.

    (3) train_lora.py lora fine-tuning using pert.

    Argument Values
    batch size 128 * 8
    epochs 3
    cut length 256
    learning rate 2e-5
    speed 1.02s / it
  • Fine-grained english-chinese alignment dataset. We colleced the high-quality English-Chinese pairs and can be download in google drive.

    We also found that BELLE provide ckpts and chinese dataset, strongly recommended to refer it.

  • Instructing tuning. We use chinese alpaca and GuanacoDataset for instructing tunning.

  • Open source checkpoints, gradio scripts and cases. We found that LLaMA model tends to generate long sentences.

case11

case12

case13

Reference

[1] https://github.com/facebookresearch/llama

[2] https://github.com/tatsu-lab/stanford_alpaca

[3] https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling

[4] https://github.com/tloen/alpaca-lora

[5] https://github.com/LianjiaTech/BELLE

About

Multi-language Enhanced LLaMA

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages