Hackable implementation of state-of-the-art open-source large language models for chinese released under the Apache 2.0 license.
Supports the following popular model checkpoints (along with all the english models supported by official Lit-GPT):
Model and usage | Model size | Reference |
---|---|---|
Yi-01 | 6B-Chat, 34B-Chat | Yi |
Baichuan 2 | 7B-Chat/Base, 13B-Chat/Base | Baichuan 2 |
ChatGLM3 | 6B, 6B-Base, 6B-32k | ChatGLM3 |
ChatGLM2 | 6B | ChatGLM2-6B |
This implementation extends on Lit-LLaMA and nanoGPT, and it's powered by Lightning Fabric ⚡.
This repository follows the main principle of openness through clarity.
Lit-GPT is:
- Simple: Single-file implementation without boilerplate.
- Correct: Numerically equivalent to the original model.
- Optimized: Runs fast on consumer hardware or at scale.
- Open-source: No strings attached.
Avoiding code duplication is not a goal. Readability and hackability are.
Clone the repo:
git clone https://github.com/metame-none/lit-gpt-chinese
cd lit-gpt-chinese
Install the minimal dependencies:
pip install -r requirements.txt
Install with all dependencies (including quantization, sentencepiece, tokenizers for Llama models, etc.):
pip install -r requirements-all.txt
(Optional) Use Flash Attention 2
Flash Attention 2 will be used automatically if PyTorch 2.2 (or higher) is installed. Currently, that requires installing PyTorch nightly, which you can get by running:
pip uninstall -y torch torchvision torchaudio torchtext
pip install --pre torch --index-url https://download.pytorch.org/whl/nightly/cu121
You are all set! 🎉
Take ChatGLM3-6B as an example:
- Download repo and checkpoints (manually or using
git lfs
):
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm3-6b $path
- Convert the checkpoint to the Lit-GPT format:
ln -snf $path checkpoints/chatglm/chatglm3-6b-hf
python scripts/convert_hf_checkpoint.py --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf
- Iteratively generate responses:
python chat/base.py --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --precision "16-true"
Optional: check the lit-gpt model is numerically equivalent to the original model.
- make the following changes to the original model (modeling_chatglm.py):
-@torch.jit.script
+# @torch.jit.script
def apply_rotary_pos_emb(x: torch.Tensor, rope_cache: torch.Tensor) -> torch.Tensor:
- check the model difference:
CUDA_VISIBLE_DEVICES=0,1 python tests/test_chatglm3.py model_diff ./checkpoints/chatglm/chatglm3-6b-hf
We provide a simple training scripts (finetune/adapter.py
, finetune/adapter_v2.py
, and finetune/lora.py
) that instruction-tunes a pretrained model on the random 10k samples from multiturn_chat_0.8M dataset.
- Download the data and generate an instruction tuning dataset:
python scripts/prepare_belle_chatglm3.py
- Run the finetuning script
For example, you can either use
Adapter (Zhang et al. 2023):
python finetune/adapter.py --data_dir ./data/belle_chat_ramdon_10k_chatglm3 --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --out_dir out/adapter/belle_chatglm3_6b --precision "bf16-true"
# test the finetuned model
python chat/adapter.py --adapter_path ./out/adapter/belle_chatglm3_6b/lit_model_adapter_finetuned.pth --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --precision "16-true"
or Adapter v2 (Gao et al. 2023):
python finetune/adapter_v2.py --data_dir ./data/belle_chat_ramdon_10k_chatglm3 --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --out_dir out/adapter_v2/belle_chatglm3_6b --precision "bf16-true"
# test the finetuned model
python chat/adapter_v2.py --adapter_path ./out/adapter_v2/belle_chatglm3_6b/lit_model_adapter_finetuned.pth --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --precision "16-true"
or LoRA (Hu et al. 2021):
python finetune/lora.py --data_dir ./data/belle_chat_ramdon_10k_chatglm3 --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --out_dir out/lora/belle_chatglm3_6b --precision "16-true"
# test the finetuned model
python chat/lora.py --lora_path ./out/lora/belle_chatglm3_6b/lit_model_lora_finetuned.pth --checkpoint_dir ./checkpoints/chatglm/chatglm3-6b-hf --precision "16-true"
(Please see the tutorials/finetune_adapter for details on the differences between the two adapter methods.)
For more details, please refer to the Lit-GPT
Lit-GPT-Chinese is released under the Apache 2.0 license.