Releases · hiyouga/LLaMA-Factory

11 Mar 13:47

hiyouga

v0.9.2

7a7071e

v0.9.2: MiniCPM-o, SwanLab, APOLLO Latest

Latest

This is the last version before LLaMA-Factory v1.0.0. We are working hard to improve the efficiency and availability.

We will attend the vLLM Beijing Meetup on Mar 16th! See you in Beijing 👋

Event info: https://mp.weixin.qq.com/s/viPRDlhnzS3qO9-96fMeeA

New features

🔥 APOLLO optimizer by @zhuhanqing in #6617
🔥 SwanLab experiment tracker by @Zeyi-Lin in #6401
🔥 Ray Trainer by @erictang000 in #6542
Batch inference with vLLM TP by @JieShenAI in #6190
QLoRA on Ascend NPU by @codemayq in #6601
Yarn and Llama3 rope scaling by @hiyouga in #6693
Support uv run by @erictang000 in #6907
Ollama modelfile auto-generation by @codemayq in #4686
Mistral tool prompt by @AlongWY in #5473
Llama3 and Qwen2 tool prompt by @hiyouga in #6367 and #6369

New models

Base models
- GPT2 (0.1B/0.4B/0.8B/1.5B) 📄
- Granite 3.0-3.1 (1B/2B/3B/8B) 📄
- PaliGemma2 (3B/10B/28B) 📄🖼️
- Moonlight (16B) 📄
- DeepSeek V2-V2.5 Base (236B) 📄
- DeepSeek V3 Base (671B) 📄
Instruct/Chat models
- Granite 3.0-3.1 (1B/2B/3B/8B) by @Tuyohai in #5922 📄🤖
- DeepSeek R1 (1.5B/7B/8B/14B/32B/70B/671B) by @Qwtdgh in #6767 📄🤖
- TeleChat2 (3B/7B/12B/35B/115B) @ge-xing in #6313 📄🤖
- Qwen2.5-VL (3B/7B/72B) by @hiyouga in #6779 📄🤖🖼️
- PaliGemma2-mix (3B/10B/28B) by @Kuangdd01 in #7060 📄🤖🖼️
- Qwen2 Audio (7B) by @BUAADreamer in #6701 📄🤖🔈
- MiniCPM-V/MiniCPM-o (8B) by @BUAADreamer in #6598 and #6631 📄🤖🖼️🔈
- InternLM3-Instruct (8B) by @hhaAndroid in #6640 📄🤖
- Marco-o1 (8B) 📄🤖
- Skywork-o1 (8B) 📄🤖
- Phi-4 (14B) 📄🤖
- Moonlight Instruct (16B) 📄
- Mistral Small (24B) 📄🤖
- QwQ (32B) 📄🤖
- Llama-3.3-Instruct (70B) 📄🤖
- QvQ (72B) 📄🤖🖼️
- DeepSeek V2-V2.5 (236B) 📄🤖
- DeepSeek V3 (671B) 📄🤖

New datasets

Supervised fine-tuning datasets
- OpenO1 (en) 📄
- Open Thoughts (en) 📄
- Open-R1-Math (en) 📄
- Chinese-DeepSeek-R1-Distill (zh) 📄

Changes

Refactor VLMs register by @hiyouga in #6600
Refactor mm plugin by @hiyouga in #6895
Refactor template by @hiyouga in #6896
Refactor data pipeline by @hiyouga in #6901
Update vlm arguments by @hiyouga in #6976
We have cleaned large files in git history using BFG Repo-Cleaner, find the backup repo here

Bug fix

Add trust_remote_code option by @yafshar in #5819
Fix mllama config by @hiyouga in #6137 and #6140
Fix mllama pad by @hiyouga in #6151 and #6874
Pin tokenizers version by @hiyouga in #6157
Fix tokenized data loading by @village-way in #6160
Show hostname in webui by @hykilpikonna in #6170
Fix VLMs zero3 training by @hiyouga in #6233
Add skip_special_tokens by @hiyouga in #6363
Support non-reenterent-gc by @hiyouga in #6364
Add disable_shuffling option by @hiyouga in #6388
Fix gen kwargs by @hiyouga in #6395
Enable module run by @youkaichao in #6457
Fix eval loss value by @hiyouga in #6465
Fix paligemma inference by @hiyouga in #6483
Add deepseek v3 template by @piamo in #5507
Add http proxy argument in dockerfile by @shibingli in #6462
Fix trainer generate by @hiyouga in #6512
Fix pixtral DPO training by @hiyouga in #6547
Fix ray args by @stephen-nju in #6564
Fix minicpm template by @BUAADreamer in #6620
Fix stop tokens for visual detection by @hiyouga in #6624
Pin vllm version by @hiyouga in #6629
Fix mllama any image by @hiyouga in #6637 and #7053
Fix tokenizer max length by @xiaosu-zhu in #6632
Fix webui locale by @steveepreston in #6653
Fix MiniCPM-o DPO training by @BUAADreamer in #6657
Fix Qwen2 MoE training by @hiyouga in #6684
Upgrade to gradio 5 by @hiyouga in #6688
Support Japanese local file by @engchina in #6698
Fix DPO loss by @yinpu in #6722
Webui thinking mode by @hiyouga in #6778
Upgrade to transformers 4.48 by @hiyouga in #6628
Fix ci by @hiyouga in #6787
Fix instructions about installing fa2 on win platform in readme by @neavo in #6788
Fix minicpmv plugin by @BUAADreamer in #6801, #6890, #6946 and #6998
Fix qwen2 tool prompt by @yueqis in #6796
Fix llama pro by @hiyouga in #6814
Allow thought in function call by @yueqis in #6797
Add ALLOW_EXTRA_ARGS by @hiyouga in #6831
Fix Qwen2vl plugin by @hiyouga in #6855
Upgrade vllm to 0.7.2 by @hiyouga in #6857
Fix unit test for tool using by @hiyouga in #6865
Skip broken data in sharegpt converter by @JJJYmmm in #6879
Fix qwen2.5 plugin for video by @JJJYmmm in #6868
Parsing chat template from tokenizer by @hiyouga in #6905 (experimental)
Fix mllama KTO training by @marko1616 in #6904
Fix grad checkpointing by @hiyouga in #6916 and #6931
Fix ollama template by @hiyouga in #6902
Fix ray example by @erictang000 in #6906
Improve error handling for media by @noahc1510 in #6128
Support split on each dataset by @SrWYG in #5522
Fix gen kwargs in training by @aliencaocao in #5451
Liger kernel for qwen2.5vl by @hiyouga in #6930
Fix lora target modules by @hiyouga in #6944
Add ray_storage_path by @erictang000 in #6920
Fix trainer.predict by @hiyouga in #6972
Add min resolution control by @hiyouga in #6975
Upgrade transformers to 4.49 by @hiyouga in #6982
Add seed in vllm batch predict by @JieShenAI in #7058
Fix pyproject.toml by @hiyouga in #7067
Upgrade CANN images by @leo-pony in #7061
Display swanlab link by @Zeyi-Lin in #7089
Fix hf engine by @hiyouga in #7120
Add bailing chat template by @oldstree in #7117
Use bicubic resampler instead of nearest by @hiyouga in #7143
Fix Qwen2Audio plugin by @lsrami in #7166
Destroy process group by @hiyouga in #7174
Fix swanlab callback by @Zeyi-Lin in #7176
Fix paligemma plugin by @hiyouga in #7181
Escape html tag in webui by @hiyouga in #7190
Upgrade vllm to 0.7.3 by @hiyouga in #7183 and #7193
Fix parser by @hiyouga in #7204
Fix function formatter by @zhangch-ss in #7201
Fix deepspeed config by @hiyouga in #7205
Fix dataloader by @hiyouga in #7207
Fix export tokenizer by @hiyouga in #7230
Update arguments by @hiyouga in #7231
Add swanlab_logdir by @Zeyi-Lin in #7219
Fix vllm batch prediction by @hiyouga in #7235
Avoid exit after saving tokenized data by @hiyouga in #7244
Support commit in env by @hiyouga in #7247
Release v0.9.2 by @hiyouga in #7242
Fix #1204 #3306 #3462 #5121 #5270 #5404 #5444 #5472 #5518 #5616 #5712 #5714 #5756 #5944 #5986 #6020 #6056 #6092 #6136 #6139 #6149 #6165 #6213 #6287 #6320 #6345 #6345 #6346 #6348 #6358 #6362 #6391 #6415 #6439 #6448 #6452 #6482 #6499 #6543 #6546 #6551 #6552 #6610 #6612 #6636 #6639 #6662 #6669 #6738 #6772 #6776 #6780 #6782 #6793 #6806 #6812 #6819 #6826 #6833 #6839 #6850 #6854 #6860 #6878 #6885 #6889 #6937 #6948 #6952 #6960 #6966 #6973 #6981 #7036 #7064 #7072 #7116 #7125 #7130 #7171 #7173 #7180 #7182 #7184 #7192 #7198 #7213 #7234 #7243

Full Changelog: v0.9.1...v0.9.2

Contributors

neavo, yafshar, and 33 other contributors

Assets 4

24 Nov 17:17

hiyouga

v0.9.1

1176cd6

v0.9.1: Many Vision Models, Qwen2.5 Coder, Gradient Fix

New features

🔥Support Llama-3.2 and Llama-3.2-Vision by @marko1616 in #5547 and #5555
🔥Support LLaVA-NeXT, LLaVA-NeXT-Video and Video-LLaVA by @BUAADreamer in #5574
🔥Support Pixtral model by @Kuangdd01 in #5581
Support EXAONE3.0 by @shing100 in #5585
Support Index-series models by @Cuiyn in #5910
Support Liger-Kernel for Qwen2-VL by @aliencaocao in #5438
Support download models from ModelHub by @huniu20 in #5642
Fix abnormal loss values in transformers 4.46 by @hiyouga in #5852 #5871
Support multi-image inference by @hiyouga in #5895
Support calculating effective tokens for SFT and DPO by @wtmlon in #6078

Note: now you can install transformers>=4.46.0,<=4.46.1 to make the gradient accumulation fix enabled.

New models

Base models
- Qwen2.5 (0.5B/1.5B/3B/7B/14B/32B/72B) 📄
- Qwen2.5-Coder (0.5B/1.5B/3B/7B/14B/32B) 📄🖥️
- Llama-3.2 (1B/3B) 📄
- OpenCoder (1.5B/8B) 📄🖥️
- Index (1.9B) 📄
Instruct/Chat models
- Qwen2.5-Instruct (0.5B/1.5B/3B/7B/14B/32B/72B) 📄🤖
- Qwen2.5-Coder-Instruct (0.5B/1.5B/3B/7B/14B/32B) 📄🤖🖥️
- Llama-3.2-Instruct (1B/3B) 📄🤖
- OpenCoder-Instruct (1.5B/8B) 📄🤖🖥️
- Index-Chat (1.9B) 📄🤖
- LLaVA-NeXT (7B/8B/13B/34B/72B/110B) 📄🤖🖼️
- LLaVA-NeXT-Video (7B/34B) 📄🤖🖼️
- Video-LLaVA (7B) 📄🤖🖼️
- Pixtral (12B) 📄🤖🖼️
- EXAONE-3.0-Instruct (8B) 📄🤖

Security fix

Fix CVE-2024-52803 by @superboy-zjc in aa6a174

Bug fix

Update version of rocm docker by @HardAndHeavy in #5427
Fix Phi-3-small template by @menibrief in #5475
Fix function call dataset process function by @whybeyoung in #5483
Add docker args by @StrangeBytesDev in #5533
Fix logger by @chengchengpei in #5546
Fix Gemma2 flash attention warning by @amrear in #5580
Update setup by @johnnynunez in #5615 #5665
Add project by @NLPJCL in #5801
Fix saving Qwen2-VL processor by @hiyouga in #5857
Support change base image in dockerfile by @sd3ntato in #5880
Fix template replace behaviour by @hiyouga in #5907
Add image_dir argument by @hiyouga in #5909
Add rank0 logger by @hiyouga in #5912
Fix DPO metrics by @hiyouga in #5913 #6052
Update datasets version by @hiyouga in #5926
Fix chat engines by @hiyouga in #5927
Fix vllm 0.6.3 by @hiyouga in #5970
Fix extra args in llamaboard by @hiyouga in #5971
Fix vllm input args by @JJJJerry in #5973
Add vllm_config args by @hiyouga in #5982 #5990
Add shm_size in docker compose config by @XYZliang in #6010
Fix tyro version by @hiyouga in #6065
Fix ci by @hiyouga in #6120
Fix Qwen2-VL inference on vLLM by @hiyouga in #6123 #6126
Release v0.9.1 by @hiyouga in #6124
Fix #3881 #4712 #5411 #5542 #5549 #5611 #5668 #5705 #5747 #5749 #5768 #5796 #5797 #5883 #5904 #5966 #5988 #6050 #6061

Full Changelog: v0.9.0...v0.9.1

Contributors

HardAndHeavy, chengchengpei, and 19 other contributors

Assets 2

08 Sep 17:14

hiyouga

v0.9.0

3aefdad

v0.9.0: Qwen2-VL, Liger-Kernel, Adam-mini

Congratulations on 30,000 stars 🎉 Follow us at X (twitter)

New features

🔥Support fine-tuning Qwen2-VL model on multi-image datasets by @simonJJJ in #5290
🔥Support time&memory-efficient Liger-Kernel via the enable_liger_kernel argument by @hiyouga
🔥Support memory-efficient Adam-mini optimizer via the use_adam_mini argument by @relic-yuexi in #5095
Support fine-tuning Qwen2-VL model on video datasets by @hiyouga in #5365 and @BUAADreamer in #4136 (needs patch huggingface/transformers#33307)
Support fine-tuning vision language models (VLMs) using RLHF/DPO/ORPO/SimPO approaches by @hiyouga
Support Unsloth's asynchronous activation offloading method via the use_unsloth_gc argument
Support vLLM 0.6.0 version
Support MFU calculation by @yzoaim in #5388

New models

Base models
- Qwen2-Math (1.5B/7B/72B) 📄🔢
- Yi-Coder (1.5B/9B) 📄🖥️
- InternLM2.5 (1.8B/7B/20B) 📄
- Gemma-2-2B 📄
- Meta-Llama-3.1 (8B/70B) 📄
Instruct/Chat models
- MiniCPM/MiniCPM3 (1B/2B/4B) by @LDLINGLINGLING in #4996 #5372 📄🤖
- Qwen2-Math-Instruct (1.5B/7B/72B) 📄🤖🔢
- Yi-Coder-Chat (1.5B/9B) 📄🤖🖥️
- InternLM2.5-Chat (1.8B/7B/20B) 📄🤖
- Qwen2-VL-Instruct (2B/7B) 📄🤖🖼️
- Gemma-2-2B-it by @codemayq in #5037 📄🤖
- Meta-Llama-3.1-Instruct (8B/70B) 📄🤖
- Mistral-Nemo-Instruct (12B) 📄🤖

New datasets

Supervised fine-tuning datasets
- Magpie-ultra-v0.1 (en) 📄
- Pokemon-gpt4o-captions (en&zh) 📄🖼️
Preference datasets
- RLHF-V (en) 📄🖼️
- VLFeedback (en) 📄🖼️

Changes

Due to compatibility consideration, fine-tuning vision language models (VLMs) requires transformers>=4.35.0.dev0, try pip install git+https://github.com/huggingface/transformers.git to install it.
visual_inputs has been deprecated, now you do not need to specify this argument.
LlamaFactory now adopts lazy loading for multimodal inputs, see #5346 for details. Please use preprocessing_batch_size to restrict the batch size in dataset pre-processing (supported by @naem1023 in #5323 ).
LlamaFactory now supports lmf (equivalent to llamafactory-cli) as a shortcut command.

Bug fix

Fix LlamaBoard export by @liuwwang in #4950
Add ROCm dockerfiles by @HardAndHeavy in #4970
Fix deepseek template by @piamo in #4892
Fix pissa savecallback by @codemayq in #4995
Add Korean display language in LlamaBoard by @Eruly in #5010
Fix deepseekcoder template by @relic-yuexi in #5072
Fix examples by @codemayq in #5109
Fix mask_history truncate from last by @YeQiuO in #5115
Fix jinja template by @YeQiuO in #5156
Fix PPO optimizer and lr scheduler by @liu-zichen in #5163
Add SailorLLM template by @chenhuiyu in #5185
Fix XPU device count by @Zxilly in #5188
Fix bf16 check in NPU by @Ricardo-L-C in #5193
Update NPU docker image by @MengqingCao in #5230
Fix image input api by @marko1616 in #5237
Add liger-kernel link by @ByronHsu in #5317
Fix #4684 #4696 #4917 #4925 #4928 #4944 #4959 #4992 #5035 #5048 #5060 #5092 #5228 #5252 #5292 #5295 #5305 #5307 #5308 #5324 #5331 #5334 #5338 #5344 #5366 #5384

Contributors

HardAndHeavy, piamo, and 18 other contributors

Assets 2

18 Jul 18:00

hiyouga

v0.8.3

542658c

v0.8.3: Neat Packing, Split Evaluation

New features

🔥Support contamination-free packing via the neat_packing argument by @chuan298 in #4224
🔥Support split evaluation via the eval_dataset argument by @codemayq in #4691
🔥Support HQQ/EETQ quantization via the quantization_method argument by @hiyouga
🔥Support ZeRO-3 when using BAdam by @Ledzy in #4352
Support train on the last turn via the mask_history argument by @aofengdaxia in #4878
Add NPU Dockerfile by @MengqingCao in #4355
Support building FlashAttention2 in Dockerfile by @hzhaoy in #4461
Support batch_eval_metrics at evaluation by @hiyouga

New models

Base models
- InternLM2.5-7B 📄
- Gemma2 (9B/27B) 📄
Instruct/Chat models
- TeleChat-1B-Chat by @hzhaoy in #4651 📄🤖
- InternLM2.5-7B-Chat 📄🤖
- CodeGeeX4-9B-Chat 📄🤖
- Gemma2-it (9B/27B) 📄🤖

Changes

Fix DPO cutoff len and deprecate reserved_label_len argument
Improve loss function for reward modeling

Bug fix

Fix numpy version by @MengqingCao in #4382
Improve cli by @kno10 in #4409
Add tool_format parameter to control prompt by @mMrBun in #4417
Automatically label npu issue by @MengqingCao in #4445
Fix flash_attn args by @stceum in #4446
Fix docker-compose path by @MengqingCao in #4544
Fix torch-npu dependency by @hashstone in #4561
Fix deepspeed + pissa by @hzhaoy in #4580
Improve cli by @injet-zhou in #4590
Add project by @wzh1994 in #4662
Fix docstring by @hzhaoy in #4673
Fix Windows command preview in WebUI by @marko1616 in #4700
Fix vllm 0.5.1 by @T-Atlas in #4706
Fix save value head model callback by @yzoaim in #4746
Fix CUDA Dockerfile by @hzhaoy in #4781
Fix examples by @codemayq in #4804
Fix evaluation data split by @codemayq in #4821
Fix CI by @codemayq in #4822
Fix #2290 #3974 #4113 #4379 #4398 #4402 #4410 #4419 #4432 #4456 #4458 #4549 #4556 #4579 #4592 #4609 #4617 #4674 #4677 #4683 #4684 #4699 #4705 #4731 #4742 #4779 #4780 #4786 #4792 #4820 #4826

Contributors

kno10, hashstone, and 14 other contributors

Assets 2

19 Jun 13:06

hiyouga

v0.8.2

fded230

v0.8.2: PiSSA, Parallel Functions

New features

Support GLM-4 tools and parallel function calling by @mMrBun in #4173
Support PiSSA fine-tuning by @hiyouga in #4307

New models

Base models
- DeepSeek-Coder-V2 (16B MoE/236B MoE) 📄
Instruct/Chat models
- MiniCPM-2B 📄🤖
- DeepSeek-Coder-V2-Instruct (16B MoE/236B MoE) 📄🤖

New datasets

Supervised fine-tuning datasets
- Neo-sft (zh)
- Magpie-Pro-300K-Filtered (en) by @EliMCosta in #4309
- WebInstruct (en) by @EliMCosta in #4309

Bug fix

Fix DPO+ZeRO3 problem by @hiyouga
Add MANIFEST.in by @iamthebot in #4191
Fix eos_token in llama3 pretrain by @dignfei in #4204
Fix vllm version by @kimdwkimdw and @hzhaoy in #4234 and #4246
Fix Dockerfile by @EliMCosta in #4314
Fix pandas version by @zzxzz12345 in #4334
Fix #3162 #3196 #3778 #4198 #4209 #4221 #4227 #4238 #4242 #4271 #4292 #4295 #4326 #4346 #4357 #4362

Contributors

kimdwkimdw, zzxzz12345, and 6 other contributors

Assets 2

10 Jun 16:50

hiyouga

v0.8.1

75e1bbf

v0.8.1: Patch release

Fix #2666: Unsloth+DoRA
Fix #4145: The PyTorch version of the docker image does not match the vLLM requirement
Fix #4160: The problem in LongLoRA implementation with the help of @f-q23
Fix #4167: The installation problem in the Windows system by @yzoaim

Contributors

yzoaim and fffffq99

Assets 2

07 Jun 22:26

hiyouga

v0.8.0

ce40d12

v0.8.0: GLM-4, Qwen2, PaliGemma, KTO, SimPO

Stronger LlamaBoard 💪😀

Support single-node distributed training in Web UI
Add dropdown menu for easily resuming from checkpoints and picking saved configurations by @hiyouga and @hzhaoy in #4053
Support selecting checkpoints of full/freeze tuning
Add throughput metrics to LlamaBoard by @injet-zhou in #4066
Faster UI loading

New features

Add KTO algorithm by @enji-zhou in #3785
Add SimPO algorithm by @hiyouga
Support passing max_lora_rank to the vLLM backend by @jue-jue-zi in #3794
Support preference datasets in sharegpt format and remove big files from git repo by @hiyouga in #3799
Support setting system messages in CLI inference by @ycjcl868 in #3812
Add num_samples option in dataset_info.json by @seanzhang-zhichen in #3829
Add NPU docker image by @dongdongqiang2018 in #3876
Improve NPU document by @MengqingCao in #3930
Support SFT packing with greedy knapsack algorithm by @AlongWY in #4009
Add llamafactory-cli env for bug report
Support image input in the API mode
Support random initialization via the train_from_scratch argument
Initialize CI

New models

Base models
- Qwen2 (0.5B/1.5B/7B/72B/MoE) 📄
- PaliGemma-3B (pt/mix) 📄🖼️
- GLM-4-9B 📄
- Falcon-11B 📄
- DeepSeek-V2-Lite (16B) 📄
Instruct/Chat models
- Qwen2-Instruct (0.5B/1.5B/7B/72B/MoE) 📄🤖
- Mistral-7B-Instruct-v0.3 📄🤖
- Phi-3-small-8k-instruct (7B) 📄🤖
- Aya-23 (8B/35B) 📄🤖
- OpenChat-3.6-8B 📄🤖
- GLM-4-9B-Chat 📄🤖
- TeleChat-12B-Chat by @hzhaoy in #3958 📄🤖
- Phi-3-medium-8k-instruct (14B) 📄🤖
- DeepSeek-V2-Lite-Chat (16B) 📄🤖
- Codestral-22B-v0.1 📄🤖

New datasets

Pre-training datasets
- FineWeb (en)
- FineWeb-Edu (en)
Supervised fine-tuning datasets
- Ruozhiba-GPT4 (zh)
- STEM-Instruction (zh)
Preference datasets
- Argilla-KTO-mix-15K (en)
- UltraFeedback (en)

Bug fix

Fix RLHF for multimodal finetuning
Fix LoRA target in multimodal finetuning by @BUAADreamer in #3835
Fix yi template by @Yimi81 in #3925
Fix abort issue in LlamaBoard by @injet-zhou in #3987
Pass scheduler_specific_kwargs to get_scheduler by @Uminosachi in #4006
Fix hyperparameters helps by @xu-song in #4007
Update issue template by @statelesshz in #4011
Fix vllm dtype parameter
Fix exporting hyperparameters by @MengqingCao in #4080
Fix DeepSpeed ZeRO3 in PPO trainer
Fix #3108 #3387 #3646 #3717 #3764 #3769 #3803 #3807 #3818 #3837 #3847 #3853 #3873 #3900 #3931 #3965 #3971 #3978 #3992 #4005 #4012 #4013 #4022 #4033 #4043 #4061 #4075 #4077 #4079 #4085 #4090 #4120 #4132 #4137 #4139

Contributors

ycjcl868, xu-song, and 13 other contributors

Assets 2

15 May 18:16

hiyouga

v0.7.1

b5034f2

v0.7.1: Ascend NPU Support, Yi-VL Models

🚨🚨 Core refactor 🚨🚨

Add CLIs usage, now we recommend using llamafactory-cli to launch training and inference, the entry point is located at the cli.py
Rename files: train_bash.py -> train.py, train_web.py -> webui.py, api_demo.py -> api.py
Remove files: cli_demo.py, evaluate.py, export_model.py, web_demo.py, use llamafactory-cli chat/eval/export/webchat instead
Use YAML configs in examples instead of shell scripts for a pretty view
Remove the sha1 hash check when loading datasets
Rename arguments: num_layer_trainable -> freeze_trainable_layers, name_module_trainable -> freeze_trainable_modules

The above changes are made by @hiyouga in #3596

REMINDER: Now installation is mandatory to use LLaMA Factory

New features

Support training and inference on the Ascend NPU 910 devices by @zhou-wjjw and @statelesshz (docker images are also provided)
Support stop parameter in vLLM engine by @zhaonx in #3527
Support fine-tuning token embeddings in freeze tuning via the freeze_extra_modules argument
Add Llama3 quickstart to readme

New models

Base models
- Yi-1.5 (6B/9B/34B) 📄
- DeepSeek-V2 (236B) 📄
Instruct/Chat models
- Yi-1.5-Chat (6B/9B/34B) 📄🤖
- Yi-VL-Chat (6B/34B) by @BUAADreamer in #3748 📄🖼️🤖
- Llama3-Chinese-Chat (8B/70B) 📄🤖
- DeepSeek-V2-Chat (236B) 📄🤖

Bug fix

Add badam arguments to LlamaBoard by @codemayq in #3487
Add openai data format to readme by @khazic in #3490
Fix slow operation in dpo/orpo trainer by @hiyouga
Fix badam examples by @pha123661 in #3578
Fix download link of the nectar_rm dataset by @ZeyuTeng96 in #3588
Add project by @Katehuuh in #3601
Fix dockerfile by @gaussian8 in #3604
Fix full tuning of MLLMs by @BUAADreamer in #3651
Fix gradio environment variables by @cocktailpeanut in #3654
Fix typo and add log in API by @Tendo33 in #3655
Fix download link of the phi-3 model by @YUUUCC in #3683
Fix #3559 #3560 #3602 #3603 #3606 #3625 #3650 #3658 #3674 #3694 #3702 #3724 #3728

Contributors

gaussian8, codemayq, and 12 other contributors

Assets 2

27 Apr 20:24

hiyouga

v0.7.0

eeff641

v0.7.0: LLaVA Multimodal LLM Support

Congratulations on 20k stars 🎉 We are the 1st of the GitHub Trending at Apr. 23rd 🔥 Follow us at X

New features

Support SFT/PPO/DPO/ORPO for the LLaVA-1.5 model by @BUAADreamer in #3450
Support inferring the LLaVA-1.5 model with both native Transformers and vLLM by @hiyouga in #3454
Support vLLM+LoRA inference for partial models (see support list)
Support 2x faster generation of the QLoRA model based on UnslothAI's optimization
Support adding new special tokens to the tokenizer via the new_special_tokens argument
Support choosing the device to merge LoRA in LlamaBoard via the export_device argument
Add a Colab notebook for getting into fine-tuning the Llama-3 model on a free T4 GPU
Automatically enable SDPA attention and fast tokenizer for higher performance

New models

Base models
- OLMo-1.7-7B
- Jamba-v0.1-51B
- Qwen1.5-110B
- DBRX-132B-Base
Instruct/Chat models
- Phi-3-mini-3.8B-instruct (4k/128k)
- LLaVA-1.5-7B
- LLaVA-1.5-13B
- Qwen1.5-110B-Chat
- DBRX-132B-Instruct

New datasets

Supervised fine-tuning datasets
- LLaVA mixed (en&zh) by @BUAADreamer in #3471
Preference datasets
- DPO mixed (en&zh) by @hiyouga

Bug fix

Fix #2093 #3333 #3347 #3374 #3387

Contributors

hiyouga and BUAADreamer

Assets 2

21 Apr 15:43

hiyouga

v0.6.3

869777b

v0.6.3: Llama-3 and 3x Longer QLoRA

New features

Support Meta Llama-3 (8B/70B) models
Support UnslothAI's long-context QLoRA optimization (56,000 context length for Llama-2 7B in 24GB)
Support previewing local datasets in directories in LlamaBoard by @codemayq in #3291

New algorithms

Support BAdam algorithm by @Ledzy in #3287
Support Mixture-of-Depths training by @mlinmg in #3338

New models

Base models
- CodeGemma (2B/7B)
- CodeQwen1.5-7B
- Llama-3 (8B/70B)
- Mixtral-8x22B-v0.1
Instruct/Chat models
- CodeGemma-7B-it
- CodeQwen1.5-7B-Chat
- Llama-3-Instruct (8B/70B)
- Command R (35B) by @marko1616 in #3254
- Command R+ (104B) by @marko1616 in #3254
- Mixtral-8x22B-Instruct-v0.1

Bug fix

Fix full-tuning batch prediction examples by @khazic in #3261
Fix output_router_logits of Mixtral by @liu-zichen in #3276
Fix automodel from pretrained with attn implementation (see huggingface/transformers#30298)
Fix unable to convergence issue in the layerwise galore optimizer (see huggingface/transformers#30371)
Fix #3184 #3238 #3247 #3273 #3316 #3317 #3324 #3348 #3352 #3365 #3366

Contributors

codemayq, Ledzy, and 4 other contributors

Assets 2

Releases: hiyouga/LLaMA-Factory

v0.9.2: MiniCPM-o, SwanLab, APOLLO

This is the last version before LLaMA-Factory v1.0.0. We are working hard to improve the efficiency and availability.

We will attend the vLLM Beijing Meetup on Mar 16th! See you in Beijing 👋

New features

New models

New datasets

Changes

Bug fix

Contributors

v0.9.1: Many Vision Models, Qwen2.5 Coder, Gradient Fix

New features

New models

Security fix

Bug fix

Contributors

v0.9.0: Qwen2-VL, Liger-Kernel, Adam-mini

Congratulations on 30,000 stars 🎉 Follow us at X (twitter)

New features

New models

New datasets

Changes

Bug fix

Contributors

v0.8.3: Neat Packing, Split Evaluation

New features

New models

Changes

Bug fix

Contributors

v0.8.2: PiSSA, Parallel Functions

New features

New models

New datasets

Bug fix

Contributors

v0.8.1: Patch release

Contributors

v0.8.0: GLM-4, Qwen2, PaliGemma, KTO, SimPO

Stronger LlamaBoard 💪😀

New features

New models

New datasets

Bug fix

Contributors

v0.7.1: Ascend NPU Support, Yi-VL Models

🚨🚨 Core refactor 🚨🚨

REMINDER: Now installation is mandatory to use LLaMA Factory

New features

New models

Bug fix

Contributors

v0.7.0: LLaVA Multimodal LLM Support

Congratulations on 20k stars 🎉 We are the 1st of the GitHub Trending at Apr. 23rd 🔥 Follow us at X

New features

New models

New datasets

Bug fix

Contributors

v0.6.3: Llama-3 and 3x Longer QLoRA

New features

New algorithms

New models

Bug fix

Contributors