Skip to content

Releases: hiyouga/LLaMA-Factory

v0.7.0: LLaVA Multimodal LLM Support

27 Apr 20:24
Compare
Choose a tag to compare

Congratulations on 20k stars πŸŽ‰ We are the 1st of the GitHub Trending at Apr. 23rd πŸ”₯ Follow us at X

New features

  • Support SFT/PPO/DPO/ORPO for the LLaVA-1.5 model by @BUAADreamer in #3450
  • Support inferring the LLaVA-1.5 model with both native Transformers and vLLM by @hiyouga in #3454
  • Support vLLM+LoRA inference for partial models (see support list)
  • Support 2x faster generation of the QLoRA model based on UnslothAI's optimization
  • Support adding new special tokens to the tokenizer via the new_special_tokens argument
  • Support choosing the device to merge LoRA in LlamaBoard via the export_device argument
  • Add a Colab notebook for getting into fine-tuning the Llama-3 model on a free T4 GPU
  • Automatically enable SDPA attention and fast tokenizer for higher performance

New models

  • Base models
    • OLMo-1.7-7B
    • Jamba-v0.1-51B
    • Qwen1.5-110B
    • DBRX-132B-Base
  • Instruct/Chat models
    • Phi-3-mini-3.8B-instruct (4k/128k)
    • LLaVA-1.5-7B
    • LLaVA-1.5-13B
    • Qwen1.5-110B-Chat
    • DBRX-132B-Instruct

New datasets

  • Supervised fine-tuning datasets
  • Preference datasets

Bug fix

v0.6.3: Llama-3 and 3x Longer QLoRA

21 Apr 15:43
Compare
Choose a tag to compare

New features

  • Support Meta Llama-3 (8B/70B) models
  • Support UnslothAI's long-context QLoRA optimization (56,000 context length for Llama-2 7B in 24GB)
  • Support previewing local datasets in directories in LlamaBoard by @codemayq in #3291

New algorithms

New models

  • Base models
    • CodeGemma (2B/7B)
    • CodeQwen1.5-7B
    • Llama-3 (8B/70B)
    • Mixtral-8x22B-v0.1
  • Instruct/Chat models
    • CodeGemma-7B-it
    • CodeQwen1.5-7B-Chat
    • Llama-3-Instruct (8B/70B)
    • Command R (35B) by @marko1616 in #3254
    • Command R+ (104B) by @marko1616 in #3254
    • Mixtral-8x22B-Instruct-v0.1

Bug fix

v0.6.2: ORPO and Qwen1.5-32B

11 Apr 12:27
Compare
Choose a tag to compare

New features

  • Support ORPO algorithm by @hiyouga in #3066
  • Support inferring BNB 4-bit models on multiple GPUs via the quantization_device_map argument
  • Reorganize README files, move example scripts to the examples folder
  • Support saving & loading arguments quickly in LlamaBoard by @hiyouga and @marko1616 in #3046
  • Support load alpaca-format dataset from the hub without dataset_info.json by specifying --dataset_dir ONLINE
  • Add a parameter moe_aux_loss_coef to control the coefficient of auxiliary loss in MoE models.

New models

  • Base models
    • Breeze-7B-Base
    • Qwen1.5-MoE-A2.7B (14B)
    • Qwen1.5-32B
  • Instruct/Chat models
    • Breeze-7B-Instruct
    • Qwen1.5-MoE-A2.7B-Chat (14B)
    • Qwen1.5-32B-Chat

Bug fix

v0.6.1: Patch release

29 Mar 04:07
Compare
Choose a tag to compare

This patch mainly fixes #2983

In commit 9bec3c9, we built the optimizer and scheduler inside the trainers, which inadvertently introduced a bug: when DeepSpeed was enabled, the trainers in transformers would build an optimizer and scheduler before calling the create_optimizer_and_scheduler method [1], then the optimizer created by our method would overwrite the original one, while the scheduler would not. Consequently, the scheduler would no longer affect the learning rate in the optimizer, leading to a regression in the training result. We have fixed this bug in 3bcd41b and 8c77b10. Thank @HideLord for helping us identify this critical bug.

[1] https://github.com/huggingface/transformers/blob/v4.39.1/src/transformers/trainer.py#L1877-L1881

We have also fixed #2961 #2981 #2982 #2983 #2991 #3010

v0.6.0: Paper Release, GaLore and FSDP+QLoRA

25 Mar 15:50
Compare
Choose a tag to compare

We released our paper on arXiv! Thanks to all co-authors and AK's recommendation

New features

  • Support GaLore algorithm, allowing full-parameter learning of a 7B model using less than 24GB VRAM
  • Support FSDP+QLoRA that allows QLoRA fine-tuning of a 70B model on 2x24GB GPUs
  • Support LoRA+ algorithm for better LoRA fine-tuning by @qibaoyuan in #2830
  • LLaMA Factory 🀝 vLLM, enjoy 270% inference speed with --infer_backend vllm
  • Add Colab notebook for easily getting started
  • Support pushing fine-tuned models to Hugging Face Hub in web UI
  • Support apply_chat_template by adding a chat template to the tokenizer after fine-tuning
  • Add dockerize support by @S3Studio in #2743 #2849

New models

  • Base models
    • OLMo (1B/7B)
    • StarCoder2 (3B/7B/15B)
    • Yi-9B
  • Instruct/Chat models
    • OLMo-7B-Instruct

New datasets

  • Supervised fine-tuning datasets
    • Cosmopedia (en)
  • Preference datasets
    • Orca DPO (en)

Bug fix

v0.5.3: DoRA and AWQ/AQLM QLoRA

28 Feb 17:01
Compare
Choose a tag to compare

New features

New models

  • Base models
    • Gemma (2B/7B)
  • Instruct/Chat models
    • Gemma-it (2B/7B)

Bug fix

v0.5.2: Block Expansion, Qwen1.5 Models

20 Feb 07:32
Compare
Choose a tag to compare

New features

  • Support block expansion in LLaMA Pro, see tests/llama_pro.py for usage
  • Add use_rslora option for the LoRA method

New models

  • Base models
    • Qwen1.5 (0.5B/1.8B/4B/7B/14B/72B)
    • DeepSeekMath-7B-Base
    • DeepSeekCoder-7B-Base-v1.5
    • Orion-14B-Base
  • Instruct/Chat models
    • Qwen1.5-Chat (0.5B/1.8B/4B/7B/14B/72B)
    • MiniCPM-2B-SFT/DPO
    • DeepSeekMath-7B-Instruct
    • DeepSeekCoder-7B-Instruct-v1.5
    • Orion-14B-Chat
    • Orion-14B-Long-Chat
    • Orion-14B-RAG-Chat
    • Orion-14B-Plugin-Chat

New datasets

  • Supervised fine-tuning datasets
    • SlimOrca (en)
    • Dolly (de)
    • Dolphin (de)
    • Airoboros (de)
  • Preference datasets
    • Orca DPO (de)

Bug fix

v0.5.0: Agent Tuning, Unsloth Integration

20 Jan 18:37
Compare
Choose a tag to compare

Congratulations on 10k stars πŸŽ‰ Make LLM fine-tuning easier and faster together with LLaMA-Factory ✨

New features

  • Support agent tuning for most models, you can fine-tune any LLMs with --dataset glaive_toolcall for tool using #2226
  • Support function calling in both API and Web mode with fine-tuned models, same as the OpenAI's format
  • LLaMA Factory 🀝 Unsloth, enjoy 170% LoRA training speed with --use_unsloth, see benchmarking here
  • Supports fine-tuning models on MPS device #2090

New models

  • Base models
    • Phi-2 (2.7B)
    • InternLM2 (7B/20B)
    • SOLAR-10.7B
    • DeepseekMoE-16B-Base
    • XVERSE-65B-2
  • Instruct/Chat models
    • InternLM2-Chat (7B/20B)
    • SOLAR-10.7B-Instruct
    • DeepseekMoE-16B-Chat
    • Yuan (2B/51B/102B)

New datasets

  • Supervised fine-tuning datasets
    • deepctrl dataset
    • Glaive function calling dataset v2

Core updates

  • Refactor data engine: clearer dataset alignment, easier templating and tool formatting
  • Refactor saving logic for models with value head #1789
  • Use ruff code formatter for stylish code

Bug fix

v0.4.0: Mixtral-8x7B, DPO-ftx, AutoGPTQ Integration

16 Dec 13:48
Compare
Choose a tag to compare

🚨🚨 Core refactor

  • Deprecate checkpoint_dir and use adapter_name_or_path instead
  • Replace resume_lora_training with create_new_adapter
  • Move the patches in model loading to llmtuner.model.patcher
  • Bump to Transformers 4.36.1 to adapt to the Mixtral models
  • Wide adaptation for FlashAttention2 (LLaMA, Falcon, Mistral)
  • Temporarily disable LongLoRA due to breaking changes, which will be supported later

The above changes were made by @hiyouga in #1864

New features

  • Add DPO-ftx: mixing fine-tuning gradients to DPO via the dpo_ftx argument, suggested by @lylcst in #1347 (comment)
  • Integrate AutoGPTQ into the model export via the export_quantization_bit and export_quantization_dataset arguments
  • Support loading datasets from ModelScope Hub by @tastelikefeet and @wangxingjun778 in #1802
  • Support resizing token embeddings with the noisy mean initialization by @hiyouga in a66186b
  • Support system column in both alpaca and sharegpt dataset formats

New models

  • Base models
    • Mixtral-8x7B-v0.1
  • Instruct/Chat models
    • Mixtral-8x7B-v0.1-instruct
    • Mistral-7B-Instruct-v0.2
    • XVERSE-65B-Chat
    • Yi-6B-Chat

Bug fix

v0.3.3: ModelScope Integration, Reward Server

03 Dec 14:17
Compare
Choose a tag to compare

New features

  • Support loading pre-trained models from ModelScope Hub by @tastelikefeet in #1700
  • Support launching a reward model server in demo API via specifying --stage=rm in api_demo.py
  • Support using a reward model server in PPO training via specifying --reward_model_type api
  • Support adjusting the shard size of exported models via the export_size argument

New models

  • Base models
    • DeepseekLLM-Base (7B/67B)
    • Qwen (1.8B/72B)
  • Instruct/Chat models
    • DeepseekLLM-Chat (7B/67B)
    • Qwen-Chat (1.8B/72B)
    • Yi-34B-Chat

New datasets

  • Supervised fine-tuning datasets
  • Preference datasets

Bug fix