Skip to content

PRIV-Creation/UniDiffusion

Repository files navigation

UniDiffusion

Navigate the Universe of Diffusion models with Unified workflow.

docs list GitHub open issues open issues

Introduction

workflow

UniDiffusion is a toolbox that provides state-of-the-art training and inference algorithms, based on diffusers. UniDiffusion is aimed at researchers and users who wish to deeply customize the training of stable diffusion. We hope that this code repository can provide excellent support for future research and application extensions.

If you also want to implement the following things, have fun with UniDiffusion

  • Train only cross attention (or convolution / feedforward / ...) layer.
  • Set different lr / weight decay / ... for different layers.
  • Using or supporting PEFT/PETL methods for different layers and easily merging them, e.g., finetune the convolution layer and update attention layer with lora.
  • Train all parameter in stable diffusion, including unet, vae, text_encoder, and automatically save and load.

Note: UniDiffusion is still under development. Some modules are borrowed from other code repositories and have not been tested yet, especially the components that are not enabled by default in the configuration system. We are working hard to improve this project.

⭐ Features

  • Modular Design. UniDiffusion is designed with a modular architecture. The modular design enables easy implementation of new methods.
  • Config System. LazyConfig System for more flexible syntax and cleaner config files.
  • Easy to Use.
    • Distributed Training: Using accelerate to support all distributed training environment.
    • Experiment Tracker: Using wandb to log all training information.
    • Distributed Evaluation: Evaluate ✅FID, ✅IS, CLIP Score during training

Unified Training Workflow

In UniDiffusion, all training methods are decomposed into three dimensions

  • Learnable parameters: which layer or which module will be updated.
  • PEFT/PETL method: how to update them. E.g., finetune, low-rank adaption, adapter, etc.
  • Training process: default to diffuion-denoising, which can be extended like XTI.

It allows we conduct a unified training pipeline with strong config system.

Example for difference in training workflow from other codebases.

Here is a simple example. In diffusers, training text-to-image finetune and dreambooth like:

python train_dreambooth.py --arg ......
python train_finetune.py --arg ......

and combining or adjusting some of the methods are difficult (e.g., only training cross attention during dreambooth).

In UniDiffusion, we can easily design our own training arguments in config file:

# text-to-image finetune
unet.training_args = {'': {'mode': 'finetune'}}
# text-to-image finetune with lora
unet.training_args = {'': {'mode': 'lora'}}
# update cross attention with lora
unet.training_args = {'attn2': {'mode': 'lora'}}

# dreambooth
unet.training_args = {'': {'mode': 'finetune'}}
text_encoder.training_args = {'text_embedding': {'initial': True}}
# dreambooth with small lr for text-encoder
unet.training_args = {'': {'mode': 'finetune'}}
text_encoder.training_args = {'text_embedding': {'initial': True, 'optim_kwargs': {'lr': 1e-6}}}

and then run

accelerate launch scripts/train.py --config-file /path/to/your/config

This facilitates easier customization, combination, and enhancement of methods, and also allows for the comparison of similarities and differences between methods through configuration files.

Regular Matching for Module Selection

In UniDiffusion, we provide a regular matching system for module selection. It allows us to select modules by regular matching. See Regular Matching for Module Selection for more details.

Powerful Support for PEFT/PETL Methods

We provide a powerful support for PEFT/PETL methods. See PEFT/PETL Methods for more details.

🌏 Installation

  1. Install prerequisites
  • Python 3.10
  • Pytorch 2.0 + CUDA11.8
  • CUDNN
  1. Install requirements
pip install -e requirements.txt
  1. Configuring accelerate and wandb
accelerate config
wandb login

🎉 Getting Started

See Train textual inversion / Dreambooth / LoRA / text-to-image Finetune for details.

accelerate launch scrits/common.py --config-file configs/train/text_to_image_finetune.py

Detailed Demo

  1. Train textual inversion / Dreambooth / LoRA / text-to-image Finetune.
  2. Customize your training process.

[Doing] Tutorial

  1. [TODO] Supporting new dataset.
  2. [TODO] Supporting new PETL method.
  3. [TODO] Supporting new training pipeline.

👑 Model Zoo

Supported Personalization Methods

Note: Personalization methods are decomposes in trainable parameters, PEFT/PETL methods, and training process in UniDiffusion. See config file for more details.

Supported PEFT/PETL Methods

📝 TODO

We are going to add the following features in the future. We also welcome contributions from the community. Feel free to pull requests or open an issue to discuss ideas for new features.

  • Methods:
    • preservation of class semantic priors (dreambooth).
    • XTI & Custom Diffusion.
    • RepAdapter and LyCORIS.
  • Features:
    • Merge PEFT to original model.
    • Convert model to diffusers and webui format.
    • Webui extension.

Contribution

We welcome contributions from the open-source community!

Acknowledge

Citation

If you use this toolbox in your research or wish to refer to the baseline results published here, please use the following BibTeX entries:

  • Citing UniDiffusion:
@misc{pu2022diffusion,
  author =       {Pu Cao, Tianrui Huang, Lu Yang, Qing Song},
  title =        {UniDiffusion},
  howpublished = {\url{https://github.com/PRIV-Creation/UniDiffusion}},
  year =         {2023}
}
Citation Supported Algorithms Comming soon

About

A Diffusion training toolbox based on diffusers and existing SOTA methods, including Dreambooth, Texual Inversion, LoRA, Custom Diffusion, XTI, ....

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published