Official PyTorch Implementation of "Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation" (AAAI 2025)

Quan Dao^*12† · Hao Phung^*13† · Trung Dao¹ · Dimitris N. Metaxas² · Anh Tran¹

¹VinAI Research ²Rutgers University ³Cornell University

[Paper] [HuggingFace

]

^*Equal contribution ^†Work done while at VinAI Research

TLDR: We introduce a self-corrected flow distillation method that integrates consistency models and adversarial training within the flow-matching framework, enabling consistent generation quality in both one-step and few-step sampling.

Table of Contents

Installation
Data
Checkpoints
Inference
Evaluation
Training
Acknowledgment
Contacts

Abstract: Flow matching has emerged as a promising framework for training generative models, demonstrating impressive empirical performance while offering relative ease of training compared to diffusion-based models. However, this method still requires numerous function evaluations in the sampling process. To address these limitations, we introduce a self-corrected flow distillation method that effectively integrates consistency models and adversarial training within the flow-matching framework. This work is a pioneer in achieving consistent generation quality in both few-step and one-step sampling. Our extensive experiments validate the effectiveness of our method, yielding superior results both quantitatively and qualitatively on CelebA-HQ and zero-shot benchmarks on the COCO dataset.

Details of the model architecture and experimental results can be found in our paper.

@inproceedings{dao2025scflow,
  title     = {Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation},
  author    = {Quan Dao and Hao Phung and Trung Dao and Dimitris Metaxas and Anh Tran},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  year      = {2025}
}

Please CITE our paper and give us a ⭐ whenever this repository is used to help produce published results or incorporated into other software.

TODO:

[] add our pretrained checkpoints

Installation

Tested on Linux with Python 3.10 and PyTorch 2.1.0.

conda create -n scflow python=3.10
conda activate scflow
pip install -r requirements.txt

Data

Unconditional generation

CelebA-HQ 256: Download our preprocessed lmdb as follow:

mkdir data/
cd data/
gdown --fuzzy https://drive.google.com/file/d/12NJYQv9lOZoVFcQgUeBaA0noYdewQ3lE/view?usp=sharing -O data/
unzip celeba-lmdb.zip
cd ../

Text-to-Image generation

We use a 2M-sample subset of LAION with aesthetic score > 6.25. Equivalently, laion/aesthetics_v2_6_5plus is a subset of ~3M images with at aesthetic score >=6.5 that is publicly available.

To precompute latents for a dataset for faster training:

pip install dagshub
python sd_distill/precomputed_data.py \
    --datadir laion/aesthetics_v2_6_5plus \
    --cache_dir ./data/laion_aesthetics_v2_6_5plus \
    --save_path ./data/laion_aesthetics_v2_6_5plus/latent_laion_aes/ \
    --num_samples 2_000_000 \
    --batch_size 64

Pre-computed FID Statistics

Download pre-computed dataset stats for CelebA-HQ dataset here and place them in pytorch_fid/.

Checkpoints

To download pretrained teacher checkpoints:

bash tools/download_teacher.sh

To download our distilled checkpoints:

bash tools/download_distilled.sh

All checkpoints are saved under the pretrained/ folder.

Inference

Unconditional Generation

bash tools/test.sh

Text-to-Image Generation

bash tools/infer_instaflow.sh

Evaluation

To evaluate on CelebA-HQ, add the following flags to tools/test_adm.sh to enable FID computation:

--compute_fid --output_log ${EXP}_${EPOCH_ID}_${METHOD}${STEPS}.log

Multi-GPU sampling with 8 GPUs is supported by default for faster evaluation.

Training

Unconditional Generation

bash tools/train.sh

Text-to-Image Generation

bash tools/train_instaflow.sh

Acknowledgment

This codebase builds upon Flow Matching in Latent Space (LFM). We also thank the authors of LCM, Rectified Flow, and InstaFlow for their great work and publicly available codebases that facilitated this research.

Contacts

If you have any problems, please open an issue in this repository or send an email to tienhaophung@gmail.com.

Name		Name	Last commit message	Last commit date
Latest commit History 244 Commits
assets		assets
datasets_prep		datasets_prep
distill		distill
dnnlib		dnnlib
models		models
pytorch_fid		pytorch_fid
sampler		sampler
sd_distill		sd_distill
tools		tools
torch_utils		torch_utils
.gitignore		.gitignore
EMA.py		EMA.py
LICENSE		LICENSE
__init__.py		__init__.py
ddp_utils.py		ddp_utils.py
pseudo_huber.py		pseudo_huber.py
readme.md		readme.md
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Official PyTorch Implementation of "Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation" (AAAI 2025)

Installation

Data

Unconditional generation

Text-to-Image generation

Pre-computed FID Statistics

Checkpoints

Inference

Unconditional Generation

Text-to-Image Generation

Evaluation

Training

Unconditional Generation

Text-to-Image Generation

Acknowledgment

Contacts

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Official PyTorch Implementation of "Self-Corrected Flow Distillation for Consistent One-Step and Few-Step Text-to-Image Generation" (AAAI 2025)

Installation

Data

Unconditional generation

Text-to-Image generation

Pre-computed FID Statistics

Checkpoints

Inference

Unconditional Generation

Text-to-Image Generation

Evaluation

Training

Unconditional Generation

Text-to-Image Generation

Acknowledgment

Contacts

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages