Official PyTorch implementation of Generalized Consistency Trajectory Models for Image Manipulation by Beomsu Kim*, Jaemin Kim*, Jeongsol Kim, and Jong Chul Ye (*Equal contribution).
Diffusion models suffer from two limitations.
- They require large number of function evaluations (NFEs) to generate high-fidelity images.
- They only enable noise-to-image generation.
We propose the Generalized Consistency Trajectory Model (GCTM), which learns the probability flow ODE (PFODE) between arbitrary distributions via Flow Matching theory. Thus, GCTMs are capable of
- Noise-to-image and image-to-image translation,
- Score or velocity evaluation with NFE = 1,
- Traversal between arbitrary points of the PFODE with NFE = 1.
Consequently, GCTMs are applicable to a wide variety of tasks, such as but not limited to
- Unconditional generation
- Image-to-image translation
- Zero-shot and supervised image restoration
- Image editing
- Latent manipulation
- CUDA version 12.0
- NVCC version 11.5.119
- Python version 3.11.5
- PyTorch version 2.0.1+cu118
- Torchvision version 0.15.2+cu118
- Torchaudio version 2.0.2+cu118
- CIFAR10 : https://www.cs.toronto.edu/~kriz/cifar.html
- FFHQ : https://github.com/NVlabs/ffhq-dataset
- Image-to-Image : https://efrosgans.eecs.berkeley.edu/pix2pix/datasets/
Use train_gctm.py
to train unconditional and image-to-image models, and use train_gctm_inverse.py
to train supervised image restoration models. To train unconditional or image-to-image models, one first needs to create a FID_stats
directory and save the Inception activation statistics in the format (dataset name)_(resolution).npz
. Inception activation statistics can be computed using save_fid_stats
function in ./pytorch_fid/fid_score.py
. Or, you can just comment out FID evaluation lines in the training code.
Example training scripts are provided in the ./configs
directory. For instance, to train a CIFAR10 unconditional model with independent coupling, one may use the command
sh ./configs/unconditional/cifar10.sh
If you find this paper useful for your research, please consider citing
@article{
kim2024gctm,
title={Generalized Consistency Trajectory Models for Image Manipulation},
author={Beomsu Kim and Jaemin Kim and Jeongsol Kim and Jong Chul Ye},
journal={arXiv preprint arXiv:2403.12510},
year={2024}
}