MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation

Official PyTorch implementation of the AAAI 2023 paper

Junyoung Seo, Gyuseong Lee, Seokju Cho, Jiyoung Lee, Seungryong Kim,

*equal contribution

For more information, check out the paper on Arxiv or Project page

Preparation

Environmental Settings

Clone the Synchronized-BatchNorm-PyTorch repository.

cd models/
git clone https://github.com/vacancy/Synchronized-BatchNorm-PyTorch
cp -rf Synchronized-BatchNorm-PyTorch/sync_batchnorm .
cd ..

And, Download the weight of VQ-autoencoder(f=4, VQ, Z=8192, d=3) here and move model.ckpt and config.yaml to models/vq-f4.

After that, please install dependencies.

conda env create -f environment.yml
conda activate midms

Also, if you already have an LDM or Stable Diffusion Models environment, you can use it as well.

Pretrained Models

We provide finetuned model on CelebA-HQ(edge-to-face). Download the weight here.

Put the weights as followings:

└── weights

    └── celeba
    
        └── midms_celebA_finetuned.pth
    
        └── pretrained
    
            └── config.yaml
        
            └── model.ckpt

Datasets

For the datasets, we used the train and validation set provided by CoCosNet, which can be downloaded from here.

Inference

Prepare the validation dataset as speicified above, and run inference.py, e.g.,

python inference.py --benchmark celebahqedge --inference_mode target_fixed --pick 11

where pick is index of condition image (e.g., sketch). If you want to evaluate the model using the validation set, change the value of inference_mode from target_fixed to evaluation.

Training

Before starting fine-tuning for MIDMs, we first pretrain LDM on the desired dataset following here, or alternatively, the pretrained weights can be obtained from the model zoo.

Additionally, pretrained VGG model is required. Please download from the Training section of CoCosNet repository, and move it to models/. We used 8 NVIDIA RTX 3090s for finetuning, and it took an average of 5-12 hours per dataset.

Run train.py like:

torchrun --standalone --nproc_per_node=<NUM_GPU> train.py \
    --benchmark celebahqedge \
    --diffusion_config_path "weights/celeba/pretrained/config.yaml" \
    --diffusion_model_path "weights/celeba/pretrained/model.ckpt" \
    --phase e2e_recurrent --dataroot "/downloaded/dataset/folder" \
    --batch-size <BATCH_SIZE> \
    --snapshots "/path/to/save/results" --warmup_iter 10000

TIP

We discovered that the number of warm-up iterations and the number of training epochs are important when fine-tuning. If training for too long, collapse can occur. In addition, by adjusting the scaling factor of perceptual loss and style loss, the trade-off can be reduced. Finally, the training code is not yet well organized. It is currently being organized and if you encounter any errors or difficulties in implementation, please feel free to contact us.

Acknowledgement

This code implementation is heavily borrowed from the official implementation of LDM and CoCosNet. We are deeply grateful for all of the projects.

Bibtex

@article{seo2022midms,
  title={MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation},
  author={Seo, Junyoung and Lee, Gyuseong and Cho, Seokju and Lee, Jiyoung and Kim, Seungryong},
  journal={arXiv preprint arXiv:2209.11047},
  year={2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
data		data
models		models
util		util
utils_training		utils_training
.gitignore		.gitignore
README.md		README.md
environment.yaml		environment.yaml
inference.py		inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

models

models

util

util

utils_training

utils_training

.gitignore

.gitignore

README.md

README.md

environment.yaml

environment.yaml

inference.py

inference.py

train.py

train.py

Repository files navigation

MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation

Official PyTorch implementation of the AAAI 2023 paper

Junyoung Seo, Gyuseong Lee, Seokju Cho, Jiyoung Lee, Seungryong Kim,

Preparation

Environmental Settings

Pretrained Models

Datasets

Inference

Training

TIP

Acknowledgement

Bibtex

About

Releases

Packages

Contributors 2

Languages

KU-CVLAB/MIDMs

Folders and files

Latest commit

History

Repository files navigation

MIDMs: Matching Interleaved Diffusion Models for Exemplar-based Image Translation

Official PyTorch implementation of the AAAI 2023 paper

Junyoung Seo*, Gyuseong Lee*, Seokju Cho, Jiyoung Lee, Seungryong Kim,

Preparation

Environmental Settings

Pretrained Models

Datasets

Inference

Training

TIP

Acknowledgement

Bibtex

About

Resources

Stars

Watchers

Forks

Languages

Junyoung Seo, Gyuseong Lee, Seokju Cho, Jiyoung Lee, Seungryong Kim,