CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model

Jianhao Zeng¹, Dan Song^1,*, Weizhi Nie¹, Hongshuo Tian¹,

¹Tianjin University ²Tencent LightSpeed Studio

Abstract

Image-based virtual try-on enables users to virtually try on different garments by altering original clothes in their photographs. Generative Adversarial Networks (GANs) dominate the research field in image-based virtual try-on, but have not resolved problems such as unnatural deformation of garments and the blurry generation quality. Recently, diffusion models have emerged with surprising performance across various image generation tasks. While the generative quality of diffusion models is impressive, achieving controllability poses a significant challenge when applying it to virtual try-on tasks and multiple denoising iterations limit its potential for real-time applications. In this paper, we propose Controllable Accelerated virtual Try-on with Diffusion Model called CAT-DM. To enhance the controllability, a basic diffusion-based virtual try-on network is designed, which utilizes ControlNet to introduce additional control conditions and improves the feature extraction of garment images. In terms of acceleration, CAT-DM initiates a reverse denoising process with an implicit distribution generated by a pre-trained GAN-based model. Compared with previous try-on methods based on diffusion models, CAT-DM not only retains the pattern and texture details of the in-shop garment but also reduces the sampling steps without compromising generation quality. Extensive experiments demonstrate the superiority of CAT-DM against both GAN-based and diffusion-based methods in producing more realistic images and accurately reproducing garment patterns.

Hardware Requirement

Our experiments were conducted on two NVIDIA GeForce RTX 4090 graphics cards, with a single RTX 4090 having 24GB of video memory. Please note that our model cannot be trained on graphics cards with less video memory than the RTX 4090.

Environment Requirement

Clone the repository

git clone https://github.com/zengjianhao/CAT-DM

A suitable conda environment named CAT-DM can be created and activated with:

cd CAT-DM
conda env create -f environment.yaml
conda activate CAT-DM

If you want to change the name of the environment you created, you need to modify the name in both environment.yaml and setup.py.
You need to make sure that conda is installed on your computer.
If there is a network error, try updating the environment using conda env update -f environment.yaml.

Installing xFormers：

git clone https://github.com/facebookresearch/xformers.git
cd xformers
git submodule update --init --recursive
pip install -r requirements.txt
pip install -U xformers
cd ..
rm -rf xformers

open src/taming-transformers/taming/data/utils.py, delete from torch._six import string_classes, and change elif isinstance(elem, string_classes): to elif isinstance(elem, str):

Dataset Preparing

VITON-HD

Download the VITON-HD dataset
Create a folder datasets
Put the VITON-HD dataset into this folder and rename it to vitonhd
Generate the mask images

# Generate the train dataset mask images
python tools/mask_vitonhd.py datasets/vitonhd/train datasets/vitonhd/train/mask
# Generate the test dataset mask images
python tools/mask_vitonhd.py datasets/vitonhd/test datasets/vitonhd/test/mask

DressCode

Download the DressCode dataset
Create a folder datasets
Put the DressCode dataset into this folder and rename it to dresscode
Generate the mask images and the agnostic images

# Generate the dresses dataset mask images and the agnostic images
python tools/mask_dresscode.py datasets/dresscode/dresses datasets/dresscode/dresses/mask
# Generate the lower_body dataset mask images and the agnostic images
python tools/mask_dresscode.py datasets/dresscode/lower_body datasets/dresscode/lower_body/mask
# Generate the upper_body dataset mask images and the agnostic images
python tools/mask_dresscode.py datasets/dresscode/upper_body datasets/dresscode/upper_body/mask

Details

datasets folder should be as follows:

datasets
├── vitonhd
│   ├── test
│   │   ├── agnostic-mask
│   │   ├── mask
│   │   ├── cloth
│   │   ├── image
│   │   ├── image-densepose
│   │   ├── ...
│   ├── test_pairs.txt
│   ├── train
│   │   ├── agnostic-mask
│   │   ├── mask
│   │   ├── cloth
│   │   ├── image
│   │   ├── image-densepose
│   │   ├── ...
│   └── train_pairs.txt
├── dresscode
│   ├── dresses
│   │   ├── dense
│   │   ├── images
│   │   ├── mask
│   │   ├── ...
│   ├── lower_body
│   │   ├── dense
│   │   ├── images
│   │   ├── mask
│   │   ├── ...
│   ├── upper_body
│   │   ├── dense
│   │   ├── images
│   │   ├── mask
│   │   ├── ...
│   ├── test_pairs_paired.txt
│   ├── test_pairs_unpaired.txt
│   ├── train_pairs.txt
│   └── ...

PS: When we conducted the experiment, VITON-HD did not release the agnostic-mask. We used our own implemented mask, so if you are using VITON-HD's agnostic-mask, the generated results may vary.

Required Model

Download the Paint-by-Example model
Create a folder checkpoints
Put the Paint-by-Example model into this folder and rename it to pbe.ckpt
Make the ControlNet model:

VITON-HD:

python tools/add_control.py checkpoints/pbe.ckpt checkpoints/pbe_dim6.ckpt configs/train_vitonhd.yaml

DressCode:

python tools/add_control.py checkpoints/pbe.ckpt checkpoints/pbe_dim5.ckpt configs/train_dresscode.yaml

checkpoints folder should be as follows:

checkpoints
├── pbe.ckpt
├── pbe_dim5.ckpt
└── pbe_dim6.ckpt

Training

VITON-HD

bash scripts/train_vitonhd.sh

DressCode

bash scripts/train_dresscode.sh

Testing

VITON-HD

Download the checkpoint for VITON-HD dataset and put it into checkpoints folder.
Directly generate the try-on results:

bash scripts/test_vitonhd.sh

Poisson Blending

python tools/poisson_vitonhd.py

DressCode

Download the checkpoint for DressCode dataset and put it into checkpoints folder.
Directly generate the try-on results:

bash scripts/test_dresscode.sh

Poisson Blending

python tools/poisson_dresscode.py

Evaluation

Citing

@article{zeng2023cat,
  title={CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model},
  author={Zeng, Jianhao and Song, Dan and Nie, Weizhi and Tian, Hongshuo and Wang, Tongtong and Liu, Anan},
  journal={arXiv preprint arXiv:2311.18405},
  year={2023}
}

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
assets		assets
configs		configs
ldm		ldm
scripts		scripts
tools		tools
.DS_Store		.DS_Store
README.md		README.md
environment.yaml		environment.yaml
setup.py		setup.py
test.py		test.py
train.py		train.py

zengjianhao/CAT-DM

Folders and files

Latest commit

History

Repository files navigation

CAT-DM: Controllable Accelerated Virtual Try-on with Diffusion Model

Abstract

Hardware Requirement

Environment Requirement

Dataset Preparing

VITON-HD

DressCode

Details

Required Model

Training

VITON-HD

DressCode

Testing

VITON-HD

DressCode

Evaluation

Citing

About

Resources

Stars

Watchers

Forks

Languages