LSD-VTON for Local Flow Global Parsing Warping integrated with Stable Diffusion Virtual Try-on
Mohamed Walid, Karim Metwally, Mohamed Mostafa, Mohamed Hesham, Omar Ehab, and Mohamed Ahmed
The virtual tryon system is based on GP-VTON and LaDi-VTON
We exploit the GP-VTON power in warping(LFGP) and the LaDi-VTON power in generation(Stable diffusion)
Dataset: VITON-HD
- Create an e-commerce application that offers virtual try-on feature.
- Deal with challenging inputs:
- Complex poses
- Complex garments
- Clothes style: tucked-in or tucked-out
- Train the developed warping and generative models.
- Evaluate the performance of the developed virtual try-on model against benchmarks.
LSD-VTON System Architecture
Used Technologies/Apps:
- Python
- FastAPI
- Kaggle API
- SQL Server
- HTML, CSS, JS
- Kaggle (P100 GPU 16 GB VRAM, 30 GB RAM)
- Pycharm IDE
In this section we will introduce our work (no code here) to test the GP-VTON, LaDi-VTON and LSD-VTON architectures but using new person/cloth images (which not included in the original dataset) i.e we will try-on our personal image and our clothes
I called it Mody Test!
For Mody Test (custom input) we only have (our inputs)
- Person image
- Cloth image
and in Mody Test we do a preprocessing task which includes complicated tasks which are:
- Person parse (Human parse)
- Person keypoints (skeleton)
- Person densepose
- Cloth parse
- Cloth binary mask
Here is a spoiler image for our work (Mody Test)
@inproceedings{xie2023gpvton,
title = {GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning},
author = {Zhenyu, Xie and Zaiyu, Huang and Xin, Dong and Fuwei, Zhao and Haoye, Dong and Xijin, Zhang and Feida, Zhu and Xiaodan, Liang},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2023},
}
@inproceedings{morelli2023ladi,
title={{LaDI-VTON: Latent Diffusion Textual-Inversion Enhanced Virtual Try-On}},
author={Morelli, Davide and Baldrati, Alberto and Cartella, Giuseppe and Cornia, Marcella and Bertini, Marco and Cucchiara, Rita},
booktitle={Proceedings of the ACM International Conference on Multimedia},
year={2023}
}
Thanks for all authors of
The use of this code is RESTRICTED to non-commercial research and educational purposes.