Torch7 implementation for paper
- "Style-Controlled Synthesis of Clothing Segments for Fashion Image Manipulation," IEEE Transactions on Multimedia (vol. 22, no. 2, Feb 2020)
Given a clothing segment of a person image and a desired style, our method synthesizes a new clothing item. This item is superimposed on the person image for virtual try-on.
Because of copyright issues, we cannot provide our experimental data (i.e., preprocessed version of the datasets in [1–5]).
- Please contact me via e-mail (bokyeong1015 at gmail dot com) if you want to know data download & preprocessing steps.
-
The following environments were tested (other better GPU/cuda may work, but untested)
- Ubuntu 16.04 or 14.04
- GTX Titan X or GTX 1080
- CUDA 8.0 or 7.5
- cuDNN 5.1
-
Install Torch7 and the following torch packages:
bash ./download_packages.sh
(1) Download FsSyn+LkBk dataset with the following script
(2) Download resol2_vggL1 model via this GoogleDrive link. Set the path to be models_trained/resol2_vggL1/model_minValidLoss.t7
(3) Test the model:
bash ./test_LkBk.sh
We had planned to provide full scripts to download our experimental data but recognized some copyright issues on distributing the datasets in [1–5]. Please check the above notice and feel free to contact me.
- We selected person images from FsSyn [1] and top-clothing product images from LkBk [2].
- We extracted clothing segments from CCP [3], CFPD [4], and FS [5] datasets, and unified them.
- Download the pre-trained models using the following GoogleDrive links.
- resol2_vggL1 (~1.2GB): FINAL model, 384×128 input size, {L1-pixel + VGG-feature} training loss
- resol2_onlyL1 (~1.2GB): 384×128 input size, only L1-pixel loss
- resol1_vggL1 (~840MB): 192×64 input size, {L1-pixel + VGG-feature} loss
- resol1_onlyL1 (~840MB): 192×64 input size, only L1-pixel loss
- Put the downloaded models in
models_trained/MODEL_NAME
. MODEL_NAME is specified above (e.g., resol2_vggL1). The example paths becomemodels_trained/resol2_vggL1/model_minValidLoss.t7
,models_trained/resol1_vggL1/model_minValidLoss.t7
, etc.
- (Optional) Modify lines 17 and 18 in
test.lua
depending on the models you want to test (default: resol2_vggL1). - Test the model on clothing product images (LkBk) OR segments (UnifSegm) with the following script. Here, person images from FsSyn are used.
bash ./test_LkBk.sh
bash ./test_UnifSegm.sh
- Download the VGG-19 network (for computing training loss) from this GoogleDrive link (~2.2GB). Then, put the downloaded
VGG_ILSVRC_19_layers_nn.t7
insrc_train/src_percepLoss
. - Train the model from scratch for 384×128 (resol2) OR 192×64 (resol1) input with the following script. The learning rate schedule was manually set.
bash ./train_resol2.sh
bash ./train_resol1.sh
- [1. Fashion Synthesis (FsSyn)] S. Zhu et al., “Be Your Own Prada: Fashion Synthesis with Structural Coherence,” in ICCV’17
- [2. LookBook (LkBk)] D. Yoo et al., “Pixel-level domain transfer,” in ECCV’16
- [3. Clothing Co-Parsing (CCP)] W. Yang et al., “Clothing co-parsing by joint image segmentation and labeling,” in CVPR’14
- [4. Colorful Fashion Parsing (CFPD)] S. Liu et al., “Fashion parsing with weak color-category labels,” IEEE Transactions on Multimedia, 2014
- [5. Fashionista (FS)] K. Yamaguchi et al., “Parsing clothing in fashion photographs,” in CVPR’12
- We thank the authors of [1–5] for providing their datasets.
- The codes for VGG-feature loss borrow heavily from fast-neural-style. The network architectures were modified from pix2pix. The VITON-stage1 masks were computed using VITON. We thank them for open-sourcing their projects.
If you plan to use our codes and datasets, please consider citing our paper:
@ARTICLE{8770290,
author={B. {Kim} and G. {Kim} and S. {Lee}},
journal={IEEE Transactions on Multimedia},
title={Style-Controlled Synthesis of Clothing Segments for Fashion Image Manipulation},
year={2020},
volume={22},
number={2},
pages={298-310},
doi={10.1109/TMM.2019.2929000}}