Skip to content

Multi-Modal Image-to-Image Translation on Tasks with High-Resolution Images

Notifications You must be signed in to change notification settings

ike-taku/DRIT_hr

 
 

Repository files navigation

Diverse Image-to-Image Translation via Disentangled Representations (High resolution)

Pytorch implementation for multi-modality I2I on translation tasks with high resolution images. We adopt a multi-scale generator and discriminator architecture to stable the training and enhance the quality of generated images. The project is an extension to the "Diverse Image-to-Image Translation via Disentangled Representations(https://arxiv.org/abs/1808.00948)", ECCV 2018.

Contact: Hsin-Ying Lee (hlee246@ucmerced.edu) and Hung-Yu Tseng (htseng6@ucmerced.edu)

Paper

Diverse Image-to-Image Translation via Disentangled Representations
Hsin-Ying Lee*, Hung-Yu Tseng*, Jia-Bin Huang, Maneesh Kumar Singh, and Ming-Hsuan Yang
European Conference on Computer Vision (ECCV), 2018 (oral) (* equal contribution)

Please cite our paper if you find the code or dataset useful for your research.

@inproceedings{DRIT,
  author = {Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Singh, Maneesh Kumar and Yang, Ming-Hsuan},
  booktitle = {European Conference on Computer Vision},
  title = {Diverse Image-to-Image Translation via Disentangled Representations},
  year = {2018}
}

Example Results

Usage

Prerequisites

Install

  • Clone this repo:
git https://github.com/hytseng0509/DRIT_hr.git
cd DRIT_hr

Datasets

  • We validate our model on street scene datasets: GTA and Cityscapes
cd datasets/gta2cityscapes
mkdir trainA trainB
  • Download images from two domains and place in folders trainA and trainB separately

Usage

  • Training
python3 train.py --dataroot ../datasets/gta2cityscapes -name NAME --display_dir DISPLAY_DIR --result_dir RESULT_DIR
tensorboard --logdir DISPLAY_DIR/NAME

Results and saved models can be found at RESULT_DIR/NAME.

  • Generate results with randomly sampled attributes
    • Require folder testA (for a2b) or testB (for b2a) under dataroot
python3 test.py --dataroot ../datasets/gta2cityscapes -name NAME --output_dir OUTPUT_DIR --resume MODEL_FILE --num NUM_PER_IMG
  • Generate results with attributes encoded from given images
    • Require both folders testA and testB under dataroot
python3 test_transfer.py --dataroot ../datasets/gta2cityscapes -name NAME --output_dir OUTPUT_DIR --resume MODEL_FILE
  • Results can be found at OUTPUT_DIR/NAME

Note

  • The feature-wise transformation (i.e. --concat 0) is not fully tested yet
  • We also adopt Mode Seeking loss, specify --ms to apply mode seeking loss in the training
  • Due to the large number of training images in the GTA dataset, the default training epoch is set to 90. Please refer to the default setting in original DRIT if the number of training images is around 1K.
  • Feel free to contact the authors for any potential improvement of the code

About

Multi-Modal Image-to-Image Translation on Tasks with High-Resolution Images

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%