Skip to content
forked from junyanz/iGAN

iGAN: a deep learning software that easily generates images with a few brushstrokes (from UC Berkeley and Adobe CTL)

License

Notifications You must be signed in to change notification settings

PhoenixDai/iGAN

 
 

Repository files navigation

iGAN: Interactive Image Generation via Generative Adversarial Networks

[Project] [Youtube] [Paper]
A research prototype developed by UC Berkeley and Adobe CTL.

Overview

iGAN (aka. interactive GAN) is the author's implementation of interactive image generation interface described in:
"Generative Visual Manipulation on the Natural Image Manifold"
Jun-Yan Zhu, Philipp Krähenbühl, Eli Shechtman, Alexei A. Efros
In European Conference on Computer Vision (ECCV) 2016

Given a few user strokes, our system could produce photo-realistic samples that best satisfy the user edits at real-time. Our system is based on deep generative models such as Generative Adversarial Networks (GAN) and DCGAN. The system serves the following two purposes:

  • An intelligent drawing interface for automatically generating images inspired by the color and shape of the brush strokes.
  • An interactive visual debugging tool for understanding and visualizing deep generative models. By interacting with the generative model, a developer can understand what visual content the model can produce, as well as the limitation of the model.

We are working on supporting more generative models (e.g. variational autoencoder) and more deep learning frameworks (e.g. Tensorflow). You are welcome to propose new changes or contribute new features (e.g. Tensorflow branch) via pull requests. Please cite our paper if you find this code useful in your research.

Contact: Jun-Yan Zhu (junyanz@berkeley.edu)

Getting started

  • Install the python libraries. (See Requirements).
  • Download the code from GitHub:
git clone https://github.com/junyanz/iGAN
cd iGAN
  • Download the model. (See Model Zoo for details):
bash ./models/scripts/download_dcgan_model.sh outdoor_64.dcgan_theano
  • Run the python script:
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64

Requirements

The code is written in Python2 and requires the following 3rd party libraries:

sudo apt-get install python-opencv
  • As an alternative to opencv, you can install scikit-image
sudo pip install -U scikit-image
sudo pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
  • PyQt4: more details on Qt installation can be found here
sudo apt-get install python-qt4
sudo pip install qdarkstyle
sudo pip install dominate
  • GPU + CUDA + cuDNN: The code is tested on GTX Titan X + CUDA 7.5 + cuDNN 5. Here are the tutorials on how to install CUDA and cuDNN. A decent GPU is required to run the system at real-time. [Warning] If you run the program on a gpu server, you need to use a remote desktop software (e.g. VNC), which may introduce display artifacts and latency problem.

Interface:

See [Youtube] at 2:18s for the interactive image generation demos.

Layout

  • Drawing Pad: This is the main window of our interface. A user can apply different edits via our brush tools and the system will display the generated image. Check/Uncheck Edits button to display/hide user edits.
  • Candidate Results: a display showing thumbnails of all the candidate results (e.g. different modes) that fits the user edits. A user can click a mode (highlighted by a green rectangle), and the drawing pad will show this result.
  • Brush Tools: Coloring Brush for changing the color of a specific region; Sketching brush for outlining the shape. Warping brush for modifying the shape more explicitly.
  • Slider Bar: drag the slider bar to explore the interpolation sequence between the initial result (i.e. random generated image) and the current result (e.g. image that satisfies the user edits).
  • Control Panel: Play: play the interpolation sequence; Fix: use the current result as additional constraints for further editing Restart: restart the system; Save: save the result to a webpage. Edits: Check the box if you would like to show the edits on top of the generated image.

User interaction

  • Coloring Brush: right click to select a color; hold left click to paint; scroll the mouse wheel to adjust the width of the brush.
  • Sketching Brush: hold left click to sketch the shape.
  • Warping Brush: We recommend you first use coloring and sketching before the warping brush. Right click to select a square region; hold left click to drag the region; scroll the mouse wheel to adjust the size of the square region.
  • Shortcuts: P for Play, F for Fix, R for Restart; S for Save; E for Edits; Q for quitting the program.
  • Tooltips: when you move the cursor over a button , the system will display the tooltip of the button.

Model Zoo:

Download the theano DCGAN model (e.g. outdoor_64.dcgan_theano). Before using our system, please check out the random real images vs. DCGAN generated samples to see which kind of images that a model can produce.

bash ./models/scripts/download_dcgan_model.sh outdoor_64.dcgan_theano

We provide a simple script to generate samples from a pre-trained DCGAN model. You can run this script to test if Theano, CUDA, cuDNN are configured properly before running our interface.

THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python generate_samples.py --model_name outdoor_64 --output_image outdoor_64_dcgan.png

Command line arguments:

Type python iGAN_main.py --help for a complete list of the arguments. Here we discuss some important arguments:

  • --model_name: the name of the model (e.g. outdoor_64, shoes_64, etc.)
  • --model_type: currently only supports dcgan_theano.
  • --model_file: the file that stores the generative model; If not specified, model_file='./models/%s.%s' % (model_name, model_type)
  • --top_k: the number of the candidate results being displayed

Try your own data and models

  • DCGAN_theano model on new datasets: we will provide a model training script soon (by Sep 25 2016). The script can train a model (e.g. cat_64.dcgan_theano) given a new photo collection. (e.g. cat_photos/)
  • New deep generative models based on Theano (e.g. VAE:variational autoencoder): The current design of our software follows: ui python class (e.g. gui_draw.py) => constrained optimization python class (constrained_opt_theano.py) => deep generative model python class (e.g. dcgan_theano.py). To incorporate your own generative model, you need to create a new python class (e.g. vae_theano.py) under model_def folder with the same interface of dcgan_theano.py, and specify --model_type vae_theano in the command line.
  • Generative models based on another deep learning framework (e.g. Tensorflow): we are working on a tensorflow based optimization class (i.e. constrained_opt_tensorflow.py) now. Once the code is released, you can create your own tensorflow model class (e.g. dcgan_tensorflow.py) under model_def folder.

TODO

  • Support Python3.
  • Add 128x128 models.
  • Add the DCGAN model training script.
  • Add the script for projecting an image to the latent vector z.
  • Support other deep learning frameworks (e.g. Tensorflow).
  • Support other deep generative models (e.g. variational autoencoder).
  • Support sketch models for sketching guidance, inspired by ShadowDraw.
  • Support average image mode for visual data exploration, inspired by AverageExplorer.
  • Support image morphing mode.
  • Support image editing mode.

Citation

@inproceedings{zhu2016generative,
  title={Generative Visual Manipulation on the Natural Image Manifold},
  author={Zhu, Jun-Yan and Kr{\"a}henb{\"u}hl, Philipp and Shechtman, Eli and Efros, Alexei A.},
  booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
  year={2016}
}

Cat Paper Collection

If you love cats, and love reading cool graphics, vision, and learning papers, please check out our Cat Paper Collection:
[Github] [Webpage]

Acknowledgement

  • We modified the DCGAN code in our package. Thanks the authors for sharing the code. Please cite the original DCGAN paper if you use their models.
  • This work was supported, in part, by funding from Adobe, eBay and Intel, as well as a hardware grant from NVIDIA. J.-Y. Zhu is supported by Facebook Graduate Fellowship.

About

iGAN: a deep learning software that easily generates images with a few brushstrokes (from UC Berkeley and Adobe CTL)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.8%
  • Shell 0.2%