Skip to content

Commit

Permalink
initial codebase
Browse files Browse the repository at this point in the history
  • Loading branch information
junyanz committed Sep 21, 2016
0 parents commit e52af05
Show file tree
Hide file tree
Showing 39 changed files with 3,432 additions and 0 deletions.
94 changes: 94 additions & 0 deletions .gitignore
@@ -0,0 +1,94 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# added
*.dcgan_theano
.idea/
web/

# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
target/

# IPython Notebook
.ipynb_checkpoints

# pyenv
.python-version

# celery beat schedule file
celerybeat-schedule

# dotenv
.env

# virtualenv
.venv/
venv/
ENV/

# Spyder project settings
.spyderproject

# Rope project settings
.ropeproject
21 changes: 21 additions & 0 deletions LICENSE
@@ -0,0 +1,21 @@
The MIT License (MIT)

Copyright (c) 2016 Jun-Yan Zhu

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
133 changes: 133 additions & 0 deletions README.md
@@ -0,0 +1,133 @@
## iGAN: Interactive Image Generation powered by GAN
[[Project Webpage]](http://www.eecs.berkeley.edu/~junyanz/projects/gvm/) [[Youtube Video]](https://youtu.be/9c4z6YsBGQ0)
Contact: Jun-Yan Zhu (junyanz at eecs dot berkeley dot edu)

## Overview
This is the authors' implementation of interactive image generation interface described in:
"Generative Visual Manipulation on the Natural Image Manifold"
[Jun-Yan Zhu](https://people.eecs.berkeley.edu/~junyanz/), [Philipp Krähenbühl](http://www.philkr.net/), [Eli Shechtman](https://research.adobe.com/person/eli-shechtman/), [Alexei A. Efros](https://people.eecs.berkeley.edu/~efros/)
In European Conference on Computer Vision (ECCV) 2016

<img src='pics/demo_teaser.jpg' width=800>


Given a few user brushes, our system could generate photo-realistic samples that best satisfy the user edits at real-time. The interface is based on deep generative model such as Generative Adversarial Networks ([GAN](https://arxiv.org/abs/1406.2661)). The system serves two main purposes:
* An intelligent drawing interface for automatically generating images inspired by the color and shape of the digital brush strokes.
* An interactive visual debugging tool for understanding and visualizing deep generative models. By interacting with the deep generative models, a developer can understand which kinds of stuff that the model can produce or not.

We are working on supporting more generative models (e.g. VAE) under more deep learning frameworks (e.g. Tensorflow). Please cite our paper if you find this code useful in your research.


## Getting started
* Install the python libraries. (See `Requirements`)
* Download the code from the Github.
* Download the pre-trained GAN models (See `Pre-trained models`).
* Run the code:
``` bash
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python iGAN_main.py --model_name outdoor_64 --model_type dcgan_theano
```

## Requirements
The code is written in Python2 and requires the following 3rd party libraries:
* numpy
* [OpenCV](http://opencv.org/)
```bash
sudo apt-get install libopencv-dev
```
* As an alternative to opencv, you can install scikit-image
```bash
sudo pip install -U scikit-image
```
* [Theano](https://github.com/Theano/Theano)
```bash
sudo pip install --upgrade --no-deps git+git://github.com/Theano/Theano.git
```
* [PyQt4](https://wiki.python.org/moin/PyQt4): more details on Qt installation can be found [here](http://www.saltycrane.com/blog/2008/01/how-to-install-pyqt4-on-ubuntu-linux/)
```bash
sudo apt-cache search pyqt
sudo apt-get install python-qt4
```
* [Qdarkstyle](https://github.com/ColinDuquesnoy/QDarkStyleSheet)
```bash
sudo pip install qdarkstyle
```
* [dominate](https://github.com/Knio/dominate):
```bash
sudo pip install dominate
```
* GPU + CUDA + CUDNN:
The code is tested on GTX Titan X + CUDA 7.5 + CUDNN 5.1. Here are the tutorials on how to install [CUDA](http://www.r-tutor.com/gpu-computing/cuda-installation/cuda7.5-ubuntu) and [CUDNN](http://askubuntu.com/questions/767269/how-can-i-install-cudnn-on-ubuntu-16-04). A decent GPU is required to run the system at near real-time. If you run the program on a gpu server, you need to use a remote desktop software (e.g. VNC).

## Pre-trained models:
Download the theano [DCGAN](https://github.com/Newmu/dcgan_code) model (e.g. outdoor_64). Before using our system ,please check out the random real images vs. DCGAN generated samples to see which kinds of images that a model can produce.

``` bash
bash ./models/scripts/download_dcgan_model.sh outdoor_64.dcgan_theano
```
* [ourdoor_64.dcgan_theano](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/models/theano_dcgan/outdoor_64.dcgan_theano) (64x64): trained on 150K landscape images from MIT [Places](http://places.csail.mit.edu/) dataset. [[Real](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/outdoor_64_real.png) vs. [DCGAN](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/outdoor_64_dcgan.png)]
* [church_64.dcgan_theano](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/models/theano_dcgan/church_64.dcgan_theano) (64x64): trained on 126k church images from the [LSUN](http://lsun.cs.princeton.edu/2016/) challenge. [[Real](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/church_64_real.png) vs. [DCGAN](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/church_64_dcgan.png)]
* [shoes_64.dcgan_theano](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/models/theano_dcgan/shoes_64.dcgan_theano) (64x64): trained on 50K shoes images collected by [Yu and Grauman](http://vision.cs.utexas.edu/projects/finegrained/utzap50k/). [[Real](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/shoes_64_real.png) vs. [DCGAN](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/shoes_64_dcgan.png)]
* [handbag_64.dcgan_theano](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/models/theano_dcgan/handbag_64.dcgan_theano) (64x64): trained on 137K handbag images downloaded from Amazon. [[Real](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/handbag_64_real.png) vs. [DCGAN](https://people.eecs.berkeley.edu/~junyanz/projects/gvm/samples/handbag_64_dcgan.png)]

We also provide a script to generate sample images from the dcgan model.
```bash
THEANO_FLAGS='device=gpu0, floatX=float32, nvcc.fastmath=True' python generate_samples.py --model_name outdoor_64 --output_image outdoor_64_dcgan.png
```


## Interface:
See [[Youtube Video]](https://youtu.be/9c4z6YsBGQ0?t=2m18s) at 2:18s for the interactive image generation demos.
<img src='pics/ui_intro.jpg' width=800>

#### Layout
* Drawing Pad: This is the main window of our interface. A user can apply different edits via our brush tools and the system will display the generated image. Check/Uncheck `Edits` button to display/hide user edits.
* Candidate Results: a display showing thumbnails of all the candidate results (e.g. different modes) that fits the user edits. A user can click a mode (highlighted by a green rectangle), and the drawing pad will show this result.
* Brush Tools: `Coloring Brush` for changing the color of a specific region; `Sketching brush` for outlining the shape. `Warping brush` for modifying the shape more explicitly.
* Slider Bar: drag the slider bar to explore the interpolation sequence between the initial result (i.e. random generated image) and the current result (e.g. image that satisfies the user edits).
* Control Panel: `Play`: play the interpolation sequence; `Fix`: fix `Restart`: restart the system; `Save`: save the result to a folder. `Edits`: Check the box if you would like to show edits on top of generated image.


#### User interaction
* Coloring Brush: right click to select the color; hold left click to paint; scroll the mouse wheel to adjust the width of the brush.
* Sketching Brush: hold left click to sketch the shape.
* Warping Brush: We recommend you first use coloring and sketching before the warping tools. Right click to select the a square region; hold left click to drag the region; scroll the mouse wheel to adjust the size of the square region.
* Shortcuts: P for `Play`, F for `Fix`, R for `Restart`; S for `Save`; E for `Edits`; Q for quiting the program

## Command line arguments:
Type `python iGAN_main.py --help` for a complete list of the arguments. Here we discuss some critical arguments:
* `--model_name`: the name of the model (e.g. outdoor_64, shoes_64, etc.)
* `--model_type`: currently only supports dcgan_theano.
* `--model_file`: the file that stores the generative model; If not specified, `model_file='./models/%s.%s' % (model_name, model_type)`
* `--top_k`: the number of the candidate results being displayed


## Plugging in you own data and models
* DCGAN_theano model on new datasets: we will provide a model training script soon (by Sep 25 2016). The script can train a model (e.g. cat.dcgan_theano) given a new photo collection. (e.g. cat_photos/)
* New deepgenerative models based on Theano (e.g. VAE:variational autoencoder): The current design of our package follows: ui python class (e.g. `ui_draw.py`) => constrained optimization python class (`constrained_opt_theano.py`) => deep generative model python class (e.g. `dcgan_theano.py`). To incorporate your own generative model, you need to create a new python class (e.g. `vae_theano.py`) under `model_def` folder with the same interface of `dcgan_theano.py`, and specify `--model_type vae_theano` in the command line.
* Generative models based on another deep learning framework (e.g. Tensorflow): we are working on a tensorflow based optimization class (e.g. `constrained_opt_tensorflow.py`) now. Once the code is released, you can create your own tensorflow model class (e.g. `dcgan_tensorflow.py`.)

## TODO
* Support Python3
* Provide our dcgan model training scripts.
* Support Tensorflow GAN models.
* Support other deep generative models (e.g. variational autoencoder)
* Support sketch models for sketching guidance, inspired by [ShadowDraw](http://vision.cs.utexas.edu/projects/shadowdraw/shadowdraw.html).
* Support the average image mode for visual data exploration, inspired by [AverageExplorer](https://people.eecs.berkeley.edu/~junyanz/projects/averageExplorer/).


## Citation
```
@inproceedings{zhu2016generative,
title={Generative Visual Manipulation on the Natural Image Manifold},
author={Zhu, Jun-Yan and Kr{\"a}henb{\"u}hl, Philipp and Shechtman, Eli and Efros, Alexei A.},
booktitle={Proceedings of European Conference on Computer Vision (ECCV)},
year={2016}
}
```

## Cat Paper Collection
If you love cats, and love reading cool vision and graphics papers, please check out Cat Paper Collection:
[[Github]](https://github.com/junyanz/CatPapers) [[Webpage]](http://people.eecs.berkeley.edu/~junyanz/cat/cat_papers.html)
## Acknowledgement
* We modified the DCGAN [code](https://github.com/Newmu/dcgan_code) in our package. Thanks the authors for sharing the code. Please cite the original [DCGAN](https://arxiv.org/abs/1511.06434) paper if you use their models.
* This work was supported, in part, by funding from Adobe, eBay and Intel, as well as a hardware grant from NVIDIA. J.-Y. Zhu is supported by Facebook Graduate Fellowship.

0 comments on commit e52af05

Please sign in to comment.