Skip to content

🐼 Generating new Pokemons with Wasserstein DCGAN | TensorFlow Implementation

Notifications You must be signed in to change notification settings

Zhenye-Na/pokemon-gan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pokemon-gan

Generating Pokemon with a Generative Adversarial Network

Overview

This project is inspired by Siraj Raval in his awesome video.
The code is inspired by Newmu, kvpratama, moxiegushi along with my own implemetation.

Dataset

Original images:

Preprocessed images:

OR

Google Drive

Training with Floydhub

Training process is very long! Please do not train on your little tiny laptop! I have used Floydhub to train my model. However, only 2 hours of free trial for GPU option. You can select your favorite cloud computing platform ro train this awesome project, if you find Training is interesting. 6

If you wanna, you can monitor the training process I upload in Floydhub @ here with CPU and here with GPU which is much faster!

floyd run --cpu --env tensorflow-1.3 --message run --data zayne/datasets/preprocessedpokemon/1:/preprocessed_data 'python main.py'

Training metrics

This is the load of my training with GPU:
GPU

This is the load of my training with CPU:

Training with BlueWater

I got access to BlueWater account this semester so I can use GPU to train this project again!!

See Results here!

Dependencies

git clone https://github.com/Zhenye-Na/pokemon-gan.git
cd pokemon-gan  
pip install -r requirements.txt

scikit-image
tensorflow
scipy
numpy
Pillow Pytorch

Usage

git clone https://github.com/Zhenye-Na/pokemon-gan.git
cd pokemon-gan
python3 main.py

Generative Adversarial Network (GAN)

GAN consist of two network:

  • A discriminator D receive input from training data and generated data. Its job is to learn how to distinguish between these two inputs.
  • A generator G generate samples from a random noise Z. Generator objective is to generate sample that is as real as possible it could not be distinguished by Discriminator.

Deep Convolution GAN (DCGAN)

In DCGAN architecture, the discriminator D is Convolutional Neural Networks (CNN) that applies a lot of filters to extract various features from an image. The discriminator network will be trained to discriminate between the original and generated image. The process of convolution is shown in the illustration below:

The network structure for the discriminator is given by:

Layer Shape Activation
input batch size, 3, 64, 64
convolution batch size, 64, 32, 32 LRelu
convolution batch size, 128, 16, 16 LRelu
convolution batch size, 256, 8, 8 LRelu
convolution batch size, 512, 4, 4 LRelu
dense batch size, 64, 32, 32 Sigmoid

The generator G, which is trained to generate image to fool the discriminator, is trained to generate image from a random input. In DCGAN architecture, the generator is represented by convolution networks that upsample the input. The goal is to process the small input and make an output that is bigger than the input. It works by expanding the input to have zero in-between and then do the convolution process over this expanded area. The convolution over this area will result in larger input for the next layer. The process of upsampling is shown below:

There are many name for this upsample process: full convolution, in-network upsampling, fractionally-strided convolution, deconvolution, or transposed convolution.

The network structure for the generator is given by:

Layer Shape Activation
input batch size, 100 (Noise from uniform distribution)
reshape layer batch size, 100, 1, 1 Relu
deconvolution batch size, 512, 4, 4 Relu
deconvolution batch size, 256, 8, 8 Relu
deconvolution batch size, 128, 16, 16 Relu
deconvolution batch size, 64, 32, 32 Relu
deconvolution batch size, 3, 64, 64 Tanh

Hyperparameter of DCGAN

The hyperparameter for DCGAN architecture is given in the table below:

Hyperparameter
Mini-batch size of 64
Weight initialize from normal distribution with std = 0.02
LRelu slope = 0.2
Adam Optimizer with learning rate = 0.0002 and momentum = 0.5

Pokemon Image Dataset

The dataset of pokemon images are gathered from various sources:

Training output

As I have only two-hour GPU option in my account! So I have only get two images of new Pokemons. If you have access to high-level computing computer, you can train longer to get better results, to get more new Pokemons!

  • 800 Epochs

Epoch50

  • 1550 Epochs

Epoch100

References

About

🐼 Generating new Pokemons with Wasserstein DCGAN | TensorFlow Implementation

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages