# <div style="text-align: right">Generative Adversarial Networks to create new Simpsons character face</div>

<div style="text-align: right">by Neel Indap(indap.n@husky.neu.edu)</div>

## Introduction
***

### What is a Generative Adversarial Network (GAN)?
Generative Adversarial Network (referred to as GAN) is a network that generates new data with the same internal structure as the training data. They can be described as generative models based on supervised learning.
It consists of 2 Neural Networks models, the generator (which defines takes random noise and generates samples), and a discriminator (which takes the above sample, and tries to determine if it is fake or real). At each step, we try to minimize the loss for both models, until the point where the generator produces samples virtually indistinguishable from the real images, for the discriminator.
The network itself can be thought of as a game between the 2 models, both competing to win.
To gain more insight into this, refer to the following paper published by Ian Goodfellow and his colleagues, explaining their motivation behind this.<br>
[GAN paper](https://arxiv.org/abs/1406.2661)

If you are new to Neural Networks, checkout this video. It is a good starting point in understanding what they are and how do they work.<br>
[What is a Neural Network?](https://www.youtube.com/watch?v=aircAruvnKk)

## GAN Architecture
***
![GAN Architecture](./images/GAN_architecture.png)

### Why GAN?
GANs are cited as the most interesting idea in the last ten years by the Yann LeCun, the director of AI at Facebook. This intrigued me to understand the working of this algorithm.<br>
Since its inception, there have been various improvements published. Most of these are around image generation.

In this paper, I am trying to train the model using a custom image set of only 100 images as training data. The original paper used a CelebA dataset provided by imagenet consisting of 200k images.<br>
Trying to get a stable working model using a small dataset, and noting the impact of changing the hyper parameters, as well as modifying the neural network itself.


## Improvements on GAN - DCGAN
***

Shortly after the initial concept was proposed, there was a paper published called [Unsupervised Learning using Deep Convolution GAN](https://arxiv.org/abs/1511.06434).

This paper talks about the use of batch normalization in the CNN layers to improve preformance of the network.

## Code setup
***

The code is hosted on [Github](https://github.com/neelindap/DCGAN-tensorflow)

Clone the repository using ``` git clone https://github.com/neelindap/DCGAN-tensorflow```

After cloning the repository, please install the following dependency:
``` pip install Pillow ```

**_NOTE_**:<br>

It is assumed the system already has Tensorflow env set up. It not, refer to the this [tutorial](https://www.tensorflow.org/install/).

## Dataset
***

I have used the following set of images I found as the training data: [Simpsons dataset](https://github.com/jbencina/simpsons-image-training-dataset)

The original paper refers to the CelebA dataset, which contains approximately 200k images, cropped and aligned.
The dataset for the Simpsons, contains about 990 images in one and 2500 images in the other, totaling approximately 3400 images.
The image dataset is only a fraction of that used for the original model. Also, the images themselves aren’t purely faces. Majority of the images contains body along with the face, more than one face or even noisy background which makes isolating the faces from the images difficult.

I tried using ***Harr Cascade*** to isolate the faces from the images. Harr Cascade has previously shown good results in detecting human faces.
The challenging part here was since Simpsons character face didn’t have precisely similar characteristics as human faces, the filters written for Harr Cascade to detect human faces didn’t work well on Simpsons dataset.
On running the classifier on the dataset, only 6 images out of the 3400 images were detected and cropped.
Even with different Harr Cascade filters (frontalface_alt, frontalface_alt2) the results weren’t any good.

Owing to hardware limitations, I picked up 100 images which were of the best quality and good orientation and began training the model.


## Image Pre-processing
***

After selecting the images, I wrote a Python script to resize the images to 64x64.<br>
As these are colored images, the resulting vector was of the size 64x64x3.<br><br>
The original size of the images combined with 3 channels would have resulted in a vector of very large size and would’ve resulted in longer computation times.

#### Code for resizing images
***

```python
for file in os.listdir(dir):
    filename = os.fsdecode(os.path.join(dir,file))

    img = Image.open(filename)
    img = img.resize((basewidth, hsize), PIL.Image.ANTIALIAS)
    img.save(os.path.join(new_path,str(i)+".jpg"))
    i += 1
```

The above code scans through the source folder, opening one image at a time and reszing them and saving the new images at the destination path.<br>
I used ```Pillow library``` to do this.

## GAN model - Tensorboard Visualization
***

![Tensorflow Visualization](./images/GAN.png)

## Code Snippets

**NOTE :**<br>

The major functionality of the code is written in the file ```model.py``` in the source code you've downloaded. <br> 
You can find it on [Github](https://github.com/neelindap/DCGAN-tensorflow/blob/master/model.py).

I've explained the important snippets from the model below.

### Generator Model
***

![Generator](./images/Generator.png)

### Discriminator Model
***

![Discriminator](./images/Discriminator.png)

### Generator and Discriminator models
***

The Generator and Discriminator models are formed as follows when the code runs :

---------<br>Variables: name (type shape) [size]<br>---------<br>generator/g_h0_lin/Matrix:0 (float32_ref 100x16384) [1638400, bytes: 6553600]<br>generator/g_h0_lin/bias:0 (float32_ref 16384) [16384, bytes: 65536]<br>generator/g_bn0/beta:0 (float32_ref 1024) [1024, bytes: 4096]<br>generator/g_bn0/gamma:0 (float32_ref 1024) [1024, bytes: 4096]<br>generator/g_h1/w:0 (float32_ref 5x5x512x1024) [13107200, bytes: 52428800]<br>generator/g_h1/biases:0 (float32_ref 512) [512, bytes: 2048]<br>generator/g_bn1/beta:0 (float32_ref 512) [512, bytes: 2048]<br>generator/g_bn1/gamma:0 (float32_ref 512) [512, bytes: 2048]<br>generator/g_h2/w:0 (float32_ref 5x5x256x512) [3276800, bytes: 13107200]<br>generator/g_h2/biases:0 (float32_ref 256) [256, bytes: 1024]<br>generator/g_bn2/beta:0 (float32_ref 256) [256, bytes: 1024]<br>generator/g_bn2/gamma:0 (float32_ref 256) [256, bytes: 1024]<br>generator/g_h3/w:0 (float32_ref 5x5x128x256) [819200, bytes: 3276800]<br>generator/g_h3/biases:0 (float32_ref 128) [128, bytes: 512]<br>generator/g_bn3/beta:0 (float32_ref 128) [128, bytes: 512]<br>generator/g_bn3/gamma:0 (float32_ref 128) [128, bytes: 512]<br>generator/g_h4/w:0 (float32_ref 5x5x64x128) [204800, bytes: 819200]<br>generator/g_h4/biases:0 (float32_ref 64) [64, bytes: 256]<br>generator/g_bn4/beta:0 (float32_ref 64) [64, bytes: 256]<br>generator/g_bn4/gamma:0 (float32_ref 64) [64, bytes: 256]<br>generator/g_h5/w:0 (float32_ref 5x5x3x64) [4800, bytes: 19200]<br>generator/g_h5/biases:0 (float32_ref 3) [3, bytes: 12]<br>discriminator/d_h0_conv/w:0 (float32_ref 5x5x3x64) [4800, bytes: 19200]<br>discriminator/d_h0_conv/biases:0 (float32_ref 64) [64, bytes: 256]<br>discriminator/d_h1_conv/w:0 (float32_ref 5x5x64x128) [204800, bytes: 819200]<br>discriminator/d_h1_conv/biases:0 (float32_ref 128) [128, bytes: 512]<br>discriminator/d_bn1/beta:0 (float32_ref 128) [128, bytes: 512]<br>discriminator/d_bn1/gamma:0 (float32_ref 128) [128, bytes: 512]<br>discriminator/d_h2_conv/w:0 (float32_ref 5x5x128x256) [819200, bytes: 3276800]<br>discriminator/d_h2_conv/biases:0 (float32_ref 256) [256, bytes: 1024]<br>discriminator/d_bn2/beta:0 (float32_ref 256) [256, bytes: 1024]<br>discriminator/d_bn2/gamma:0 (float32_ref 256) [256, bytes: 1024]<br>discriminator/d_h3_conv/w:0 (float32_ref 5x5x256x512) [3276800, bytes: 13107200]<br>discriminator/d_h3_conv/biases:0 (float32_ref 512) [512, bytes: 2048]<br>discriminator/d_bn3/beta:0 (float32_ref 512) [512, bytes: 2048]<br>discriminator/d_bn3/gamma:0 (float32_ref 512) [512, bytes: 2048]<br>discriminator/d_h4_conv/w:0 (float32_ref 5x5x512x1024) [13107200, bytes: 52428800]<br>discriminator/d_h4_conv/biases:0 (float32_ref 1024) [1024, bytes: 4096]<br>discriminator/d_bn4/beta:0 (float32_ref 1024) [1024, bytes: 4096]<br>discriminator/d_bn4/gamma:0 (float32_ref 1024) [1024, bytes: 4096]<br>discriminator/d_h5_lin/Matrix:0 (float32_ref 16384x1) [16384, bytes: 65536]<br>discriminator/d_h5_lin/bias:0 (float32_ref 1) [1, bytes: 4]<br>Total size of variables: 36507524<br>Total bytes of variables: 146030096

### Loss Functions
***

Next, we define the loss functions for the 2 models as follows:

```python
d_loss_real = tf.reduce_mean(sigmoid_cross_entropy_with_logits(D_logits, tf.ones_like(D)))
d_loss_fake = tf.reduce_mean(sigmoid_cross_entropy_with_logits(D_logits_, tf.zeros_like(D_)))
g_loss = tf.reduce_mean(sigmoid_cross_entropy_with_logits(D_logits_, tf.ones_like(D_)))```

Where, <br>
d_loss_real is the loss for the real images passing through the discriminator<br>
d_loss_fake is the loss for the fake images passing through the discriminator<br>
g_loss is the loss for the images generated by the generator<br>

The total loss of the discriminator (d_loss) is the sum of d_loss_real and d_loss_fake

### Optimizer
***

We use Adam Optimizer to optimize the generator and discriminator models. They are defined as follows:<br>
```python
d_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1).minimize(self.d_loss, var_list=self.d_vars)
g_optim = tf.train.AdamOptimizer(config.learning_rate, beta1=config.beta1).minimize(self.g_loss, var_list=self.g_vars)
```
<br>
Learning Rate is 0.0001 and decay(beta1) is 0.5.

### Hyper-parameters
***

1.	Activation Function: tanH for Generator and Sigmoid for Discriminator
2.	Cost Function: Sigmoid with Cross Entorpy
3.	Gradient Descent: Adam Optimizer with learning rate: 0.0001 & beta1(decay rate of 1st moment estimation): 0.5
4.	Network Architecture: 5-layer Neural Network
5.	Network Initializer: random normal initializer
6.	Batch Size: 25
7.	Total Images: 100
8.	Epochs: 10000

## Running the Code
***

To run the code, on your terminal navigate to the installed path and run
``` python main.py --train --crop ```

This will automatically pick-up the training images present in the folder ```./Data/Simpsons_64```.
In order to use a different data set, place the images in the folder ```./Data``` folder and change the name of the "dataset" flag in ```main.py``` file.

### Output
***

Epoch: [ 0] [   0/   4] time: 9.3660, d_loss: 0.00804730, g_loss: 7.05201912<br>
Epoch: [ 0] [   1/   4] time: 10.9825, d_loss: 0.09658723, g_loss: 8.51840591<br>
Epoch: [ 0] [   2/   4] time: 12.5638, d_loss: 0.01051401, g_loss: 6.69371843<br>
Epoch: [ 0] [   3/   4] time: 14.2115, d_loss: 3.46202755, g_loss: 3.73002696<br>
Epoch: [ 1] [   0/   4] time: 15.7351, d_loss: 0.07472128, g_loss: 4.96042156<br>
Epoch: [ 1] [   1/   4] time: 17.2322, d_loss: 0.00944371, g_loss: 5.74052477<br>
Epoch: [ 1] [   2/   4] time: 18.7593, d_loss: 0.00912661, g_loss: 5.82239866<br>
Epoch: [ 1] [   3/   4] time: 20.2981, d_loss: 1.34175253, g_loss: 5.06495857<br>
Epoch: [ 2] [   0/   4] time: 21.8085, d_loss: 0.01723327, g_loss: 4.79014826<br>
Epoch: [ 2] [   1/   4] time: 23.3216, d_loss: 0.01660232, g_loss: 4.81151104<br>
Epoch: [ 2] [   2/   4] time: 24.8221, d_loss: 0.00530944, g_loss: 6.86225700<br>
Epoch: [ 2] [   3/   4] time: 26.3493, d_loss: 0.04552327, g_loss: 4.08214474<br>
Epoch: [ 3] [   0/   4] time: 27.8628, d_loss: 0.06854818, g_loss: 3.71761250<br>
Epoch: [ 3] [   1/   4] time: 29.3643, d_loss: 0.02385384, g_loss: 4.90426350<br>
Epoch: [ 3] [   2/   4] time: 30.8568, d_loss: 0.03424166, g_loss: 4.69303417<br>
Epoch: [ 3] [   3/   4] time: 32.3743, d_loss: 0.01002755, g_loss: 5.53210688<br>
Epoch: [ 4] [   0/   4] time: 33.8728, d_loss: 0.01985748, g_loss: 5.55863380<br>
Epoch: [ 4] [   1/   4] time: 35.3774, d_loss: 0.01248339, g_loss: 5.53703785<br>
Epoch: [ 4] [   2/   4] time: 36.9025, d_loss: 0.02969375, g_loss: 4.45696592<br>
Epoch: [ 4] [   3/   4] time: 38.4351, d_loss: 0.21307696, g_loss: 2.91995859<br>
Epoch: [ 5] [   0/   4] time: 39.9196, d_loss: 0.16518828, g_loss: 2.64963651<br>
Epoch: [ 5] [   1/   4] time: 41.4348, d_loss: 0.01804122, g_loss: 6.06599474<br>
Epoch: [ 5] [   2/   4] time: 42.9457, d_loss: 0.01435683, g_loss: 6.68838930<br>
Epoch: [ 5] [   3/   4] time: 44.4653, d_loss: 0.00883127, g_loss: 7.26980829<br>
Epoch: [ 6] [   0/   4] time: 45.9689, d_loss: 0.01340097, g_loss: 5.95355368<br>
Epoch: [ 6] [   1/   4] time: 47.4735, d_loss: 0.04442319, g_loss: 3.88450599<br>
Epoch: [ 6] [   2/   4] time: 48.9736, d_loss: 0.04200076, g_loss: 3.81001735<br>
Epoch: [ 6] [   3/   4] time: 50.4751, d_loss: 0.02555417, g_loss: 4.34248734<br>
Epoch: [ 7] [   0/   4] time: 51.9833, d_loss: 0.09478149, g_loss: 3.46468067<br>
Epoch: [ 7] [   1/   4] time: 53.4773, d_loss: 0.03278716, g_loss: 4.23916864<br>
Epoch: [ 7] [   2/   4] time: 54.9713, d_loss: 0.04215960, g_loss: 4.22080708<br>
Epoch: [ 7] [   3/   4] time: 56.4849, d_loss: 0.02984809, g_loss: 4.95883036<br>
Epoch: [ 8] [   0/   4] time: 57.9754, d_loss: 0.02219681, g_loss: 5.00462198<br>
Epoch: [ 8] [   1/   4] time: 59.4845, d_loss: 0.02506564, g_loss: 4.62666941<br>
Epoch: [ 8] [   2/   4] time: 60.9863, d_loss: 0.05980067, g_loss: 4.34014511<br>
Epoch: [ 8] [   3/   4] time: 62.5002, d_loss: 0.02720683, g_loss: 5.18341923<br>
Epoch: [ 9] [   0/   4] time: 64.0038, d_loss: 0.01533393, g_loss: 5.67472982<br>
Epoch: [ 9] [   1/   4] time: 65.5188, d_loss: 0.01708235, g_loss: 5.04262686<br>
Epoch: [ 9] [   2/   4] time: 67.0179, d_loss: 0.01435666, g_loss: 5.96386719<br>
Epoch: [ 9] [   3/   4] time: 68.5394, d_loss: 0.02204786, g_loss: 4.83817005<br>
Epoch: [10] [   0/   4] time: 70.0413, d_loss: 0.01232358, g_loss: 5.61192083<br>
Epoch: [10] [   1/   4] time: 71.5432, d_loss: 0.00669380, g_loss: 6.16942406<br>
Epoch: [10] [   2/   4] time: 73.0393, d_loss: 0.00966259, g_loss: 6.14368629<br>
Epoch: [10] [   3/   4] time: 74.5814, d_loss: 0.03318290, g_loss: 4.32485056<br>
Epoch: [11] [   0/   4] time: 76.0770, d_loss: 0.01441771, g_loss: 5.40539503<br>
Epoch: [11] [   1/   4] time: 77.5875, d_loss: 0.00369535, g_loss: 7.03609514<br>
Epoch: [11] [   2/   4] time: 79.1080, d_loss: 0.00759749, g_loss: 6.58801556<br>
Epoch: [11] [   3/   4] time: 80.6081, d_loss: 0.01224802, g_loss: 5.92280626<br>
Epoch: [12] [   0/   4] time: 82.1091, d_loss: 0.01170493, g_loss: 5.17165852<br>
Epoch: [12] [   1/   4] time: 83.6086, d_loss: 0.01223969, g_loss: 5.80764532<br>
Epoch: [12] [   2/   4] time: 85.1212, d_loss: 0.00818410, g_loss: 6.62704229<br>
Epoch: [12] [   3/   4] time: 86.6609, d_loss: 0.02226664, g_loss: 5.19087696<br>
Epoch: [13] [   0/   4] time: 88.1745, d_loss: 0.01443872, g_loss: 5.19868040<br>
Epoch: [13] [   1/   4] time: 89.6758, d_loss: 0.00672846, g_loss: 5.74825907<br>
Epoch: [13] [   2/   4] time: 91.1935, d_loss: 0.01081088, g_loss: 5.57111931<br>
Epoch: [13] [   3/   4] time: 92.7240, d_loss: 0.03427375, g_loss: 4.19021749<br>
Epoch: [14] [   0/   4] time: 94.2232, d_loss: 0.02590284, g_loss: 4.46787453<br>
Epoch: [14] [   1/   4] time: 95.7166, d_loss: 0.01754840, g_loss: 4.91180038<br>
Epoch: [14] [   2/   4] time: 97.2247, d_loss: 0.01751318, g_loss: 5.16031599<br>
Epoch: [14] [   3/   4] time: 98.7553, d_loss: 0.01427595, g_loss: 5.58691168<br>
Epoch: [15] [   0/   4] time: 100.2583, d_loss: 0.01843244, g_loss: 4.40491199<br>
Epoch: [15] [   1/   4] time: 101.7646, d_loss: 0.01243536, g_loss: 4.77438831<br>
Epoch: [15] [   2/   4] time: 103.2936, d_loss: 0.10667857, g_loss: 3.15785646<br>
Epoch: [15] [   3/   4] time: 104.8106, d_loss: 0.02200819, g_loss: 5.98732615<br>
Epoch: [16] [   0/   4] time: 106.3201, d_loss: 0.00618810, g_loss: 6.95014572<br>
Epoch: [16] [   1/   4] time: 107.8276, d_loss: 0.00112494, g_loss: 8.30367088<br>
Epoch: [16] [   2/   4] time: 109.3518, d_loss: 0.00449583, g_loss: 8.07813263<br>
Epoch: [16] [   3/   4] time: 110.8543, d_loss: 0.03454046, g_loss: 4.24285698<br>
Epoch: [17] [   0/   4] time: 112.3485, d_loss: 0.01515299, g_loss: 5.38927269<br>
Epoch: [17] [   1/   4] time: 113.8427, d_loss: 0.00462707, g_loss: 7.06467533<br>
Epoch: [17] [   2/   4] time: 115.3346, d_loss: 0.00884411, g_loss: 7.17192888<br>
Epoch: [17] [   3/   4] time: 116.8628, d_loss: 0.01973029, g_loss: 6.27611876<br>
Epoch: [18] [   0/   4] time: 118.3682, d_loss: 0.00860556, g_loss: 6.71839857<br>
Epoch: [18] [   1/   4] time: 119.9160, d_loss: 0.00869883, g_loss: 6.15110826<br>
Epoch: [18] [   2/   4] time: 121.4120, d_loss: 0.00928370, g_loss: 6.09283829<br>
Epoch: [18] [   3/   4] time: 122.9368, d_loss: 0.00822398, g_loss: 6.66194439<br>
Epoch: [19] [   0/   4] time: 124.4292, d_loss: 0.00291626, g_loss: 7.10071659<br>
Epoch: [19] [   1/   4] time: 125.9427, d_loss: 0.00230291, g_loss: 6.81967115<br>
Epoch: [19] [   2/   4] time: 127.4466, d_loss: 0.02431140, g_loss: 4.78234243<br>
Epoch: [19] [   3/   4] time: 128.9633, d_loss: 0.03422339, g_loss: 4.20858574<br>
Epoch: [20] [   0/   4] time: 130.4566, d_loss: 0.00712587, g_loss: 6.09566545<br>
Epoch: [20] [   1/   4] time: 131.9585, d_loss: 0.00334194, g_loss: 7.37051392<br>
Epoch: [20] [   2/   4] time: 133.4581, d_loss: 0.00596511, g_loss: 7.39865494<br>
Epoch: [20] [   3/   4] time: 134.9832, d_loss: 2.47153139, g_loss: 2.25520945<br>
Epoch: [21] [   0/   4] time: 136.5024, d_loss: 0.11012243, g_loss: 14.14280987<br>
Epoch: [21] [   1/   4] time: 137.9874, d_loss: 0.22035009, g_loss: 17.17830276<br>
Epoch: [21] [   2/   4] time: 139.4850, d_loss: 0.05087453, g_loss: 10.32483673<br>
Epoch: [21] [   3/   4] time: 141.0066, d_loss: 0.56518769, g_loss: 2.98864126<br>
Epoch: [22] [   0/   4] time: 142.5077, d_loss: 0.02509777, g_loss: 11.04115486<br>
Epoch: [22] [   1/   4] time: 144.0092, d_loss: 0.02014254, g_loss: 11.68441391<br>
Epoch: [22] [   2/   4] time: 145.5148, d_loss: 0.85586715, g_loss: 8.93093586<br>
Epoch: [22] [   3/   4] time: 147.0386, d_loss: 0.69659758, g_loss: 1.79383457<br>
Epoch: [23] [   0/   4] time: 148.5474, d_loss: 0.05355394, g_loss: 4.31195784<br>
Epoch: [23] [   1/   4] time: 150.0470, d_loss: 0.05076646, g_loss: 4.59662247<br>
Epoch: [23] [   2/   4] time: 151.5471, d_loss: 0.08791910, g_loss: 3.80403996<br>
Epoch: [23] [   3/   4] time: 153.0592, d_loss: 0.08182263, g_loss: 2.92726135<br>
Epoch: [24] [   0/   4] time: 154.5483, d_loss: 0.10160693, g_loss: 2.63836312<br>
Epoch: [24] [   1/   4] time: 156.1186, d_loss: 0.04913354, g_loss: 3.74601221<br>
Epoch: [24] [   2/   4] time: 157.6297, d_loss: 0.02884583, g_loss: 4.49800587<br>
[Sample] d_loss: 0.00623855, g_loss: 5.96150637

## Model training
***

The gist of the model training code is as follows:

```python
    # Update D network
     _, summary_str = self.sess.run([d_optim, self.d_sum],
    feed_dict={ self.inputs: batch_images, self.z: batch_z })
    # Update G network
    _, summary_str = self.sess.run([g_optim, self.g_sum],
    feed_dict={ self.z: batch_z })

    # Run g_optim twice to make sure that d_loss does not go to zero (different from paper)
    _, summary_str = self.sess.run([g_optim, self.g_sum],
    feed_dict={ self.z: batch_z })
          
    errD_fake = self.d_loss_fake.eval({ self.z: batch_z })
    errD_real = self.d_loss_real.eval({ self.inputs: batch_images })
    errG = self.g_loss.eval({self.z: batch_z})
```

At every step, we try to optimize the sum of the network, where the sum is the sum of the losses.<br>
In the discriminator's case, it is the loss of the real images, and in generator's case, it is the loss of the fake images.<br>

It is defined as follows:

```python
self.g_sum = tf.summary.merge([self.z_sum, self.d__sum,
      self.G_sum, self.d_loss_fake_sum, self.g_loss_sum])
self.d_sum = tf.summary.merge([self.z_sum, self.d_sum, 
      self.d_loss_real_sum, self.d_loss_sum])
```

The losses are captured in Tensorboard:

![Discriminator loss](./images/d_loss.PNG)
**Fig. 1 : Discriminator Loss**

![Generator loss](./images/g_loss.PNG)
**Fig. 2 : Generator Loss**

As you can see the loss of the 2 are kind of inversely related (Like adversaries). <br>
If the discriminator has a lower loss, it means it can distinguish the fake images from the real ones, which in turn means the generator cannot produce good quality output, and vice-versa.


### Training Output
***

![GAN](./images/GAN.gif)

## Test Output
***

![Test Output](./images/test_20180425010142.png)

## Conclusion
***

With the output generated by the test, it is evident that the model had started to distinguish between various Simpson’s characters and tried to generate a new face based off the existing ones.
The model did lose its track around 600 epoch, where in started generating noise instead of faces. It stabilized in some 400 epochs, and eventually started producing better outputs again around the 1000 epoch.

With enough images and more training, I think the model would be stable enough to generate better output.


## Future Scope
***

The GAN model while having many applications still isn’t stable enough to generate definitive results.<br>
Model is susceptible to mode collapse, where in once the generator can fool the discriminator, it keeps on producing similar results again and again. <br><br>
GAN models also suffer from convergence, and therefore we don’t know when to stop training. To overcome this, there was a paper proposing use of Wasserstein distance instead of Jensen-Shannon divergence to understand the loss function better, which can be correlated to image quality.

![Test Output](./images/WGAN.png)
**Fig. 3: Loss functions in WGAN**

Another new search in the field of neural networks gave rise to Capsule Networks, which are evidently much better than CNNs in training models.<br>
These networks can be used in place of CNNs in the GAN architecture.

## References
***

1.	Generative Adversarial Networks (https://arxiv.org/abs/1406.2661)<br>
2.	GAN tutorial: https://medium.com/@awjuliani/generative-adversarial-networks-explained-with-a-classic-spongebob-squarepants-episode-54deab2fce39.<br>
3.	Generative models: https://en.wikipedia.org/wiki/Generative_model<br>
4.	Discriminative models : https://en.wikipedia.org/wiki/Discriminative_model<br>
5.	CNN: http://cs231n.github.io/convolutional-networks/<br>
6.	DCGAN: https://github.com/carpedm20/DCGAN-tensorflow<br>
7.	Hacks for GAN: https://github.com/soumith/ganhacks<br>
8.	https://arxiv.org/abs/1511.06434<br>
9.	https://towardsdatascience.com/activation-functions-neural-networks-1cbd9f8d91d6<br>
10. https://prateekvjoshi.com/2016/03/29/understanding-xavier-initialization-in-deep-neural-networks/<br>
11. http://gluon.mxnet.io/chapter14_generative-adversarial-networks/dcgan.html<br>
12. WGAN https://arxiv.org/abs/1701.07875<br>
13. https://hackernoon.com/what-is-a-capsnet-or-capsule-network-2bfbe48769cc<br>


## Licenses
***

The text in the document by Neel Indap is licensed under CC BY 3.0 https://creativecommons.org/licenses/by/3.0/us/

The code in the document by Neel Indap is licensed under the MIT License https://opensource.org/licenses/MIT

![License](https://licensebuttons.net/l/by/3.0/us/88x31.png)