## Generative Adversarial Networks

<img width="350px" src="files/Gan_Arch.png">


#### Loss function

<img width="500px" src="files/loss.png">

*** 
#### Discriminator Objective 

Maximize the chance to recognize real images as real and generated images as fake.

#### Generator Objective 

Generate images with the highest possible value of D(x) to fool the discriminator.

***

Now that we have the loss function defined, let's get into training GANs.

Our main focus here is to generate data from scratch i.e from a noise vector. 

The generator upsamples the data using a **transposed convolution** operation (a.k.a deconvolution a.k.a fractionally-strided convolution)

<table style="width:100%; table-layout:fixed;">
  <tr>
    <td><img width="150px" src="files/no_padding_no_strides.gif"></td>
    <td><img width="150px" src="files/no_padding_no_strides_transposed.gif"></td>
  </tr>
  <tr>
    <td>Convolution (Downsampling)</td>
    <td>Transposed Convolution (Upsampling)</td>
  </tr>

</table>

***
An example script for creating and running a GAN is in the repository `mnist_gan.py`.

- Inferno is used for training and logging
- The main inferno training loop pumps real images and trains the discriminator
- A callback periodically trains the generator

If you want the full experience, please try running tensorboard while the script runs. 

- Live images will be drawn to the webpage as your network trains.
- A video will be rendered when training completes (if you install ffmpeg).


![tensorboard](https://github.com/cmudeeplearning11785/deep-learning-tutorials/raw/master/recitation-10/images/tensorboard1.png)


- Alternatively, there is also a sample pytorch implementation of the DCGAN on mnist in the repository [dcgan_mnist_pytorch.ipynb]
***

In [None]:
# Import the actual code from the linked file
import mnist_gan
import mnist_wgangp
import mnist_cwgangp
import cifar10_wgangp
from IPython.core.display import HTML

### Successful GAN

First we train the GAN with settings that converge (found through trial-and-error). The generator and discriminator both have a learning rate of 3e-4 and the generator is trained 1 time every 5 times the discriminator is trained.

In [None]:
mnist_gan.main([])


https://www.youtube.com/embed/IUi0REAWj2c?rel=0

### Failed GAN

Here we see what happens if the generator is trained too much or too little compared to the discriminator.

In [None]:
mnist_gan.main(['--generator-frequency=1', '--save-directory=output/mnist_gan/frequency-1', '--epochs=50'])

https://www.youtube.com/embed/J8m1NXLwSKw

***
## GANs can be HARD to train. 

- The parameters oscillate, destabilize and never converge. Unbalance between generator and discriminator.

- Mode collapse - the generator collapses which produces limited varieties of samples

<table style="width:100%; table-layout:fixed;">
  <tr>
    <td><img width="150px" src="images/mode.png"></td>
    <td><img width="150px" src="images/mode1.png"></td>
  </tr>
  <tr>
</table>

- Diminished gradient: the discriminator ends up dominating, and the generator gradients start to vanish and it learns nothing

- Highly sensitive to the hyperparameter selections.



## Ways to improve GAN Performance

- **Feature matching** 

Minimize the statistical difference between the features of the real images and the generated images by computing the L2-distance between the means of their feature vectors. Add this L2 distance to your generator loss. Feature matching expands the goal from beating the opponent to matching features in real images. 

- **Minibatch discrimination**

When mode collapses, all images created looks similar. To mitigate the problem, we feed real images and generated images into the discriminator separately in different batches and compute the similarity of the image x with images in the same batch. The discriminator can use this score to detect generated images and penalize the generator if mode is collapsing.

- **One sided label smoothing**

Deep networks may suffer from overconfidence. For example, it uses very few features to classify an object. To mitigate the problem, deep learning uses regulation and dropout to avoid overconfidence. To avoid the problem, we penalize the discriminator when the prediction for any real images go beyond 0.9 (D(real image)>0.9). This is done by setting our target label value to be 0.9 instead of 1.0. 

- **Changing the cost functions**

*WGAN-GP, LSGAN, BEGAN, DRAGAN*
***

## Wasserstein GAN with Gradient Penalty

Here we see an improvement on traditional GAN

In [None]:
mnist_wgangp.main([])

https://www.youtube.com/watch?v=unXILX2wp1A

# WGAN-GP on CIFAR10

For a slightly more complicated dataset, this example uses CIFAR10.

In [None]:
cifar10_wgangp.main([])

https://youtu.be/dAe-UcOfywE

# Conditional WGAN-GP on MNIST

Here use use a conditional GAN to learn each digit.

In [None]:
mnist_cwgangp.main([])

https://youtu.be/_wuRRwujeHc

# How to evaluate different GAN models?

- **Inception score** 

Measures the performance of the GAN based on quality of generated images and their diversity. High inception score is good.

- **Fréchet Inception Distance (FID)**

In FID, we use the Inception network to extract features from an intermediate layer. Then we model the data distribution for these features using a multivariate Gaussian distribution.
The FID between the real images x and generated images g is computed as : (Tr sums up all the diagonal elements).
Low FID score is good

- **Precision, Recall, F1 score**

If the generated images look similar to the real images on average, the precision is high. High recall implies the generator can generate any sample found in the training dataset. A F1 score is the harmonic average of precision and recall.


In [None]:
-------- Latest Work in GANs -----------------

--------- TODO --------