# Generative Deep Learning #
## Part 1 : Review Session 2 ##
——————————————————————————————————————————

<h2> Chapter 4 : Generative Adversarial Network </h2>

 GAN is a battle between two adversaries, the generator and the discriminator.
 
 - Generator
   
   The generator tries to convert random noise into observations that look as if they have been sampled from the original dataset.
   
 - Discriminator
 
   The discriminator tries to predict whether an observation comes from original dataset or is one of the generator's forgeries.
   
 ![](./imgs/GenDeepLearn_Part1_Review_Pic13.jpg)
 
 A simple explanation to the Training Process for GANs
 
 - At the beginning
   - The generator outputs noisy images and the discriminator predicts randomly.
 - Generator Learning
   - It should be able to fool the detector from identify whether the generated image is brought by a generator or reallife 
 - Discriminator Learning
   - Works with the objective to perform successful discrimination

<h3> Discriminator </h3>
 
A discriminator is basically a supervised learning convolutional neural network, whose objective is to distinguish whether the generated image is "fake" or 
"real". The original paper on GANs (Ian Goodfellow) had a discriminator which was a dense connected ANN, however, now almost all the time a convolutional networks are used. They are also called **DCGAN - Deep Convolutional Generative Adversarial Networks**.

Obviously, there is nothing unique about the discriminator and looks like a very similar a standard CNN models. Below is an example.

![](./imgs/GenDeepLearn_Part1_Review_Pic14.jpg)

<h3> Generator </h3>
Once again, the generator looks very similar to a **Decoder - VAE (Variational Auto-Encoder)**, which has

- INPUT : A vector drawn from a multi-variate normal distribution
- OUTPUT : An image (1 channel), built using the input vector and using **Conv2DTranspose** layers to bring to a 1 channel image.

**Caveat - Upsampling2D layer**

In Keras, we can also use Upsampling2D layer, which works similar to Conv2DTranspose layer. The only different is that, in Conv2DTranspose, 0 (zero) is filled around the original matrix to expand the channel, however, in Upsampling2D, rows and columns are repeated around the original matrix to expand the channel.

Both the methods are acceptable in GANs generator networks. Experiments have shown that using Conv2DTranspose leads to formation of checkered patterns, and sometime do not give good clarity to images. However, Upsampling2D also has its own challenges, the ides should be to use both to see which is best for a problem statement.

An example of generator network

![](./imgs/GenDeepLearn_Part1_Review_Pic15.jpg)

Note the peculiar way in which Upsampling2D and std Conv2D layers have been stacked to extract features for good generation.

<h3> Training GANs </h3>
Training the discriminator is not a unique job, as it is a supervised learning algorithm, where in we provide "real" images with label "1" and fake images generated by *Generator* with label "0". 

Training the generator is a more difficult job, because we do not know where a *TRUE* image gets mapped on the latent space (lower dimension space). 

<b style="color:blue">Remember, the objective of the generator is to output images through which it could fool the discriminator, that is, the output of the discriminator for generated images should be "1"</b>

<h4> Training the Discriminator </h4>
While training the discriminator, the objective is to make the model differentiate between "real" images and "fake/generated" images. So we train the discriminator model using images sourced from a database, which has actual images, and then we train the discriminator on "generated" images from generator.

<img src="./imgs/GenDeepLearn_Part1_Review_Pic16.jpg" style="align:left" height="40%" width="40%"></img>

<b style="color:red"> QUESTION </b>

**Why is that we are training the Discriminator in 2 steps, once with "good" images and then with "fake" images. Why cant we train on a collection of both "fake" + "good" together?** - Courtsey Elizabeth

<h4> Training the Generator </h4>
Now when we are training the generator, following important points need to be considered

- When training the generator, we need to freeze the waits of the discriminator. This is done because

  - The objective of the generator is to generate images which look as real as possible, that means there labels should be "1". Although, the discriminator should be predicting probabilities very very less for these images (As that is the objective of the Discriminator), but the generator should be penalized for not being able to bridge this difference. So please note, the lables for the images generated by generator will be "1", as the generator need to learn to bridge the gap between "1" and the discriminator's predicted probabilities.
  
  - If we not freeze the weights of the Discriminator, then its weight would also adjust to bridge gap between predicted probabilties and the "label = 1", which we do not want, because this would make the Discriminator learn wrong ability.

<img src="./imgs/GenDeepLearn_Part1_Review_Pic17.jpg" style="align:left" height="40%" width="40%"></img>

<h4> Combined learning process </h4>
We take alternate steps to let the Discriminator learn and then we make the Generator learn. Note when the Generator's learning is going on, the Discriminator weights are frozen.

![](./imgs/GenDeepLearn_Part1_Review_Pic18.jpg)

<b style="color:red"> IMPORTANT NOTE </b>

While training GANs, **the learning rate of Discriminator  > the learning rate of Generator**. The main idea behind this is that your generator is only as good as your discriminator.

<h4 style="color:blue"> Discussion within MLT COMMUNITY for learning rate note above </h4>

`
Some points discussed
`


<h3> Analyzing the performance and training of GANs </h3>

![](./imgs/GenDeepLearn_Part1_Review_Pic19.jpg)

- Loss 
  
  - We can see that the loss of Discriminator is continously going down, signifying the as the generator is generating new images, the discriminator is getting stronger and stonger.
  - We can also see that the loss for Generator is increasing. Remember, the loss in generator would be calculated against value = 1, because the objective of the generator is to fool the discriminator. Since the discriminator is becoming stronger and stronger, this loss is getting increased. However, the generator's objective was always to bridge this gap, and hence as the discriminator becomes stronger, the generator is also becoming stronger and stronger. Also, note/remember, the learning rate of the discriminator > learning rate of generator.
  
- Accuracy
  
  The accuracy graph is almost an inverse of the Loss graph. The accuracy of the discriminator is continously increasing and that of generator is reducing
  
- Generative ability of the Generator

  We can also check the generative ability of the generator, by checking the generated images after specific epochs. We could see that as epochs increase, the generative ability of the generator is also getting better. Lastly, the objective of generator is to not generate a duplicate of real images. This is checked by L1 distance between the generated and real images, and then visually verifying how different are generated images from the closest counter-parts in real dataset.

![](./imgs/GenDeepLearn_Part1_Review_Pic20.jpg)
  
![](./imgs/GenDeepLearn_Part1_Review_Pic21.jpg)


<h3> GAN Challenges </h3>

Although GANs produce some miraculous results, however, they are very difficult to train. Some of the challenges while training GANs are:
 
 - Oscillating Loss
 
   The Loss/Accuracy graph shown above is an ideal one, however, there are chances when the Generator/Discriminator loss values spiral out. It is difficult to say when this would happen, however **VANILLA GANs** are quite susceptible to such issues. As an example:
   
   ![](./imgs/GenDeepLearn_Part1_Review_Pic22.jpg)
 
 - Mode Collapse
 
   At this moment, Generator has found a point in latent space, through which it is able to fool the discriminator. Therefore, discriminator predicts higher probabilities of this generated image be like a "real" one. This point is called the ***MODE***. If we keep on training the generator without training the discriminator, then Generator would find a point in latent space through which it could food the discriminator. Subsequent iterations of training, the generator would continue to sample points from a similar area, thereby continuing to fool the discriminator and essentially reducing the loss to zero. Even if we train the discriminator after this, the generator would find another ***MODE*** point to fool the generator.
 
 - Uninformative Loss
   
   From the basic concept of ML/AI, the idea of learning for a model is to ***continously reduce the LOSS***. However, as we see in the LOSS graphs above, the LOSS of the generator keeps on increasing, however, the quality of the images keep on increasing. This is because, the LOSS of the generator is judged on the basis of the current capability of the Discriminator. Therefore, in essence we should ***not compare the loss of a GENERATOR Vs DISCRIMINATOR***. This lack of correlation between the quality of images generated and loss of generator make GAN training non intuitive.
 
 - Hyperparameters
 
   Large number of parameters and combinations of Dropouts, BatchNormalization etc. make it difficult for zero down on a good architecture. Also, GAN's are very sensitive to slight changes in the architecture.
   
<h4> Tackling the problems with GAN </h4>
Certain advancements in algorithms have helped tackle the problems above. There are 2 main GAN modifications that are important these days

 - WGAN
 
 - WGAN-GP

<h3> WGAN - Wasserstein GAN </h3>

 
 - The Lipschitz Constraint
 
 - Weight Clipping

<h3> Training WGAN </h3>

<h3> Analyzing WGAN </h3>

<h3> WGAN-GP </h3>

<h3> Analyzing WGAN-GP </h3>