### How GANs work

Essentially, there are two networks that are in competition with each other:

<img src="part-5_images/basic_gan.png" alt="GANs" style="width: 650px;"/>

1. Generator - is a neural network that it takes in random noisy data and tries to output a a reshaped noise that has a realistic structure.

Training process is trained in a unsupervised manner, we show the model a lot of images and ask the model to output a lot of images from that probability distribution.

2. Discriminator - is a regular neural network classifier that has the role of guiding the generator towards outputing realistic output. During training it's shown 50% of the time real data and 50% fake data so it's trained to assign a probability near 1 to real images and close to 0 to fake data.

The generator's is forced produce better and better output in order to fool the discriminator.

The original paper for GANs: https://arxiv.org/pdf/1406.2661.pdf



### Games and Equilibria

In a very simple example:

- the cost for the discriminator is the negative of the cost for the generator.
- the generator wants to minimize the cost function and the generator to maximize the cost.

If both networks are large enough with mathematical tools from game theory it can be shown that there is an equilibrium where the generator density is equal to the true data density and the discriminator outputs one half everywhere.

GANs are usually trained by running two optimization algorithms at the same time, each minimizing a player's cost with respect to the parameters. However, they do not necessarily find an equilibrium of the game. 

### Tips for Training GANs

In general when we train a GAN:

Discriminator training

    Compute the discriminator loss on real, training images
    Generate fake images
    Compute the discriminator loss on fake, generated images
    Add up real and fake loss
    Perform backpropagation + an optimization step to update the discriminator's weights

Generator training

    Generate fake images
    Compute the discriminator loss on fake images, using flipped labels!
    Perform backpropagation + an optimization step to update the generator's weights



<img src="part-5_images/gan_architecture_example.png" alt="GAN architecture" style="width: 600px;"/>

**Activation functions:** 
- Leaky ReLU makes sure the gradient can flow through the entire architecture
- tanh - a popular choice for the output of the generator which means a rescaling (-1, 1).
- sigmoid - used to enforce the constraint of output as a probability for discriminator

<img src="part-5_images/stable_loss_gans.png" alt="BCE stable" style="width: 600px;"/>


**Two optimization algorithms**
- Adam is a good choice (also used in DCGANs)
- a Binary Cross Entropy Loss (BCELoss) is used to calculate the loss
- For the BCE, we need to use the BCELosswithLogits this helps the discriminator generalize better
- For the generator loss you want to set up another BCE but with the labels flipped

To scale up classifiers to work with larger images convolutional networks are used.

<img src="part-5_images/gan_conv.png" alt="GAN Convolution" style="width: 600px;"/>

Use Batch Normalization in most layers except on the input and output of the generator.

Improved training techniques for GANs: https://s3.amazonaws.com/video.udacity-data.com/topher/2018/November/5bea0c6a_improved-training-techniques/improved-training-techniques.pdf

#### The universal approximation function

The universal approximation theorem states that a feed-forward network with a single hidden layer is able to approximate certain continuous functions.

https://medium.com/@jonathan_hui/gan-whats-generative-adversarial-networks-and-its-application-f39ed278ef09
    
https://skymind.ai/wiki/generative-adversarial-network-gan