# GAN: Generative Adversarial Netowrk 
*Sources:*  
`"What is a GAN?", AWS. https://aws.amazon.com/what-is/gan/`  
`"What are GANs (Generative Adversarial Networks)?, IBM Technology, YT.com. https://www.youtube.com/watch?v=TpMIssRdhco`  
`Generative Adversarial Network (GAN), Geeksforgeeks.org. https://www.geeksforgeeks.org/generative-adversarial-network-gan/`   
`DCGAN Tutorial, Nathan Inkawhich, PyTorch.org. https://pytorch.org/tutorials/beginner/dcgan_faces_tutorial.html`  

## Overview
A GAN trains two neural networks to compete against eeach other in order to generate more authentic data from a given training dataset. A GAN can generate new images from an existing database, etc. The first network creates new data by modifying a random input vector as much as possible, then the second network determines whether the created data is original or generated. The GAN continues learning/improving the generated data until the second network can't tell the difference between the original set and the generated samples.   

In machine learning, GAN models can be used for data augmentation by creating synthetic data. This can be used to create a better dataset, to fill in missing values, etc etc.  

GANS are a type of NN used for Unsupervised learning. 

## Structure
A GAN is made up of a __generative__ network and a __discriminator__ netowrk.  
Basic steps:
1. generator analyzes training set and identifies attributes in data
2. distriminator also analyzes training set and distinguishes between attributes independantly
3. generator modifies data by adding noise or random changes to certain attributes
4. generator passes modified data to discriminator
5. discriminator calculates probability that generated output is from original dataset
6. discriminator gives generator guidance to reduce randomization in next cycle  

The discriminator is trained first; it is given the initial dataset and then after it is able to recognize attributes of the training data, it can be fed samples that don't belong to the original set to test whether it actually can discriminate. 

### How it works:
The generator and discriminator go back and forth. The generator sends a generated sample to the discriminator, and the discriminator predicts whether is is fake or real. There is always a winner; if the disc determines the sample is rightfully fake, the discriminator wins, and if it is wrong, the generator wins. The loser modifies its model, and the winner remains unchanged. This cycle continues until the generator gets good enough that the discriminator is consistently wrong.   

The discriminator's ouput is the probability that a sample came from the training set. It should be HIGH when x comes from the training set and LOW when x came from the generator. It can be thought of as a binary classifier.  

The generator takes in a latent space vector, z. The generator will map z to the dataset space, and its goal is to estimate the distribution that the training data comes from, so it can generate samples that match that distribution.  

G and D play a "minimax" game, where D tries to maximize the probability that it correctly classifies samples, and G tries to minimize the probability that D will predict it's outputs are fake. 

### DCGAN: Deep Convolutional GAN
In a DCGAN, both models are CNNs.   

The discriminator is made up of convolution layers, batch normalization layers, and leaky relu activations (recall that batch normalization layers normalize the output of the previous layer by subtracting mean and dividing by std to improve convergence
). The input is a 3d input image (3rd dim is number of channels), and the output is the scalar probability.  

The generator is made of convolutional transpose layers, batch norm layers, and relu activations. Input is latent vector z, and output is a 3d image. The conv-transpose layers transform the latent vector into a volume matching the output image. 

# Autoencoders
*Sources:*   
`"What is an autoencoder?", IBM.com. https://www.ibm.com/think/topics/autoencoder`  
`"Introduction to Autoencoders: From the Basics to Advanced Applications in PyTorch", Pier Paolo Ippolito, datacamp.com. https://www.datacamp.com/tutorial/introduction-to-autoencoders`  
`"Introduction to autoencoders." Jeremy Jordan, jeremyjordan.me. https://www.jeremyjordan.me/autoencoders/`  


An autoencoder is a NN that is designed to compress data down to its essential features, then reconstruct the original input from the compressed representation. They use Unsupervised learning to find the fundamental latent variables (underlying or hidden) in the input data. The autoencoder then uses this knowledge to construct a latent space representation of the most essential information, then reconstructs using just this essential data.   

Autoencoders are used for things like data compression, anolomy detection, noise reduction, and facial recognition. 

An autoencoder is made of an Encoder and a Decoder. Autoencoders are a specific kind of encoder-decoders that are trained unsupervised to reconstruct their own input data. They are actually considered "self-supervised" because they can compare their output to the original, "ground-truth" input.  

In some applications, the decoder's only purpose is to train the encoder, and it can be discarded after training. 

### Structure
An autoencoder is made of an Encoder and a Decoder.   

Hyperparameters to tune:  
- number of layers in encoder and decoder networks 
- number of nodes in encoder and decoder layers
- loss function used for optimization
- size of the latent space

#### Encoder
Autoencoders discover latent variables by creating a "bottleneck" during the encoder process. This bottleneck makes the encoder extract only the most essential data.   

The encoder is made of reducing size layers, which puts the data through dimensionality reduction. 

#### Bottleneck / "code"
The code contains the most compressed/smallest dimension version of the input. It is the output of the last encoder layer and the input of the first decoder layer. The code dimension is the minimum number of features needed for accurate reconstruction.  

#### Decoder
The decoder increases the dimensionality of the code, as it is made of increasing size layers. The decoder is meant to "reconstruct" the original data. The output of the decoder is compared to the original input to judge how the autoencoder did - called the reconstruction error. 

### Types of Autoencoders
1. Undercomplete autoendcoder
Size of bottleneck is fixed and is always lower than original input size, in order to avoid overfitting. The model is trained according to reconstruction loss and hidden layer size is fixed in order to make sure the model isn't just memorizing the input data. 

2. Sparse autoencoders
For sparse encoders, a regularization technique is applie to encourage good generalization. A loss function is used to penalize certain activations within a layer, instead of setting a smaller size for inner layers. The network is encouraged to learn encoding/decoding by only activating a few nuerons. This is different than the usual approach, which generally regularizes the weights, not the activations.   

3. Denoising autoencoders
Input and output are no longer the same for denoising autoencoders. This version uses a dataset that is slightly corrupted (while the original is maintained for testing) in order to create a more generalized model. 