### References - The text in this notebook is mostly copied and summarized from the references below

1. Generative Adversarial Nets - Ian Goodfeloow et al.
2. Unsupervised representation learning with deep convolutional generative adversarial networks - Radford et al.
3. Least squares generative adversarial networks - Mao et al.
4. Wasserstein generative adversarial networks - Arjovsky et al.
5. Generative Adversarial Networks - Jason Brownlee 
6. Deep Learning with Python - Francois Chollet

### Generative Modelling

Generative modelling is an unsupervised task in machine learning that aims to identify patterns and regularaties in the input domain. The training of such models is done in such a way that the samples generated by the trained model will be sufficiently close to the samples available in the input domain. Thus generative modelling in essence is a technique of generating 'fake' samples that are hard to identify as 'fake'. 

### Supervised vs Unsupervised

A typical machine learning problem involves using a model to make a prediction, e.g. predictive
modeling. This requires a training dataset that is used to train a model, comprised of multiple
examples, called samples, each with input variables (X) and output class labels (y). A model is
trained by showing examples of inputs, having it predict outputs, and correcting the model to
make the outputs more like the expected outputs. 

This correction of the model is generally referred to as a supervised form of learning, or
supervised learning. It is supervised because there is a real expected outcome to which a
prediction is compared.

This lack of correction is generally referred to as an unsupervised form of learning, or
unsupervised learning.



### Discriminative vs Generative Modelling

In supervised learning, we may be interested in developing a model to predict a class label
given an example of input variables. This predictive modeling task is called classification.
Classification is also traditionally referred to as discriminative modeling. 

This is because a model must discriminate examples of input variables across classes; it must
choose or make a decision as to what class a given example belongs.

Alternately, unsupervised models that summarize the distribution of input variables may be
able to be used to create or generate new examples in the input distribution. As such, these
types of models are referred to as generative models.

For example, a single variable may have a known data distribution, such as a Gaussian
distribution, or bell shape. A generative model may be able to sufficiently summarize this data
distribution, and then be used to generate new examples that plausibly fit into the distribution
of the input variable.

In fact, a really good generative model may be able to generate new examples that are not
just plausible, but __indistinguishable__ from real examples from the problem domain.

### Examples of Generative Models

1. Naive Bayes is an example of a generative model that is more often used as a discriminative model. For example, Naive Bayes works by summarizing the probability distribution of each input variable and the output class. When a prediction is made, the probability for each possible outcome is calculated for each variable, the independent probabilities are combined, and the most likely outcome is predicted. Used in reverse, the probability distributions for each variable can be sampled to generate new plausible (independent) feature values.
2. Latent Dirichlet Allocation, or LDA
3. Gaussian Mixture Model, or GMM. 

Deep learning methods can be used as generative models. Two popular examples include:
1. Restricted Boltzmann Machine, or RBM
2. Deep Belief Network, or DBN. 

Two modern examples of deep learning generative modeling algorithms include:
1. Variational Autoencoder, or VAE 
2. Generative Adversarial Network, or GAN.

### Generative Adversarial Network

GANs tackle the unsupervised task of Generative modelling in a unique way. GANs make the training of a generative model into a supervised learning task by introducing two sub models. 

    1. Generator: This model generates new samples.
    2. Discriminator: This model tries to classify the samples generated by the generator as either real (coming from the input domain) or fake (coming from the generator)

The two models are then trained together in an adversarial zero-sum game until the discriminator model is unable separate generated samples from real samples at least half of the time. At this point in training the generator model should be able to produce sufficiently good 'fake' samples.

Generative adversarial networks are based on a __game theoretic__ scenario in which
the generator network must compete against an __adversary__. The generator network
directly produces samples. Its adversary, the discriminator network, attempts to
distinguish between samples drawn from the training data and samples drawn from
the generator.

### The Generator Model
The generator model takes a fixed-length random vector as input and generates a sample in the
domain, such as an image. 

-> A vector is drawn randomly from a __Gaussian distribution__ and is
used to seed or source of noise for the generative process. To be clear, the input is a vector of
random numbers. It is not an image or a flattened image and has no meaning other than the
meaning applied by the generator model. 

-> After training, points in this multidimensional vector
space will correspond to points in the problem domain, forming a compressed representation
of the data distribution. This vector space is referred to as a __latent space__, or a vector space
comprised of latent variables.

-> __Latent variables__, or hidden variables, are those variables that are
important for a domain but are __not directly observable__. A latent space provides a compression or high-level concepts of the
observed raw data such as the input data distribution.

### The Discriminator Model
The discriminator model takes an example from the problem domain as input (real or generated)
and predicts a binary class label of real or fake (generated). The real example comes from the
training dataset. The generated examples are output by the generator model. The discriminator
is a normal classification model.

### GAN as a Two Player Game
Generative modeling is an unsupervised learning problem. A clever property of the GAN architecture is that the training of the generative model
is framed as a supervised learning problem. The two models, the generator and discriminator,
are trained together. The generator generates a batch of samples, and these, along with real
examples from the domain, are provided to the discriminator and classified as real or fake.
The discriminator is then updated to get better at discriminating real and fake samples
in the next round, and importantly, the generator is updated based on how well, or not, the
generated samples fooled the discriminator.

We can think of the generator as being like a counterfeiter, trying to make fake
money, and the discriminator as being like police, trying to allow legitimate money
and catch counterfeit money. To succeed in this game, the counterfeiter must learn
to make money that is indistinguishable from genuine money, and the generator
network must learn to create samples that are drawn from the same distribution as
the training data.

In this way, the two models are competing against each other. They are adversarial in the
game theory sense and are playing a zero-sum game.

Because the GAN framework can naturally be analyzed with the tools of game
theory, we call GANs “adversarial”.
— __NIPS 2016 Tutorial: Generative Adversarial Networks, 2016.__

In this case, zero-sum means that when the discriminator successfully identifies real and
fake samples, it is rewarded and no change is needed to the model parameters, whereas the
generator is penalized with large updates to model parameters. Alternately, when the generator
fools the discriminator, it is rewarded and no change is needed to the model parameters, but
the discriminator is penalized and its model parameters are updated.
At a limit, the generator generates perfect replicas from the input domain every time, and
the discriminator cannot tell the difference and predicts unsure (e.g. 50% for real and fake) in
every case. This is just an example of an idealized case; we do not need to get to this point to
arrive at a useful generator model.

### GANs and Convolutional Neural Networks

GANs typically work with image data and use Convolutional Neural Networks, or CNNs, as
the generator and discriminator models. The reason for this may be both because the first
description of the technique was in the field of computer vision and it used image data, and
because of the remarkable progress that has been seen in recent years using CNNs more generally
to achieve state-of-the-art results on a suite of computer vision tasks such as object detection
and face recognition.
Modeling image data means that the latent space, the input to the generator, provides a
compressed representation of the set of images or photographs used to train the model. It also
means that the generator generates new images, providing an output that can be easily viewed
and assessed by developers or users of the model. It may be this fact above others, the ability to
visually assess the quality of the generated output, that has both led to the focus of computer
vision applications with CNNs and on the massive leaps in the capability of GANs as compared
to other generative models, deep-learning-based or otherwise.

## Research Interest Point

__GANs provide a path to sophisticated domain-specific data augmentation and a solution
to problems that require a generative solution, such as image-to-image translation. This could imply further practical use cases other than image to image translation. From stronger pseudorandom number generators based on chaotic maps to solving PDE's that require both interpolation and extrapolation for solutions.__ 