# Generative Adversial Network

## What is Generative Adversial Network?

**GAN** is composed of a generative and a discriminative network.  The purpose of Generative network is to generate data vectors indistinguishable from ground truth dataset.  Discriminative Network optimizes itself to best distinguish the generated data and real data.

- GANs work best when output entropy is low.

### Generator Network

1. Must be Differentiable
2. REINFORCE can be used for discrete variables
3. No invertibility
4. Trainable for any size z
5. x can be conditionally Gaussian



### Loss Function

$$\min_G\max_D V(D,G) = \mathbb{E}_{x \sim p_{data}(x)}[\log D(x)] + \mathbb{E}_{z \sim p_{z}(z)}[\log ( 1 - D(G(x)))]$$

## Related Theorems

1. For $G$ fixed, the optimial discrminator $D$ is

$$D_G^* = \frac{p_{data}(x)}{p_{data}(x) + p_g(x)}$$

2. The global minimum of the virtual training criterion $C(G)$ is achieved iff $p_g = p_{data}$.  At that point, $C(G)$ achieves the value $- \log 4$


### How does vector space arthmetic applied to GAN

## Example Generative Network

### Fully Visible Belief Nets

### Wavenet
### Variational Autoencoder

## Reproduction using Convolutional Neural Network 

### Example Code

In [None]:
def generative_model(inputs, data_format):
    return inputs

def discriminative_model(inputs, data_format):
    return inputs

In [None]:
def model_fn(features, labels, mode):
    # In this case, features 

## Variations of Generative Network

### Variational Autoencoder

$$\log p(x) \geq \log p(x) - D_{KL}(q(z) || p(z | x)) = \mathbb{E}_{z \sim q} \log p(x, z) + H(q)$$

Disadvantages:
1. Not asymptotoically consistent unless q is perfect
2. Lower quality sample

### Boltzmann Machines

$$\begin{align} p(x) &= \frac{1}{Z}\exp(-E(x, z)) \\
       &= \sum_{x}\sum_{z} \exp(-E(x, z))\end{align}$$
       
- Partition function is intractable
- Maybe estimated with Markov chain methods
- Generating samples require Markov chains too

## Training GAN

- Use SGD-like algorithm of choice (Adam) on two minibatches simultaneously:
    - A minibatch of training examples
    - A minibatch of generated samples
- Optional: run k steps of one player for every step of the other player.

$$J^{(D)} = - \frac{1}{2} \mathbb{E}_{x \sim p_data} \log D(x) - \frac{1}{2}\mathbb{E}_z \log(1 - D(G(z)))$$

$$J^{(G)} = - J^{(D)}$$

- Equilibrium is a saddle point of the discriminator loss
- Resembles Jensen-Shannon divergence
- Generator minimizes the log-probability of the discriminator being correct

## Two GAN games:

### Non-saturating Game

$$J^{(D)} = -\frac{1}{2}\mathbb{E}_{x \sim p_{data}}[\log D(x) - \frac{1}{2} \mathbb{E}_z \log (1 - D(G(z)))]$$

$$J^{(G)} = - \frac{1}{2} \mathbb{E}_z [\log D(G(z))]$$

- Equilibrium no longer describable with a single loss
- Generator maximizes the log-probability of the discriminator being mistaken
- Heuristically motivated; generated can still learn even when discriminator rejects all generator samples

### Maximum Likelihood Game
$$J^{(D)} = -\frac{1}{2}\mathbb{E}_{x \sim p_{data}}[\log D(x) - \frac{1}{2} \mathbb{E}_z \log (1 - D(G(z)))]$$

$$J^{(G)} = - \frac{1}{2} \mathbb{E}_z [\exp(\sigma^{-1}(D(G(z))))]$$

- When discriminator is optimal, the generator gradient matches that of maximum likelihood.

## ＧＡＮ examples

### Laplacian Pyramid

### LAPGAN

### DCGAN

### INFOGAN

## Ｇｅnerative Model Application

1. Given an image with multiple holes, geneartive model can reveal a face
2. Semi-supervised Learning:  (More labels on the output given to the discriminator rather than fake/real image)
3. Next Video Frames Prediction
4. Unsupervised correspondence learning
    - CycleGAN
    - Allow to change features of an image:  day to night, horse to zebra
    - Translation without parallel corpora?
5. Simulate environment and training data
6. Domain Adaption: Domain Adversarial Networks

### Games $\supseteq$ Optimization

### Nash Equilibrium

## Game cases

1. Finite minmax
2. Finite mixed strategy games
3. Continuous, convex games
4. Differential games