# CS294 - Deep Unsupervised Learning

## Lecture 5. Implicit models and GANs

### Motivation

<img src="imgs/motivation.png">

Lets say we want to build the sampler. The simplest possible way would be to just randomly take samples from the training data:

### Original GAN

<img src="imgs/sampler1.png">

But we don’t just want to sample the exact data points you have. We want to build a generative model that can:
* understand the underlying distribution of data points and smoothly interpolate across the training samples
* output samples similar but not the same as training data samples
* output samples representative of the underlying factors of variation in the training distribution. 

Example: digits with unseen strokes, faces with unseen poses, etc. 

_def_. **Implicit models**

<img src="imgs/implicit_model_def.png">

_def_. **GANs**

<img src="imgs/gans_cost_func.png">
<img src="imgs/gans_tutorial.png">

### Evaluation metrics

1. Parzen-Window density estimator

2. Inception score
    
3. Frechet Inception Distance


### Theory behind GANs

### Progression of GANs

**DCGAN** (Deep Convolutional GANs) propose GANs as a way to learn feature representations and they show that those representations are actually capturing important aspects of the images.

This is the DCGAN generator architecture which takes 100 dim random Gaussian noise which is forwarded through dense layer and reshaped to make the tensor spatial. The non-linearity at the end is $tanh$ and we are forcing $[-1, 1]$ values because we want to make pixels out of those values. We would then simply create pixels by doing $(out + 1) * 127.5 = rgbout$. 
<img src="imgs/dcgan1.png">
<img src="imgs/dcgan2.png">
One important point that batch norm should be used on real and generated tensors separately.
<img src="imgs/dcgan3.png">
<img src="imgs/dcgan4.png">

**Improved training of GANs**

**WGAN** introduced moving away from binary classifier in $D$. In WGAN, $D = f_w$ outputs a single scalar value and $f_w$ has to be aproximatelly Lipschitz ??????? Weight clipping step assures that Lipschitzness.
<img src="imgs/wgan1.png">
<img src="imgs/wgan2.png">


**WGAN-GP** improved way to ensure weight clipping by introducing regularization term.
<img src="imgs/wgan-gp1.png">
<img src="imgs/wgan-gp2.png">

Positive and negative (underlined with red pen) sides of WGAN-GP:
<img src="imgs/wgan-gp3.png">


**Progressive GAN**
...

**Style GAN** proposed novel idea to move away from passing random noise $z$ directly into the Generator. The idea is to always start from the constant and pass various transformations of $z$ (called style vectors $w$) on the different layers of the generator. During the training they were also adding gaussian random noise at various levels (this is a common trick when training GANs)

<img src="imgs/stylegan1.png">

The crucial trick of style gan is how they apply style vectors to various layers and thats achived with **Adaptive Instance Norm (AdaIN)**.
<img src="imgs/stylegan2.png">


**Style GAN v2** is fixing the problem of waterdrops like artifacts that style gan v1 has.

<img src="imgs/styleganv21.png">

In [1]:
import numpy as np

def calculate_inception_score(p_yx, eps=1E-16):
    # calculate p(y)
    p_y = expand_dims(p_yx.mean(axis=0), 0)
    # kl divergence for each image
    kl_d = p_yx * (log(p_yx + eps) - log(p_y + eps))
    # sum over classes
    sum_kl_d = kl_d.sum(axis=1)
    # average over images
    avg_kl_d = mean(sum_kl_d)
    # undo the logs
    is_score = exp(avg_kl_d)
    return is_score