## VAE
- a Probabilistic take on the Autoencoder
- AE is a model which takes high dimensional input data and compresses it into a smaller representation


- __a VAE maps the input data into the parameters of a probability dist., such as the mean and variance of a Gaussian__
- Produces a continous, structured latent space, which is useful for image generation

<img src='img/0505_1.png' width='400'>  

<center> 
    Hands-On Machine Learning, 2nd edition
</center>
    


### 0. Setup

In [None]:
# !pip install -q tensorflow-probability

In [1]:
# to generate gifs
!pip install -q imageio
!pip install -q git+https://github.com/tensorflow/docs

You should consider upgrading via the 'c:\users\hj\anaconda3\envs\test_ten\python.exe -m pip install --upgrade pip' command.
You should consider upgrading via the 'c:\users\hj\anaconda3\envs\test_ten\python.exe -m pip install --upgrade pip' command.


In [None]:
from Ipython import display

import glob
import imageio
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import tensorflow_probability as tfp
import time

### 1. Load the MNIST dataset
- A vector of 784 integers (28*28)
- 0-255


- Model each pixel with a Bernoulli dist and statically binarize the data (정적으로 이진화한다)

In [None]:
(train_images, _), (test_images, _) = tf.keras.datasets.mnist.load_data()

In [None]:
def preprocess_image(images):
    images = images.reshape((image.shape[0], 28, 28, 1)) / 255.
    return np.where(images > .5, 1.0, 0.0).astype('float32')

In [None]:
train_images = preprocess_image(train_images)
test_images = preprocess_image(test_images)

In [None]:
train_size = 60000
batch_size = 32
test_size = 10000

### 2. Batch and shuffle the data


In [None]:
train_dataset = (tf.data.Dataset.from_tensor_slices(train_images)\
                 .shuffle(train_size).batch(batch_size))
test_data = (tf.data.Dataset.from_tensor_slices(test_images)
            .shuffle(test_size).batch(batch_size))

    - from_tensor_slices : numpy array나 list를 tensor dataset으로 변환

### 3. Define the encoder and decoder networks
- Use two small ConvNets for the Encoder and Decoder networks

> ### Encoder network
> - defines __the approximate posterior distribution $q(z|x)$__, when $z$ is latent variable
> 
> 
> - In this example, simply model the dist. as a diagonal Gaussian
> - outputs the mena and log-variance parameters of a factorized Gaussian (use log-variance for numerical stability)

> ### Decoder network
> - defines __the conditional distribution of the observation $p(x|z)$__
> - Model the latent dist.prior $p(z)$ as a unit Gaussian

> ### Reparameterization trick
> <img src='img/0505_2.png' width='600'>
> 
> <br>
> 
> 
> - To generate a sample $z$, you can sample from the latent distribution
> - But, by this sampling operation, backpropagation cannot flow through a random node  
> => __Reparameterization trick!__
> <img src='img/0505_4.png' width='300'>  
> 
> 
> - $\epsilon$ is a random noise, while $\mu$ and $\sigma$ is a fixed value