<center>
    <h1>Variational Auto Encoders(VAEs)</h1>
</center>

# Brief Recap

VAEs are a generative model that learns a latent representation of data. Unlike traditional autoencoders, which learn a deterministic mapping between input and output, VAEs learn a probabilistic mapping that allows them to generate new data samples that are similar to the training data.

This probabilistic nature is achieved by modeling the latent space as a probability distribution, typically a Gaussian distribution. The encoder maps the input to this distribution, and the decoder samples from the distribution to generate new data. This probabilistic formulation allows VAEs to capture the uncertainty in the data and generate more diverse and realistic samples.


## VAE Architecture Description

![VAE Architecture]('assets/vae.png')

1. **Input:** This is the input data to the VAE, which can be any type of data such as images, text, or numerical data.

2. **Encoder:** The encoder is a neural network that maps the input data to a latent space. It learns to extract the essential features and patterns from the input.

3. **Mean μ and Std σ:** The encoder outputs the mean (μ) and standard deviation (σ) of a Gaussian distribution in the latent space. These values represent the parameters of the probability distribution that describes the input data.

4. **Latent Space:** This is a lower-dimensional representation of the input data. It is a probabilistic space where each point represents a possible combination of the latent variables.

5. **Sampling:** In this step, a random sample is drawn from the Gaussian distribution defined by the mean and standard deviation. This sampling process introduces randomness into the VAE, allowing it to generate new data points that are similar to the training data but not identical.

6. **Decoder:** The decoder is a neural network that maps the sampled point from the latent space back to the original input space. It learns to reconstruct the input data based on the latent representation.

Output: This is the reconstructed output of the VAE, which is the decoder's attempt to recreate the original input data based on the latent representation.

## Advantages of VAEs
* **Generative Capabilities:** VAEs can generate new data samples that are similar to the training data.
* **Probabilistic Interpretation:** The latent space is probabilistic, allowing for uncertainty modeling and more flexible representations.
* **Regularization:** The KL divergence term in the loss function acts as a regularizer, preventing the model from collapsing to a degenerate solution.
* **Improved Representation Learning:** VAEs can learn meaningful latent representations that capture the underlying structure of the data.

# Building VAE model in TensorFlow



In [None]:
import tensorflow as tf
from tensorflow.keras.layers import Input, Dense, Lambda, Layer
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K
from helpers import *

# Hyperparameters
latent_dim = 40  # Dimensionality of the latent space
batch_size = 128
epochs = 100

In [None]:
# Encoder network
inputs = Input(shape=(784,))  # Assuming 28x28 input images flattened
x = Dense(512, activation='relu')(inputs)
x = Dense(latent_dim * 2)(x)  # Output mean and variance

# Sampling layer from the latent space distribution
z_mean = Lambda(lambda x: x[:, :latent_dim], output_shape=(latent_dim,))(x)
z_log_var = Lambda(lambda x: x[:, latent_dim:], output_shape=(latent_dim,))(x)

# Define a function to sample from the latent space
def sampling(args):
    z_mean, z_log_var = args
    epsilon = K.random_normal(shape=(K.shape(z_mean)[0], latent_dim))
    return z_mean + K.exp(0.5 * z_log_var) * epsilon

# Use the Lambda layer to create the sampling layer
z = Lambda(sampling, output_shape=(latent_dim,))([z_mean, z_log_var])

# Decoder network
decoder_inputs = Input(shape=(latent_dim,))
decoder_hidden = Dense(512, activation='relu')
decoder_out = Dense(784, activation='sigmoid')

decoder_x = decoder_hidden(decoder_inputs)
outputs = decoder_out(decoder_x)

# Build the decoder model separately for generating new samples later
decoder = Model(decoder_inputs, outputs)

# Apply the decoder to the sampled latent vector
outputs = decoder(z)

# VAE model: mapping inputs to the reconstructed outputs
vae = Model(inputs, outputs)

In [None]:
vae.summary()

**Encoder network:**

* `inputs = Input(shape=(784,))`: Creates an input layer for 784-dimensional input data (e.g., flattened 28x28 images).
x = Dense(512, activation='relu')(inputs): Adds a hidden layer with 512 units and ReLU activation. This layer processes the input data and extracts features.
* `x = Dense(latent_dim * 2)(x)`: Adds an output layer with latent_dim * 2 units. This layer outputs the mean and variance of the latent space distribution.

**Sampling layer:**

* `z_mean = Lambda(lambda x: x[:, :latent_dim], output_shape=(latent_dim,))(x)`: Extracts the mean of the latent space distribution from the output of the previous layer.
* `z_log_var = Lambda(lambda x: x[:, latent_dim:], output_shape=(latent_dim,))(x)`: Extracts the log variance of the latent space distribution from the output of the previous layer.

**Decoder network:**

* `decoder_hidden = Dense(512, activation='relu')`: Adds a hidden layer with 512 units and ReLU activation. This layer processes the latent representation.
* `decoder_out = Dense(784, activation='sigmoid')`: Adds an output layer with 784 units and sigmoid activation. This layer reconstructs the input data from the latent representation.
* `decoder_inputs = Input(shape=(latent_dim,))`: Creates an input layer for the decoder, taking the sampled latent vector as input.
* `decoder_x = decoder_hidden(decoder_inputs)`: Processes the latent vector using the decoder's hidden layer.
* `outputs = decoder_out(decoder_x)`: The final output of the decoder, representing the reconstructed input.

**VAE model:**

* `vae = Model(inputs, outputs)`: Combines the encoder and decoder to form the complete VAE model.

Overall, the code defines a VAE model with the following components:

* Encoder: Maps the input data to a latent space, outputting the mean and variance of a Gaussian distribution.
* Sampling layer: Samples a random point from the latent space distribution using the reparameterization trick.
* Decoder: Reconstructs the input data from the sampled latent point.
This VAE model can be used for various tasks such as generating new data, dimensionality reduction, and anomaly detection.

# Building a DeNoising VAE on MNIST Dataset

The MNIST (Modified National Institute of Standards and Technology) database is a widely used dataset in the field of machine learning, specifically for image classification tasks. It consists of 60,000 training images and 10,000 testing images, each 28x28 grayscale images of handwritten digits from 0 to 9. [(learn more)](https://www.tensorflow.org/datasets/catalog/mnist)

The goal is to train a **Variational Autoencoder (VAE)** to denoise noisy MNIST images. This involves:

* Adding noise to the clean MNIST images.
* Training the VAE on these noisy images.
* Using the trained VAE to denoise new noisy images.

<img src='https://miro.medium.com/v2/resize:fit:1400/1*ZaC4kTerUL7Q1EEPtO7mWg.png' width=500>

## Load and Preprocess the dataset

In [None]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.keras.datasets import mnist

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize pixel values to [0, 1]
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

## Adding Noise to the MNIST dataset

In [None]:
# Add noise to the training data
noise_factor = 0.2  # Adjust noise level as needed
x_train_noisy = x_train + noise_factor * np.random.normal(0, 1, x_train.shape)
x_train_noisy = np.clip(x_train_noisy, 0., 1.)

Here `np.random.normal` where the values are derived from **Gaussian distribution** multiplied with `noise_factor` imputes **Gaussian noise** to the dataset.

In [None]:
# Visualize original and noisy images
fig, axs = plt.subplots(2, 5, figsize=(15, 6))
for i in range(5):
    axs[0, i].imshow(x_train[i], cmap='gray')
    axs[0, i].set_title(f"Original Image {i}")
    axs[1, i].imshow(x_train_noisy[i], cmap='gray')
    axs[1, i].set_title(f"Noisy Image {i}")

plt.tight_layout()
plt.show()

In [None]:
# Reshape the input data before training
x_train_noisy = x_train_noisy.reshape(-1, 784)  # Reshape to (num_samples, 784)
x_train_noisy = x_train_noisy.astype('float32')
x_train = x_train.reshape(-1, 784)
x_test = x_test.reshape(-1, 784)

# Print shapes for verification
print("Original shape:", x_train.shape)
print("Noisy shape:", x_train_noisy.shape)

# Compiling and Training the VAE

 We shall be fitting the VAE model by minimizing the difference between the reconstructed images (from noisy input) and the original, clean images. The training process iterates through epochs, processing batches of data at a time, and uses the validation data to assess performance.



In [None]:
vae.compile(optimizer='adam', loss=tf.keras.losses.MeanSquaredError())

In [None]:
# Train the VAE
vae.fit(x_train_noisy, x_train,
        epochs=10,
        batch_size=128,
        validation_data=(x_test, x_test))

# Visualizing the results

In [None]:
plot_denoised_images(vae, x_train, x_train_noisy, num_images=10)