## Week 3 : Autoencoders
```
- Generative Artificial Intelligence (Fall semester 2023)
- Professor: Muhammad Fahim
- Teaching Assistant: Gcinizwe Dlamini
```
<hr>

## Content
In this lab we will cover the following topics:
```
1. Types of autoencoders
2. Applications of autoencoders
3. Autoencoders training procedure
4. Reparametrisation trick

```

## Undercomplete & Overcomplete

PCA vs. Undercomplete autoencoders
* Autoencoders are much flexible than PCA.
* Neural Network activation functions introduce “non-linearities” in encoding, but PCA only linear transformation.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchsummary import summary
from torch.utils.data import TensorDataset, DataLoader

import torchvision
import torchvision.transforms as transforms


device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

## Defining Undercomplete Autoencoder

In [None]:
## Undercomplete
class autoencoder(nn.Module):
    def __init__(self, input_size, latent_dim):
      super(autoencoder, self).__init__()
      # Step 1 : Define the encoder
      self.encoder = torch.nn.Sequential(
            torch.nn.Linear(input_size, input_size//2),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//2, input_size//4),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//4, latent_dim),
        )
      # Step 2 : Define the decoder
      self.decoder = torch.nn.Sequential(
            torch.nn.Linear(latent_dim, input_size//4),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//4, input_size//2),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//2, input_size),
        )
      # Step 3 : Initialize the weights (optional)
      self.encoder.apply(self.__init_weights)
      self.decoder.apply(self.__init_weights)

    def forward(self, x):
      # Step 1: Pass the input through encoder to get latent representation
      z = self.encoder(x)
      # Step 2: Take latent representation and pass through decoder
      x = self.decoder(z)
      # x = x + noise
      return x

    def encode(self,input):
      #Step 1: Pass the input through the encoder to get latent representation
      z = self.encoder(input)
      return z

    def __init_weights(self,m):
      #Init the weights (optional)
      if type(m) == nn.Linear:
          torch.nn.init.xavier_uniform_(m.weight)
          m.bias.data.fill_(0.01)

## Define training parameters

```
Step 1: Set training parameters (batch size, learning rate, optimizer, number of epochs, loss function)
Step 2: Create dataset (Randomly generated)
Step 3: Create data loader
Step 4: Define the training loop
```

In [None]:
batchSize = 100
learning_rate = 0.01
num_epochs = 40
sample = torch.randn((batchSize,1,64))
AE = autoencoder(64,5).to(device)
print(AE)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(AE.parameters(),lr=learning_rate)

#Create a random dataset
data_loader = DataLoader(TensorDataset(torch.randn((1000,1,64))),batch_size=32,shuffle=True)

autoencoder(
  (encoder): Sequential(
    (0): Linear(in_features=64, out_features=32, bias=True)
    (1): ReLU()
    (2): Linear(in_features=32, out_features=16, bias=True)
    (3): ReLU()
    (4): Linear(in_features=16, out_features=5, bias=True)
  )
  (decoder): Sequential(
    (0): Linear(in_features=5, out_features=16, bias=True)
    (1): ReLU()
    (2): Linear(in_features=16, out_features=32, bias=True)
    (3): ReLU()
    (4): Linear(in_features=32, out_features=64, bias=True)
  )
)


## Training Loop

In [None]:
for epoch in range(num_epochs):
    epoch_loss = 0.0
    for X in data_loader:
        X = X[0].to(device)

        optimizer.zero_grad()
        # forward
        output = AE(X)
        loss = criterion(output, X)

        # backward
        loss.backward()
        optimizer.step()

        epoch_loss += loss.item()

    # log
    print('epoch [{}/{}], loss:{:.4f}'.format(epoch + 1, num_epochs, loss.item()))

epoch [1/40], loss:0.9702
epoch [2/40], loss:0.9585
epoch [3/40], loss:0.8792
epoch [4/40], loss:0.9092
epoch [5/40], loss:0.8583
epoch [6/40], loss:0.9522
epoch [7/40], loss:0.9292
epoch [8/40], loss:0.9324
epoch [9/40], loss:0.9184
epoch [10/40], loss:0.9587
epoch [11/40], loss:0.7892
epoch [12/40], loss:0.9840
epoch [13/40], loss:0.8939
epoch [14/40], loss:0.8362
epoch [15/40], loss:0.7783
epoch [16/40], loss:0.8780
epoch [17/40], loss:0.9695
epoch [18/40], loss:0.8538
epoch [19/40], loss:0.9071
epoch [20/40], loss:0.7875
epoch [21/40], loss:1.0170
epoch [22/40], loss:0.7398
epoch [23/40], loss:0.9544
epoch [24/40], loss:0.7921
epoch [25/40], loss:0.8679
epoch [26/40], loss:0.7918
epoch [27/40], loss:0.8273
epoch [28/40], loss:0.8619
epoch [29/40], loss:0.8166
epoch [30/40], loss:0.7935
epoch [31/40], loss:0.8081
epoch [32/40], loss:0.8751
epoch [33/40], loss:0.9209
epoch [34/40], loss:0.8924
epoch [35/40], loss:0.8522
epoch [36/40], loss:0.8811
epoch [37/40], loss:0.9314
epoch [38/

## Regularized Autoencoder

Regularized autoencoders use a loss function that encourages the model to have other properties besides the ability to copy its input to its output.

* **Sparse Autoencoders** : It impose a constraint in its loss by adding a regularization term in the loss function.
$$L(x,\hat{x}) + λ \sum_{i}|h_i|$$

  **Regularization Form** : It can be L1 regularization or Any other kinds of penalties are possible


* **Denoising Autoencoder** : a special autoencoder that is robust to noise. By adding stochastic noise, we force Autoencoder to learn more robust features

## Sparse Autoencoder

**Task**: implement a Sparse Autoencoder for 1D data of your choice

In [None]:
## Undercomplete
class SparseAutoencoder(nn.Module):
    def __init__(self, input_size, latent_dim):
      super(SparseAutoencoder, self).__init__()
      # Step 1 : Define the encoder
      self.encoder = torch.nn.Sequential(
            torch.nn.Linear(input_size, input_size//2),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//2, input_size//4),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//4, latent_dim),
        )
      # Step 2 : Define the decoder
      self.decoder = torch.nn.Sequential(
            torch.nn.Linear(latent_dim, input_size//4),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//4, input_size//2),
            torch.nn.ReLU(),
            torch.nn.Linear(input_size//2, input_size),
        )
      # Step 3 : Initialize the weights (optional)
      self.encoder.apply(self.__init_weights)
      self.decoder.apply(self.__init_weights)

    def forward(self, x):
      # Step 1: Pass the input through encoder to get latent representation
      z = self.encoder(x)
      # Step 2: Take latent representation and pass through decoder
      x = self.decoder(z)
      # x = x + noise
      return x

    def encode(self,input):
      #Step 1: Pass the input through the encoder to get latent representation
      z = self.encoder(input)
      return z

    def __init_weights(self,m):
      #Init the weights (optional)
      if type(m) == nn.Linear:
          torch.nn.init.xavier_uniform_(m.weight)
          m.bias.data.fill_(0.01)

    def sum_up_weights(self):
      l1_regularization = 0
      for param in self.encoder.parameters():
         l1_regularization += param.abs().sum()
      for param in self.decoder.parameters():
         l1_regularization += param.abs().sum()
      return l1_regularization



In [None]:
batchSize = 100
learning_rate = 0.01
num_epochs = 40
sample = torch.randn((batchSize,1,64))
SA = SparseAutoencoder(64,5).to(device)
print(SA)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(SA.parameters(),lr=learning_rate)

#Create a random dataset
data_loader = DataLoader(TensorDataset(torch.randn((1000,1,64))),batch_size=32,shuffle=True)

SparseAutoencoder(
  (encoder): Sequential(
    (0): Linear(in_features=64, out_features=32, bias=True)
    (1): ReLU()
    (2): Linear(in_features=32, out_features=16, bias=True)
    (3): ReLU()
    (4): Linear(in_features=16, out_features=5, bias=True)
  )
  (decoder): Sequential(
    (0): Linear(in_features=5, out_features=16, bias=True)
    (1): ReLU()
    (2): Linear(in_features=16, out_features=32, bias=True)
    (3): ReLU()
    (4): Linear(in_features=32, out_features=64, bias=True)
  )
)


In [None]:
for epoch in range(num_epochs):
    epoch_loss = 0.0
    for X in data_loader:
        X = X[0].to(device)

        optimizer.zero_grad()
        # forward
        output = SA(X)
        loss = criterion(output, X)

        # backward
        loss.backward()
        optimizer.step()

        epoch_loss += loss.item()

    # log
    print('epoch [{}/{}], loss:{:.4f}'.format(epoch + 1, num_epochs, loss.item()))

epoch [1/40], loss:0.9919
epoch [2/40], loss:1.0163
epoch [3/40], loss:1.0197
epoch [4/40], loss:0.9396
epoch [5/40], loss:0.8909
epoch [6/40], loss:0.9341
epoch [7/40], loss:0.9058
epoch [8/40], loss:0.9624
epoch [9/40], loss:0.8243
epoch [10/40], loss:0.8901
epoch [11/40], loss:0.8961
epoch [12/40], loss:0.8450
epoch [13/40], loss:0.8574
epoch [14/40], loss:0.8601
epoch [15/40], loss:0.8677
epoch [16/40], loss:0.8393
epoch [17/40], loss:0.9435
epoch [18/40], loss:0.8850
epoch [19/40], loss:0.9191
epoch [20/40], loss:0.9207
epoch [21/40], loss:0.8591
epoch [22/40], loss:0.8973
epoch [23/40], loss:0.8461
epoch [24/40], loss:0.8078
epoch [25/40], loss:0.8834
epoch [26/40], loss:0.8696
epoch [27/40], loss:0.8720
epoch [28/40], loss:0.9020
epoch [29/40], loss:0.9433
epoch [30/40], loss:0.8622
epoch [31/40], loss:0.7744
epoch [32/40], loss:0.8353
epoch [33/40], loss:0.7469
epoch [34/40], loss:0.8307
epoch [35/40], loss:0.8666
epoch [36/40], loss:0.7902
epoch [37/40], loss:0.8911
epoch [38/

## Denoising Autoencoder

**Task** : implement a Denoising Autoencoder for CIFAR 10 dataset. Choose one class from the 10

In [None]:
# TODO : implement a Denoising Autoencoder for CIFAR 10 dataset. Choose one class from the 10
import cv2

class DenoisingAutoencoder(nn.Module):
  def __init__(self, input_size, latent_dim):
    super(DenoisingAutoencoder, self).__init__()
    # Step 1 : Define the encoder
    self.encoder = torch.nn.Sequential(
          torch.nn.Linear(input_size, input_size//2),
          torch.nn.ReLU(),
          torch.nn.Linear(input_size//2, input_size//4),
          torch.nn.ReLU(),
          torch.nn.Linear(input_size//4, latent_dim),
      )
    # Step 2 : Define the decoder
    self.decoder = torch.nn.Sequential(
          torch.nn.Linear(latent_dim, input_size//4),
          torch.nn.ReLU(),
          torch.nn.Linear(input_size//4, input_size//2),
          torch.nn.ReLU(),
          torch.nn.Linear(input_size//2, input_size),
      )
    # Step 3 : Initialize the weights (optional)
    self.encoder.apply(self.__init_weights)
    self.decoder.apply(self.__init_weights)

  def forward(self, x):
    # Step 1: Pass the input through encoder to get latent representation
    z = self.encoder(x)
    # Step 2: Take latent representation and pass through decoder
    x = self.decoder(z)
    # x = x + noise
    return x

  def encode(self,input):
    #Step 1: Pass the input through the encoder to get latent representation
    z = self.encoder(input)
    return z

  def __init_weights(self,m):
    #Init the weights (optional)
    if type(m) == nn.Linear:
        torch.nn.init.xavier_uniform_(m.weight)
        m.bias.data.fill_(0.01)

  def sum_up_weights(self):
    l1_regularization = 0
    for param in self.encoder.parameters():
       l1_regularization += param.abs().sum()
    for param in self.decoder.parameters():
       l1_regularization += param.abs().sum()
    return l1_regularization

In [None]:
batchSize = 100
learning_rate = 0.01
num_epochs = 40
sample = torch.randn((batchSize,1,64))
DA = DenoisingAutoencoder(32*32,10).to(device)
print(DA)

criterion = nn.MSELoss()
optimizer = torch.optim.Adam(DA.parameters(),lr=learning_rate)

#Create a random dataset
data_loader = DataLoader(TensorDataset(torch.randn((1000,1,64))),batch_size=32,shuffle=True)

### Get Image data (CIFAR 10 dataset)

In [None]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)

Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:01<00:00, 98961295.12it/s] 


Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified


## Variational Autoencoders

![caption](https://learnopencv.com/wp-content/uploads/2020/11/vae-diagram-1-1024x563.jpg)

In [None]:
import os
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import transforms
from torchvision.utils import save_image

## Get data (MNIST)

In [None]:
# Hyper-parameters
image_size = 784
h_dim = 400
z_dim = 20
num_epochs = 15
batch_size = 128
learning_rate = 1e-3

# MNIST dataset
dataset = torchvision.datasets.MNIST(root='../../data',
                                     train=True,
                                     transform=transforms.ToTensor(),
                                     download=True)

# Data loader
data_loader = torch.utils.data.DataLoader(dataset=dataset,
                                          batch_size=batch_size,
                                          shuffle=True)

Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../../data/MNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 9912422/9912422 [00:00<00:00, 113785103.08it/s]


Extracting ../../data/MNIST/raw/train-images-idx3-ubyte.gz to ../../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../../data/MNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 28881/28881 [00:00<00:00, 38751021.70it/s]


Extracting ../../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../../data/MNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 1648877/1648877 [00:00<00:00, 33698576.20it/s]


Extracting ../../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../../data/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../../data/MNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 4542/4542 [00:00<00:00, 20888737.68it/s]


Extracting ../../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../../data/MNIST/raw



## Define VAE

In [None]:
# VAE model
class VAE(nn.Module):
  def __init__(self, image_size=784, h_dim=400, z_dim=20):
    super(VAE, self).__init__()
    # Encoder part
    self.fc1 = nn.Linear(image_size, h_dim)
    self.fc2 = nn.Linear(h_dim, z_dim)
    self.fc3 = nn.Linear(h_dim, z_dim)
    # Decoder part
    self.fc4 = nn.Linear(z_dim, h_dim)
    self.fc5 = nn.Linear(h_dim, image_size)

  def encode(self, x):
    h = F.relu(self.fc1(x))
    return self.fc2(h), self.fc3(h)

  def reparameterize(self, mu, log_var):
    std = torch.exp(log_var/2)
    eps = torch.randn_like(std)
    return mu + eps * std

  def decode(self, z):
    h = F.relu(self.fc4(z))
    return F.sigmoid(self.fc5(h))

  def forward(self, x):
    mu, log_var = self.encode(x)
    z = self.reparameterize(mu, log_var)
    x_reconst = self.decode(z)
    return x_reconst, mu, log_var

### Train Autoencoder

In [None]:
model = VAE(image_size, h_dim, z_dim).to(device)
optimizer = torch.optim.Adam(model.parameters(), lr=learning_rate)

In [None]:
# Start training
mse_loss = nn.MSELoss()
for epoch in range(num_epochs):
    for i, (x, _) in enumerate(data_loader):
        # Forward pass
        x = x.to(device).view(-1, image_size)
        x_reconst, mu, log_var = model(x)

        # Compute reconstruction loss and kl divergence
        reconst_loss = mse_loss(x_reconst, x)
        kl_div = - 0.5 * torch.sum(1 + log_var - mu.pow(2) - log_var.exp())

        # Backprop and optimize
        loss = reconst_loss + kl_div
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        if (i+1) % 10 == 0:
            print ("Epoch[{}/{}], Step [{}/{}], Reconst Loss: {:.4f}"
                   .format(epoch+1, num_epochs, i+1, len(data_loader), reconst_loss.item()))

    with torch.no_grad():
        # Save the sampled images
        z = torch.randn(batch_size, z_dim).to(device)
        out = model.decode(z).view(-1, 1, 28, 28)
        save_image(out,'./sampled-{}.png'.format(epoch+1))

        # Save the reconstructed images
        out, _, _ = model(x)
        x_concat = torch.cat([x.view(-1, 1, 28, 28), out.view(-1, 1, 28, 28)], dim=3)
        save_image(x_concat, './reconst-{}.png'.format(epoch+1))

Epoch[1/15], Step [10/469], Reconst Loss: 0.1252
Epoch[1/15], Step [20/469], Reconst Loss: 0.0805
Epoch[1/15], Step [30/469], Reconst Loss: 0.0726
Epoch[1/15], Step [40/469], Reconst Loss: 0.0718
Epoch[1/15], Step [50/469], Reconst Loss: 0.0693
Epoch[1/15], Step [60/469], Reconst Loss: 0.0687
Epoch[1/15], Step [70/469], Reconst Loss: 0.0671
Epoch[1/15], Step [80/469], Reconst Loss: 0.0680
Epoch[1/15], Step [90/469], Reconst Loss: 0.0671
Epoch[1/15], Step [100/469], Reconst Loss: 0.0685
Epoch[1/15], Step [110/469], Reconst Loss: 0.0706
Epoch[1/15], Step [120/469], Reconst Loss: 0.0678
Epoch[1/15], Step [130/469], Reconst Loss: 0.0699
Epoch[1/15], Step [140/469], Reconst Loss: 0.0648
Epoch[1/15], Step [150/469], Reconst Loss: 0.0682
Epoch[1/15], Step [160/469], Reconst Loss: 0.0695
Epoch[1/15], Step [170/469], Reconst Loss: 0.0670
Epoch[1/15], Step [180/469], Reconst Loss: 0.0668
Epoch[1/15], Step [190/469], Reconst Loss: 0.0686
Epoch[1/15], Step [200/469], Reconst Loss: 0.0700
Epoch[1/1

**Task :** Add tensorboard to log the encoder loss and weights

In [None]:
# TODO: Add tensorboard to log the encoder loss and weights

## Resources

* [Auto-Encoding Variational Bayes](https://arxiv.org/pdf/1312.6114.pdf)
* [Variational inference: A review for statisticians](https://arxiv.org/pdf/1601.00670.pdf)
* [Tutorial on variational autoencoders](https://arxiv.org/pdf/1606.05908.pdf)
* [Stochastic Backpropagation and Approximate Inference in Deep Generative Models](https://arxiv.org/pdf/1401.4082.pdf)

**Key theories behind VAE:** <br>
1. Change of variable
2. Loacation-Scale Transformation
3. [Law of The Unconscious Statistician](https://en.wikipedia.org/wiki/Law_of_the_unconscious_statistician)
4. [Evidence lower bound (ELBO)](https://en.wikipedia.org/wiki/Evidence_lower_bound)