# Autoencoder Introduction 
<hr>

![Poster-YouTubeModel-based Deep Convolutional Face Autoencoder for Unsupervised Monocular Reconstruction](https://i.ytimg.com/vi/uIMpHZYB8fI/maxresdefault.jpg)
> *Image [source](http://gvv.mpi-inf.mpg.de/projects/MZ/Papers/arXiv2017_FA/page.html)*

> An autoencoder is a type of artificial neural network used to learn efficient data codings in an unsupervised manner. The aim of an autoencoder is to learn a representation (encoding) for a set of data, typically for dimensionality reduction, by training the network to ignore signal ‚Äúnoise‚Äù. Source: [wiki](https://en.wikipedia.org/wiki/Autoencoder#:~:text=An%20autoencoder%20is%20a%20type,to%20ignore%20signal%20%E2%80%9Cnoise%E2%80%9D.)

<hr>

### **NOTE:**
**From now I will be updating my notebook on @tarunk04 account. Old account is now closed due to some unavoidable reasons. I have lost most of my work but I still have a copy of most of my popular notebook and I am going to reupload it on this account. I hope I will get the same response on this account. Thanks for your support üôè.** 

**<span style = "color:#cc1616">Update log (on old account):</span>**
* <font style="color: rgba(107, 61, 35, 0.92) "><b>02 July 2020 03:15 AM IST (Version 1) :</b> Initial Version </font>
* <font style="color: rgba(107, 61, 35, 0.92) "><b>02 July 2020 04:17 PM IST (Version 4) :</b> Few fixes. </font>
* <font style="color: rgba(107, 61, 35, 0.92) "><b>02 July 2020 09:15 PM IST (Version 5) :</b> New section added <a href="#Denoising-Cifar10-Data">Denoising Cifar10 Data.</a> </font>
* <font style="color: rgba(107, 61, 35, 0.92) "><b>20 September 2020 10:50 PM IST (Version 16) :</b> Latest Update. </font>

**<span style = "color:#cc1616">Update log:</span>**
* <font style="color: #cc1616 "><b>06 Feb 2021 07:10 PM IST (Version 1) :</b> Latest Update. </font>

<hr>

<font style="color:Red;font-size:18px">If you find this notebook helpful, Please UPVOTE.</font>

### Follow me:

* <a href="https://bit.ly/tarungithub"> GitHub</a>
* <a href="https://bit.ly/tarnkr-youtube">Youtube<a/>
* <a href="https://medium.com/@codeeasy"> Medium</a>
* <a href="https://www.linkedin.com/in/tarun-kumar-iit-ism/">Linkedin</a>



<hr>

# Content:

<hr>

* [Autoencoder Introduction](#Autoencoder-Introduction)
* [What is Auotencoder?](#What-is-Auotencoder?)
* [Applications](#Applications)
* [Image Denoising](#Image-Denoising)
* [Required Imports](#Required-Imports)
* [Load/Prepare MNIST data](#Load/Prepare-MNIST-data)
* [Autoencoder Model](#Autoencoder-Model)
* [Performance/ Visualise Results](#Performance/-Visualise-Results)
* **[Denoising Cifar10 Data](#Denoising-Cifar10-Data)** ***<span style="color:red">New</span>***
    * [Deconvolution (Conv2DTranspose)](#Deconvolution-(Conv2DTranspose))
    * [Skip Connection](#Skip-Connection)
* [Conclusion](#Conclusion)
* [Valuable Feedback](#Valuable-Feedback)



# What is Auotencoder?
Autoencoder is an unsupervised learning technique that can efficiently learn to compress the data and then reconstruct it from the compressed version of the data. The reconstructed data is close to the original data with minimum reconstruction loss as possible. In other words, autoencoders can learn to encode the essential features of the input data needed to reconstruct the data.

Autoencoders can be considered as a data compression algorithm, where compression and decompression are specific to data and learned automatically from data. Autoencoders are data specific means it can only compress data for which it has been trained. Unlike other compression algorithms such as JPEG can compress any image input, this is not true for autoencoders. If autoencoder is trained on the MNIST dataset, then it can only compress MNIST data. 

**Components of Autoencoder:**
* Enoder: Learns to reduce the data into low dimension
* Decoder: Takes an encoded version of input and regenerate the data
* Bottleneck: Compressed version of data

![Autoencode](https://lilianweng.github.io/lil-log/assets/images/denoising-autoencoder-architecture.png)

<hr>

# Applications
* **Dimensional Reduction:** One of the earliest applications of autoencoder was dimensionality reduction. PCA is a well-known technique to reduce the dimension and can give good results but has limitations as PCA uses linear algebra transformations. In contrast, the neural network can perform non-linear transformations (non-linear activation function). Also, it is much efficient with several hidden layers to train than one transformation with PCA. Autoencoders can significantly well when data is complex.
![PCA vs Autoencoder](https://www.jeremyjordan.me/content/images/2018/03/Screen-Shot-2018-03-07-at-8.52.21-AM.png)

* **Anomaly Detection:** An well-trained autoencoder can reconstruct the data input data with minimum reconstruction error. Now, if any outliner or anomaly is passed through a trained autoencoder, then the output is quite different from that of input and has a significant error term, representing an anomaly.

* **Data Denoising:** Autoencoders has been proved excellent in denoising task. Layers of autoencoders can easily learn to ignore the noise in the encoded images (bottleneck) and hence can regenerate the denoised image. In this tutorial, I am going to implement this idea on the MNIST dataset. ![Noise Detection](https://miro.medium.com/max/2000/1*sHOPK4Mm5kl5-fju9kLByg.png)

* **Image Super-Resolution:** There are several algorithms for increasing the resolution of images such as bicubic, bilinear, etc., but all are interpolation algorithms and has limitations. Autoencoders are quite impressive in this task.
![super-resolution image](https://miro.medium.com/max/700/1*5wzZbWyKt9v_vWVmdHmBxA.png)

**Autoencoders can also be used with other techniques to get even better results**
> For 2D visualization specifically, t-SNE (pronounced "tee-snee") is probably the best algorithm around, but it typically requires relatively low-dimensional data. So a good strategy for visualizing similarity relationships in high-dimensional data is to start by using an autoencoder to compress your data into a low-dimensional space (e.g. 32 dimensional), then use t-SNE for mapping the compressed data to a 2D plane. Note that a nice parametric implementation of t-SNE in Keras was developed by Kyle McDonald and is available on Github. Otherwise scikit-learn also has a simple and practical implementation.




# Image Denoising
![Denoisng](https://miro.medium.com/max/5160/1*SxwRp9i23OM0Up4sEze1QQ@2x.png)
<br>
Noise reduction or denoising is the process of removal of noise form any signal or data input. We can train our autoencoder to remove the noise from the data. Data can be images or audio. Let's see how to build an autoencoder. I am going to use the MNIST dataset to keep things simple and easy to understand. But you can implement this idea to build your custom autoencoder.   
<br>

Steps involved:

* Prepare input data by adding noise to MNIST dataset 
* Build a CNN Autoencoder Network
* Train the network
* Test the performance of Autoencoder 

**Note: If you want to Learn More about training classifier for MNIST dataset, you can check my [Notebook](https://www.kaggle.com/tarunkr/digit-recognition-tutorial-cnn-99-67-accuracy)**

# Required Imports

**Imports:**
* numpy
* Matplotlib - for visualization 
* Keras - Building and Training CNN autoencoder

In [None]:
import numpy as np

import matplotlib.pyplot as plt 

from keras.layers import Conv2D, Input, Dense, Dropout, MaxPool2D, UpSampling2D
from keras.models import Model
from keras.datasets import mnist, cifar10

%matplotlib inline

# Load/Prepare MNIST data 

Download MNIST data from the Keras dataset. There are several other datasets available at Keras dataset such as CIFAR10, Fashion MNIST, IMDB movie review, etc. [READ MORE](https://keras.io/api/datasets/)

After downloading the dataset, reshape the train and test images to the required model input format `[samples, 28, 28, 1]`, where `1` represents the number of channels. And scale the images to `[0,1]` by dividing with `255`.

In [None]:
(train, _), (test, _) = mnist.load_data()

# scaling input data
train = train.reshape([-1,28,28,1]) / 255
test = test.reshape([-1,28,28,1]) / 255

### Adding Noise
We need to add noise to generate the noisy images. To add noise we can generate array with same dimension of our images with random values between `[0,1]` using normal distribution with `mean = 0` and standard `deviation = 1`.
<br><br>
To generate normal distribution, we can use [np.random.normal(loc,scale,size)](https://numpy.org/doc/stable/reference/random/generated/numpy.random.normal.html). Then scale the noise by some factor, here I am using `0.3`. After adding noise, pixel values can be out of range `[0,1]`, so we need to clip the values using [np.clip(arr, arr_min, arr_max )](https://numpy.org/doc/1.18/reference/generated/numpy.clip.html).

In [None]:
# Adding noise to data
noise = 0.3
train_noise = train + noise * np.random.normal(0, 1, size=train.shape)
test_noise = test + noise * np.random.normal(0, 1, size=test.shape)

train_noise = np.clip(train_noise, 0, 1)
test_noise = np.clip(test_noise, 0, 1)

### Visualise Training data

Let's see how our training data looks like
#### Input (Noisy Images)

In [None]:
# sample noisy image

rows = 5 # defining no. of rows in figure
cols = 6 # defining no. of colums in figure
subplot_size = 2

f = plt.figure(figsize=(subplot_size*cols,subplot_size*rows)) # defining a figure 

for i in range(rows*cols): 
    f.add_subplot(rows,cols,i+1) # adding sub plot to figure on each iteration
    plt.imshow(train_noise[i].reshape([28,28]),cmap="Reds") 
    plt.axis("off")
plt.savefig("digits_noise.png")

#### Original Images

In [None]:
# sample original image

rows = 5 # defining no. of rows in figure
cols = 6 # defining no. of colums in figure
subplot_size = 2
f = plt.figure(figsize=(subplot_size*cols, subplot_size*rows)) # defining a figure 

for i in range(rows*cols): 
    f.add_subplot(rows,cols,i+1) # adding sub plot to figure on each iteration
    plt.imshow(train[i].reshape([28,28]),cmap="Reds") 
    plt.axis("off")
plt.savefig("digits_original.png")

# Autoencoder Model
I am using basic CNN architecture to build the model (CNN works well with images). CNN can improve the reconstruction quality.
<br>

Autoencoder consists of two parts:
1. **Encoder:**  In Encoder, I am using 2 Conv2D layers and 2 MaxPool2D layers. The output of the 2nd MaxPool2D layer is the encoded features or the input to the Decoder.
2. **Decoder:** Decoder takes the encoder output as input. Here I used Conv2D and UpSampling2D layers. UpSampling2D layer increases the dimension, opposite of MaxPool which reduces the dimension. Output of decoder has same dimension as the input of the encoder.

**How UpSampling2D works:**
The input image of shape 2x2 will be 4x4, like the example below.
```
Input = [
         [1, 2],
         [3, 4]
        ]

Output =  [
           [1, 1, 2, 2],
           [1, 1, 2, 2],
           [3, 3, 4, 4],
           [3, 3, 4, 4]
          ]
```

I am using a functional API of Keras. If you have not familiar functional API, you may find syntax weird. But this is a feature of Python where you can define `__call__()` function to make objects of the class callable. This enables the instance of the class behave as a function.
See the example below:
```
class A:
    def __call__(self):
        print("This is a call function!!")

obj = A()
obj()

Output:
This is a call function!!
```

If you look carefully, you are doing the same thing in functional API.

Read More:
* [Functioanl API](https://keras.io/guides/functional_api/)
* `__call__()` python [Read More1](https://www.geeksforgeeks.org/callable-in-python/) [Read More2](https://www.geeksforgeeks.org/__call__-in-python/)
* [UpSampling2D](#https://www.tensorflow.org/api_docs/python/tf/keras/layers/UpSampling2D)

### Encoder

In [None]:
# Encoder 
inputs = Input(shape=(28,28,1))

x = Conv2D(32, 3, activation='relu', padding='same')(inputs)
x = MaxPool2D()(x)
x = Dropout(0.3)(x)
x = Conv2D(32, 3, activation='relu', padding='same')(x)
encoded = MaxPool2D()(x)


### Decoder
Activation of our output layer is sigmoid to make every value between `[0,1]`.

In [None]:
# Decoder

x = Conv2D(32, 3, activation='relu', padding='same')(encoded)
x = UpSampling2D()(x)
x = Dropout(0.3)(x)
x = Conv2D(32, 3, activation='relu', padding='same')(x)
x = UpSampling2D()(x)
decoded = Conv2D(1, 3, activation='sigmoid', padding='same')(x)

### Create and compile model

In [None]:
autoencoder = Model(inputs, decoded)
autoencoder.compile(optimizer='rmsprop', loss='binary_crossentropy')

autoencoder.summary()

### Training
I am training for `50` epochs with batch size `256`. Batch size may vary for your system. If you are using Kaggle or Colab, then `256` will work. Also, remember if you are using other dataset, it may be required to change the number of epochs and batch size.

In [None]:
epochs = 50
batch_size = 256

history = autoencoder.fit(train_noise,
                train,
                epochs=epochs,
                batch_size=batch_size,
                shuffle=True,
                validation_data=(test_noise, test)
               )

# Performance/ Visualise Results
Training seems to be great. Loss and validation loss has decreased as expected.

In [None]:
# Defining Figure
f = plt.figure(figsize=(10,7))
f.add_subplot()

#Adding Subplot
plt.plot(history.epoch, history.history['loss'], label = "loss") # Loss curve for training set
plt.plot(history.epoch, history.history['val_loss'], label = "val_loss") # Loss curve for validation set

plt.title("Loss Curve",fontsize=18)
plt.xlabel("Epochs",fontsize=15)
plt.ylabel("Loss",fontsize=15)
plt.grid(alpha=0.3)
plt.legend()
plt.savefig("Loss_curve.png")
plt.show()

### Sample few test images

In [None]:
# Select few random test images
num_imgs = 16
rand = np.random.randint(1, 100)

test_images = test_noise[rand:rand+num_imgs] # slicing
test_desoided = autoencoder.predict(test_images) # predict

In [None]:
# Visualize test images with their denoised images

rows = 2 # defining no. of rows in figure
cols = 8 # defining no. of colums in figure

f = plt.figure(figsize=(2*cols,2*rows*2)) # defining a figure 

for i in range(rows):
    for j in range(cols): 
        f.add_subplot(rows*2,cols, (2*i*cols)+(j+1)) # adding sub plot to figure on each iteration
        plt.imshow(test_images[i*cols + j].reshape([28,28]),cmap="Reds") 
        plt.axis("off")
        
    for j in range(cols): 
        f.add_subplot(rows*2,cols,((2*i+1)*cols)+(j+1)) # adding sub plot to figure on each iteration
        plt.imshow(test_desoided[i*cols + j].reshape([28,28]),cmap="Reds") 
        plt.axis("off")
        
f.suptitle("Autoencoder Results",fontsize=18)
plt.savefig("test_results.png")

plt.show()

# Denoising Cifar10 Data

After MNIST dataset, let's try the idea on Cifar10 dataset. If you want to know more about ciraf dataset [Read Here](https://www.cs.toronto.edu/~kriz/cifar.html). You can easly get Cifar10 dataset from keras dataset.

### Loading/ Preparing data

In [None]:
(cifar_train, _), (cifar_test, _) = cifar10.load_data()

size = 32
channel = 3
# scaling input data
cifar_train = cifar_train / 255
cifar_test = cifar_test / 255

# Adding noise mean = 0, std = 0.3
noise = 0.3
cifar_train_noise = cifar_train + noise * np.random.normal(0, 0.3, size=cifar_train.shape) 
cifar_test_noise = cifar_test + noise * np.random.normal(0, 0.3, size=cifar_test.shape)

cifar_train_noise = np.clip(cifar_train_noise, 0, 1)
cifar_test_noise = np.clip(cifar_test_noise, 0, 1)

### Sample few noisy and original images

In [None]:
# Visualize few training images with their noisy images

rows = 2 # defining no. of rows in figure
cols = 8 # defining no. of colums in figure

f = plt.figure(figsize=(2*cols,2*rows*2)) # defining a figure 

for i in range(rows):
    for j in range(cols): 
        f.add_subplot(rows*2,cols, (2*i*cols)+(j+1)) # adding sub plot to figure on each iteration
        plt.imshow(cifar_train_noise[i*cols + j]) 
        plt.axis("off")
        
    for j in range(cols): 
        f.add_subplot(rows*2,cols,((2*i+1)*cols)+(j+1)) # adding sub plot to figure on each iteration
        plt.imshow(cifar_train[i*cols + j]) 
        plt.axis("off")
        
f.suptitle("Sample Training Data",fontsize=18)
plt.savefig("Cifar-trian.png")

plt.show()

## Model
Here I am using more complicated architecture. What's different from last Model Architecture:
* Conv2DTranspose layer
* No UpSampling2D layer
* Skip connection from the encoder to the decoder
* 3 Conv2D layers followed by BatchNormalization and MaxPool2D

## Deconvolution (Conv2DTranspose)
**Conv2DTranspose** layer performs the inverse of that of **Conv2D**. It performs deconvolution, and it is much better than **UpSampling**. **UpSampling** layer copies the values to the upscaled dimension. But deconvolution layer can combine the upsampling and convolution in one layer. It fills the value by interpreting the input. But there one disadvantage also, deconvolution can lead to the Checkerboard Artifacts. You can see the artifact in below image. [Read More about Checkerboard Artifacts](https://distill.pub/2016/deconv-checkerboard/)
> Checkerboard Artifacts
![Checkerboard Artifacts](https://distill.pub/2016/deconv-checkerboard/assets/deepdream_full_gitter_8x8.png) 

> Deconvolution operation
![deconv](https://miro.medium.com/max/1972/1*kOThnLR8Fge_AJcHrkR3dg.gif)



>The need for transposed convolutions generally arises from the desire to use a transformation going in the opposite direction of a normal convolution, i.e., from something that has the shape of the output of some convolution to something that has the shape of its input while maintaining a connectivity pattern that is compatible with said convolution
‚Äî [A Guide To Convolution Arithmetic For Deep Learning, 2016.](https://arxiv.org/abs/1603.07285)

Read More:
* [Conv2DTranspose layer](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv2DTranspose)
* [UpSampling2D vs Conv2DTranspose](https://machinelearningmastery.com/upsampling-and-transpose-convolution-layers-for-generative-adversarial-networks/)
* [Deconvolution and Checkerboard Artifacts](https://distill.pub/2016/deconv-checkerboard/)

## Skip Connection
Skip connections are very useful when working with any network where convolutions and deconvolution operations are performed. It helps in restoring the pieces of information which can be lost during convolution and deconvolutions.

![Skip connection](https://www.researchgate.net/profile/Muzammal_Naseer/publication/323694671/figure/fig5/AS:603205645910017@1520826841252/A-basic-autoencoder-architecture.png)

<br>

You can go through this research paper [Image Restoration Using Convolutional Auto-encoders with Symmetric Skip Connections](https://arxiv.org/pdf/1606.08921.pdf).



In [None]:
from keras.layers import Conv2DTranspose, BatchNormalization, add, LeakyReLU
from keras.optimizers import Adam

In [None]:
# Encoder 
inputs = Input(shape=(size,size,channel))

x = Conv2D(32, 3, activation='relu', padding='same')(inputs)
x = BatchNormalization()(x)
x = MaxPool2D()(x)
x = Dropout(0.5)(x)
skip = Conv2D(32, 3, padding='same')(x) # skip connection for decoder
x = LeakyReLU()(skip)
x = BatchNormalization()(x)
x = MaxPool2D()(x)
x = Dropout(0.5)(x)
x = Conv2D(64, 3, activation='relu', padding='same')(x)
x = BatchNormalization()(x)
encoded = MaxPool2D()(x)

# Decoder
x = Conv2DTranspose(64, 3,activation='relu',strides=(2,2), padding='same')(encoded)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Conv2DTranspose(32, 3, activation='relu',strides=(2,2), padding='same')(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Conv2DTranspose(32, 3, padding='same')(x)
x = add([x,skip]) # adding skip connection
x = LeakyReLU()(x)
x = BatchNormalization()(x)
decoded = Conv2DTranspose(3, 3, activation='sigmoid',strides=(2,2), padding='same')(x)

autoencoder = Model(inputs, decoded)
autoencoder.compile(optimizer=Adam(lr=0.001), loss='binary_crossentropy')
autoencoder.summary()

In [None]:
# Training
epochs = 25
batch_size = 256

history = autoencoder.fit(cifar_train_noise,
                cifar_train,
                epochs=epochs,
                batch_size=batch_size,
                shuffle=True,
                validation_data=(cifar_test_noise, cifar_test)
               )

In [None]:
# Defining Figure
f = plt.figure(figsize=(10,7))
f.add_subplot()

#Adding Subplot
plt.plot(history.epoch, history.history['loss'], label = "loss") # Loss curve for training set
plt.plot(history.epoch, history.history['val_loss'], label = "val_loss") # Loss curve for validation set

plt.title("Loss Curve",fontsize=18)
plt.xlabel("Epochs",fontsize=15)
plt.ylabel("Loss",fontsize=15)
plt.grid(alpha=0.3)
plt.legend()
plt.savefig("Loss_curve_cifar10.png")
plt.show()

In [None]:
# Select few random test images
num_imgs = 48
rand = np.random.randint(1, cifar_test_noise.shape[0]-48) 

cifar_test_images = cifar_test_noise[rand:rand+num_imgs] # slicing
cifar_test_desoided = autoencoder.predict(cifar_test_images) # predict

In [None]:
# Visualize test images with their denoised images

rows = 4 # defining no. of rows in figure
cols = 12 # defining no. of colums in figure
cell_size = 1.5
f = plt.figure(figsize=(cell_size*cols,cell_size*rows*2)) # defining a figure 
f.tight_layout()
for i in range(rows):
    for j in range(cols): 
        f.add_subplot(rows*2,cols, (2*i*cols)+(j+1)) # adding sub plot to figure on each iteration
        plt.imshow(cifar_test_images[i*cols + j]) 
        plt.axis("off")
        
    for j in range(cols): 
        f.add_subplot(rows*2,cols,((2*i+1)*cols)+(j+1)) # adding sub plot to figure on each iteration
        plt.imshow(cifar_test_desoided[i*cols + j]) 
        plt.axis("off")

f.suptitle("Autoencoder Results - Cifar10",fontsize=18)
plt.savefig("test_results_cifar10.png")

plt.show()

### That all for now! You can build your autoencoders üòéüòé. Explore more datasets and have fun training your own autoencoders.

# Conclusion
Autoencoders are powerful and can do a lot more. Here I introduced you with 2 simple examples, and you can see how well our model performed on the denoising task. There are other uses as well, such as using autoencoder for sequential data. Variational autoencoder (VAE) is a slightly more advanced and modern approach. It can be used to generate the images. In the future, I will try to cover more uses of autoencoder with code implementation. <br>

![faces](https://miro.medium.com/max/1400/1*BaZPg3SRgZGVigguQCmirA.png)
> Face images generated with a Variational Autoencoder (source: [Wojciech Mormul on Github](https://github.com/WojciechMormul/vae)).

# About Me
<hr>
I am Tarun Kumar from India.<br>
Software Developer at Toppr.<br>
Hobbyist Artist. <br>
PE Undergrad from IIT ISM Dhanbad. <br>
Deep Learning, Reinforcement Learning, and Data Science. 
<br>
<br>

### Follow me:

* <a href="https://bit.ly/tarungithub"> GitHub</a>
* <a href="https://bit.ly/tarnkr-youtube">Youtube<a/>
* <a href="https://medium.com/@codeeasy"> Medium</a>
* <a href="https://www.linkedin.com/in/tarun-kumar-iit-ism/">Linkedin</a>


<hr>

# Feedback
* **Your feedback is much appreciated**
* **Please UPVOTE if you LIKE this notebook**
* **Comment if you have any doubts or you found any errors in the notebook**