# Denoising autoencoder

We are going to use ```keras``` (```tensorflow``` as backend) to build a simple denoising autoencoder.

## Libraries
We'll be using Pillow a fork of PIL, the Python Image Library, Tensorflow, Numpy and Matplotlib.
You can install Pillow with 
```
!pip install Pillow
```

## Imports

In [2]:
from tensorflow import keras
from tensorflow.keras.layers import Activation, Dense, Input
from tensorflow.keras.layers import Conv2D, Flatten
from tensorflow.keras.layers import Reshape, Conv2DTranspose
from tensorflow.keras.models import Model
from tensorflow.keras import backend as K

from tensorflow.keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

## Data

### Load and check the data

In [3]:
# load the data
(x_train, _), (x_test, _) = mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step


In [None]:
# X_train, y_train, X_test, y_test

# we do not care about the y-labels, we are not taking out the "_"
# unsupervised learning - we don't need labels
# so 

In [4]:
x_train.shape

# 60000 by 28 by 28

(60000, 28, 28)

In [6]:
x_train[0].shape

# Encoded by un-signed integer, 28
# if you'd like to plot it
# 

(28, 28)

In [8]:
img.show(x_train[0])

NameError: name 'img' is not defined

#### Take a look at the data

Just look at one or two examples to get an idea of the data.

### Reshape the images

When loaded, each image is 2-dimensional, 28 x 28 pixels. But, due to how keras and tensorflow handles images we want the images to have the shape 28 x 28 x num_channels, where num_channels is the number of color channels in the images. These are grey scale images so the num_channels is 1, but in colour images num_channels is 3 (RGB).

### Normalize the input data
Check the range of values and data type (```dtype```) or the input data.

Is this ok, or should we normalize the data?

What about the data type?

### Generate corrupted MNIST images

Add noise with normal distribution centered at 0.5 and std=0.5

### Check the corrupted images

## Model

### Build the model
Use ```Sequential()``` and add two encoder layers, one hidden/middle layer, and two decoder layers.
For the layers, use ```Dense()```, i.e. densly connected layers (not convolutional or dropout).

In [None]:
# Network parameters
input_shape = x_train.shape[1:]
batch_size = 128
kernel_size = 3
latent_dim = 16
# Encoder/Decoder number of CNN layers and filters per layer (depth)
layer_filters = [32, 64]

# Build the Autoencoder Model
# First build the Encoder Model
inputs = Input(shape=input_shape, name='encoder_input')
x = inputs
# Stack of Conv2D blocks
# Notes:
# 1) Use Batch Normalization before ReLU on deep networks
# 2) Use MaxPooling2D as alternative to strides>1
# - faster but not as good as strides>1
for filters in layer_filters:
    x = Conv2D(filters=filters,
               kernel_size=kernel_size,
               strides=2,
               activation='relu',
               padding='same')(x)

# Shape info needed to build Decoder Model
shape = K.int_shape(x)

# Generate the latent vector
x = Flatten()(x)
latent = Dense(latent_dim, name='latent_vector')(x)

# Instantiate Encoder Model
encoder = Model(inputs, latent, name='encoder')
encoder.summary()

# Build the Decoder Model
latent_inputs = Input(shape=(latent_dim,), name='decoder_input')
x = Dense(shape[1] * shape[2] * shape[3])(latent_inputs)
x = Reshape((shape[1], shape[2], shape[3]))(x)

# Stack of Transposed Conv2D blocks
# Notes:
# 1) Use Batch Normalization before ReLU on deep networks
# 2) Use UpSampling2D as alternative to strides>1
# - faster but not as good as strides>1
for filters in layer_filters[::-1]:
    x = Conv2DTranspose(filters=filters,
                        kernel_size=kernel_size,
                        strides=2,
                        activation='relu',
                        padding='same')(x)

x = Conv2DTranspose(filters=1,
                    kernel_size=kernel_size,
                    padding='same')(x)

outputs = Activation('sigmoid', name='decoder_output')(x)
   

In [None]:
# Instantiate Decoder Model
decoder = Model(latent_inputs, outputs, name='decoder')
decoder.summary()

# Autoencoder = Encoder + Decoder
# Instantiate Autoencoder Model
autoencoder = Model(inputs, decoder(encoder(inputs)), name='autoencoder')
autoencoder.summary()

### Compile and train

In [42]:
autoencoder.compile(loss='mse', optimizer='adam')

In [None]:
# Train the autoencoder
autoencoder.fit(x_train_noisy,
                x_train,
                validation_data=(x_test_noisy, x_test),
                epochs=30,
                batch_size=batch_size)

## Test

### Test our brand new denoiser

Pro-tip: use ```model.predict()```

and take a look at the result (try plotting the original and reconstructed images next to each other)

**Are we doing denoising?**

Add some noise to a test image and try to reconstruct it.
Pro-tip:
    Use ```np.random.something```
    
Plot the original, noise added and reconstructed images side-by-side