# TP Convolutional Neural Networks in tensorflow and keras - part 3

Author : Alasdair Newson

alasdair.newson@telecom-paristech.fr

In this session, we shall be looking at autoencoders. In particular, we shall apply these autoencoders to image denoising, on simple images from the MNIST dataset.

First, let's load the necessary packages.

In [1]:
import pdb
import matplotlib.pyplot as plt
import numpy as np
import os

from keras.datasets import mnist
from keras.layers import Input, Dense, Reshape, Flatten, Dropout, BatchNormalization, Activation, ZeroPadding2D, MaxPooling2D
from keras.layers.advanced_activations import LeakyReLU
from keras.layers.convolutional import UpSampling2D, Conv2D, Conv2DTranspose
from keras.models import Sequential, Model, load_model
from keras.optimizers import Adam

Using TensorFlow backend.


Now, we are going to create a simple autoencoder based on an MLP architecture, in Keras. The architecture is the following :

- Encoder :
    - Flatten input
    - Dense layer, of output size $d$
    - Leaky ReLU ($\alpha$=0.2)
- Decoder :
    - Dense Layer, output size 784 (28$\times$28)
    - Sigmoid activation
    - Reshape, to size $28\times28$
    
The following code defines an autoencoder Class (using Python classes), which creates the autoencoder. Modify this class to implement the MLP autoencoder described above.

In [27]:
class autoencoder():
    def __init__(self,dataset_name='mnist',architecture='mlp'):

        X_train = self.load_data(dataset_name)
        optimizer = 'adadelta'

        # image parameters
        self.epochs = 51
        self.error_list = np.zeros((self.epochs,1))
        self.img_rows = X_train.shape[1]
        self.img_cols = X_train.shape[2]
        self.img_channels = X_train.shape[3]
        self.img_size = X_train.shape[1] * X_train.shape[2] * X_train.shape[3]
        self.img_shape = (self.img_rows, self.img_cols, self.img_channels)
        self.z_dim = 32
        self.sample_interval = 50
        self.dataset_name = dataset_name

        # Build and compile the discriminator
        self.ae = self.build_ae()
        self.ae.summary()
        self.ae.compile(optimizer=optimizer, loss='binary_crossentropy') #binary cross-entropy loss, because mnist is grey-scale

    def build_ae(self):

        n_pixels = self.img_rows*self.img_cols*self.img_channels

        # FULLY CONNECTED (MLP)

        #BEGIN FILL IN CODE
        input_img = Input(shape=self.img_shape)
        
        encoder = Flatten()(input_img)
        encoder = Dense(self.z_dim, activation=LeakyReLU(alpha=0.2))(encoder)
        
        decoder = Dense(n_pixels, activation='sigmoid')(encoder)
        decoder = Reshape(self.img_shape)(decoder)
        
        ae_model = Model(input_img, decoder)
        # END FILL IN CODE

        #output the model
        return ae_model
    
    
    def load_data(self,dataset_name):
        # Load the dataset
        if(dataset_name == 'mnist'):
            (X_train, _), (_, _) = mnist.load_data()
        else:
            print('Error, unknown database')

        # normalise images between 0 and 1
        X_train = X_train/255.0
        #add a channel dimension, if need be (for mnist data)
        if(X_train.ndim ==3):
            X_train = np.expand_dims(X_train, axis=3)
        return X_train

    def test_images(self, test_imgs, image_filename):
        # this function shows some input/output images for the autoencoder
        n_images = test_imgs.shape[0]
        #get output imagesq
        output_imgs = self.ae.predict( test_imgs )

        r = 2
        c = n_images
        fig, axs = plt.subplots(r, c)
        for j in range(c):
            #black and white images
            axs[0,j].imshow(test_imgs[j, :, :, 0], cmap='gray')
            axs[0,j].axis('off')
            axs[1,j].imshow(output_imgs[j, :, :, 0], cmap='gray')
            axs[1,j].axis('off')
            fig.savefig(image_filename)
            plt.close()

Now, modify the code in the next cell to train the autoencoder. 

In order to monitor the autoencoder's progression, use the ```test_images()``` function defined above to write the autoencoding output of some random images. You can do this every 'sample_interval' steps (a parameter of the autoencoder class. 

Note, in Keras, you can use the following function :

- ```model.train_on_batch```($x,\hat{y}$),  where $\hat{y}$ is the target data (in the case of the autoencoder, this is the input data itself)

to carry out a training step on a single batch, rather than the function model.fit(). This can be useful for seeing the evolution of the autoencoder's progression.

In [26]:
#create the output image directory
if (os.path.isdir('images')==0):
    os.mkdir('images')

if (os.path.isdir('images/part3_without_noise')==0):
    os.mkdir('images/part3_without_noise')
    
#choose dataset
dataset_name = 'mnist'#
batch_size=128

#create AE model
ae = autoencoder(dataset_name)
#load dataset
X_train = ae.load_data(ae.dataset_name)

# Now, train model
#BEGIN INSERT CODE
number_steps = X_train.shape[0]//batch_size

for epoch in range(ae.epochs):
    np.random.shuffle(X_train)
    
    for i in range(number_steps):
        batch_x = X_train[i*batch_size:(i+1)*batch_size]
    
        ae.ae.train_on_batch(batch_x, batch_x)
    
        if i%(2*ae.sample_interval) == 0 and epoch%(ae.sample_interval/5) == 0:
            ae.test_images(X_train[i*batch_size: i*batch_size + 5], './images/part3_without_noise/epoch_{}_step_{}.jpg'.format(epoch, i*batch_size))
# END INSERT CODE

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_13 (InputLayer)        (None, 28, 28, 1)         0         
_________________________________________________________________
flatten_13 (Flatten)         (None, 784)               0         
_________________________________________________________________
dense_25 (Dense)             (None, 32)                25120     
_________________________________________________________________
dense_26 (Dense)             (None, 784)               25872     
_________________________________________________________________
reshape_13 (Reshape)         (None, 28, 28, 1)         0         
Total params: 50,992
Trainable params: 50,992
Non-trainable params: 0
_________________________________________________________________


Now, we can train a denoising autoencoder. The denoising autoencoder minimises the following loss function :

- $\mathcal{L}(x) = || x - D \circ E (x+\eta)||^2_2$,

where $\eta$ is some noise with a fixed standard deviation.

This is quite simple to implement using the above code. Instead of putting the real images as the input, just replace them with the real images, with noise added. Use Gaussian additive noise, with a relatively high standard deviation ($\sigma=20.0/255.0$, for example).

In the following cell, implement the training of the denoising autoencoder.

In [29]:
#create the output image directory
if (os.path.isdir('images')==0):
    os.mkdir('images')

if (os.path.isdir('images/part3_with_noise')==0):
    os.mkdir('images/part3_with_noise')
    
#choose dataset
dataset_name = 'mnist'#
batch_size=128

#create AE model
ae = autoencoder(dataset_name)
#load dataset
X_train = ae.load_data(ae.dataset_name)
noise = np.random.normal(0, 20./255., X_train.shape)

# Now, train model
#BEGIN INSERT CODE
number_steps = X_train.shape[0]//batch_size

for epoch in range(ae.epochs):
    np.random.shuffle(X_train)
    np.random.shuffle(noise)
    for i in range(number_steps):
        batch_x = X_train[i*batch_size:(i+1)*batch_size]
        batch_noise = noise[i*batch_size:(i+1)*batch_size]
    
        ae.ae.train_on_batch(batch_x + batch_noise, batch_x)
    
        if i%(2*ae.sample_interval) == 0 and epoch%(ae.sample_interval/5) == 0:
            ae.test_images(X_train[i*batch_size:i*batch_size+5] + noise[i*batch_size:i*batch_size+5], './images/part3_with_noise/epoch_{}_step_{}.jpg'.format(epoch, i*batch_size))
# END INSERT CODE

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_15 (InputLayer)        (None, 28, 28, 1)         0         
_________________________________________________________________
flatten_15 (Flatten)         (None, 784)               0         
_________________________________________________________________
dense_29 (Dense)             (None, 32)                25120     
_________________________________________________________________
dense_30 (Dense)             (None, 784)               25872     
_________________________________________________________________
reshape_15 (Reshape)         (None, 28, 28, 1)         0         
Total params: 50,992
Trainable params: 50,992
Non-trainable params: 0
_________________________________________________________________
