
Trains a denoising autoencoder on MNIST dataset.<br>
Denoising is one of the classic applications of autoencoders.<br>
The denoising process removes unwanted noise that corrupted the<br>
true data.<br>
Noise + Data ---> Denoising Autoencoder ---> Data<br>
Given a training dataset of corrupted data as input and<br>
true data as output, a denoising autoencoder can recover the<br>
hidden structure to generate clean data.<br>
This example has modular design. The encoder, decoder and autoencoder<br>
are 3 models that share weights. For example, after training the<br>
autoencoder, the encoder can be used to  generate latent vectors<br>
of input data for low-dim visualization like PCA or TSNE.<br>


In [None]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

In [None]:
from keras.layers import Dense, Input
from keras.layers import Conv2D, Flatten
from keras.layers import Reshape, Conv2DTranspose
from keras.models import Model
from keras import backend as K
from keras.datasets import mnist
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image

In [None]:
np.random.seed(1337)

load MNIST dataset

In [None]:
(x_train, _), (x_test, _) = mnist.load_data()

reshape to (28, 28, 1) and normalize input images

In [None]:
image_size = x_train.shape[1]
x_train = np.reshape(x_train, [-1, image_size, image_size, 1])
x_test = np.reshape(x_test, [-1, image_size, image_size, 1])
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

generate corrupted MNIST images by adding noise with normal dist<br>
centered at 0.5 and std=0.5

In [None]:
noise = np.random.normal(loc=0.5, scale=0.5, size=x_train.shape)
x_train_noisy = x_train + noise
noise = np.random.normal(loc=0.5, scale=0.5, size=x_test.shape)
x_test_noisy = x_test + noise

adding noise may exceed normalized pixel values>1.0 or <0.0<br>
clip pixel values >1.0 to 1.0 and <0.0 to 0.0

In [None]:
x_train_noisy = np.clip(x_train_noisy, 0., 1.)
x_test_noisy = np.clip(x_test_noisy, 0., 1.)

network parameters

In [None]:
input_shape = (image_size, image_size, 1)
batch_size = 32
kernel_size = 3
latent_dim = 16
# encoder/decoder number of CNN layers and filters per layer
layer_filters = [32, 64]

build the autoencoder model<br>
first build the encoder model

In [None]:
inputs = Input(shape=input_shape, name='encoder_input')
x = inputs

stack of Conv2D(32)-Conv2D(64)

In [None]:
for filters in layer_filters:
    x = Conv2D(filters=filters,
               kernel_size=kernel_size,
               strides=2,
               activation='relu',
               padding='same')(x)

shape info needed to build decoder model so we don't do hand computation<br>
the input to the decoder's first Conv2DTranspose will have this shape<br>
shape is (7, 7, 64) which can be processed by the decoder back to (28, 28, 1)

In [None]:
shape = K.int_shape(x)

generate the latent vector

In [None]:
x = Flatten()(x)
latent = Dense(latent_dim, name='latent_vector')(x)

instantiate encoder model

In [None]:
encoder = Model(inputs, latent, name='encoder')
encoder.summary()

build the decoder model

In [None]:
latent_inputs = Input(shape=(latent_dim,), name='decoder_input')
# use the shape (7, 7, 64) that was earlier saved
x = Dense(shape[1] * shape[2] * shape[3])(latent_inputs)
# from vector to suitable shape for transposed conv
x = Reshape((shape[1], shape[2], shape[3]))(x)

stack of Conv2DTranspose(64)-Conv2DTranspose(32)

In [None]:
for filters in layer_filters[::-1]:
    x = Conv2DTranspose(filters=filters,
                        kernel_size=kernel_size,
                        strides=2,
                        activation='relu',
                        padding='same')(x)

reconstruct the denoised input

In [None]:
outputs = Conv2DTranspose(filters=1,
                          kernel_size=kernel_size,
                          padding='same',
                          activation='sigmoid',
                          name='decoder_output')(x)

instantiate decoder model

In [None]:
decoder = Model(latent_inputs, outputs, name='decoder')
decoder.summary()

autoencoder = encoder + decoder<br>
instantiate autoencoder model

In [None]:
autoencoder = Model(inputs, decoder(encoder(inputs)), name='autoencoder')
autoencoder.summary()

Mean Square Error (MSE) loss function, Adam optimizer

In [None]:
autoencoder.compile(loss='mse', optimizer='adam')

train the autoencoder

In [None]:
autoencoder.fit(x_train_noisy,
                x_train,
                validation_data=(x_test_noisy, x_test),
                epochs=10,
                batch_size=batch_size)

predict the autoencoder output from corrupted test images

In [None]:
x_decoded = autoencoder.predict(x_test_noisy)

3 sets of images with 9 MNIST digits<br>
1st rows - original images<br>
2nd rows - images corrupted by noise<br>
3rd rows - denoised images

In [None]:
rows, cols = 3, 9
num = rows * cols
imgs = np.concatenate([x_test[:num], x_test_noisy[:num], x_decoded[:num]])
imgs = imgs.reshape((rows * 3, cols, image_size, image_size))
imgs = np.vstack(np.split(imgs, rows, axis=1))
imgs = imgs.reshape((rows * 3, -1, image_size, image_size))
imgs = np.vstack([np.hstack(i) for i in imgs])
imgs = (imgs * 255).astype(np.uint8)
plt.figure()
plt.axis('off')
plt.title('Original images: top rows, '
          'Corrupted Input: middle rows, '
          'Denoised Input:  third rows')
plt.imshow(imgs, interpolation='none', cmap='gray')
Image.fromarray(imgs).save('corrupted_and_denoised.png')
plt.show()