<b>Neural Nets: Auto Encoder and Domain Adaptation</b>

In this lab you will learn how to use neural nets in a setting where we use a pretrained model (trained on a big data set) for a slightly different task as initialization (in this tutorial a neural net that can recognize the digits 0-7). Building a model on top of a pretrained model (training a neural net to recognize the digits 8 and 9) by only fine tuning the initial weights can give better results than training a neural net from scratch. This technique is known as domain adaptation. Furthermore we will look at autoencoders and how they can be used for denoising data.

In [None]:
# imports
from keras.datasets import mnist
import matplotlib.pyplot as plt
import numpy as np

# nn
from keras import backend as K
from keras.models import Sequential, Model, Input
from keras.layers.core import Dense, Dropout, Flatten, Reshape, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D, UpSampling2D
from keras.utils import np_utils
from sklearn.preprocessing import LabelEncoder

# image manipulation
import cv2

%matplotlib inline

<b>1. Part: Domain Adaptation</b>

First of all, we load the data set (handwritten digit data set) and rescale all data points as in the tutorial before.

In [None]:
# load digit dataset with training and test images
(x_train, y_train), (x_test, y_test) = mnist.load_data()
# rescale the data
x_train = x_train / 255.
x_test = x_test / 255.
# dimension
img_rows, img_cols = x_train[0].shape

In [None]:
# transform data set
if K.image_data_format() == 'channels_first':
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

Now we create two slightly different task by dividing the data into two data sets. The first data set contains all images for the digits ranging from 0 to 7. The second data set contains images with images of handwritten 8's and 9's. For the second data set we only use the first 10 instances in the training part of the data set.

In [None]:
# get all images from class 0-7 and 8/9
x_train_0to7 = x_train[np.logical_and(y_train != 8,y_train != 9)]
x_train_8and9 = x_train[np.logical_or(y_train == 8,y_train == 9)]
y_train_0to7 = y_train[np.logical_and(y_train != 8,y_train != 9)]
y_train_8and9 = y_train[np.logical_or(y_train == 8,y_train == 9)]

# only use small subset of training set (#num_instances training images per class)
num_instances = 10
idx_8 = np.where(y_train_8and9 == 8)[0]
idx_9 = np.where(y_train_8and9 == 9)[0]
x_train_8and9 = x_train_8and9[np.concatenate([idx_8[0:num_instances],idx_9[0:num_instances]]),:]
y_train_8and9 = y_train_8and9[np.concatenate([idx_8[0:num_instances],idx_9[0:num_instances]])]

# get all images for the test data
x_test_0to7 = x_test[np.logical_and(y_test != 8,y_test != 9)]
x_test_8and9 = x_test[np.logical_or(y_test == 8,y_test == 9)]
y_test_0to7 = y_test[np.logical_and(y_test != 8,y_test != 9)]
y_test_8and9 = y_test[np.logical_or(y_test == 8,y_test == 9)]

<b>Excercise 1</b>

Write the python code to show the first 10 instances of the training data set for the problem of recognizing the digits from 0 to 7 and the first 10 instances for the second problem of recognizing the digits 8 and 9.

In [None]:
# YOUR CODE GOES HERE


The model to detect handwritten digits is defined as shown in tutorial last week. Because we want to train a model to distinguish between the digits from 0 to 7 we only have 8 different classes. This model will be the basis and will be finetuned on the dataset of only two different digits 8 and 9.

In [None]:
def getSimpleCNN(nb_classes=8):
    # domain adaptation neural net
    nb_filters_one = 32
    nb_filters_two = 64
    nb_conv = 3
    nb_pool = 2
    dense_size = 128
    cnnModel = Sequential()
    cnnModel.add(Conv2D(nb_filters_one, kernel_size=(nb_conv, nb_conv),
                     activation='relu',
                     input_shape=input_shape,name='conv'))
    #cnnModel.add(Conv2D(nb_filters_two, (nb_conv, nb_conv), activation='relu'))
    cnnModel.add(MaxPooling2D(pool_size=(nb_pool, nb_pool),name='max'))
    cnnModel.add(Dropout(0.25))
    cnnModel.add(Flatten())
    cnnModel.add(Dense(dense_size, activation='relu',name='dense'))
    cnnModel.add(Dropout(0.5))
    cnnModel.add(Dense(nb_classes, activation='softmax'))

    cnnModel.compile(loss='categorical_crossentropy', optimizer='SGD', metrics=['accuracy'])
    return cnnModel
cnnModel = getSimpleCNN()
cnnModel.summary()

Now we train our model for 10 epochs.

In [None]:
batch_size = 128
encoder = LabelEncoder().fit(y_train_0to7)
oneHotLabelTrain = np_utils.to_categorical(encoder.transform(y_train_0to7), len(np.unique(y_train_0to7)))
oneHotLabelTest  = np_utils.to_categorical(encoder.transform(y_test_0to7), len(np.unique(y_test_0to7)))
learnHistSimple = cnnModel.fit(x_train_0to7,oneHotLabelTrain,validation_data=(x_test_0to7,oneHotLabelTest),
                                  batch_size=batch_size,
                                  epochs=10)

<b>Exercise 2:</b>  
Write code to show the learning curve and look at the learning curve. The learning curve should show the training and testing loss for the different epochs. Would you train the model for more epochs? Is the model converged? What conclusions can you draw from the learning curve below?

<b>Answer:</b>

In [None]:
# YOUR CODE GOES HERE


Now we train a simple cnn on the small data that only contains the digits 8 and 9. To do so we first of all create the cnn:

In [None]:
cnnModelSmallDataset = getSimpleCNN(2)
cnnModelSmallDataset.summary()

Then we can train the model:

In [None]:
batch_size = 128
encoder = LabelEncoder().fit(y_train_8and9)
oneHotLabelTrain = np_utils.to_categorical(encoder.transform(y_train_8and9), len(np.unique(y_train_8and9)))
oneHotLabelTest  = np_utils.to_categorical(encoder.transform(y_test_8and9), len(np.unique(y_test_8and9)))
learnHistSimple = cnnModelSmallDataset.fit(x_train_8and9,oneHotLabelTrain,validation_data=(x_test_8and9,oneHotLabelTest),
                                  batch_size=batch_size,
                                  epochs=10)

<b>Using pretrained weights</b>

If we define a neural net using keras we can initialize all the weights for the different layers with some weights of a different neural network (using the argument weights during initilization). To illustrate this look at the example above, where 3 different models are created. The first 2 models are initialized at random where the weights of the third model are the same as the weights for the first neural net.

In [None]:
net1 = Sequential()
net1.add(Dense(100,input_shape=[100,]))
net1.add(Dense(1))
net1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
net1.summary()

In [None]:
net2 = Sequential()
net2.add(Dense(100,input_shape=[100,]))
net2.add(Dense(1))
net2.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
net2.summary()

In [None]:
net3 = Sequential()
net3.add(Dense(100,input_shape=[100,],weights=net1.layers[0].get_weights()))
net3.add(Dense(1,weights=net1.layers[1].get_weights()))
net3.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
net3.summary()

<b>Exercise 3:</b>  
Create 10 random test instances for the neural nets net1,net2 and net3 (a matirx with dimension 10 x 100). What is the outcome for the 3 different nets?

In [None]:
# YOUR CODE GOES HERE
testInstances = 
predict1 = 
predict2 = 
predict3 =

<b>Exercise 4:</b> 

Create a cnn like the function 'getSimpleCNN' does, but use as weights for initialization the weights of the pretrained model 'cnnModel' for the convolution, the max pooling and the first dense layer.

In [None]:
nb_filters_one = 32
nb_filters_two = 64
nb_conv = 3
nb_pool = 2
dense_size = 128
nb_classes = 2

cnnModelDomainAdaptation = Sequential()
# YOUR CODE GOES HERE


cnnModelDomainAdaptation.compile(loss='categorical_crossentropy', optimizer='SGD', metrics=['accuracy'])
cnnModelDomainAdaptation.summary()

Now, we train the new model using the pretrained weights for initialization.

In [None]:
batch_size = 128
encoder = LabelEncoder().fit(y_train_8and9)
oneHotLabelTrain = np_utils.to_categorical(encoder.transform(y_train_8and9), len(np.unique(y_train_8and9)))
oneHotLabelTest  = np_utils.to_categorical(encoder.transform(y_test_8and9), len(np.unique(y_test_8and9)))
learnHistSimple = cnnModelDomainAdaptation.fit(x_train_8and9,oneHotLabelTrain,validation_data=(x_test_8and9,oneHotLabelTest),
                                  batch_size=batch_size,
                                  epochs=10)

<b>Exercise 5:</b>
Compare the two models trained above for recognizing the digits 8 and 9 (cnnModel vs. cnnModelDomainAdaptation) with each other. Which model would you prefer?

<b>Answer:</b>

<b>2. Part: Autoencoder</b>

In the second part of this tutorial we build an autoencoder to denoise images. To do so we are now using the functional api of keras.

First of all we create a noisy data set.

<b>Exercise 6:</b>
Create a noisy dataset by adding noise to all the images from the training and testing data 'x_train' and 'x_test', respectively. Your dataset should then look like shown in the figure below.

<img src="files/noisy.png",width=600,height=600>

In [None]:
# YOUR CODE GOES HERE
#x_train_noisy = 
#x_test_noisy = 

Now, test your noising code by looking at the first 10 images.

In [None]:
# show images
n = 10
plt.figure(figsize=(20, 2))
for i in range(n):
    ax = plt.subplot(1, n, i+1)
    plt.imshow(x_test_noisy[i].reshape(28, 28))
    plt.gray()
    ax.get_xaxis().set_visible(False)
    ax.get_yaxis().set_visible(False)
plt.show()

Now, lets define the the simplest autoencoder we can imagine. Here we try to encode the images with 32 floats. The input are the raw pixel values for the images as a flattened vector.

In [None]:
# this is the size of our encoded representations
encoding_dim = 32  # 32 floats -> compression of factor 24.5, assuming the input is 784 floats

# this is our input placeholder
input_img = Input(shape=(784,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu')(input_img)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(784, activation='sigmoid')(encoded)

# this model maps an input to its reconstruction
autoencoder = Model(input_img, decoded)

Extract the encoder:

In [None]:
# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)

Extract the decoder:

In [None]:
# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))

Now we compile and train the autoencoder on the flattened images:

In [None]:
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.summary()

In [None]:
autoEncoderHist = autoencoder.fit(x_train_noisy.reshape(x_train_noisy.shape[0],28*28),
                x_train_noisy.reshape(x_train_noisy.shape[0],28*28),
                epochs=10,
                batch_size=128,
                shuffle=True,
                validation_data=(x_test_noisy.reshape(x_test_noisy.shape[0],28*28),
                                 x_test_noisy.reshape(x_test_noisy.shape[0],28*28)))

In [None]:
# encode and decode some digits
encoded_imgs = encoder.predict(x_test_noisy.reshape(x_test_noisy.shape[0],28*28))
decoded_imgs = decoder.predict(encoded_imgs)

<b>Exercise 7:</b>
Plot the first 10 images from the noisy test set and their denoised prediction.

In [None]:
# YOUR CODE GOES HERE


Can we do better when giving the exact objective of trying to build a model the creates for a noisy input its denoised version?

In [None]:
# this is the size of our encoded representations
encoding_dim = 32  # 32 floats -> compression of factor 24.5, assuming the input is 784 floats

# this is our input placeholder
input_img = Input(shape=(784,))
# "encoded" is the encoded representation of the input
encoded = Dense(encoding_dim, activation='relu')(input_img)
# "decoded" is the lossy reconstruction of the input
decoded = Dense(784, activation='sigmoid')(encoded)

# this model maps an input to its encoded representation
encoder = Model(input_img, encoded)

# create a placeholder for an encoded (32-dimensional) input
encoded_input = Input(shape=(encoding_dim,))
# retrieve the last layer of the autoencoder model
decoder_layer = autoencoder.layers[-1]
# create the decoder model
decoder = Model(encoded_input, decoder_layer(encoded_input))

# this model maps an input to its reconstruction
autoencoder = Model(input_img, decoded)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
autoencoder.summary()

autoEncoderHist = autoencoder.fit(x_train_noisy.reshape(x_train_noisy.shape[0],28*28),
                x_train.reshape(x_train_noisy.shape[0],28*28),
                epochs=10,
                batch_size=128,
                shuffle=True,
                validation_data=(x_test_noisy.reshape(x_test_noisy.shape[0],28*28),
                                 x_test.reshape(x_test.shape[0],28*28)))



# encode and decode some digits
encoded_imgs = encoder.predict(x_test_noisy.reshape(x_test_noisy.shape[0],28*28))
decoded_imgs = decoder.predict(encoded_imgs)

<b>Excercise 8:</b>

Add your code from Excersice 7 and look at the new results. Are they better now? And if yes, why?

In [None]:
# YOUR CODE GOES HERE

<b>Additional Exercise:</b>

Try to build a autoencoder using CNN's.