## SDAE - Stacked Denoising Auto-Encoder

Generally, weights in a neural network (for classification) are initialized randomly. This works fine when we have a lot of labelled data. But what if we have a lot of data but only a fraction of it is labelled? A good example of this case is google images. We have a ton of data there, but only a tiny fraction of it is actually labelled. Can we classify all the data by training a network only on that fraction of labelled data?

In this experiment we demonstrate that initiaizing the network with pretrained SDAE weights boosts classification performance. In a stacked auto-encoder, every layer is trained individually to predict it's input. 

In [1]:
# Import the necessary libraries/modules
import numpy as np # for array operations
from keras.models import Model, Sequential # for defining the architectures
from keras.layers import Dense, Dropout, Input # layers for building the network
from keras.utils import to_categorical # to_categorical does one-hot encoding

tmp = np.load('cifar_pca_train.npz')
data = tmp['data']
labels = tmp['labels']

train_data = data[:10000]      # We'll use only 10000 out of 50000 samples for classification
train_labels = labels[:10000]

tmp = np.load('cifar_pca_test.npz')
test_data = tmp['data']
test_labels = tmp['labels']

# Converting labels into one-hot vectors for training. one-hot encoding is nothing but dummyfing
train_labels = to_categorical(train_labels, 10) 
test_labels = to_categorical(test_labels, 10)

print(train_data.shape)
print(train_labels.shape)
print(test_data.shape)
print(test_labels.shape)

Using TensorFlow backend.


(10000, 781)
(10000, 10)
(10000, 781)
(10000, 10)


### Training an MLP for classification through randomly initialized weights
#### Note:  We'll use only 10000 out of 50000 samples for classification

In [3]:
# training a simple one hidden layer MLP for classification task

mlp = Sequential()
mlp.add(Dropout(0.2, input_shape=(781,)))
mlp.add(Dense(1000, activation='sigmoid'))
mlp.add(Dropout(0.5))
mlp.add(Dense(800, activation='sigmoid'))
mlp.add(Dropout(0.5))
mlp.add(Dense(10, activation='softmax'))

nb_epoch = 20
batch_size = 32

mlp.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])


history = mlp.fit(train_data, train_labels,
                    batch_size=batch_size,
                    epochs=nb_epoch,
                    validation_data=(test_data, test_labels))

Train on 10000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


### Train a two layer stacked autoencoder
#### Note:  We'll use all 50000 samples for training. Since no labels are required for autoencoder.

The way we do this is: 
1. Train an AE (autoencoder) with 1000 nodes on the original 781 features. So this will give 1000 encoded features.
2. Train another AE (autoencoder1) with 800 nodes on the above 1000 features.

The weights of the encoding layers of the above two netwroks can be used to initialize the MLP for classification .

In [4]:
# Training an autoencoder model on cifar-10 PCA reduced data

input_img = Input(shape=(781,))
crrpt_img = Dropout(0.5)(input_img)
encoded = Dense(1000, activation='sigmoid')(crrpt_img)
decoded = Dense(781, activation='linear')(encoded)

autoencoder = Model(input_img,decoded)

nb_epoch = 20
batch_size = 32

autoencoder.compile(optimizer='adam',
                    loss='mean_squared_error')

history = autoencoder.fit(data, data,
                    epochs=nb_epoch,
                    batch_size=batch_size,
                    shuffle=True,
                    validation_data=(test_data, test_data))


autoencoder.save('SDAE_l1_model.h5')

Train on 50000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [5]:
encoder = Model(input_img,encoded)
htrain_data = encoder.predict(data)

In [6]:
input_img1 = Input(shape=(1000,))
crrpt_img1 = Dropout(0.5)(input_img1)
encoded1 = Dense(800, activation='sigmoid')(crrpt_img1)
decoded1 = Dense(1000, activation='sigmoid')(encoded1)

autoencoder1 = Model(input_img1,decoded1)

autoencoder1.compile(optimizer='adam',
                    loss='binary_crossentropy')

history = autoencoder1.fit(htrain_data, htrain_data,
                    epochs=nb_epoch,
                    batch_size=batch_size,
                    shuffle=True)

autoencoder1.save('SDAE_l2_model.h5')

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [8]:
history.history.keys()
history.history['loss']

[0.50201290034294133,
 0.48248770041465761,
 0.48005497880935671,
 0.47914626605987548,
 0.47867883778572085,
 0.47839034770011901,
 0.47817189907073976,
 0.47795296325683595,
 0.47778135187149046,
 0.47768380585670472,
 0.47758046679496763,
 0.47743361221313474,
 0.47738703725814818,
 0.47719481215476989,
 0.47712607245445249,
 0.47705453159332273,
 0.47699410514831542,
 0.47690007112503052,
 0.47684359521865843,
 0.47676485745429992]

### Training an MLP for classification through randomly initialized weights

In [9]:
# Training with unsupervised initialization for layer-1 
# of an MLP using autoencoder weights

mlp = Sequential()
mlp.add(Dropout(0.2, input_shape=(781,)))
mlp.add(Dense(1000, activation='sigmoid'))
mlp.add(Dropout(0.5))
mlp.add(Dense(800, activation='sigmoid'))
mlp.add(Dropout(0.5))
mlp.add(Dense(10, activation='softmax'))

mlp.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [11]:
mlp.layers

[<keras.layers.core.Dropout at 0x1433f5d0>,
 <keras.layers.core.Dense at 0x12207b90>,
 <keras.layers.core.Dropout at 0x12207910>,
 <keras.layers.core.Dense at 0x1433ff50>,
 <keras.layers.core.Dropout at 0x1433f590>,
 <keras.layers.core.Dense at 0x14060bd0>]

In [12]:
mlp.layers[1].set_weights(autoencoder.layers[2].get_weights())   # Import the encoder weights of autoencoder  
mlp.layers[3].set_weights(autoencoder1.layers[2].get_weights())  # Import the encoder weights of autoencoder1

history = mlp.fit(train_data, train_labels,
                    batch_size=batch_size,
                    epochs=nb_epoch,
                    validation_data=(test_data, test_labels))

Train on 10000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20
