In [0]:
import numpy as np
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dropout, Flatten, Dense, Activation
from keras import applications
from keras import optimizers
from keras.layers.normalization import BatchNormalization

train_data_dir = 'dataset/train1'
validation_data_dir = 'dataset/validation1'

Using TensorFlow backend.


In [0]:
nb_train_samples = 1000
nb_validation_samples = 200
epochs = 50
batch_size = 20

def features():
    img_width, img_height = 224, 224
    
    datagen = ImageDataGenerator(rescale=1./255)

    model = applications.VGG16(weights='imagenet', include_top=False, input_shape=(img_width, img_height, 3))
    print('Model loaded.')

    train_generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    
    features_train = model.predict_generator(
        train_generator, nb_train_samples//batch_size)
    np.save(open('features_train.npy', 'wb'),
            features_train)

    val_generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
    
    features_validation = model.predict_generator(
        val_generator, nb_validation_samples//batch_size)
    np.save(open('features_validation.npy', 'wb'),
            features_validation)

Since part of the rationale behind this experiment was to see how well the the pre-trained CNN does on a raw unprocessed dataset from a completely different domain than that it was trained on, data augmentation was performed minimally. It was to the extent of rescaling and resizing image size to 224 X 224, which is the default for the VGG16 model.

The strategy followed for this problem was instantiating only the convolutional layers of the VGG16 model ie. upto the fully connected layers. After running the training and validation data once, the features from the last activation layers before the fully connected layers were recorded in 2 numpy arrays.

In [0]:
sgd = optimizers.SGD(lr=0.001, momentum=0.7)

def train_model():
    train_data = np.load(open('features_train.npy', 'rb'))
    train_labels = np.array(
        [0] * int((nb_train_samples / 2)) + [1] * int((nb_train_samples / 2)))

    validation_data = np.load(open('features_validation.npy', 'rb'))
    validation_labels = np.array(
        [0] * int((nb_validation_samples / 2)) + [1] * int((nb_validation_samples / 2)))

    model = Sequential()
    model.add(Flatten(input_shape=train_data.shape[1:]))
    model.add(Dense(256))
    model.add(Activation('relu'))
    model.add(Dropout(0.25))
    model.add(Dense(1))
    model.add(Activation('sigmoid'))

    model.compile(optimizer='sgd',
                  loss='binary_crossentropy', metrics=['accuracy'])
    
    model.fit(train_data, train_labels,
              epochs=epochs,
              batch_size=batch_size,
              validation_data=(validation_data, validation_labels))
    model.save_weights('bottleneck_fc_model.h5')

A small fully-connected classifier model was consequently trained on top of the pre-trained models. The features stored after running the VGG16 model, were taken as inputs into the model. The weights were then saved in 'bottleneck_fc_model.h5', so we could later use them for fine-tuning.

Dropout is implemented by only keeping a neuron active with some probability p(a hyperparameter), or setting it to zero otherwise. It can be interpreted as sampling a Neural Network within the full Neural Network, and only updating the parameters of the sampled network based on the input data. p=0.25 is a reasonable enough value, and so the dropout was kept at 0.25 in the model. Loss was computed through 'binary crossentropy' since the classification was carried out between 2 classes, and stochastic gradient descent with tunable parameters was chosen as the optimization function.

In [0]:
features()
train_model()

Model loaded.
Found 1000 images belonging to 2 classes.
Found 200 images belonging to 2 classes.
Train on 1000 samples, validate on 200 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


Not bad! Using just the bottleneck features, the model reached a validation accuracy of 79% within 50 epochs (In this case, more number of epochs made it prone to overfit, so implemented Early Stopping). Of course, this performance is likely to improve if the model is finetuned after this point.
