<h1>Binary VGG16 Convnet 3-6-2016</h1>

<strong>Abstract</strong> 
Trained VGG16 CNN on equal number of images from 2 people (George_W_Bush and Colin_Powell) over 100 epochs resulting in 84% accuracy and test score of .63, on our test set. Demonstrate it's possible to train a CNN to differentiate between two people with equal size image sets fairly easily without any tuning.

<strong>Details</strong>
The final epoch of training on the CNN reaches 99.7% accuracy on test set and around 72% on validation set and took about 30 minutes running on GeForce GT 650M 1GB. Images were resized to 100x100 dimensions. Models and weights have been persisted in the models folder.

<strong>Takeaways</strong>
<ul>
    <li>Originally used a binary CNN where it outputs either 0 or 1 because this is essentially a binary classification problem. It did not learn - so shifted towards a categorical CNN with 2 classes as either the person we're identifying or not the person.</li>
    <li>Original CNN wasn't deep enough, then implemented well-known VGG16 which performed much better.</li>
    <li>Configured with not enough epochs, originally running around 10 for a while but realized 100 actually resulted in the CNN actually learning</li>
    <li>Precision and Recall become less important as a metric when doing multi-categorical classification</li>
    <li>Running Theano on GPU with CuDNN and CNMeM significantly improves the run time by factor of ~200x vs running on the CPU</li>
    <li>Earlystopping is important in identifying if the CNN will be able to even learn the problem.</li>
</ul>

<strong>Recommendations</strong>
Try to identify one face among many other faces, treat that as a binary classification problem as well.

In [1]:
%load_ext autoreload

In [2]:
%autoreload 2
%matplotlib inline

import os
import fnmatch
import numpy as np
from matplotlib.pyplot import imshow 
from PIL import Image

from skimage import io
from skimage.transform import resize
from sklearn.metrics import confusion_matrix

from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.utils import np_utils
from keras.callbacks import EarlyStopping
from keras.models import model_from_json

np.random.seed(123456)

Using Theano backend.
Using gpu device 0: GeForce GT 650M (CNMeM is enabled with initial size: 40.0% of memory, CuDNN 4007)


In [3]:
data_path = '../data/'
data_lfw_path = data_path + 'lfw_cropped/'

class1 = 'George_W_Bush'
class2 = 'Colin_Powell'

batch_size = 32
nb_epoch = 100
img_rows, img_cols = 100, 100
train_size_percent = .85
validation_split = .15
random_discard_percent = 0

In [4]:
def get_filenames_separated_from_target(class1, class2):
    class1_files = []
    class2_files = []
    
    for root, dirnames, filenames in os.walk(data_lfw_path):
        for dirname in dirnames:
                for filename in os.listdir(os.path.join(data_lfw_path, dirname)):
                    if filename.endswith(".jpg"):
                        f = os.path.join(root + dirname, filename)
                        if dirname == class1:
                            class1_files.append(f)
                        elif dirname == class2:
                            class2_files.append(f)
    return class1_files, class2_files

In [5]:
def get_train_and_test_sets(class1_data, class2_data):
    
    size = min(len(class1_data), len(class2_data))
    
    all_data = [(t, 1) for t in class1_data[:size]] + [(t, 0) for t in class2_data[:size]]

    np.random.shuffle(all_data)
    
    train_size = int(train_size_percent * len(all_data))
    X_train = np.array([x[0] for x in all_data[:train_size]])
    y_train = np.array([x[1] for x in all_data[:train_size]])
    X_test = np.array([x[0] for x in all_data[train_size:]])  
    y_test = np.array([x[1] for x in all_data[train_size:]])
      
    return (X_train, y_train), (X_test, y_test)

In [6]:
def image_read(f):
    return resize(io.imread(f), (img_rows, img_cols))

In [7]:
def display_image(m):
    imshow(Image.fromarray(np.uint8(m * 255)))

In [8]:
class1_files, class2_files = get_filenames_separated_from_target(class1, class2)

In [9]:
class1_images = [image_read(f) for f in class1_files]
class2_images = [image_read(f) for f in class2_files]

In [10]:
(X_train, y_train), (X_test, y_test) = get_train_and_test_sets(class1_images, class2_images)

In [11]:
X_train = X_train.reshape(X_train.shape[0], 3, img_rows, img_cols)
X_test = X_test.reshape(X_test.shape[0], 3, img_rows, img_cols)

In [12]:
Y_train = np_utils.to_categorical(y_train,2)
Y_test = np_utils.to_categorical(y_test,2)

In [14]:
def VGG_16(optimizer, batch_size=16):
    model = Sequential()
    model.add(ZeroPadding2D((1,1),input_shape=(3,img_rows,img_cols)))
    model.add(Convolution2D(32, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(32, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(64, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(64, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(128, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(128, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(128, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(ZeroPadding2D((1,1)))
    model.add(Convolution2D(256, 3, 3, activation='relu'))
    model.add(MaxPooling2D((2,2), strides=(2,2)))

    model.add(Flatten())
    model.add(Dense(2048, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2048, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(2, activation='softmax'))
    
    model.compile(loss='categorical_crossentropy',
              optimizer=optimizer)
    
    return model

In [16]:
model = VGG_16('sgd', batch_size)
early_stopping = EarlyStopping(monitor='loss', patience=10, mode='min')
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, 
        show_accuracy=True, verbose=1, shuffle=True, validation_split=validation_split)

Train on 340 samples, validate on 61 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Ep

<keras.callbacks.History at 0x13c18d710>

In [17]:
json_string = model.to_json()
open('models/BinaryVGG16Convnet.json', 'w').write(json_string)
model.save_weights('models/BinaryVGG16Convnet.h5')

In [13]:
model = model_from_json(open('models/BinaryVGG16Convnet.json').read())
model.load_weights('models/BinaryVGG16Convnet.h5')

In [14]:
score = model.evaluate(X_test, Y_test, show_accuracy=True, verbose=1)
print('Test score:', score[0])
print('Test accuracy:', score[1])

('Test score:', 0.63323318958282471)
('Test accuracy:', 0.84507042253521125)


In [15]:
y_pred = model.predict_classes(X_test)



In [16]:
confusion_matrix(y_test, y_pred)

array([[33,  2],
       [ 9, 27]])

In [3]:
m = [[33., 2.], [9., 27.]]
print 'precision: ' + str(m[1][1] / (m[0][1] + m[1][1]))
print 'recall: ' + str(m[1][1] / (m[1][0] + m[1][1]))

precision: 0.931034482759
recall: 0.75
