In [1]:
import numpy as np
from PIL import Image
import itertools
from scipy import ndimage
from keras.utils import np_utils

import scipy.spatial as sp
import cv2
import os

import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.layers import SpatialDropout2D, GlobalAveragePooling2D

import re
import glob
import pandas as pd

from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.metrics import accuracy_score, confusion_matrix, roc_curve, roc_auc_score

import tensorflow

We have just under 1000 images labelled by the film that they were captured from. All images contain the main (Or a main) character from the film, those being:

Cars            :- Lightning McQueen
Finding Nemo    :- Nemo
Monsters Inc    :- Mike (The Green One)
The Incredibles :- Mr. Incredible
Up              :- Carl (The Grandpa)
Wall-E          :- Eva

We chose these characters largely due to their high frequency in the image dataset, created here: 
            https://github.com/LaurenceDyer/pixaR
            
We aim to train a CNN model to correctly classify images of the main characters from these films. First, we import the fileset using opencv, then transform it in to an appropriately shaped tensor. This does require resizing the images (And thus losing quite some resolution), also performed via opencv.

In [2]:
jpg_files = glob.glob('*.jpg')

In [3]:
char_dict = {0: "Lightning McQueen", 1: "Nemo", 2: "Mike", 3: "Mr. Incredible", 4: "Carl", 5: "Eva"}

images = [cv2.imread(file) for file in jpg_files]

In [4]:
imageTensor = np.empty((973,528,1280,3), dtype=np.float32)
for (k, image) in enumerate(images):
    imageTensor[k] = cv2.resize(image, dsize=(1280, 528))
    

We can use string modification to create a label set from the image names of each file.

In [5]:
labels = [re.sub(r"[0-9]+.jpg","",x) for x in jpg_files]
labels = [re.sub(r"cars","Lightning McQueen",x) for x in labels]
labels = [re.sub(r"fnemo","Nemo",x) for x in labels]
labels = [re.sub(r"monsters","Mike",x) for x in labels]
labels = [re.sub(r"ti","Mr. Incredible",x) for x in labels]
labels = [re.sub(r"up","Carl",x) for x in labels]
labels = [re.sub(r"walle","Eva",x) for x in labels]

labels = np.array(labels).reshape((973,1))

In [6]:
print(imageTensor.shape)
print(labels.shape)

(973, 528, 1280, 3)
(973, 1)


Great! All our tensors now have the correct shape for training. We can use sklearn to split our train/test data.

We will also divide our integer values by 255 to normalize them to the range 0-1.

In [7]:
X_train, X_test, y_train, y_test = train_test_split(imageTensor/255.,labels,test_size=0.1,random_state=1337)
print("Split")

Split


We reshape our labels into one-hot encoded (N-1) columns.

In [8]:
char_dict_inv = {v: k for k, v in char_dict.items()}

y_train = np.vectorize(char_dict_inv.get)(y_train)
y_test = np.vectorize(char_dict_inv.get)(y_test)

print(y_train.shape)
print(y_test.shape)

(875, 1)
(98, 1)


In [9]:
#X_test = X_test.reshape(X_test.shape[0],3,528,1280).astype("float32")
#X_train = X_train.reshape(X_train.shape[0],3,528,1280).astype("float32")

In [10]:
print(X_train.shape)

(875, 528, 1280, 3)


Now we create our model. We choose to use a CNN because of its proven track record of high-quality modelling in computer vision. Because our dataset is quite small, it is important not to allow for excessive overfitting and as such we will utilise several dropout steps.

In [11]:
def cnn_model():
    model = Sequential() 
    model.add(Conv2D(32, (3, 3), input_shape=(528, 1280, 3), activation = 'relu'))  
    model.add(SpatialDropout2D(0.2))
    model.add(Conv2D(64, (3, 3), activation = 'relu'))
    model.add(SpatialDropout2D(0.2))

    model.add(GlobalAveragePooling2D())
    
    model.add(Dense(6, activation= 'softmax'))
    
    optimizer = keras.optimizers.Adam()
    
    model.compile(loss='sparse_categorical_crossentropy',
              optimizer=optimizer,
              metrics=['accuracy'])
    return model

In [12]:
model = cnn_model()

history = model.fit(X_train,y_train, validation_split=0.1, epochs=10, batch_size=16)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [13]:
show_history(history.history)

NameError: name 'show_history' is not defined

As we can see, our validation accuracy has increased to roughly 42% and is still increasing as of 10 epochs, with our loss at 1.5 and dropping. This is very reassuring as overfitting was likely to be the biggest challenge in this analysis. Computationally speaking, this took my laptop almost a full day, and so I cannot continue training the model but am extremely satisfied with it and imagine its accuracy would keep increasing well above 50% if left for one or two hundred epochs.