# Demonstration of Convolutional Neural Networks

In this demonstartion I show how Conv Nets can be used to categorise images. Look at the notebook entiteled "image-scrapper.ipynb". This uses the bing API to search the web for an image. It finds around 1000 images for a search term, such as 'cat', and saves them in a directory of the same name. It turns them into black and white images for simplicity and numbers them all cat_1.jpg etc. The order therefore may be somewhat significant as they are in the same order as bing gives them to us.

This notebook tries to create a catagorical classifier for each of the given categories.

We can get moderate accuracy, of around 60% with 5 categories. There is definite scope for improvement, but the data set is quite small and a bit messy, so this might dictate an upper limit of how well any model can do.

## Data Prep

* Take each image and convert it to a numpy array.
* Then make an deep array of images of the shape (datasize, imagesize, imagesize)
* We also need to make a similiar array of labels, these are simply integers to begin with.
* One hot encode the integers (one for each class).
* There will be a small number of classes (typically around 5, or even only 2)
* Split the data into training and testing.

In [None]:
from PIL import Image
import numpy as np
import glob, os

from matplotlib.pyplot import imshow
import seaborn

# Only use this for splitting the arrays
from sklearn.model_selection import train_test_split

import keras

%matplotlib inline

In [None]:
def file_to_numpy(file_name):
    """
    Return a numpy array from a filename.
    """
    
    im = Image.open(file_name)
    arr = np.array(im.getdata())
    # Assuming the image is square
    size = int(np.sqrt(arr.shape[0]))
    arr = arr.reshape((size,size,1))
    return arr

In [None]:
def plot_file(file_name):
    """
    Santity check function, simply looks at the data and makes
    sure it still makes sense by plotting it.
    """
    
    data = file_to_numpy(file_name)
    imshow(data[:,:,0], interpolation='nearest')


In [None]:
test_file = '../data/convnet/cat/cat_4.jpg'

In [None]:
plot_file(test_file)

Still looks like a cat, that's a good start!

### Setting up the labels

Define the categories here. Create lookups of integers to the categories, and visa-versa.

In [None]:
categories = ['cat', 'donkey', 'monkey', 'donald_trump', 'dog']
total_categories = len(categories)
num2cat = {i:c for i,c in enumerate(categories)}
cat2num = {c:i for i,c in enumerate(categories)}

For larger data sets it would be sensible to do something more clever here with batching the data, but as we don't have that much it doesn't matter too much here, so load the whole thing into memory. We also exclude any images that are literally identical, of which there are a few. This does not do anything about images that are almost the same though, which might be something to think about.

In [None]:
data = []
labels = []
base_dir = "../data/convnet"
for c in categories:
    c_dir = os.path.join(base_dir, c)
    glob_str = os.path.join(c_dir, "{}*.jpg".format(c))
    print("{}:{}".format(c, len(glob.glob(glob_str))))
    # Keep a set of pictures that we have seen, so that
    # we can reject duplicates. This assumes that there
    # are only duplicates within the same directory
    all_pics = set()
    total_duplicates = 0
    for f in glob.glob(glob_str):
        array = file_to_numpy(f)
        HASH = hash(array.tostring())
        if HASH in all_pics:
            total_duplicates += 1
            continue
        else:
            all_pics.add(HASH)
        data.append(array)
        labels.append(cat2num[c])
    print("Total duplicates: {}".format(total_duplicates))
data = np.array(data)
labels = np.array(labels)
labels = keras.utils.to_categorical(labels)

In [None]:
data.shape

In [None]:
labels.shape

### Rescale the data

The learning will work better if the data is centred around zero, and has a standard deviation of around 1. Let's perform this rescale here.

In [None]:
data = data - data.mean()

In [None]:
data = data/data.std()

### Split the data

Split the data into training and testing data. Note that this also does the shuffeling for us, as our data is in a non-random order.

In [None]:
train_data, test_data, train_labels, test_labels = train_test_split(data, labels, train_size=0.9)

## The model

### Build the model

Pass the data through one convoulution step, a pooling step, one hidden layer and then the output layer. I've played aroung with a fair few model settings and this one seems to be the best. Much bigger models don't have enough data to train, although I find that any sensible choice of model does roughly the same, suggesting that the size and quality of the data may be the thing that is limiting the accuracy, not the model choices.

In [None]:
# Model parameters
depth = 16
hidden_nodes = 64
kernel_size = 5
pool_size = 2
dropout_rate = 0.5
image_size = data.shape[1]

In [None]:
model = keras.models.Sequential()
model.add(keras.layers.Conv2D(input_shape=(image_size, image_size, 1), filters=depth, kernel_size=kernel_size))
model.add(keras.layers.MaxPool2D(pool_size=(pool_size, pool_size)))
# model.add(keras.layers.Conv2D(filters=depth*2, kernel_size=kernel_size))
# model.add(keras.layers.MaxPool2D(pool_size=(pool_size, pool_size)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(hidden_nodes, activation='relu'))
model.add(keras.layers.Dropout(dropout_rate))
# model.add(keras.layers.Dense(int(hidden_nodes/2), activation='relu'))
model.add(keras.layers.Dense(total_categories, activation='softmax'))

In [None]:
model.summary()

In [None]:
model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])


### Train the model

Simply need to train this model now, which is just a built in function.

In [None]:
model.fit(train_data, train_labels, batch_size=128, epochs=12)

### Evaluate the model

Do a few examples in an explict way, and compute the accuracy and the confusion matrix.

In [None]:
model.evaluate(test_data, test_labels)

In [None]:
def pretty_predict(index):
    """
    Predicts a single input with a picture and a verbose output.
    """
    
    print("Index: {}".format(index))
    label = num2cat[train_labels[index].argmax()]
    print("It's a {}".format(label))
    data = train_data[index]
    imshow(data[:,:,0], interpolation='nearest')
    prediction = model.predict(np.array([data]))
    predicted_label = num2cat[prediction.argmax()]
    print("Model says that this is a {}".format(predicted_label))

In [None]:
pretty_predict(np.random.randint(len(train_data)))

As a final illustration I plot the confusion matrix. This shows how often class i is predicted as class j. Large diagonal elements are the ones the model gets right. This can often give some clues as the where the model is going wrong.

In [None]:
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt

In [None]:
predictions = model.predict(test_data)
predictions = [num2cat[p.argmax()] for p in predictions]
labels = [num2cat[p.argmax()] for p in test_labels]

In [None]:
_, ax = plt.subplots(figsize=(9,9))
ax = seaborn.heatmap(confusion_matrix(predictions, labels, labels=categories),
                     xticklabels=categories, yticklabels=categories,
                     annot=True, cbar=False, square=True)