# Bodypart classification using a Siamese network

This notebook is designed to serve as a walk through of the work done in bodypart classifiction. Due to the small size of the dataset available a simple convolutional neural network which learns to differentiate bodyparts achieves a very bad accuracy score. The method of using a Siamese architecture leverages the small size of the dataset by comparing the input image against training images to find which ones the input image to predict are most similar to. This task may not be able to be done efficiently with a very large dataset.

More about the theory and practicalities behind this idea, and the code presented here, can be found in the Documentation in the 'Bodypart Classification' section.

## Import modules

In [None]:
from __future__ import absolute_import
from __future__ import print_function
import numpy as np

import random
import h5py

Import Keras functionalities

In [None]:
from keras.models import Model, Sequential
from keras.layers import Input, concatenate, Conv2D, MaxPooling2D, UpSampling2D, Dropout, Cropping2D,Convolution2D
from keras.layers import BatchNormalization, Reshape, Layer, Dense
from keras.layers import Activation, Flatten, Dense, Dropout
from keras.optimizers import *
from keras.callbacks import ModelCheckpoint, LearningRateScheduler
from keras.metrics import categorical_accuracy
from keras import backend as K
from keras import losses
from keras.models import load_model

## Define model

Define the model architecture including the convolutional network and the euclidean distance metric for 'likeness' comparison in feature space.

In [None]:
def create_base_network(input_dim):
    '''Base network to be shared for feature extraction'''
    seq = Sequential()
    seq.add(Convolution2D(16,(3,3), input_shape=input_sh, activation='relu'))
    seq.add(MaxPooling2D(2,2))
    seq.add(Convolution2D(32,(3,3), padding = 'same',activation='relu'))
    seq.add(MaxPooling2D(2,2))
    seq.add(Dropout(0.1))
    seq.add(Convolution2D(64,(3,3), padding = 'same',activation='relu'))
    seq.add(MaxPooling2D(2,2))
    seq.add(Dropout(0.1))
    seq.add(Flatten())
    seq.add(Dense(32, activation='relu'))
    return seq

def euclidean_distance(vects):
    '''Returns the euclidean distance in feature space'''
    x, y = vects
    return K.sqrt(K.maximum(K.sum(K.square(x - y), axis=1, keepdims=True), K.epsilon()))

def eucl_dist_output_shape(shapes):
    '''Returns the correct output shape for the network to compare to labels'''
    shape1, shape2 = shapes
    return (shape1[0], 1)

## Define metrics

In [None]:
def contrastive_loss(y_true, y_pred):
    '''Contrastive loss from Hadsell-et-al.'06
    http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
    '''
    margin = 1
    return K.mean(y_true * K.square(y_pred) +
                  (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))

def compute_accuracy(y_true, y_pred):
    '''Compute classification accuracy with a fixed threshold on distances.
    '''
    pred = y_pred.ravel() < 0.5
    return np.mean(pred == y_true)

def accuracy(y_true, y_pred):
    '''Compute classification accuracy with a fixed threshold on distances.
    '''
    return K.mean(K.equal(y_true, K.cast(y_pred < 0.5, y_true.dtype)))

## Define hyperparameters

In [None]:
num_classes = 3
epochs = 20

## Load in data

The standard hdf5 file created containing data places the different body parts in different classes. There are, however, no numerical labels for the data. Therefore, we write a function that labels the data with numbers for each bodypart.

In [None]:
def Label(array):
    '''Assigns a numeric label to each image depending on bodypart'''
    #Note: this function must be updated depending on the bodyparts available in the dataset
    
    label = np.array([])
    for i in array:
        if i == b'ankle':
            label = np.append(label,0)
        if i == b'arm':
            label = np.append(label,1)
        if i == b'femur':
            label = np.append(label,2)
        if i == b'foils':
            label = np.append(label,3)
        if i == b'hand':
            label = np.append(label,4)
        if i == b'knee':
            label = np.append(label,5)
        if i == b'leg':
            label = np.append(label,6)
        if i == b'lumbarspin':
            label = np.append(label,7)
        if i == b'neckoffemu':
            label = np.append(label,8)
        if i == b'cropped':
            label = np.append(label,9)
        #if i == b'shoulder':
        #    label = np.append(label,9)
        if i == b'thigh':
            label = np.append(label,10)
        if i == b'tibia':
            label = np.append(label,11)
        if i == b'wrist':
            label = np.append(label,12)
    
    return label

Additionally, we must pair up the data and assign each pair a label depending on whether each image in the pari belongs to the same or a different bodypart.

In [None]:
def format_data(images,classes):
    '''Pairs up images and assigns each pair a label'''
    
    no_images = len(images)
    if no_images%2 != 0:
        no_images -= 1
    i = 0
    pairs = []
    labels = []
    while i < no_images:
        pair_append = [[images[i],images[i+1]]]
        labels_append = 0
        if classes[i] != classes[i+1]:
            labels_append = 1
        pairs.append(pair_append)
        labels.append(labels_append)
        i = i + 2
    pairs = np.asarray(pairs)
    labels = np.asarray(labels)
    return pairs, labels

### Read in the data

In [None]:
#Define the path to the hdf5 data file
data_path = ''

In [None]:
#Read the data
hf = h5py.File(data_path, 'r')

#### Train pairs

In [None]:
train = hf['train_img'][:]
no_images, height,width, channels = train.shape

In [None]:
train_classes = hf['train_bodypart'][:]

In [None]:
train_classes = Label(train_classes)
no_classes = len(np.unique(train_classes))

In [None]:
train_pairs, train_labels = format_data(train,train_classes)

In [None]:
train_pairs = train_pairs.reshape(-1,2,height,width,1)

#### Test pairs

In [None]:
test = hf['test_img'][:]
no_images, height,width, channels = test.shape

In [None]:
test_classes = hf['test_bodypart'][:]

In [None]:
test_classes = Label(test_classes)
no_classes = len(np.unique(test_classes))

In [None]:
test_pairs, test_labels = format_data(test,test_classes)

In [None]:
test_pairs = test_pairs.reshape(-1,2,height,width,1)

#### Validation pairs

In [None]:
val = hf['val_img'][:]
no_images, height,width, channels = val.shape

In [None]:
val_classes = hf['val_bodypart'][:]

In [None]:
val_classes = Label(val_classes)
no_classes = len(np.unique(val_classes))

In [None]:
val_pairs, val_labels = format_data(val,val_classes)

In [None]:
val_pairs = val_pairs.reshape(-1,2,height,width,1)

## Model

Build the model in Keras's ```Functional API```

In [None]:
#Define the input shape
input_sh = (height,width,1)

In [None]:
#Network definition
base_network = create_base_network(input_sh)

In [None]:
#Define input shapes to each of the two convolutional networks
input_a = Input(shape=input_sh)
input_b = Input(shape=input_sh)

#Because we re-use the same instance `base_network`,
#the weights of the network will be shared across the two branches
processed_a = base_network(input_a)
processed_b = base_network(input_b)

#Find the distance in feature space 
distance = Lambda(euclidean_distance,
                  output_shape=eucl_dist_output_shape)([processed_a, processed_b])

#Define the model
model = Model([input_a, input_b], distance)

In [None]:
#Compile the model
model.compile(loss=contrastive_loss, optimizer=RMSProp(), metrics=[accuracy])

In [None]:
#Train the model
model.fit([train_pairs[:, 0], train_pairs[:, 1]], train_labels,
          batch_size=32,
          epochs=epochs)

## Testing

In [None]:
#Compute final accuracy on training and test sets

y_pred = model.predict([train_pairs[:, 0], train_pairs[:, 1]])
tr_acc = compute_accuracy(train_labels, y_pred)
y_pred = model.predict([test_pairs[:, 0], test_pairs[:, 1]])
te_acc = compute_accuracy(test_labels, y_pred)
y_pred = model.predict([val_pairs[:, 0], val_pairs[:, 1]])
val_acc = compute_accuracy(val_labels, y_pred)

print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc))
print('* Accuracy on test set: %0.2f%%' % (100 * te_acc))
print('* Accuracy on validation set: %0.2f%%' % (100 * val_acc))

# TODO:

Ideally the training should be done on an equal number of same and different pairings and the training iterations should alternate between different and same pairings. Both of these should help prevent overfitting to any one particular category. Similarly, bodypart pairings should be better randomised and intersperced in training.

A prediction algorithm has not yet been built which compares the input 'to predict' image with all the training images, or at least one image from each bodypart class of training images, and compares it against each image. The maximum likelihood result can be taken for which the two images are the same and the bodypart class of the paired image with the most likely 'same' predicted label should be take to be the bodypart class of the 'to predict' image.