# Mohcine Madkour


# Data Science Problem

A collaborator from Dept. Plastic surgery wants to develop an ML-based computational tool, which can tell whether an input ear image is normal or abnormal and, if abnormal, what shape of the ear should be the ideal as results of surgical intervention. The collaborator currently has 300 abnormal and 200 normal ear images. Assuming you are in charge of this project, what would be the best strategy for implementing the tool? Elaborate your strategy, if possible, along with pseudo-codes.

# Approach

Convolutional Neural Networks (CNN or ConvNet) are by design one of the best models available for most perceptual problems such as image classification, even with very little data to learn from. A CNN convolves learned features with input data, and uses 2D convolutional layers, making this architecture well suited to processing 2D data, such as images.Training a convnet from scratch on a small image dataset will still yield reasonable results, without the need for any custom feature engineering.

CNNs eliminate the need for manual feature extraction, so we do not need to identify features used to classify images. The CNN works by extracting features directly from images. The relevant features are not pretrained; they are learned while the network trains on a collection of images. This automated feature extraction makes deep learning models highly accurate for computer vision tasks such as object classification.

In CNN the image passes through a series of convolutional, nonlinear, pooling layers and fully connected layers, and then generates the output:

![title](./img/cnn1.png "ShowMyImage")

CNNs learn to detect different features of an image using tens or hundreds of hidden layers. Every hidden layer increases the complexity of the learned image features. For example, the first hidden layer could learn how to detect edges, and the last learns how to detect more complex shapes specifically catered to the shape of the object we are trying to recognize.

# Method

## Spliting the Dataset to Training and testing

We first need to split our images to training and validation sets (90%, 10%)

In [None]:
# Split into test and training sets
TRAIN_TEST_SPLIT = 0.9

# Split at the given index
split_index = int(TRAIN_TEST_SPLIT * n_images)
shuffled_indices = np.random.permutation(n_images)
train_indices = shuffled_indices[0:split_index]
test_indices = shuffled_indices[split_index:]

# Split the images and the labels
x_train = images[train_indices, :, :]
y_train = labels[train_indices]
x_test = images[test_indices, :, :]
y_test = labels[test_indices]

![title](./img/classes.png "ShowMyImage")

## Data pre-processing and data augmentation

 Because I have a relatively small number of pictures, I augment them via a number of random transformations. This helps prevent overfitting and helps the model generalize better.

In Keras this can be done via the *keras.preprocessing.image.ImageDataGenerator* class. This class allows to configure random transformations and normalization operations to be done on an image data during training, and it instantiate generators of augmented image batches (and their labels) via *.flow(data, labels)* or *.flow_from_directory(directory)*. These generators can then be used with the Keras model methods that accept data generators as inputs, *fit_generator*, *evaluate_generator* and *predict_generator*.


From each single ear shape image, I can generate some pictures and save them to disk, I may want to disable rescaling in this case to keep the images displayable:

In [None]:
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img

datagen = ImageDataGenerator(
        rotation_range=40,`
        width_shift_range=0.2,
        height_shift_range=0.2,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True,
        fill_mode='nearest')

img = load_img('data/train/EARs/ear.0.jpg')  # this is a PIL image
x = img_to_array(img)  # this is a Numpy array with shape (3, 150, 150)
x = x.reshape((1,) + x.shape)  # this is a Numpy array with shape (1, 3, 150, 150)

The .flow() command  generates batches of randomly transformed images and saves the results to the `preview/` directory

In [None]:
i = 0
for batch in datagen.flow(x, batch_size=1,save_to_dir='preview', save_prefix='cat', save_format='jpeg'):
    i += 1
    if i > 20:
        break  # otherwise the generator would loop indefinitely

## Model Construction and Training using Convnet from scratch

The three most common ways people use deep learning to perform object classification are: training from scratch, transfer learning, and feature extraction. Because I have a new application, I used from scratch method of convnet to train the data, as an initial baseline.

In my case I only use a small convnet with few layers and few filters per layer, alongside data augmentation and dropout. Dropout helps reduce overfitting, by preventing a layer from seeing twice the exact same pattern, thus acting in a way analoguous to data augmentation

The code snippet below construct the NN model using Keras and TensorFlow, a simple stack of 3 convolution layers with a ReLU activation and followed by max-pooling layers: 

In [None]:
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation, Dropout, Flatten, Dense

model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(3, 150, 150)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

# the model so far outputs 3D feature maps (height, width, features)


On top of it I stick two fully-connected layers. I end the model with a single unit and a sigmoid activation, which is perfect for a binary classification. To go with it we will also use the binary_crossentropy loss to train our model.

In [None]:
model.add(Flatten())  # this converts our 3D feature maps to 1D feature vectors
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

I use .flow_from_directory() to generate batches of image data (and their labels) directly from the supposedly jpgs in their respective folders.

In [None]:
batch_size = 16

# this is the augmentation configuration we will use for training
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

# this is the augmentation configuration we will use for testing:
# only rescaling
test_datagen = ImageDataGenerator(rescale=1./255)

# this is a generator that will read pictures found in
# subfolers of 'data/train', and indefinitely generate
# batches of augmented image data
train_generator = train_datagen.flow_from_directory(
        'data/train',  # this is the target directory
        target_size=(150, 150),  # all images will be resized to 150x150
        batch_size=batch_size,
        class_mode='binary')  # since we use binary_crossentropy loss, we need binary labels

# this is a similar generator, for validation data
validation_generator = test_datagen.flow_from_directory(
        'data/validation',
        target_size=(150, 150),
        batch_size=batch_size,
        class_mode='binary')

I can now use these generators to train our model:

In [None]:
model.fit_generator(
        train_generator,
        steps_per_epoch=2000 // batch_size,
        epochs=50,
        validation_data=validation_generator,
        validation_steps=800 // batch_size)
model.save_weights('first_try.h5')  # always save your weights after training or during training

## Feeding images into Convnet and evaluate the model﻿

After the model has finished training, I can evaluate the model on the test set using the following code:

In [None]:
# Make a prediction on the test set
test_predictions = model.predict(x_test)
test_predictions = np.round(test_predictions

I have to round the scores so that I get a binary output.

Then use the predictions and compare them to the ground truth

In [None]:
# Report the accuracy
accuracy = accuracy_score(y_test, test_predictions)
print("Accuracy: " + str(accuracy))

# Predicting ideal shapes of abnormal ears after a surgical intervention

The abnormal images represents deformations comparing to normal images, therefore similarity/dissimilarity measurement is quiet difficult. We need to use a suitable AI technique with prior knowledge to get suitable matching score. Otherwise we need to reduce such deformations using possible warping technique as a pre-processing work. 

To choose a similarity/dissimilarity measurement technique, first we need to know what features of ear biometrie we are trying to compare ? Second, we need to know what we wish to obtain from the comparaison (Do we want to group them? Do we want to simply tell how different are they? Do we want to tell how the images changes with respect to a particular surgurical procedure)?

Clearly every different clinical question will have different technical metrics that can quantify image deformation. Correlation analyses can help show the similarity between different images features, however, correlation studies the relationship between one feature and another, not the differences, Bland-Altman analyses provides a method of measuring variability relative to an accepted reference standard. This method is based on the quantification of the agreement between two quantitative measurements by studying the mean difference and constructing limits of agreement.

A supervised leanring method for prediction the ideal shape of an image after surgery could be by grouping normal and abnormal shapes into granular groupes (i.e. labeling them into defined classes), and by using the knowledge of a human expert (as we have a limited number of images) and using a ConvNet we can train a CNN model that can predict the class of an input image the same way as we did for classifying normal and abnormal images. As a result the CNN based model can extract the features/values inherent to each class and weight the Neural Network in a way that make it learns this classification.    