# Abstract

**This notebook introduces our approach to the "RSNA Intracranial Hemorrhage Detection" Competition. In specific, we focus on pointing out how we tuned the ResNet50 pretrained neural net from the Keras library in order to achieve worthwile predictions of hemorrhages and their subtype.**


### Acknowledgements
We benefited heavily from the public notebooks available for this competition, including amongst others:
* [Akenesert](https://www.kaggle.com/akensert/inceptionv3-prev-resnet50-keras-baseline-model)'s notebook on the InceptionV3 implementation
* [Ryan Epp](https://www.kaggle.com/reppic/gradient-sigmoid-windowing)'s notebook on windowing
* [Marco E](https://www.kaggle.com/marcovasquez/basic-eda-data-visualization)'s EDA notebook
* ...



### Further Information

* [Residual Networks in Keras 1](https://towardsdatascience.com/hitchhikers-guide-to-residual-networks-resnet-in-keras-385ec01ec8ff)/[Residual Networks in Keras 2](https://towardsdatascience.com/understanding-and-coding-a-resnet-in-keras-446d7ff84d33), and [ResNet50](https://www.kaggle.com/keras/resnet50) in specific 
* [Hyperparameter tuning with Keras Tuner](https://keras-team.github.io/keras-tuner/) (and a [simple introduction](https://www.mikulskibartosz.name/how-to-automaticallyselect-the-hyperparameters-of-a-resnet-neural-network/))

# 1. Preparation

In [None]:
import numpy as np
import pandas as pd
import pydicom
import cv2
from math import ceil, floor, log
from os import listdir
from os.path import isfile, join
import matplotlib.pyplot as plt

import tensorflow as tf
import keras
#from keras_applications.resnet import ResNet50
from keras_applications.inception_v3 import InceptionV3
from sklearn.model_selection import ShuffleSplit

We can set the directory right to the Kaggle backend. This makes handling the data thoroughout this notebook much easier:

In [None]:
# Set directories
DATA_DIR = '../input/rsna-intracranial-hemorrhage-detection/rsna-intracranial-hemorrhage-detection'

TRAIN_IMAGES_DIR = DATA_DIR + '/stage_2_train/'
TRAIN_CSV_DIR = DATA_DIR + '/stage_2_train.csv'

TEST_IMAGES_DIR = DATA_DIR + '/stage_2_test/'
TEST_CSV_DIR = DATA_DIR + '/stage_2_sample_submission.csv'

# 2. Data Cleaning and Exploration

We have 4,416,818 unique training labels but there are duplicates we need to remove. So we define a function to load the train labels correctly and also a function to prepare the test labels.

In [None]:
# Check for duplicates in trainset csv
train = pd.read_csv(TRAIN_CSV_DIR)
print(train.head(10))
print(len(train)) # 4,516,842 training labels
print(len(train.ID.unique())) # 4,516,818 unique training labels

# Identify duplicate rows in trainset
duplicates = train[train.duplicated(subset=None, keep='first')]
print(duplicates)

# Define function to remove duplicates when loading trainset
def read_trainset(filename=TRAIN_CSV_DIR):
    df = pd.read_csv(filename)
    df["Image"] = df["ID"].str.slice(stop=12)
    df["Diagnosis"] = df["ID"].str.slice(start=13)
    
    duplicates_to_remove = [
        56346, 56347, 56348, 56349, 56350, 56351,
        1171830, 1171831, 1171832, 1171833, 1171834,
        1171835, 3705312, 3705313, 3705314, 3705315, 
        3705316, 3705317, 3842478, 3842479, 3842480,
        3842481, 3842482, 3842483
    ]
    
    df = df.drop(index=duplicates_to_remove)
    df = df.reset_index(drop=True)
    
    df = df.loc[:, ["Label", "Diagnosis", "Image"]]
    df = df.set_index(['Image', 'Diagnosis']).unstack(level=-1)
    
    return df

# Define function to load testset
def read_testset(filename= TEST_CSV_DIR):
    df = pd.read_csv(filename)
    df["Image"] = df["ID"].str.slice(stop=12)
    df["Diagnosis"] = df["ID"].str.slice(start=13)
    
    df = df.loc[:, ["Label", "Diagnosis", "Image"]]
    df = df.set_index(['Image', 'Diagnosis']).unstack(level=-1)
    
    return df

Then we load the .CSV data correctly for train as well as test.

In [None]:
# Load the data as two dataframes
df = read_trainset()
df = df.sample(frac = .00001)
test_df = read_testset()
test_df = test_df.sample(frac = .00001)

In [None]:
print(df.tail(7))
print(df.shape)  
# 752,803 train images with 6 labels each = 4,516,818 labels 
# Amount of unique training labels from earlier equals amount of images labeled in csv

Out of 752,803 train images we have 107,933 labeled as hemorrhage. The types of the hemorrhages identified vary heavily with epidural being very few and subdural being the most subtypes of hemorrhages identified:

In [None]:
label_sum = df.sum()
print(label_sum)
label_sum.plot.bar()

# 3. Image Preparation

Before moving on with loading the images, let's first check how the images look like in the original data: 

In [None]:
dcm = pydicom.dcmread(TRAIN_IMAGES_DIR + 'ID_ffffb670a' + '.dcm')
plt.imshow(dcm.pixel_array, cmap=plt.cm.bone)

Throughout the Kaggle Competition, many challenges around DICOM images have been idetified and discussed. In the following, we use some of these insights as an important pre-processing step:

In [None]:
# Shift everything up by 1000, then move the values larger than 2048 back to where they should have been. (JER)
def correct_dcm(dcm):
    x = dcm.pixel_array + 1000
    px_mode = 4096
    x[x>=px_mode] = x[x>=px_mode] - px_mode
    dcm.PixelData = x.tobytes()
    dcm.RescaleIntercept = -1000

# Window the DICOM image
def window_image(dcm, window_center, window_width):
    if (dcm.BitsStored == 12) and (dcm.PixelRepresentation == 0) and (int(dcm.RescaleIntercept) > -100):
        correct_dcm(dcm)
    
    img = dcm.pixel_array * dcm.RescaleSlope + dcm.RescaleIntercept
    img_min = window_center - window_width // 2
    img_max = window_center + window_width // 2
    img = np.clip(img, img_min, img_max)

    return img

# Save different images as R, G, B: in 3 dimensions
def bsb_window(dcm):
    brain_img = window_image(dcm, 40, 80)
    subdural_img = window_image(dcm, 80, 200)
    soft_img = window_image(dcm, 40, 380)
    
    brain_img = (brain_img - 0) / 80
    subdural_img = (subdural_img - (-20)) / 200
    soft_img = (soft_img - (-150)) / 380
    bsb_img = np.array([brain_img, subdural_img, soft_img]).transpose(1,2,0)

    return bsb_img

After the corrections and the windowing, the picture looks like this:

In [None]:
plt.imshow(bsb_window(dcm), cmap=plt.cm.bone)

We then create a helper function to load and store the images correctly. We also want 3-channel inputs for e.g. the ResNet. To do so, we use the bsb_window functino was defined earlier:

In [None]:
def data_read(path, desired_size):
    dcm = pydicom.dcmread(path)
    
    try:
        img = bsb_window(dcm)
    except:
        img = np.zeros(desired_size)
    
    img = cv2.resize(img, desired_size[:2], interpolation=cv2.INTER_LINEAR) # Make image smaller
    
    return img

In [None]:
plt.imshow(data_read(TRAIN_IMAGES_DIR +'ID_ffffb670a'+'.dcm', (128, 128)), cmap=plt.cm.bone)

# 4. Data Generator

We then introduce a so-called Data Generator that enables us to load the data in parts and to feed it respectively into the neural net:

In [None]:
# Inherits from keras.utils.Sequence object and thus should be safe for multiprocessing
class DataGenerator(keras.utils.Sequence):

    def __init__(self, list_IDs, labels=None, batch_size=1, img_size=(512, 512, 1), 
                 img_dir=TRAIN_IMAGES_DIR, *args, **kwargs):

        self.list_IDs = list_IDs
        self.labels = labels
        self.batch_size = batch_size
        self.img_size = img_size
        self.img_dir = img_dir
        self.on_epoch_end()

    def __len__(self):
        return int(np.ceil(len(self.indices) / self.batch_size))

    def __getitem__(self, index):
        indices = self.indices[index*self.batch_size:(index+1)*self.batch_size]
        list_IDs_temp = [self.list_IDs[k] for k in indices]
        
        if self.labels is not None:
            X, Y = self.__data_generation(list_IDs_temp)
            return X, Y
        else:
            X = self.__data_generation(list_IDs_temp)
            return X
        
    def on_epoch_end(self):
        
        
        if self.labels is not None: # for training phase we undersample and shuffle
            # keep probability of any=0 and any=1
            keep_prob = self.labels.iloc[:, 0].map({0: 0.35, 1: 0.5})
            keep = (keep_prob > np.random.rand(len(keep_prob)))
            self.indices = np.arange(len(self.list_IDs))[keep]
            np.random.shuffle(self.indices)
        else:
            self.indices = np.arange(len(self.list_IDs))

    def __data_generation(self, list_IDs_temp):
        X = np.empty((self.batch_size, *self.img_size))
        
        if self.labels is not None: # training phase
            Y = np.empty((self.batch_size, 6), dtype=np.float32)
        
            for i, ID in enumerate(list_IDs_temp):
                X[i,] = data_read(self.img_dir+ID+".dcm", self.img_size)
                Y[i,] = self.labels.loc[ID].values
        
            return X, Y
        
        else: # test phase
            for i, ID in enumerate(list_IDs_temp):
                X[i,] = data_read(self.img_dir+ID+".dcm", self.img_size)
            
            return X

But before using the Data Generator we want to quickly check the amount of images: Does it match the amount of labels?

In [None]:
#train_check = [f for f in listdir(TRAIN_IMAGES_DIR) if isfile(join(TRAIN_IMAGES_DIR, f))]
#test_check = [f for f in listdir(TEST_IMAGES_DIR) if isfile(join(TEST_IMAGES_DIR, f))]

#print('Amount train images:', len(train_check))
print('Amount train labels:', df.shape)

#print('Amount test images:', len(test_check))
print('Amount test labels:', test_df.shape) 

# Looks good!

# 5. Loss function

We are ready to move on. Before continuing with the model, we need to specify the details for the evaluation of the predictive power:

In [None]:
from keras import backend as K

def weighted_log_loss(y_true, y_pred):
    """
    Can be used as the loss function in model.compile()
    ---------------------------------------------------
    """
    
    class_weights = np.array([2., 1., 1., 1., 1., 1.])
    
    eps = K.epsilon()
    
    y_pred = K.clip(y_pred, eps, 1.0-eps)

    out = -(         y_true  * K.log(      y_pred) * class_weights
            + (1.0 - y_true) * K.log(1.0 - y_pred) * class_weights)
    
    return K.mean(out, axis=-1)


def _normalized_weighted_average(arr, weights=None):
    """
    A simple Keras implementation that mimics that of 
    numpy.average(), specifically for this competition
    """
    
    if weights is not None:
        scl = K.sum(weights)
        weights = K.expand_dims(weights, axis=1)
        return K.sum(K.dot(arr, weights), axis=1) / scl
    return K.mean(arr, axis=1)


def weighted_loss(y_true, y_pred):
    """
    Will be used as the metric in model.compile()
    ---------------------------------------------
    
    Similar to the custom loss function 'weighted_log_loss()' above
    but with normalized weights, which should be very similar 
    to the official competition metric:
        https://www.kaggle.com/kambarakun/lb-probe-weights-n-of-positives-scoring
    and hence:
        sklearn.metrics.log_loss with sample weights
    """
    
    class_weights = K.variable([2., 1., 1., 1., 1., 1.])
    
    eps = K.epsilon()
    
    y_pred = K.clip(y_pred, eps, 1.0-eps)

    loss = -(        y_true  * K.log(      y_pred)
            + (1.0 - y_true) * K.log(1.0 - y_pred))
    
    loss_samples = _normalized_weighted_average(loss, class_weights)
    
    return K.mean(loss_samples)


def weighted_log_loss_metric(trues, preds):
    """
    Will be used to calculate the log loss 
    of the validation set in PredictionCheckpoint()
    ------------------------------------------
    """
    class_weights = [2., 1., 1., 1., 1., 1.]
    
    epsilon = 1e-7
    
    preds = np.clip(preds, epsilon, 1-epsilon)
    loss = trues * np.log(preds) + (1 - trues) * np.log(1 - preds)
    loss_samples = np.average(loss, axis=1, weights=class_weights)

    return - loss_samples.mean()

# 6. Model

Model is divided into the following parts: 

The input image is then passed through InceptionV3 ("engine"). InceptionV3 could be replaced by any of the available architectures in keras_application.

Finally, the output from InceptionV3 goes through average pooling followed by two dense layers (including output layer).

In [None]:
class PredictionCheckpoint(keras.callbacks.Callback):
    
    def __init__(self, test_df, valid_df, 
                 test_images_dir=TEST_IMAGES_DIR, 
                 valid_images_dir=TRAIN_IMAGES_DIR, 
                 batch_size=32, input_size=(224, 224, 3)):
        
        self.test_df = test_df
        self.valid_df = valid_df
        self.test_images_dir = test_images_dir
        self.valid_images_dir = valid_images_dir
        self.batch_size = batch_size
        self.input_size = input_size
        
    def on_train_begin(self, logs={}):
        self.test_predictions = []
        self.valid_predictions = []
        
    def on_epoch_end(self,batch, logs={}):
        self.test_predictions.append(
            self.model.predict_generator(
                DataGenerator(self.test_df.index, None, self.batch_size, self.input_size, self.test_images_dir), verbose=2)[:len(self.test_df)])
        
        # Commented out to save time
#         self.valid_predictions.append(
#             self.model.predict_generator(
#                 DataGenerator(self.valid_df.index, None, self.batch_size, self.input_size, self.valid_images_dir), verbose=2)[:len(self.valid_df)])
        
#         print("validation loss: %.4f" %
#               weighted_log_loss_metric(self.valid_df.values, 
#                                    np.average(self.valid_predictions, axis=0, 
#                                               weights=[2**i for i in range(len(self.valid_predictions))])))
        
        # here you could also save the predictions with np.save()


class MyDeepModel:
    
    def __init__(self, engine, input_dims, batch_size=5, num_epochs=4, learning_rate=1e-3, 
                 decay_rate=1.0, decay_steps=1, weights="imagenet", verbose=1):
        
        self.engine = engine
        self.input_dims = input_dims
        self.batch_size = batch_size
        self.num_epochs = num_epochs
        self.learning_rate = learning_rate
        self.decay_rate = decay_rate
        self.decay_steps = decay_steps
        self.weights = weights
        self.verbose = verbose
        self._build()

    def _build(self):
        
        
        engine = self.engine(include_top=False, weights=self.weights, input_shape=self.input_dims,
                             backend = keras.backend, layers = keras.layers,
                             models = keras.models, utils = keras.utils)
        
        x = keras.layers.GlobalAveragePooling2D(name='avg_pool')(engine.output)
#         x = keras.layers.Dropout(0.2)(x)
#         x = keras.layers.Dense(keras.backend.int_shape(x)[1], activation="relu", name="dense_hidden_1")(x)
#         x = keras.layers.Dropout(0.1)(x)
        out = keras.layers.Dense(1, activation="sigmoid", name='dense_output')(x)

        self.model = keras.models.Model(inputs=engine.input, outputs=out)

        self.model.compile(loss="binary_crossentropy", optimizer=keras.optimizers.Adam(), metrics=[weighted_loss])
    

    def fit_and_predict(self, train_df, valid_df, test_df):
        
        # callbacks
        pred_history = PredictionCheckpoint(test_df, valid_df, input_size=self.input_dims)
        #checkpointer = keras.callbacks.ModelCheckpoint(filepath='%s-{epoch:02d}.hdf5' % self.engine.__name__, verbose=1, save_weights_only=True, save_best_only=False)
        scheduler = keras.callbacks.LearningRateScheduler(lambda epoch: self.learning_rate * pow(self.decay_rate, floor(epoch / self.decay_steps)))
        
        self.model.fit_generator(
            DataGenerator(
                train_df.index, 
                train_df, 
                self.batch_size, 
                self.input_dims, 
                TRAIN_IMAGES_DIR
            ),
            epochs=self.num_epochs,
            verbose=self.verbose,
            use_multiprocessing=True,
            workers=4,
            callbacks=[pred_history, scheduler]
        )
        
        return pred_history
    
    def save(self, path):
        self.model.save_weights(path)
    
    def load(self, path):
        self.model.load_weights(path)

# 7. Training and Prediction

5. Train model and predict

Using train, validation and test set 

Training for 5 epochs with Adam optimizer, with a learning rate of 0.0005 and decay rate of 0.8. The validation predictions are [exponentially weighted] averaged over all 5 epochs (not in this commit). fit_and_predict returns validation and test predictions for all epochs.

In [None]:
# train set (00%) and validation set (10%)
ss = ShuffleSplit(n_splits=10, test_size=0.2, random_state=42).split(df.index)

# lets go for the first fold only
train_idx, valid_idx = next(ss)

# obtain model
model = MyDeepModel(engine=InceptionV3, input_dims=(256, 256, 3), batch_size=32, learning_rate=5e-4,               
                    num_epochs=2, decay_rate=0.8, decay_steps=1, weights="imagenet", verbose=1)

# obtain test + validation predictions (history.test_predictions, history.valid_predictions)
history = model.fit_and_predict(df.iloc[train_idx], df.iloc[valid_idx], test_df)

# 8. Hyperparameter Tuning

# 9. Evaluation

# 10. Submission

In [None]:
test_df.iloc[:, :] = np.average(history.test_predictions, axis=0, weights=[0, 1, 2, 4, 6]) # let's do a weighted average for epochs (>1)

test_df = test_df.stack().reset_index()

test_df.insert(loc=0, column='ID', value=test_df['Image'].astype(str) + "_" + test_df['Diagnosis'])

test_df = test_df.drop(["Image", "Diagnosis"], axis=1)

test_df.to_csv('submission.csv', index=False)