This competition provides an exciting and challenging task of doing multi-label classification on a dataset with well over half a million images. There are multiple very nice notebooks which perform only 2 or 3 epochs with all the training data. In this notebook I will try out and see what the effect is of using more epochs but less steps per epoch. By averaging the predictions made during the last few epochs we should be able to achieve a nice LB score. This also should provide some alternative ways to experiment for the Kagglers that don't have the adequate computing resources available and are dependent on Kaggle Kernels.

As model I will be using the EfficientNet B2 model. It should be able to provide highly accurate predictions while still being able to run within the kernel limits. With 9 hours max time for a GPU kernel you have to make some trade-offs ;-)

I hope this kernel will be usefull and may'be will provide you with some new and alternative ideas to try out. If you like it..then please upvote it ;-)
Any feedback or remarks are appreciated.

Lets start by importing all the necessary modules.

Note!! This kernel is now updated for Stage2 Training and Test data..altough with less epochs because of the increase in train and test data.

In [None]:
!pip -q install mlflow

In [None]:
import mlflow.tensorflow
mlflow.tensorflow.autolog()

In [None]:
import numpy as np
import pandas as pd
import pydicom
import os
import collections
import sys
import glob
import random
import cv2
import tensorflow as tf
import multiprocessing

from math import ceil, floor
from copy import deepcopy
from tqdm import tqdm_notebook as tqdm
from imgaug import augmenters as iaa

import tensorflow.keras
import tensorflow.keras.backend as K
from tensorflow.keras.callbacks import Callback, ModelCheckpoint
from tensorflow.keras.layers import Dense, Flatten, Dropout
from tensorflow.keras.models import Model, load_model
from tensorflow.keras.utils import Sequence
from tensorflow.keras.losses import binary_crossentropy
from tensorflow.keras.optimizers import Adam

def calculating_class_weights(y_true):
    from sklearn.utils.class_weight import compute_class_weight
    number_dim = np.shape(y_true)[1]
    weights = np.empty([number_dim, 2])
    for i in range(number_dim):
        weights[i] = compute_class_weight('balanced', [0.,1.], y_true[:, i])
    return weights

Install and import the efficientnet and iterative-stratification packages from the internet. The iterative-stratification package provides a very nice implementation of multi-label stratification. I've used it in a few competitions now with good results. There are offcourse more packages that provide implementations for it.

In [None]:
# Install Modules from internet
!pip install efficientnet
!pip install iterative-stratification

In [None]:
# Import Custom Modules
import efficientnet.tfkeras as efn 
from iterstrat.ml_stratifiers import MultilabelStratifiedShuffleSplit

Next we will set the random_state, some constants and folders that will be used later on. I've specified a rather small test size as I want to maximize the training time available and minimize the time used for validation. I'am not using methods like early stopping...when the kernel time limit is approaching we could still increase the results on the LB if we were allowed to continue.

In [None]:
# Seed
SEED = 12345
np.random.seed(SEED)
# tf.set_random_seed(SEED)

# Constants
TEST_SIZE = 0.1
HEIGHT = 256
WIDTH = 256
CHANNELS = 3
TRAIN_BATCH_SIZE = 32
VALID_BATCH_SIZE = 64
SHAPE = (HEIGHT, WIDTH, CHANNELS)

# Folders
DATA_DIR = '/kaggle/input/rsna-intracranial-hemorrhage-detection/rsna-intracranial-hemorrhage-detection/'
TEST_IMAGES_DIR = DATA_DIR + 'stage_2_test/'
TRAIN_IMAGES_DIR = DATA_DIR + 'stage_2_train/'

Next the code for the DICOM windowing and the Data Generators. After seeing the effect of different versions of windowing as presented in this very nice [kernel](https://www.kaggle.com/akensert/inceptionv3-prev-resnet50-keras-baseline-model) I decided to also update my kernel with it. Lets see what the effect will be.

In [None]:
def correct_dcm(dcm):
    x = dcm.pixel_array + 1000
    px_mode = 4096
    x[x>=px_mode] = x[x>=px_mode] - px_mode
    dcm.PixelData = x.tobytes()
    dcm.RescaleIntercept = -1000

def window_image(dcm, window_center, window_width):    
    if (dcm.BitsStored == 12) and (dcm.PixelRepresentation == 0) and (int(dcm.RescaleIntercept) > -100):
        correct_dcm(dcm)
    img = dcm.pixel_array * dcm.RescaleSlope + dcm.RescaleIntercept
    
    # Resize
    img = cv2.resize(img, SHAPE[:2], interpolation = cv2.INTER_LINEAR)
   
    img_min = window_center - window_width // 2
    img_max = window_center + window_width // 2
    img = np.clip(img, img_min, img_max)
    return img

def bsb_window(dcm):
    brain_img = window_image(dcm, 40, 80)
    subdural_img = window_image(dcm, 80, 200)
    soft_img = window_image(dcm, 40, 380)
    
    brain_img = (brain_img - 0) / 80
    subdural_img = (subdural_img - (-20)) / 200
    soft_img = (soft_img - (-150)) / 380
    bsb_img = np.array([brain_img, subdural_img, soft_img]).transpose(1,2,0)
    return bsb_img

def _read(path, SHAPE):
    dcm = pydicom.dcmread(path)
    try:
        img = bsb_window(dcm)
    except:
        img = np.zeros(SHAPE)
    return img

I'll specify some light image augmentation. Some horizontal and vertical flipping and some cropping. I haven't yet tried out more augmentation but will do so in future versions of the kernel. Also the code for Data Generators for train and test data.

In [None]:
# Image Augmentation
sometimes = lambda aug: iaa.Sometimes(0.25, aug)
augmentation = iaa.Sequential([ iaa.Fliplr(0.25),
                                iaa.Flipud(0.10),
                                sometimes(iaa.Crop(px=(0, 25), keep_size = True, sample_independently = False))   
                            ], random_order = True)       
        
# Generators
class TrainDataGenerator(tensorflow.keras.utils.Sequence):
    def __init__(self, dataset, labels, batch_size = 16, img_size = SHAPE, img_dir = TRAIN_IMAGES_DIR, augment = False, *args, **kwargs):
        self.dataset = dataset
        self.ids = dataset.index
        self.labels = labels
        self.batch_size = batch_size
        self.img_size = img_size
        self.img_dir = img_dir
        self.augment = augment
        self.on_epoch_end()

    def __len__(self):
        return int(ceil(len(self.ids) / self.batch_size))

    def __getitem__(self, index):
        indices = self.indices[index*self.batch_size:(index+1)*self.batch_size]
        X, Y = self.__data_generation(indices)
        return X, Y

    def augmentor(self, image):
        augment_img = augmentation        
        image_aug = augment_img.augment_image(image)
        return image_aug

    def on_epoch_end(self):
        self.indices = np.arange(len(self.ids))
        np.random.shuffle(self.indices)

    def __data_generation(self, indices):
        X = np.empty((self.batch_size, *self.img_size))
        Y = np.empty((self.batch_size, 1), dtype=np.float32)
        
        for i, index in enumerate(indices):
            ID = self.ids[index]
            image = _read(self.img_dir+ID+".dcm", self.img_size)
            if self.augment:
                X[i,] = self.augmentor(image)
            else:
                X[i,] = image
            Y[i,] = self.labels.iloc[index].values        
        return X, Y
    
class TestDataGenerator(tensorflow.keras.utils.Sequence):
    def __init__(self, dataset, labels, batch_size = 16, img_size = SHAPE, img_dir = TEST_IMAGES_DIR, *args, **kwargs):
        self.dataset = dataset
        self.ids = dataset.index
        self.labels = labels
        self.batch_size = batch_size
        self.img_size = img_size
        self.img_dir = img_dir
        self.on_epoch_end()

    def __len__(self):
        return int(ceil(len(self.ids) / self.batch_size))

    def __getitem__(self, index):
        indices = self.indices[index*self.batch_size:(index+1)*self.batch_size]
        X = self.__data_generation(indices)
        return X

    def on_epoch_end(self):
        self.indices = np.arange(len(self.ids))
    
    def __data_generation(self, indices):
        X = np.empty((self.batch_size, *self.img_size))
        
        for i, index in enumerate(indices):
            ID = self.ids[index]
            image = _read(self.img_dir+ID+".dcm", self.img_size)
            X[i,] = image              
        return X

Import the training and test datasets.

In [None]:
def read_testset(filename = DATA_DIR + "stage_2_sample_submission.csv"):
    df = pd.read_csv(filename)
    df["Image"] = df["ID"].str.slice(stop=12)
    df["Diagnosis"] = df["ID"].str.slice(start=13)
    df = df.loc[:, ["Label", "Diagnosis", "Image"]]
    df = df.set_index(['Image', 'Diagnosis']).unstack(level=-1)
    return df

def read_trainset(filename = DATA_DIR + "stage_2_train.csv"):
    df = pd.read_csv(filename)
    df["Image"] = df["ID"].str.slice(stop=12)
    df["Diagnosis"] = df["ID"].str.slice(start=13)
    duplicates_to_remove = [56346, 56347, 56348, 56349,
                            56350, 56351, 1171830, 1171831,
                            1171832, 1171833, 1171834, 1171835,
                            3705312, 3705313, 3705314, 3705315,
                            3705316, 3705317, 3842478, 3842479,
                            3842480, 3842481, 3842482, 3842483 ]
    df = df.drop(index = duplicates_to_remove)
    df = df.reset_index(drop = True)    
    df = df.loc[:, ["Label", "Diagnosis", "Image"]]
    df = df.set_index(['Image', 'Diagnosis']).unstack(level=-1)
    return df

# Read Train and Test Datasets
test_df = read_testset()
train_df = read_trainset()

In [None]:
train_df = train_df.iloc[:]
train_df

The training data contains some class inbalance. Multiple kernels explored the use of undersampling..so let's try the opposite and oversample the minority class 'epidural' one additional time.

In [None]:
# Oversampling
epidural_df = train_df[train_df.Label['epidural'] == 1]
train_oversample_df = pd.concat([train_df, epidural_df])
train_df = train_oversample_df

# Summary
print('Train Shape: {}'.format(train_df.shape))
print('Test Shape: {}'.format(test_df.shape))

In [None]:
weights = calculating_class_weights(train_df.values)
weights

Some methods for predictions on the test data, a callback method and a method to create the EfficientNet B2 model. For the EfficientNet we use the pretrained imagenet weights. Also a Dropout layer is added with a small value to prevent some overfitting. 

In [None]:
def predictions(test_df, model):    
    test_preds = model.predict_generator(TestDataGenerator(test_df, None, 8, SHAPE, TEST_IMAGES_DIR), verbose = 1)
    return test_preds[:test_df.iloc[range(test_df.shape[0])].shape[0]]

def ModelCheckpointFull(model_name):
    return ModelCheckpoint(model_name, 
                            monitor = 'val_AUC_full', 
                            verbose = 1, 
                            save_best_only = True, 
                            save_weights_only = True, 
                            mode = 'max', 
                            period = 1)

# Create Model
def create_model():
    K.clear_session()
    
    base_model =  efn.EfficientNetB2(weights = 'imagenet', include_top = False, pooling = 'avg', input_shape = SHAPE)
    x = base_model.output
    x = Dropout(0.15)(x)
    y_pred = Dense(1, activation = 'sigmoid')(x)

    return Model(inputs = base_model.input, outputs = y_pred)

Next we setup the multi label stratification. I've specified multiple splits but only using the first one for train data and validation data. Optionally you can also loop through the different splits and use a different train and validation set for each epoch. 

In [None]:
# Submission Placeholder
submission_predictions = []

# Multi Label Stratified Split stuff...
msss = MultilabelStratifiedShuffleSplit(n_splits = 10, test_size = TEST_SIZE, random_state = SEED)
X = train_df.index
Y = train_df.Label.values

In [None]:
# Get train and test index
msss_splits = next(msss.split(X, Y))
train_idx = msss_splits[0]
valid_idx = msss_splits[1]

Now we can train the model for a number of epochs. All epochs we train the full model but each time on only 1/6 of the train data. With each epoch only a subset of the train data will allow us to make more epochs and allows todo averaging over more then just 1 or 2 epochs (compared to using all data every epoch).

Note that I recreate the data generators and model on each epoch. This is only necessary when using the different Multi-label stratified splits since the data generators will get a totally different set of data on each epoch then. I left it in so that you can try it out.

Starting with the 6th epoch a prediction for the test set is made on each epoch. In total predictions from the last 6 epochs will be averaged this way for the final submission.

In [None]:
np.random.shuffle(train_idx)
print(train_idx[:5])    
print(valid_idx[:5])

data_generator_train = TrainDataGenerator(train_df.iloc[train_idx,[0]], 
                                            train_df.iloc[train_idx,[0]], 
                                            TRAIN_BATCH_SIZE, 
                                            SHAPE,
                                            augment = True)
data_generator_val = TrainDataGenerator(train_df.iloc[valid_idx,[0]], 
                                        train_df.iloc[valid_idx,[0]], 
                                        VALID_BATCH_SIZE, 
                                        SHAPE,
                                        augment = False)

TRAIN_STEPS = int(len(data_generator_train) / 10)
LR = 0.000125

In [None]:
(train_df.iloc[valid_idx].values[:,:]==1).sum(axis=0)/10

In [None]:
AUC = tf.keras.metrics.AUC
RECALL = tf.keras.metrics.Recall
PRECISION = tf.keras.metrics.Precision

In [None]:
# Create Model
Metrics = [AUC(name = 'AUC_full'),
           
           RECALL(thresholds=0.7,name='REC_full'),
          
           PRECISION(thresholds=0.7, name='PRE_full')]

def get_weighted_loss(weights):
    def weighted_loss(y_true, y_pred):
        return K.mean((weights[:,0]**(1-y_true))*(weights[:,1]**(y_true))*K.binary_crossentropy(y_true, y_pred), axis=-1)
    return weighted_loss

model = create_model()   
model.compile(optimizer = Adam(learning_rate = LR), 
                  loss = get_weighted_loss(weights),
                  metrics = Metrics)

In [None]:
model.load_weights('../input/usefhemorrhageeffnetb2abnorv0/model.h5')

In [None]:
# def main():
with mlflow.start_run():
    model.fit_generator(generator = data_generator_train,
                            validation_data = data_generator_val,
                            steps_per_epoch = TRAIN_STEPS,
                            epochs = 10,
                            callbacks = [ModelCheckpointFull('model.h5')],
                            verbose = 1,workers=4)

In [None]:
# if __name__ == "__main__":
#     main()

In [None]:
# test_df0 = test_df.iloc[:]

# val_df0 = train_df.iloc[valid_idx]
# val_df0 = val_df0.iloc[:]

# train_df0 = train_df.iloc[train_idx]
# train_df0 = train_df0.iloc[:]

In [None]:
# i,j = next(iter(data_generator_train))
# i.shape,j.shape

In [None]:
# def calculating_class_weights(y_true):
#     from sklearn.utils.class_weight import compute_class_weight
#     number_dim = np.shape(y_true)[1]
#     weights = np.empty([number_dim, 2])
#     for i in range(number_dim):
#         weights[i] = compute_class_weight('balanced', [0.,1.], y_true[:, i])
#     return weights

In [None]:
# arr = [[1.0, 2.0], [3.0, 4.0]]
# def numpy_to_tensor(arr):
#     arg = tf.constant(arr)
#     arg = tf.convert_to_tensor(arg, dtype=tf.float32)
#     return arg
# numpy_to_tensor(arr)

In [None]:
# sample_p =np.array([[0.5,0.5,0.5],
#                     [0.5,0.5,0.5],
#                     [0.5,0.5,0.5],
#                     [0.5,0.5,0.5],
#                     [0.5,0.5,0.5],
#                     [0.5,0.5,0.5]],'float32')
# y_true =  np.array([[0,0,0],
#                     [1,0,0],
#                     [1,0,0],
#                     [1,0,0],
#                     [1,1,0],
#                     [0,1,1]],'float32')
# weights = calculating_class_weights(y_true)
# weights

In [None]:
# def get_weighted_loss(weights):
#     def weighted_loss(y_true, y_pred):
#         return K.mean((weights[:,0]**(1-y_true))*(weights[:,1]**(y_true))*K.binary_crossentropy(y_true, y_pred), axis=-1)
#     return weighted_loss

In [None]:
# def weighted_loss(y_true, y_pred):
#     return K.mean((weights[:,0]**(1-y_true))*(weights[:,1]**(y_true))*K.binary_crossentropy(y_true, y_pred), axis=-1)
# #     return weighted_loss

In [None]:
# loss = get_weighted_loss(weights)
# # loss = weighted_loss(numpy_to_tensor(y_true), numpy_to_tensor(sample_p))
# loss

In [None]:
# w_weight = 'model'

# model.load_weights(f'../input/keras-efficientnet-b2-from-start-to-output/{w_weight}.h5')
# test_preds = model.predict_generator(TestDataGenerator(test_df0, None, 8, SHAPE, TEST_IMAGES_DIR), verbose = 1)
# preds = test_preds[:test_df0.iloc[range(test_df0.shape[0])].shape[0]]
# np.savez_compressed(f'pred_test_{w_weight}.npz',data = preds)
# print(preds.shape)

# test_preds = model.predict_generator(TestDataGenerator(train_df0, None, 8, SHAPE, TRAIN_IMAGES_DIR), verbose = 1)
# preds = test_preds[:train_df0.iloc[range(train_df0.shape[0])].shape[0]]
# np.savez_compressed(f'pred_train_{w_weight}.npz',data = preds)
# print(preds.shape)

# test_preds = model.predict_generator(TestDataGenerator(val_df0, None, 8, SHAPE, TRAIN_IMAGES_DIR), verbose = 1)
# preds = test_preds[:val_df0.iloc[range(val_df0.shape[0])].shape[0]]
# np.savez_compressed(f'pred_val_{w_weight}.npz',data = preds)
# print(preds.shape)
##################################################################################################################
# w_weight = 'Final'

# model.load_weights(f'../input/keras-efficientnet-b2-from-start-to-output/{w_weight}.h5')
# test_preds = model.predict_generator(TestDataGenerator(test_df0, None, 8, SHAPE, TEST_IMAGES_DIR), verbose = 1)
# preds = test_preds[:test_df0.iloc[range(test_df0.shape[0])].shape[0]]
# np.savez_compressed(f'pred_test_{w_weight}.npz',data = preds)
# print(preds.shape)

# test_preds = model.predict_generator(TestDataGenerator(train_df0, None, 8, SHAPE, TRAIN_IMAGES_DIR), verbose = 1)
# preds = test_preds[:train_df0.iloc[range(train_df0.shape[0])].shape[0]]
# np.savez_compressed(f'pred_train_{w_weight}.npz',data = preds)
# print(preds.shape)

# test_preds = model.predict_generator(TestDataGenerator(val_df0, None, 8, SHAPE, TRAIN_IMAGES_DIR), verbose = 1)
# preds = test_preds[:val_df0.iloc[range(val_df0.shape[0])].shape[0]]
# np.savez_compressed(f'pred_val_{w_weight}.npz',data = preds)
# print(preds.shape)

In [None]:
# sub 1
# submission_predictions = [np.load('./pred_test_model.npz')['data']]
# test_df0.iloc[:, :] = np.average(submission_predictions, axis = 0, weights = [1])
# test_df0 = test_df0.stack().reset_index()
# test_df0.insert(loc = 0, column = 'ID', value = test_df0['Image'].astype(str) + "_" + test_df0['Diagnosis'])
# test_df0 = test_df0.drop(["Image", "Diagnosis"], axis=1)
# test_df0.to_csv('submission.csv', index = False)
# print(test_df0.head(12))

# # sub 2
# test_df0 = test_df.iloc[:1600]
# submission_predictions = [np.load('./pred_test_Final.npz')['data']]
# test_df0.iloc[:, :] = np.average(submission_predictions, axis = 0, weights = [1])
# test_df0 = test_df0.stack().reset_index()
# test_df0.insert(loc = 0, column = 'ID', value = test_df0['Image'].astype(str) + "_" + test_df0['Diagnosis'])
# test_df0 = test_df0.drop(["Image", "Diagnosis"], axis=1)
# test_df0.to_csv('submission_2.csv', index = False)
# print(test_df0.head(12))

# # sub 3
# test_df0 = test_df.iloc[:1600]
# submission_predictions = [np.load('./pred_test_model.npz')['data'],np.load('./pred_test_Final.npz')['data']]
# test_df0.iloc[:, :] = np.average(submission_predictions, axis = 0, weights = [1,1])
# test_df0 = test_df0.stack().reset_index()
# test_df0.insert(loc = 0, column = 'ID', value = test_df0['Image'].astype(str) + "_" + test_df0['Diagnosis'])
# test_df0 = test_df0.drop(["Image", "Diagnosis"], axis=1)
# test_df0.to_csv('submission_3.csv', index = False)
# print(test_df0.head(12))

In [None]:
# test_df.iloc[:, :] = np.average(submission_predictions, axis = 0, weights = [2**i for i in range(len(submission_predictions))])
# test_df = test_df.stack().reset_index()
# test_df.insert(loc = 0, column = 'ID', value = test_df['Image'].astype(str) + "_" + test_df['Diagnosis'])
# test_df = test_df.drop(["Image", "Diagnosis"], axis=1)
# test_df.to_csv('submission.csv', index = False)
# print(test_df.head(12))