## 2D EfficientNet + InceptionResNet (soon) Benchmark 

#### In this notebook, we're going to go through:
- A brief exploration of the data,
- What MRI scans are, what [the different scan modes (FLAIR, T1w, T1wCE, T2w) mean](https://case.edu/med/neurology/NR/MRI%20Basics.htm), and how they provide information,
- A quick 2-hour benchmark model.

##### I hope this exploration helps whoever's reading this - I'm still a beginner myself, so any comments/feedback would be appreciated.

#### if you feel this notebook was helpful, please don't forget to upvote! 😊

#### Credits to Johnathan Basomi for the Brain Tumor dataset in [PNG form.](https://www.kaggle.com/c/rsna-miccai-brain-tumor-radiogenomic-classification/discussion/253000)

### Load Dataset

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Define data paths
TRAIN_DATA_PATH = '../input/rsna-miccai-png/train/'
TEST_DATA_PATH = '../input/rsna-miccai-png/test/'

train_labels = pd.read_csv('../input/rsna-miccai-brain-tumor-radiogenomic-classification/train_labels.csv')

# Mark on the dataframe whether there are missing values for one or more MRI modes
import os

img_modes = set(['T1w','T1wCE','T2w','FLAIR'])

mode_valid = {'T1w':[],
             'T1wCE':[],
             'T2w':[],
             'FLAIR':[]}

for patient_id in train_labels.iloc:
    subdirs = set(os.listdir(TRAIN_DATA_PATH + str(patient_id['BraTS21ID']).zfill(5)))
    for img_mode in img_modes: mode_valid[img_mode].append(int(img_mode in subdirs))
train_labels = train_labels.merge(pd.DataFrame(mode_valid),left_index=True,right_index=True)

train_labels

#### Visualize Class Label Distribution:

In [None]:
sns.histplot(train_labels['MGMT_value'], bins=2, shrink=.8)
plt.title("Class Distribution: Positive vs Negative")
print("# of total data points: ",len(train_labels))

### Four modes of MRI scans are in our dataset in 3d image format (many images as vertical slices).
#### According to https://case.edu/med/neurology/NR/MRI%20Basics.htm, the scans mean:

``` 
T1w: T1-Weighted MRI scan. The Cerespinal fluid (fluid outside the brain cavity) will be dark, and brain tissue will be white. Tumors will typically show up as dark patches.

T1wCE: Contrast-material enhanced. Also known as a MRI scan with [Gadolinium infusion](https://www.insideradiology.com.au/gadolinium-contrast-medium/). Blood vessels will show up as white lines.

T2w: T2-Weighted MRI scan. Cerespinal fluid will appear white, and the brain tissue will appear dark.

FLAIR: Both brain matter and cerespinal fluid will appear dark, but abnormalities tend to appear light.
```
-----------
#### Visualizing Depth Distribution:

In [None]:
import os
from tqdm import tqdm, trange

# Explore data structure

img_modes = ['T1w','T1wCE','T2w','FLAIR']
scans_per_mode = {'T1w':[],
                     'T1wCE':[],
                     'T2w':[],
                     'FLAIR':[]}

for PATIENT in tqdm(os.listdir(TRAIN_DATA_PATH)):
    for img_mode in img_modes:
        try: scans_per_mode[img_mode].append(len(os.listdir(TRAIN_DATA_PATH + PATIENT + '/' + img_mode)))
        except: scans_per_mode[img_mode].append(0)
            
fig, axes = plt.subplots(4, figsize=(10,10), sharex=True)
fig.suptitle("# of vertical slices per MRI category")
for index, img_mode in enumerate(img_modes):
    sns.distplot(scans_per_mode[img_mode],ax=axes[index], color=['cyan','blue','green','violet'][index])
    axes[index].set_title(img_mode)
plt.show()

#### Load image slices in np array format:
Image augmentation from [🧠 MRI data augmentation
](https://www.kaggle.com/furcifer/mri-data-augmentation-pipeline)

In [None]:
import imgaug as ia
import imgaug.augmenters as iaa

def image_augment(data):
    sometimes = lambda aug: iaa.Sometimes(0.2, aug)

    seq = iaa.Sequential(
        [
            # apply the following augmenters to most images
            iaa.Fliplr(0.5), # horizontally flip 50% of all images
            iaa.Flipud(0.2), # vertically flip 20% of all images
            # crop images by -5% to 10% of their height/width
            sometimes(iaa.CropAndPad(
                percent=(-0.05, 0.05),
                pad_mode=ia.ALL,
                pad_cval=(0, 255)
            )),
            sometimes(iaa.Affine(
                scale={"x": (0.7, 1.3), "y": (0.7, 1.3)}, # scale images to 80-120% of their size, individually per axis
                translate_percent={"x": (-0.2, 0.2), "y": (-0.2, 0.2)}, # translate by -20 to +20 percent (per axis)
                rotate=(-45, 45), # rotate by -45 to +45 degrees
                shear=(-16, 16), # shear by -16 to +16 degrees
                order=[0, 1], # use nearest neighbour or bilinear interpolation (fast)
                cval=(0, 255), # if mode is constant, use a cval between 0 and 255
                mode=ia.ALL # use any of scikit-image's warping modes (see 2nd image from the top for examples)
            )),
            # execute 0 to 5 of the following (less important) augmenters per image
            # don't execute all of them, as that would often be way too strong
            iaa.SomeOf((0, 5),
                [
                    sometimes(iaa.Superpixels(p_replace=(0, 1.0), n_segments=(20, 200))), # convert images into their superpixel representation
                    iaa.OneOf([
                        iaa.GaussianBlur((0, 2.0)), # blur images with a sigma between 0 and 3.0
                        iaa.AverageBlur(k=(2, 5)), # blur image using local means with kernel sizes between 2 and 7
                        iaa.MedianBlur(k=(3, 7)), # blur image using local medians with kernel sizes between 2 and 7
                    ]),
                    iaa.Sharpen(alpha=(0, 1.0), lightness=(0.75, 1.5)), # sharpen images
                    iaa.Emboss(alpha=(0, 1.0), strength=(0, 2.0)), # emboss images
                    # search either for all edges or for directed edges,
                    # blend the result with the original image using a blobby mask
                    iaa.SimplexNoiseAlpha(iaa.OneOf([
                        iaa.EdgeDetect(alpha=(0.5, 1.0)),
                        iaa.DirectedEdgeDetect(alpha=(0.5, 1.0), direction=(0.0, 1.0)),
                    ])),
                    iaa.AdditiveGaussianNoise(loc=0, scale=(0.0, 0.05*255), per_channel=0.5), # add gaussian noise to images
                    iaa.OneOf([
                        iaa.Dropout((0.01, 0.1), per_channel=0.5), # randomly remove up to 10% of the pixels
                        iaa.CoarseDropout((0.03, 0.15), size_percent=(0.02, 0.05), per_channel=0.2),
                    ]),
                    iaa.Invert(0.05, per_channel=True), # invert color channels
                    iaa.Add((-10, 10), per_channel=0.5), # change brightness of images (by -10 to 10 of original value)

                    # either change the brightness of the whole image (sometimes
                    # per channel) or change the brightness of subareas
                    iaa.OneOf([
                        iaa.Multiply((0.5, 1.5), per_channel=0.5),
                        iaa.FrequencyNoiseAlpha(
                            exponent=(-4, 0),
                            first=iaa.Multiply((0.5, 1.5), per_channel=True),
                            second=iaa.LinearContrast((0.5, 2.0))
                        )
                    ]),
                    iaa.LinearContrast((0.5, 2.0), per_channel=0.5), # improve or worsen the contrast
                    sometimes(iaa.ElasticTransformation(alpha=(0.5, 3.5), sigma=0.25)), # move pixels locally around (with random strengths)
                    sometimes(iaa.PiecewiseAffine(scale=(0.01, 0.05))), # sometimes move parts of the image around
                    sometimes(iaa.PerspectiveTransform(scale=(0.01, 0.1)))
                ],
                random_order=True
            )
        ],
        random_order=True
    )
    return seq(images=data)

In [None]:
IMG_DIMS = (456,456)
IMG_CHANNELS = 3

In [None]:
import os, PIL
from PIL import Image
from scipy.ndimage import zoom
    
def load_array(path: str, slice_name:str):
    return np.array(Image.open(path+'/'+slice_name).resize(IMG_DIMS, resample=PIL.Image.BILINEAR))

def load_array_3d(filename: str, img_mode: str, augment=False):
    """
    Inputs
    ----------
    Filename: Id of scan being loaded (without any further path extensions)
        etc: "00000" -> Patient with ID 00000
    Img_mode: MRI Scan mode
        etc: one of ['T1w','T1wCE','T2w','FLAIR']
        
    Returns
    -----------
    3d array of shape (x, y, num_slices)
    """
    path = TRAIN_DATA_PATH+filename+'/'+img_mode

    data = []
    img_paths_in_order = sorted(os.listdir(path), key=lambda x: int(x.split('-')[1].split('.png')[0]))
    
    for slice_name in img_paths_in_order:
        data.append(load_array(path, slice_name))
            
    if augment: data = image_augment(data)
    return np.swapaxes(np.array(data), 0, 2)

import time
starttime = time.time()
sample = load_array_3d('00000','FLAIR')
print("Time to load array of %i images:"%(len(os.listdir('../input/rsna-miccai-png/train/00000/FLAIR'))), time.time()-starttime)

In [None]:
# Visualize

def plot_slices(sample, num_rows, num_columns, from_train_gen = False, augment=False):
    f, axarr = plt.subplots(
        num_rows,
        num_columns,
        figsize=(num_columns*2, num_rows*2)
    )
    f.suptitle("MRI Slices Visualized\nAugmentation: %s"%str(augment))
    
    if from_train_gen: num_slices = sample.shape[0]
    else: num_slices = sample.shape[2]
    slices_to_increment = num_slices / (num_rows * num_columns)
    for r in range(num_rows):
        for c in range(num_columns):
            slice_index = np.floor((r*num_columns + c) * slices_to_increment)
            if from_train_gen: axarr[r, c].imshow(sample[slice_index.astype(int)],cmap='gray')
            else: axarr[r, c].imshow(sample[:,:,slice_index.astype(int)],cmap='gray')
                
            axarr[r, c].axis('off')
            if not from_train_gen: axarr[r, c].set_title("Slice %i"%(slice_index+1))
    plt.subplots_adjust(wspace=0, hspace=0, left=0, right=1, bottom=0, top=1)
    plt.tight_layout()
    plt.show()
    
sample = load_array_3d('00000','FLAIR',augment=False)
plot_slices(sample, 4, 4); del sample
sample = load_array_3d('00000','FLAIR',augment=True)
plot_slices(sample, 4, 4, augment=True); del sample

In [None]:
# Data flow
from sklearn.model_selection import train_test_split
from time import sleep
import random
import gc

train_df, val_df = train_test_split(train_labels, test_size=0.2, random_state= 420)

# Multiprocessing
import multiprocessing as mp
MULTIPROCESS = False
NCORE = 4

def load_array_and_labels(patient_id, img_mode, augment, testing = False):
    if testing:
        
        path = TEST_DATA_PATH+str(int(patient_id['BraTS21ID'])).zfill(5)+'/'+img_mode
        img_paths = os.listdir(path)
        random.shuffle(img_paths)

        batch_data = []

        for img_path in img_paths:
            batch_data.append(load_array(path, img_path))

        if augment: batch_data = image_augment(batch_data)
        return batch_data
    
    else:
        path = TRAIN_DATA_PATH+str(int(patient_id['BraTS21ID'])).zfill(5)+'/'+img_mode
        img_paths = os.listdir(path)
        random.shuffle(img_paths)

        batch_data = []

        for img_path in img_paths:
            batch_data.append(load_array(path, img_path))

        if augment: batch_data = image_augment(batch_data)
        return batch_data, [patient_id['MGMT_value']] * len(batch_data)

def data_generator(df, img_mode: str, batch_size = 8, db_mult = 20, shuffle_per = 10, augment=True):
    img_batch, target_batch = [], []
    while True:
        # Shuffle dataframe
        df = df.sample(frac=1).reset_index(drop=True)
        shuffle_index = 0
        
        # Multiprocess
        if MULTIPROCESS: 
            pool = mp.Pool(NCORE)
            patient_id_queue = []
        for patient_id in df.iloc:      
            if patient_id[img_mode] == 0: continue
              
            if MULTIPROCESS: 
                patient_id_queue.append(patient_id)    
                if len(patient_id_queue) >= NCORE:
                    jobs = []
                    for task in patient_id_queue:
                        jobs.append(pool.apply_async(load_array_and_labels, (task,img_mode,augment)))
                    for job in jobs:
                        preprocessed_data = job.get(timeout=30)
                        img_batch.extend(preprocessed_data[0])
                        target_batch.extend(preprocessed_data[1])

                    patient_id_queue = []    
            else:
                preprocessed_data = load_array_and_labels(patient_id,img_mode, augment)
                img_batch.extend(preprocessed_data[0])
                target_batch.extend(preprocessed_data[1])
                
            while len(target_batch) > (batch_size * db_mult): 
                if shuffle_index % shuffle_per == 0: 
                    c = list(zip(img_batch, target_batch))
                    random.shuffle(c)
                    img_batch, target_batch = zip(*c)
                    img_batch, target_batch = list(img_batch), list(target_batch)
                    del c
                    
                if IMG_CHANNELS == 1:
                    yield np.array(img_batch[:batch_size])[:,:,:,np.newaxis], np.array(target_batch[:batch_size])
                else:
                    yield np.repeat(np.array(img_batch[:batch_size])[:,:,:,np.newaxis], IMG_CHANNELS, -1), np.array(target_batch[:batch_size])

                img_batch = img_batch[batch_size:]
                target_batch = target_batch[batch_size:]
                shuffle_index += 1
                gc.collect()
                                
        pool.close()

In [None]:
def timeit():
    starttime = time.time()
    datagen = data_generator(train_df,batch_size=64,img_mode='FLAIR')
    
    times = []
    for i in range(6):
        t = next(iter(datagen)); del t
        if i % 5 == 0: times.append("Batch %i :"%i +str(round(time.time()-starttime,2)))
    [print(i) for i in times]
    return times[-1].split(':')[1]

MULTIPROCESS = False
print("Without multiprocessing:",timeit())
MULTIPROCESS = True
print("With multiprocessing:",timeit())

### Efficientnet Benchmark

In [None]:
!pip install -q efficientnet

In [None]:
import efficientnet.tfkeras as efn
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
print("Eager execution on:",tf.executing_eagerly())
import tensorflow.keras as keras
from tensorflow.keras import layers, models, optimizers
from keras.utils.vis_utils import plot_model

l_in = layers.Input((IMG_DIMS[0],IMG_DIMS[1],IMG_CHANNELS,))
effnet_base = efn.EfficientNetB5(weights='noisy-student', input_shape = (IMG_DIMS[0], IMG_DIMS[1], IMG_CHANNELS), include_top=False, drop_connect_rate=0.2)(l_in)  # or weights='imagenet'
effnet_base.trainable = False
l_pool_1 = layers.GlobalAveragePooling2D()(effnet_base)
l_dense_1 = layers.Dense(128, activation='relu')(l_pool_1)
l_out = layers.Dense(1, activation='sigmoid')(l_dense_1)
model = models.Model(l_in, l_out)

model.summary()
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

In [None]:
BATCH_SIZE = 32
EPOCHS = 20

In [None]:
import tensorflow.keras as keras

lr_schedule = keras.optimizers.schedules.ExponentialDecay(
    1.25e-5*BATCH_SIZE, decay_steps=(25*len(train_df)//BATCH_SIZE), decay_rate=0.85, staircase=True
)
plt.title("Learning rate over %i epochs:"%EPOCHS)
x = range(0,(25*len(train_df)//BATCH_SIZE) * EPOCHS,10)
def visualize_lr_schedule(x):
    return 1.25e-5*BATCH_SIZE * 0.85 ** (x//(25*(len(train_df)//BATCH_SIZE)))
y = [visualize_lr_schedule(i) for i in x]
sns.lineplot(x, y)
print("Starting LR:",max(y))
print("Ending LR:",min(y))

In [None]:
# Visualize processed
train_datagen = data_generator(train_df, img_mode, batch_size=BATCH_SIZE, augment=True)
sample_batch = next(iter(train_datagen))
plot_slices(sample_batch[0], 4, 4, from_train_gen=True, augment=True)
print(sample_batch[0].shape, sample_batch[1].shape)

In [None]:
# Check for weirdness in class distribution in train sample
sns.histplot(sample_batch[1])

### Training:
#### (Note: You will need to run the block of code below four times to train all the models. Set img_mode to:
- 'T1w'
- 'T2w'
- 'T1wCE'
- 'FLAIR'
#### each time.)
#### Download the models after training is complete, and restart the kernel.
A memory leak causes RAM to fill up during training and will crash your kernel if you try to train 2 models in a row. I still haven't found a fix.

In [None]:
import gc
import pickle as pkl

# Select one
img_modes = ['T1w','T1wCE','T2w','FLAIR']
img_mode = 'T1wCE'

print("Num train samples:",len(train_df))
print("Num validation samples:",len(val_df))

# Clear garbage upon the end of every epoch
class gc_callback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        gc.collect()
garbage_callback = gc_callback()

# Load datasets & train model
print("Now training model on:", img_mode)
train_datagen = data_generator(train_df, img_mode, batch_size=BATCH_SIZE, augment=True)
val_datagen = data_generator(val_df, img_mode, batch_size=BATCH_SIZE, augment=False)

model.compile(loss='binary_crossentropy', optimizer = optimizers.Adam(learning_rate=lr_schedule), metrics=['AUC','acc','mae'])
es_callback = tf.keras.callbacks.EarlyStopping(
monitor="val_loss",
min_delta=0,
patience=5,
restore_best_weights=False
)

warmup_history = model.fit(train_datagen, epochs=int(EPOCHS//5), validation_data=val_datagen, use_multiprocessing=False, workers=1, steps_per_epoch = int(25*len(train_df)//BATCH_SIZE), validation_steps = int(25*len(val_df)//BATCH_SIZE), callbacks = [es_callback, garbage_callback])
model.save(img_mode+'_warmup.h5')

sns.lineplot(data=pd.DataFrame(warmup_history.history)[['loss','val_loss','AUC','val_AUC']])
plt.title("Warmup History: %s"%img_mode)
plt.show()

del warmup_history; gc.collect() # Save memory

# Unfreeze pretrained weights and continue training
for layer in model.layers: layer.trainable = True
model.trainable = True

model.compile(loss='binary_crossentropy', optimizer = optimizers.Adam(learning_rate=lr_schedule), metrics=['AUC','acc','mae'])
es_callback = tf.keras.callbacks.EarlyStopping(
monitor="val_loss",
min_delta=0,
patience=5,
restore_best_weights=True
)

train_history = model.fit(train_datagen, epochs=EPOCHS, validation_data=val_datagen, use_multiprocessing=False, workers=1, steps_per_epoch = int(25*len(train_df)//BATCH_SIZE), validation_steps = int(25*len(val_df)//BATCH_SIZE), callbacks = [es_callback, garbage_callback])
sns.lineplot(data=pd.DataFrame(train_history.history)[['loss','val_loss','AUC','val_AUC']])
plt.title("Training History: %s"%img_mode)

del train_datagen, val_datagen, train_history # Save memory
gc.collect()

plt.show()

### Evaluation

In [None]:
import efficientnet.tfkeras as efn
import tensorflow as tf
tf.compat.v1.disable_eager_execution()
print("Eager execution on:",tf.executing_eagerly())
import tensorflow.keras as keras
from tensorflow.keras import layers, models, optimizers
from keras.utils.vis_utils import plot_model

# Load model if you have a pretrained one
#model = models.load_model('../input/btr-classification-models/FLAIR.h5')

In [None]:
from tqdm import tqdm, trange

def evaluate(model, val_df, img_mode, augment=False):
    true, preds = [],[]
    for patient_id in tqdm(val_df.iloc, total=len(val_df)):
        if patient_id[img_mode] == 0: preds.append(0.5)
        else:
            img_batch = load_array_and_labels(patient_id, img_mode, augment=augment)[0]
            
            if IMG_CHANNELS == 1: img_batch = np.array(img_batch)[:,:,:,np.newaxis]
            else: img_batch = np.repeat(np.array(img_batch)[:,:,:,np.newaxis], IMG_CHANNELS, -1)

            preds.append(np.mean(model.predict(np.array(img_batch))))
        true.append(patient_id['MGMT_value'])
    return true, preds

true_noaug, preds_noaug = evaluate(model, val_df, img_mode, augment=False)
true_aug, preds_aug = evaluate(model, val_df, img_mode, augment=True)

In [None]:
from sklearn.metrics import roc_curve, auc

def display_roc_curve(true_list, preds_list, name_list):
    plt.figure(1)
    plt.plot([0, 1], [0, 1], 'k--')
    for true, preds, name in zip(true_list, preds_list, name_list):
        fpr, tpr, thresholds_rf = roc_curve(true, preds, pos_label = 1)
        auc_rf = auc(fpr, tpr)
        sns.lineplot(fpr, tpr, label=name+' (area = {:.3f})'.format(auc_rf))
    plt.xlabel('False positive rate')
    plt.ylabel('True positive rate')
    plt.title('ROC curve')
    plt.legend(loc='best')
    plt.show()
    # Zoom in view of the upper left corner.
    plt.figure(2)
    plt.xlim(0, 0.5)
    plt.ylim(0.3, 1)
    plt.plot([0, 1], [0, 1], 'k--')
    for true, preds, name in zip(true_list, preds_list, name_list):
        fpr, tpr, thresholds_rf = roc_curve(true, preds, pos_label = 1)
        auc_rf = auc(fpr, tpr)
        sns.lineplot(fpr, tpr, label=name+' (area = {:.3f})'.format(auc_rf))
    plt.xlabel('False positive rate')
    plt.ylabel('True positive rate')
    plt.title('ROC curve (zoomed in at top left)')
    plt.legend(loc='best')
    plt.show()

In [None]:
img_mode = 'T1wCE'
true_list = [true_noaug,
             true_aug]
preds_list = [preds_noaug,
             preds_aug]
name_list = ["%s - No Augmentation"%img_mode,
            "%s - With Augmentation"%img_mode]
display_roc_curve(true_list, preds_list, name_list)

#### Prediction

In [None]:
# Import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

# Define data paths
TRAIN_DATA_PATH = '../input/rsna-miccai-png/train/'
TEST_DATA_PATH = '../input/rsna-miccai-png/test/'

submission_df = pd.read_csv('../input/rsna-miccai-brain-tumor-radiogenomic-classification/sample_submission.csv')
# Mark on the dataframe whether there are missing values for one or more MRI modes
import os

img_modes = set(['T1w','T1wCE','T2w','FLAIR'])

mode_valid = {'T1w':[],
             'T1wCE':[],
             'T2w':[],
             'FLAIR':[]}

for patient_id in submission_df.iloc:
    subdirs = set(os.listdir(TEST_DATA_PATH + str(int(patient_id['BraTS21ID'])).zfill(5)))
    for img_mode in img_modes: mode_valid[img_mode].append(int(img_mode in subdirs))
submission_df = submission_df.merge(pd.DataFrame(mode_valid),left_index=True,right_index=True)
print("Sample submission:")
submission_df

In [None]:
from tqdm import tqdm, trange

IMG_CHANNELS = 3
img_mode = 'T1wCE'
def make_preds(model, submission_df, img_mode):
    preds = []
    for patient_id in tqdm(submission_df.iloc, total=len(submission_df)):
        if patient_id[img_mode] == 0: preds.append(0.5)
        else:
            img_batch = load_array_and_labels(patient_id, img_mode, augment=False, testing = True)
            
            if IMG_CHANNELS == 1: img_batch = np.array(img_batch)[:,:,:,np.newaxis]
            else: img_batch = np.repeat(np.array(img_batch)[:,:,:,np.newaxis], IMG_CHANNELS, -1)

            preds.append(np.mean(model.predict(np.array(img_batch))))
    return preds

preds = make_preds(model, submission_df, img_mode)

In [None]:
submission_df['MGMT_value'] = preds
submission_df

In [None]:
submission_df.to_csv('submission.csv')