# DenseNet Trained with Old and New Data

---

Due to the size of previous competition's data, I faced few memory-related problems while training my model. In this kernel I would like to show the approach I used.

I basically splitted the training set in buckets and trained the model for each bucket.

I'd truly interested to further discuss how it could has been solved. So, if you faced the same issues, please comment with your ideas :)

#### Here you can find the [Inference Kernel](https://www.kaggle.com/raimonds1993/aptos19-densenet-inference-old-new-data/data?scriptVersionId=17252732)!

---

### Credits
I started this kernel by forking [APTOS 2019: DenseNet Keras Starter](https://www.kaggle.com/xhlulu/aptos-2019-densenet-keras-starter), by [Xhlulu](https://www.kaggle.com/xhlulu).

I also used [previous competition's data](https://www.kaggle.com/tanlikesmath/diabetic-retinopathy-resized) uploaded by [ilovescience](https://www.kaggle.com/tanlikesmath).

Thank you both guys!

### Changes

*Version 3:*
- This is the first completed version to consider. (Still without seed)
- Inference -> LB: 0.719

*Version 4:*
- Updated image size (320). In order to process the whole dataset, I load just one bucket at a time and trained the model on that.
- Added seed to better evaluate the results.

*Version 5:*
- Changed train - val split: Now let's take previous comp data as train and the new comp data as validation

*Version 9:*
- Changed preprosessing filter in old (train) data
- Changed Imagegenerator , more augmentation

*Version 10:*
- No Cropping in old comp data

In [None]:
# To have reproducible results and compare them
nr_seed = 2019
import numpy as np 
np.random.seed(nr_seed)
import tensorflow as tf
tf.set_random_seed(nr_seed)

In [None]:
# import libraries
import json
import math
from tqdm import tqdm, tqdm_notebook
import gc
import warnings
import os

import cv2
from PIL import Image

import pandas as pd
import scipy
import matplotlib.pyplot as plt

from keras import backend as K
from keras import layers
from keras.applications.densenet import DenseNet121
from keras.callbacks import Callback, ModelCheckpoint
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.optimizers import Adam

from sklearn.model_selection import train_test_split
from sklearn.metrics import cohen_kappa_score, accuracy_score

warnings.filterwarnings("ignore")

%matplotlib inline

In [None]:
# Image size
im_size = 320
# Batch size
BATCH_SIZE = 32

# Loading & Merging

In [None]:
#read csv files
aptos = pd.read_csv('../input/aptos2019-blindness-detection/train.csv') #aptos
eyepacs= pd.read_csv('../input/diabetic-retinopathy-resized/trainLabels.csv')#kaggle
apt_csv=pd.read_csv('../input/aptos2019-blindness-detection/test.csv')#aptos test
messidor2=pd.read_csv('../input/messidor2preprocess/messidor_data.csv')#messidor

print(aptos.shape)
print(eyepacs.shape)
print(messidor2.shape)

In [None]:
eyepacs= eyepacs[['image','level']]
eyepacs.columns = aptos.columns
eyepacs.diagnosis.value_counts()
messidor2=messidor2[['id_code','diagnosis']]


# path columns
#aptos['id_code'] = '../input/aptos2019-blindness-detection/train_images/' + aptos['id_code'].astype(str) + '.png'
eyepacs['id_code'] = '../input/diabetic-retinopathy-resized/resized_train/resized_train/' + eyepacs['id_code'].astype(str) + '.jpeg'
messidor2['id_code']='../input/messidor2preprocess/messidor-2/messidor-2/preprocess/' + messidor2['id_code'].astype(str)

pacs_df = eyepacs.copy()#train
test_df=messidor2.copy()#test

print(pacs_df .head())
print(pacs_df .shape)
print(test_df.shape)

## Train - Valid split

In [None]:
# Let's shuffle the datasets
pacs_df = pacs_df.sample(frac=1).reset_index(drop=True)
print(pacs_df.shape)

### Process Images

Crop function: https://www.kaggle.com/ratthachat/aptos-updated-preprocessing-ben-s-cropping 

In [None]:
"""
def crop_image1(img,tol=7):
    # img is image data
    # tol  is tolerance
        
    mask = img>tol
    return img[np.ix_(mask.any(1),mask.any(0))] #np.ix_ selects certain data from image(cropping)


def crop_image_from_gray(img,tol=7):
    if img.ndim ==2: # 2d image (greyScale)
        mask = img>tol
        return img[np.ix_(mask.any(1),mask.any(0))]
    elif img.ndim==3: # RGB colored image
        gray_img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY) # turn to grey scale
        mask = gray_img>tol
        
        check_shape = img[:,:,0][np.ix_(mask.any(1),mask.any(0))].shape[0]
        if (check_shape == 0): # image is too dark so that we crop out everything,
            return img # return original image
        else:
            img1=img[:,:,0][np.ix_(mask.any(1),mask.any(0))]#red channel
            img2=img[:,:,1][np.ix_(mask.any(1),mask.any(0))]#green channel
            img3=img[:,:,2][np.ix_(mask.any(1),mask.any(0))]#blue channel
            img = np.stack([img1,img2,img3],axis=-1) #stack three channels
        return img
        """
import os
import glob
import cv2
import numpy as np
def crop_image_from_gray(img,tol=7):
    """
    Crop out black borders
    https://www.kaggle.com/ratthachat/aptos-updated-preprocessing-ben-s-cropping
    """  
    
    if img.ndim ==2:
        mask = img>tol
        return img[np.ix_(mask.any(1),mask.any(0))]
    elif img.ndim==3:
        gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        mask = gray_img>tol        
        check_shape = img[:,:,0][np.ix_(mask.any(1),mask.any(0))].shape[0]
        if (check_shape == 0):
            return img
        else:
            img1=img[:,:,0][np.ix_(mask.any(1),mask.any(0))]
            img2=img[:,:,1][np.ix_(mask.any(1),mask.any(0))]
            img3=img[:,:,2][np.ix_(mask.any(1),mask.any(0))]
            img = np.stack([img1,img2,img3],axis=-1)
        return img


def circle_crop(img):   
    """
    Create circular crop around image centre    
    """    
    
    img = cv2.imread(img)
    img = crop_image_from_gray(img)    
    
    height, width, depth = img.shape    
    
    x = int(width/2)
    y = int(height/2)
    r = np.amin((x,y))
    
    circle_img = np.zeros((height, width), np.uint8)
    cv2.circle(circle_img, (x,y), int(r), 1, thickness=-1)
    img = cv2.bitwise_and(img, img, mask=circle_img)
    img = crop_image_from_gray(img)
    
    return img 

def circle_crop_v2(img):
    """
    Create circular crop around image centre
    """
    img = cv2.imread(img)
    img = crop_image_from_gray(img)

    height, width, depth = img.shape
    largest_side = np.max((height, width))
    img = cv2.resize(img, (largest_side, largest_side))

    height, width, depth = img.shape

    x = int(width / 2)
    y = int(height / 2)
    r = np.amin((x, y))

    circle_img = np.zeros((height, width), np.uint8)
    cv2.circle(circle_img, (x, y), int(r), 1, thickness=-1)
    img = cv2.bitwise_and(img, img, mask=circle_img)
    img = crop_image_from_gray(img)

    return img    

def preprocess_image(image_path, desired_size=224):
    img = cv2.imread(image_path)
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #turn to greyscale
    img = crop_image_from_gray(img)
    img = cv2.resize(img, (desired_size,desired_size))
    img = cv2.addWeighted(img,4,cv2.GaussianBlur(img, (0,0), desired_size/30) ,-4 ,128)
    
    return img

def preprocess_image_old(image_path, desired_size=im_size):
    #img = cv2.imread(image_path)
    #img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    img = circle_crop_v2(image_path) #already cropped
    img = cv2.resize(img, (desired_size,desired_size))
    img = cv2.addWeighted(img,4,cv2.GaussianBlur(img, (0,0), desired_size/40) ,-4 ,128)#blend two images
    
    return img




In [None]:
def display_samples(df, columns=4, rows=3):
    fig = plt.figure(figsize=(5 * columns, 4 * rows))

    for i in range(columns * rows):
        image_path = df.loc[i, 'id_code']
        image_id = df.loc[i, 'diagnosis']
        #img = cv2.imread(f'{image_path}')
        #img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        img = preprocess_image_old(image_path)
        #img = cv2.resize(img, (im_size, im_size))
        #img = cv2.addWeighted(img, 4, cv2.GaussianBlur(img, (0, 0), im_size / 40), -4, 128)

        fig.add_subplot(rows, columns, i + 1)
        plt.title(image_id)
        plt.imshow(img)

    plt.tight_layout()

display_samples(pacs_df)

# Processing Images

__UPDATE:__ Here we are reading just the validation set. In order to use 320x320 images, we are going to load one bucket at a time only when needed. This will let our code run without memory-related errors.

In [None]:
 """
# validation set
N = val_df.shape[0]
x_val = np.empty((N, im_size, im_size, 3), dtype=np.uint8)

for i, image_id in enumerate(tqdm_notebook(val_df['id_code'])):
    x_val[i, :, :, :] = preprocess_image(  #returns preprocessed image
        f'{image_id}',
        desired_size = im_size
    )
   """

In [None]:
x_npz = np.load("../input/fork-of-preprocess-aptos/images_array.npz")
x_val = x_npz['arr_0']
# Load binary encoded labels for Lung Infiltrations: 0=Not_infiltration 1=Infiltration
y_val = pd.read_csv('../input/fork-of-preprocess-aptos/mycsvfile.csv')
print(x_val.shape)
print(y_val.shape)

In [None]:
print(y_val.head())
print(pacs_df.head())

In [None]:
# Binary Classification
####convert y values to 0 for no dr and 1 for dr# 
temp=pacs_df.copy()
temp["diagnosis"].replace({2:1, 3:1,4:1}, inplace=True)
y_train = pd.get_dummies(temp['diagnosis']).values
y_train[:,0]=1
###
temp=test_df.copy()
temp["diagnosis"].replace({2:1, 3:1,4:1}, inplace=True)
y_tests = pd.get_dummies(temp['diagnosis']).values
y_tests[:,0]=1
###

y_val["diagnosis"].replace({2:1, 3:1,4:1}, inplace=True)
y_val= pd.get_dummies(y_val['diagnosis']).values
y_val[:,0]=1
###
print(y_train.shape)
print(y_tests.shape)
print(y_val.shape)

In [None]:
print(y_val[0]) #class 1 0 no dr
print(y_val[1]) #class 1 1 dr exists
print(y_train[0])#class 1 1 dr
print(y_train[3]) # class 10 no dr


# Creating multilabels

Instead of predicting a single label, we will change our target to be a multilabel problem; i.e., if the target is a certain class, then it encompasses all the classes before it. E.g. encoding a class 4 retinopathy would usually be `[0, 0, 0, 1]`, but in our case we will predict `[1, 1, 1, 1]`. For more details, please check out [Lex's kernel](https://www.kaggle.com/lextoumbourou/blindness-detection-resnet34-ordinal-targets).

In [None]:
"""
y_train_multi = np.empty(y_train.shape, dtype=y_train.dtype)
y_train_multi[:, 4] = y_train[:, 4]
y_tests_multi=np.empty(y_tests.shape, dtype=y_tests.dtype)
y_tests_multi[:, 4] = y_tests[:,4]
for i in range(3, -1, -1):
    y_train_multi[:, i] = np.logical_or(y_train[:, i], y_train_multi[:, i+1])
    y_tests_multi[:, i] = np.logical_or(y_tests[:, i], y_tests_multi[:, i+1])

y_val_multi = np.empty(y_val.shape, dtype=y_val.dtype)
y_val_multi[:, 4] = y_val[:, 4]

for i in range(3, -1, -1):
    y_val_multi[:, i] = np.logical_or(y_val[:, i], y_val_multi[:, i+1])

print("Y_train multi: {}".format(y_train_multi.shape))
print("Y_tests multi: {}".format(y_tests_multi.shape))
print("Y_val multi: {}".format(y_val_multi.shape))
"""

In [None]:
"""y_train = y_train_multi
y_tests=y_tests_multi
y_val = y_val_multi
"""

In [None]:
# delete the uneeded df
del temp
del aptos
del eyepacs
del messidor2
gc.collect()

# Creating keras callback for QWK

---

I had to change this function, in order to consider the best kappa score among all the buckets.

In [None]:
class Metrics(Callback):

   def on_epoch_end(self, epoch, logs={}): # logs contains the loss value, and all the metrics at the end of a batch or epoch. Example includes the loss and mean absolute error.
        X_val, y_val = self.validation_data[:2] #self.validation_data[0]:x val #self.validation_data[1]:y val
        y_val = y_val.sum(axis=1)  - 1

        y_pred = self.model.predict(X_val) > 0.5
        y_pred = y_pred.astype(int).sum(axis=1) - 1
        _val_kappa = cohen_kappa_score(
            y_val,
            y_pred,
            weights='quadratic'
        )

        self.val_kappas.append(_val_kappa)

        print(f"val_kappa: {_val_kappa:.4f}")

        if _val_kappa == max(self.val_kappas):
            print("Validation Kappa has improved. Saving model.")
            self.model.save('model.h5')

        return


# Data Generator

In [None]:
def create_datagen():
    return ImageDataGenerator(
        featurewise_std_normalization = True,
        horizontal_flip = True,
        vertical_flip = True,
        rotation_range = 360
    )

# Model: DenseNet-121

In [None]:
densenet = DenseNet121(
    weights='../input/densenet-keras/DenseNet-BC-121-32-no-top.h5',
    include_top=False,
    input_shape=(im_size,im_size,3)
)

In [None]:
def build_model():
    model = Sequential()
    model.add(densenet)
    model.add(layers.GlobalAveragePooling2D())
    model.add(layers.Dropout(0.5))
    model.add(layers.Dense(2, activation='sigmoid'))
    
    model.compile(
        loss='binary_crossentropy',
        optimizer=Adam(lr=0.0001,decay=1e-6),
        metrics=['accuracy']
    )
    
    return model

In [None]:
model = build_model()
model.summary()

# Training & Evaluation

In [None]:
#train_df = train_df.reset_index(drop=True)
bucket_num = 8
div = round(pacs_df.shape[0]/bucket_num)

In [None]:
df_init = {
    'val_loss': [0.0],
    'val_acc': [0.0],
    'loss': [0.0], 
    'acc': [0.0],
    'bucket': [0.0]
}
results = pd.DataFrame(df_init)

In [None]:
# I found that changing the nr. of epochs for each bucket helped in terms of performances
epochs = [5,5,10,15,15,20,20,30]
kappa_metrics = Metrics()
kappa_metrics.val_kappas = []

In [None]:
for i in range(0, bucket_num):
    if i != (bucket_num - 1):
        print("Bucket Nr: {}".format(i))

        N = pacs_df.iloc[i * div:(1 + i) * div].shape[0]
        x_train = np.empty((N, im_size, im_size, 3), dtype=np.uint8)
        for j, image_id in enumerate(tqdm_notebook(pacs_df.iloc[i * div:(1 + i) * div, 0])):
            x_train[j, :, :, :] = preprocess_image_old(f'{image_id}', desired_size=im_size)

        data_generator = create_datagen().flow(x_train, y_train[i * div:(1 + i) * div, :], batch_size=BATCH_SIZE)
        history = model.fit_generator(
            data_generator,
            steps_per_epoch=x_train.shape[0] / BATCH_SIZE,
            epochs=epochs[i],
            validation_data=(x_val, y_val),
            callbacks=[kappa_metrics]
        )

        dic = history.history
        df_model = pd.DataFrame(dic)
        df_model['bucket'] = i
    else:
        print("Bucket Nr: {}".format(i))

        N = pacs_df.iloc[i * div:].shape[0]
        x_train = np.empty((N, im_size, im_size, 3), dtype=np.uint8)
        for j, image_id in enumerate(tqdm_notebook(pacs_df.iloc[i * div:, 0])):
            x_train[j, :, :, :] = preprocess_image_old(f'{image_id}', desired_size=im_size)
        data_generator = create_datagen().flow(x_train, y_train[i * div:, :], batch_size=BATCH_SIZE)

        history = model.fit_generator(
            data_generator,
            steps_per_epoch=x_train.shape[0] / BATCH_SIZE,
            epochs=epochs[i],
            validation_data=(x_val, y_val),
            callbacks=[kappa_metrics]
        )

        dic = history.history
        df_model = pd.DataFrame(dic)
        df_model['bucket'] = i

    results = results.append(df_model)
    
    del data_generator
    del x_train
    gc.collect()

    print('-' * 40)

In [None]:
results = results.iloc[1:]
results['kappa'] = kappa_metrics.val_kappas
results = results.reset_index()
results = results.rename(index=str, columns={"index": "epoch"})
results

In [None]:
results[['loss', 'val_loss']].plot()
results[['acc', 'val_acc']].plot()
results[['kappa']].plot()
results.to_csv('model_results.csv',index=False)

# Find best threshold

In [None]:
model.load_weights('model.h5')
y_val_pred = model.predict(x_val)

def compute_score_inv(threshold):
    y1 = y_val_pred > threshold
    y1 = y1.astype(int).sum(axis=1) - 1
    y2 = y_val.sum(axis=1) - 1
    score = cohen_kappa_score(y1, y2, weights='quadratic')
    
    return 1 - score

simplex = scipy.optimize.minimize(
    compute_score_inv, 0.5, method='nelder-mead'
)

best_threshold = simplex['x'][0]

y1 = y_val_pred > best_threshold
y1 = y1.astype(int).sum(axis=1) - 1
y2 = y_val.sum(axis=1) - 1
score = cohen_kappa_score(y1, y2, weights='quadratic')
print('Threshold: {}'.format(best_threshold))
print('Validation QWK score with best_threshold: {}'.format(score))

y1 = y_val_pred > .5
y1 = y1.astype(int).sum(axis=1) - 1
score = cohen_kappa_score(y1, y2, weights='quadratic')
print('Validation QWK score with .5 threshold: {}'.format(score))

testing phase

In [None]:
"""
N = test_df.shape[0]
x_tests = np.empty((N, im_size, im_size, 3), dtype=np.uint8)

for i, image_id in enumerate(tqdm_notebook(test_df['id_code'])):
    print(image_id)
    x_tests[i, :, :, :] = preprocess_image(  # returns preprocessed image
        f'{image_id}',
        desired_size = im_size
    )
    """

## [Inference Kernel](https://www.kaggle.com/raimonds1993/aptos19-densenet-inference-old-new-data/data?scriptVersionId=17252732)

**Thanks for reading it all! Please let me know if you have any ideas to improve this process.**

**Hope you liked it.**