<h1 style="border:2px solid Purple;text-align:center">TensorFlow Implementation</h1>

**This is nice and simple baseline implemented in Tensorflow. And good for those who want to get started with this competition with baseline Tensorflow implementation.**

**Do remember to upvote if you liked the content :)**

I took a lot of help from Harveen's Notebook [here](https://www.kaggle.com/harveenchadha/efficientnetb3-tf2-keras-baseline/notebook), please upvote his work as well.

<h1 style="border:2px solid Purple;text-align:center">Basic Imports</h1>

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau
from tensorflow.keras.layers import Dense, Dropout, Flatten,GlobalAveragePooling2D,BatchNormalization, Activation
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from sklearn.model_selection import train_test_split
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
from tensorflow import keras
from tensorflow.keras.applications import EfficientNetB3


import os
import glob

config = ConfigProto()
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)

<h1 style="border:2px solid Purple;text-align:center">Configurations</h1>

In [None]:
class Config:
    
    seed = 44
    num_classes = 11
    class_list = [0,1,2,3,4,5,6,7,8,9,10]
    batch_size = 16
    n_epochs = 14
    drop_rate = 0.4
   
    #scheduler = 'CosineAnnealingLR'
    
    scheduler_params = {           
                    'ReduceLROnPlateau': 
                        {
                            'monitor':'val_loss',
                            'mode':'max',
                            'factor':0.5,
                            'patience':2,
                            'threshold':0.0001,
                            'threshold_mode':'rel',
                            'cooldown':0,
                            'min_lr':1e-5,
                            'eps':1e-08,
                            'verbose':True
                        },
                
                }
    
    
    # optimizer
    optimizer = tf.keras.optimizers.Adam(lr = 1e-4)

    # criterion
    loss = 'binary_crossentropy'
    
    target_size_dim = 500
    
    metrics = tf.keras.metrics.AUC(multi_label=True)
    
    reduceLROnPlat = ReduceLROnPlateau(monitor='val_loss', factor=0.8, patience=2, verbose=1, mode='auto', epsilon=0.0001, 
                                       cooldown=5, min_lr=0.00001)
    
    checkpoint = ModelCheckpoint('best_model.hdf5', 
                             monitor= 'val_loss', 
                             verbose=1, 
                             save_best_only=True, 
                             mode= 'min', 
                             save_weights_only = False)

    checkpoint_last = ModelCheckpoint('last_model.hdf5', 
                             monitor= 'val_loss', 
                             verbose=1, 
                             save_best_only=False, 
                             mode= 'min', 
                             save_weights_only = False)


    early = EarlyStopping(monitor= 'val_loss', 
                      mode= 'min', 
                      patience=5)
    
    labels = [
                'ETT - Abnormal',
                'ETT - Borderline',
                'ETT - Normal',
                'NGT - Abnormal',
                'NGT - Borderline',
                'NGT - Incompletely Imaged',
                'NGT - Normal', 
                'CVC - Abnormal',
                'CVC - Borderline',
                'CVC - Normal',
                'Swan Ganz Catheter Present'
            ]
    
    paths = {
                'train_path':'../input/ranzcr-clip-trainset-256x256',
                'test_path': '../input/siim-isic-melanoma-classification/jpeg/test',
                'csv_path': '../input/ranzcr-clip-catheter-line-classification/train.csv',
                'model_weight_path_folder': '../input/tfkerasefficientnetimagenetnotop'
            }

In [None]:
config = Config

In [None]:
type(config.reduceLROnPlat)

In [None]:
np.random.seed(config.seed)
tf.random.set_seed(config.seed)
os.environ['PYTHONHASHSEED'] = str(config.seed)

<h1 style="border:2px solid Purple;text-align:center">Prepare Data</h1>

In [None]:
df = pd.read_csv(config.paths['csv_path'])
df.head()

    Columns of the Dataframe    
    
    StudyInstanceUID --> unique ID for each image
    ETT - Abnormal --> endotracheal tube placement abnormal
    ETT - Borderline --> endotracheal tube placement borderline abnormal
    ETT - Normal --> endotracheal tube placement normal
    NGT - Abnormal --> nasogastric tube placement abnormal
    NGT - Borderline --> nasogastric tube placement borderline abnormal
    NGT - Incompletely Imaged --> nasogastric tube placement inconclusive due to imaging
    NGT - Normal --> nasogastric tube placement borderline normal
    CVC - Abnormal --> central venous catheter placement abnormal
    CVC - Borderline --> central venous catheter placement borderline abnormal
    CVC - Normal --> central venous catheter placement normal
    Swan Ganz Catheter Present
    PatientID --> unique ID for each patient in the dataset

In [None]:
print('Number of Records: {} and Number of Patients: {}'.format(len(df), df['PatientID'].nunique()))

Distribution of patient in all the records

In [None]:
df['PatientID'].value_counts()

In [None]:
df['path'] = '../input/ranzcr-clip-catheter-line-classification/train/' + df['StudyInstanceUID']+'.jpg'

<h1 style="border:2px solid Purple;text-align:center">Data Loader</h1>

Splitting the df for training and Validation

In [None]:
X_train, X_valid = train_test_split(df, test_size = 0.1, random_state=config.seed, shuffle=True)

tf.data.Datasetfrom_tensor_slices --> Creates a Dataset whose elements are slices of the given tensors.

The given tensors are sliced along their first dimension. This operation preserves the structure of the input tensors, removing the first dimension of each tensor and using it as the dataset dimension. All input tensors must have the same size in their first dimensions.

In [None]:
Train_df = tf.data.Dataset.from_tensor_slices((X_train.path.values, X_train[config.labels].values))

Valid_df = tf.data.Dataset.from_tensor_slices((X_valid.path.values, X_valid[config.labels].values))

In [None]:
for path, label in Train_df.take(5):
    print ('Path: {}, Label: {}'.format(path, label))

<h1 style="border:2px solid Purple;text-align:center">Data Generator</h1>

In [None]:
def process_data_train(image_path, label):
    
    img = tf.io.read_file(image_path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.random_brightness(img, 0.3)
    img = tf.image.random_flip_left_right(img)
    img = tf.image.resize(img, [config.target_size_dim, config.target_size_dim])
    #img = tf.image.per_image_standardization(img)
    
    return img, label

def process_data_valid(image_path, label):
    
    img = tf.io.read_file(image_path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, [config.target_size_dim, config.target_size_dim])
    
    return img, label

In [None]:
Train_df = Train_df.map(process_data_train, num_parallel_calls=tf.data.experimental.AUTOTUNE)
Valid_df = Valid_df.map(process_data_valid, num_parallel_calls=tf.data.experimental.AUTOTUNE)

In [None]:
for image, label in Train_df.take(1):
    
    plt.imshow(image.numpy().astype('uint8'))
    plt.show()
    print("Image shape: ", image.numpy().shape)
    print("Label: ", config.labels[np.argmax(label.numpy())])

In [None]:
def configure_for_performance(ds, batch_size = config.batch_size):
    
    ds = ds.cache('/kaggle/dump.tfcache') 
    ds = ds.repeat()
    ds = ds.shuffle(buffer_size=1024)
    ds = ds.batch(batch_size)
    ds = ds.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
    
    return ds

train_ds_batch = configure_for_performance(Train_df)
valid_ds_batch = Valid_df.batch(config.batch_size*2)

In [None]:
image_batch, label_batch = next(iter(train_ds_batch))

In [None]:
plt.figure(figsize=(10, 10))
for i in range(16):
    
    ax = plt.subplot(4, 4, i + 1)
    plt.imshow(image_batch[i].numpy().astype("uint8"))
    label = config.labels[np.argmax(label_batch[i].numpy())]
    plt.title(label)
    plt.axis("off")

In [None]:
data_augmentation = keras.Sequential(
    [
        tf.keras.layers.experimental.preprocessing.RandomRotation(0.05, interpolation='nearest'),
    ]
)

In [None]:
plt.figure(figsize=(10, 10))

for i in range(16):
    
    augmented_images = data_augmentation(image_batch)
    ax = plt.subplot(4, 4, i + 1)
    plt.imshow(augmented_images[i].numpy().astype("uint8"))
    label = config.labels[np.argmax(label_batch[i].numpy())]
    plt.title(label)
    plt.axis("off")

<h1 style="border:2px solid Purple;text-align:center">Model Creation</h1>

In [None]:
def load_pretrained_model(weights_path, drop_connect, target_size_dim, layers_to_unfreeze=5):
    
    model = EfficientNetB3(
            weights=None, 
            include_top=False, 
            drop_connect_rate=0.4
        )
    
    model.load_weights(weights_path)
    
    model.trainable = True

    return model

def build_my_model(base_model, optimizer, metrics, loss):
    
    inputs = tf.keras.layers.Input(shape=(config.target_size_dim, config.target_size_dim, 3))
    x = data_augmentation(inputs)
    outputs_eff = base_model(x)
    global_avg_pooling = GlobalAveragePooling2D()(outputs_eff)
    dense_1= Dense(256)(global_avg_pooling)
    bn_1 = BatchNormalization()(dense_1)
    activation = Activation('relu')(bn_1)
    dropout = Dropout(0.3)(activation)
    dense_2 = Dense(len(config.labels), activation='sigmoid')(dropout)

    my_model = tf.keras.Model(inputs, dense_2)
    
    my_model.compile(
        optimizer=config.optimizer,
        loss=config.loss,
        metrics=config.metrics
    )
    
    return my_model

In [None]:
model_weights_path = '../input/tfkerasefficientnetimagenetnotop/efficientnetb3_notop.h5'
model_weights_path


In [None]:
base_model = load_pretrained_model(model_weights_path, config.drop_rate, config.target_size_dim)

my_model = build_my_model(base_model, config.optimizer, metrics = [config.metrics], loss=config.loss)

my_model.summary()

In [None]:
callbacks_list = [config.checkpoint, config.checkpoint_last, config.early, config.reduceLROnPlat]

<h1 style="border:2px solid Purple;text-align:center">Model Training</h1>

In [None]:
steps_per_epoch = len(X_train) // config.batch_size

history = my_model.fit(
                          train_ds_batch, 
                          validation_data = valid_ds_batch, 
                          epochs = config.n_epochs, 
                          callbacks = callbacks_list,
                          steps_per_epoch = steps_per_epoch
                      )

<h1 style="border:2px solid Purple;text-align:center">Model Evaluation</h1>

In [None]:
my_model.load_weights('best_model.hdf5') ## load the best model or all your metrics would be on the last run not on the best one

In [None]:
pred_valid_y = my_model.predict(valid_ds_batch, verbose = True, workers=4)
pred_valid_y_labels = np.argmax(pred_valid_y, axis=-1)

In [None]:
valid_labels = np.concatenate([y.numpy() for x, y in valid_ds_batch], axis=0)

In [None]:
valid_labels

<h1 style="border:2px solid Purple;text-align:center">Test Predictions</h1>

In [None]:
test_images = glob.glob('../input/ranzcr-clip-catheter-line-classification/test/*.jpg')

In [None]:
df_test = pd.DataFrame(np.array(test_images), columns=['Path'])
df_test.head()

In [None]:
test_ds = tf.data.Dataset.from_tensor_slices((df_test.Path.values))


def process_test(image_path):

    img = tf.io.read_file(image_path)
    img = tf.image.decode_jpeg(img, channels=3)
    img = tf.image.resize(img, [config.target_size_dim, config.target_size_dim])
    
    return img
    
test_ds = test_ds.map(process_test, num_parallel_calls=tf.data.experimental.AUTOTUNE).batch(config.batch_size*2)

In [None]:
pred_y = my_model.predict(test_ds, workers=4, verbose=1)

In [None]:
df_ss = pd.DataFrame(pred_y, columns = config.labels)

In [None]:
df_test['image_id'] = df_test.Path.str.split('/').str[-1].str[:-4]
df_ss['StudyInstanceUID'] = df_test['image_id']
df_ss.head()

In [None]:
cols_reordered = ['StudyInstanceUID', 'ETT - Abnormal', 'ETT - Borderline', 'ETT - Normal', 'NGT - Abnormal',
       'NGT - Borderline', 'NGT - Incompletely Imaged', 'NGT - Normal',
       'CVC - Abnormal', 'CVC - Borderline', 'CVC - Normal',
       'Swan Ganz Catheter Present']

df_order = df_ss[cols_reordered]

<h1 style="border:2px solid Purple;text-align:center">Final Submission</h1>

In [None]:
df_order.to_csv('submission.csv', index=False)

**Please upvote the Notebook if you liked the content**