**This is a starter notebook for [HuBMAP + HPA - Hacking the Human Body](https://www.kaggle.com/competitions/hubmap-organ-segmentation) using FPN with InceptionV3 as backbone.**

![](https://i.postimg.cc/s24PHJDV/Screenshot-2022-06-26-at-03-05-59-Hu-BMAP-HPA-Hacking-the-Human-Body-Kaggle.png)


I used the Docker File of 2020-11-18 in this notebook, I did this by forking from another notebook created using that Docker File. You might need to fork this notebook or something equivalent to get the same Docker File in order to run this code. 

<h3><font color='red'> If you like this notebook then please upvote.</h3>

**INSTALLING REQUIREMENTS**

In [None]:
!pip install -U ../input/kerasapplications/Keras_Applications-1.0.8-py3-none-any.whl
!pip install ../input/qubvel/efficientnet-1.0.0-py3-none-any.whl
!pip install ../input/qubvel/image_classifiers-1.0.0-py3-none-any.whl
!pip install ../input/qubvel-segmentation-model-keras-v101/segmentation_models-master

%env SM_FRAMEWORK=tf.keras

In [None]:
import os
import gc
import cv2
import glob
from tqdm import notebook
import tifffile as tiff 
import numpy as np 
import pandas as pd 
import tensorflow.keras.backend as K
import tensorflow as tf
import matplotlib.pyplot as plt

In [None]:
train_csv = pd.read_csv("../input/hubmap-organ-segmentation/train.csv")
test_csv = pd.read_csv("../input/hubmap-organ-segmentation/test.csv")
sample_submission = pd.read_csv("../input/hubmap-organ-segmentation/sample_submission.csv")

In [None]:
train_csv.head(10)

Setting up Seeds for reproducibility

In [None]:
SEED = 42
os.environ['PYTHONHASHSEED'] = str(SEED)
np.random.seed = SEED
K.set_random_seed = SEED
tf.random.set_seed= SEED

In [None]:
IMG_SIZE = 480

**IMAGE AUGMENTATIONS**

I am only using `HorizontalFlip`, `VerticalFlip`, `RandomRotate90` and `RandomBrightnessContrast` here but feel free to try out other augmentation techniques like `GridDistortion` and `OpticalDistortion` .

In [None]:
from albumentations import *

transforms = Compose([
             HorizontalFlip(),
             VerticalFlip(),
             RandomRotate90(),
             RandomBrightnessContrast(),
             ShiftScaleRotate(shift_limit=0.0625, scale_limit=0.2, rotate_limit=15, p=0.9, 
                              border_mode=cv2.BORDER_REFLECT),

                     ], p=1.0)

**LOAD DATA**

In [None]:
#https://www.kaggle.com/code/paulorzp/rle-functions-run-lenght-encode-decode

def mask2rle(img):
    '''
    img: numpy array, 1 - mask, 0 - background
    Returns run length as string formated
    '''
    pixels= img.T.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)
 
def rle2mask(mask_rle, shape):
    '''
    mask_rle: run-length as string formated (start length)
    shape: (width,height) of array to return 
    Returns numpy array, 1 - mask, 0 - background
    '''
    s = mask_rle.split()
    starts, lengths = [np.asarray(x, dtype=int) for x in (s[0:][::2], s[1:][::2])]
    starts -= 1
    ends = starts + lengths
    img = np.zeros(shape[0]*shape[1], dtype=np.uint8)
    for lo, hi in zip(starts, ends):
        img[lo:hi] = 1
    return img.reshape(shape).T

In this demonstration I loaded the data into numpy(because I managed to fit them) but ITS ALWAYS BETTER TO USE `tf.data` in data loading cases. Because it's fast and also convenient for using other functions. <br>I will add this in later commits.

In [None]:
imgs = [] 
masks = []
viz = False

for each_id in notebook.tqdm(train_csv.id.values, total=train_csv.shape[0]):
    info = train_csv[train_csv.id==each_id]
    img_path = "../input/hubmap-organ-segmentation/train_images/{}.tiff".format(each_id)
    
    img = tiff.imread(img_path)
    msk = rle2mask(mask_rle=info['rle'].values[0], 
                   shape=(info['img_height'].values[0], info['img_width'].values[0])
                  )
    sample = transforms(image = img, mask = msk)
    aug_img = sample['image']
    aug_msk = sample['mask']
    aug_img = cv2.resize(aug_img, (IMG_SIZE, IMG_SIZE))
    aug_msk = cv2.resize(aug_msk, (IMG_SIZE, IMG_SIZE))
    aug_img = aug_img/255.0
    
    imgs.append( aug_img )
    masks.append( aug_msk )
    
    #img = cv2.resize(img, (IMG_SIZE, IMG_SIZE)) 
    #msk = cv2.resize(msk, (IMG_SIZE, IMG_SIZE))
    #img = img/255.0
    
    #imgs.append( img )
    #masks.append( msk )
    
    if viz:
        fig, ax = plt.subplots(1, 2, figsize=(12, 8))
        ax[0].imshow(aug_img)
        ax[1].imshow(aug_msk)
        plt.show()

    
imgs = np.array(imgs).reshape(-1, IMG_SIZE, IMG_SIZE, 3).astype(np.float32)
masks = np.array(masks).reshape(-1, IMG_SIZE, IMG_SIZE)

print(imgs.shape, masks.shape)
print(np.min(imgs), np.max(imgs), np.min(masks), np.max(masks))

**Let's see some images**

In [None]:
no_imgs = 5

for _ in range(no_imgs):
    idx = np.random.randint(low=0, high=imgs.shape[0]-1)
    fig, ax = plt.subplots(1, 2, figsize=(10, 5))
    
    ax[0].imshow(imgs[idx])
    ax[0].set_title('Image')
    ax[1].imshow(masks[idx])
    ax[1].set_title('GT Mask')
    plt.show()
    

In [None]:
from sklearn.model_selection import train_test_split

train_imgs, val_imgs, train_masks, val_masks = train_test_split(imgs, masks, 
                                                                shuffle=True, test_size=0.20, 
                                                                random_state=SEED)
print(train_imgs.shape, train_masks.shape, val_imgs.shape, val_masks.shape)

del imgs, masks
_ = gc.collect()

**DEFINE THE MODEL**

In [None]:
#https://www.kaggle.com/code/queyrusi/vanilla-submission-seresnext50

def dice_coeff(y_true, y_pred, epsilon=1.):
    
    """
    Calculates dice coefficient

    Arguments: 
            y_true : tensor of ground truth values.
            y_pred : tensor of predicted values.
            epsilon : constant to avoid divide by 0 errors.
    
    Returns:
            dice_coefficient
    """
    
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    score = (2. * intersection + epsilon) / (K.sum(y_true_f) + K.sum(y_pred_f) + epsilon)
    return score

In [None]:
%env SM_FRAMEWORK=tf.keras
from segmentation_models import FPN

model = FPN('inceptionv3', input_shape=(IMG_SIZE, IMG_SIZE, 3), classes=1, activation='sigmoid',
            encoder_weights=
            '../input/keras-pretrained-imagenet-weights/inceptionv3_imagenet_1000_no_top.h5'
           )

model.compile(loss='binary_crossentropy', optimizer='adam', metrics = [dice_coeff]) 

**Start Training**

In [None]:
%%time

cp_callback = tf.keras.callbacks.ModelCheckpoint('/kaggle/working/best_model/',
                                                 monitor='val_dice_coeff',
                                                 verbose=1,
                                                 save_best_only=True,
                                                 save_weights_only=False,
                                                 mode='max',
                                                 save_freq='epoch',
                                                )                                 
history = model.fit(train_imgs, train_masks, 
                    epochs=15, batch_size=8,
                    validation_data=(val_imgs, val_masks),
                    callbacks=[cp_callback]
                    )

del train_imgs, train_masks
_ = gc.collect()

In [None]:
pd.DataFrame(history.history).plot(figsize=(8, 6))
plt.show();

**Let's look at some Predictions by our model**

Here, I used the same model but trained on 30 epochs.

In [None]:
model = tf.keras.models.load_model('/kaggle/working/best_model', custom_objects={'dice_coeff':dice_coeff})
print("Best Model Loaded!")

In [None]:
no_imgs = 5

for _ in range(no_imgs):
    idx = np.random.randint(low=0, high=val_imgs.shape[0]-1)
    fig, ax = plt.subplots(1, 3, figsize=(10, 5))

    ax[0].imshow(val_imgs[idx])
    ax[0].set_title('Image')
    
    pred = model.predict(np.expand_dims(val_imgs[idx], 0))[0, :, :, 0]
    pred[pred>=0.5] = 1
    pred[pred<0.5] = 0
    ax[1].imshow(pred)
    ax[1].set_title('Predicted Mask')

    ax[2].imshow(val_masks[idx])
    ax[2].set_title('GT Mask')
    plt.show()


**SUBMISSION**

In [None]:
### Let's delete Val Images and Masks since we don't need them to do submission.

del val_imgs, val_masks
_ = gc.collect()

In [None]:
ids = []
preds = []

for each_id in sample_submission.id.values:
    print(each_id)
    img_path = "../input/hubmap-organ-segmentation/test_images/{}.tiff".format(each_id)
    img = tiff.imread(img_path)
    if img==[]:
        ids.append(each_id)
        preds.append('')
    else:
        img_real_shape  = img.shape
        img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))
        
        ###Prediction
        img = np.expand_dims(img, 0)
        img = img/255.0
        pred_mask = model.predict(img)[0, :, :, 0]
        pred_mask = cv2.resize(pred_mask, (img_real_shape[1], img_real_shape[0]))
        pred_mask[pred_mask>=0.5] = 1
        pred_mask[pred_mask<0.5] = 0
        
        pred_rle  = mask2rle(pred_mask)
        ids.append(each_id)
        preds.append(pred_rle)
        

In [None]:
sub = pd.DataFrame({'id':ids,'rle':preds})
sub.to_csv('/kaggle/working/submission.csv', index=False)
sub.head(5)

**Thank You for reading.**
