## UNet starter for Steel defect detection challenge


This kernel uses a UNet model with pretrained resnet18 encoder for this challenge, with simple augmentations using albumentations library, uses BCE loss, metrics like Dice and IoU. I've used [segmentation_models.pytorch](https://github.com/qubvel/segmentation_models.pytorch) which comes with a lot pre-implemented segmentation architectures. This is a modified version of my previous [kernel](https://www.kaggle.com/rishabhiitbhu/unet-with-resnet34-encoder-pytorch) for [siim-acr-pneumothorax-segmentation](https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation/) competition.

**As internet is not allowed for this competition, I tried installing `segmentation_models.pytorch` by source using pip but due to some reasons it didn't work. So, as a [Jugaad](https://en.wikipedia.org/wiki/Jugaad) I took all of `segmentation_models.pytorch`'s UNet code and wrote it in a single file and added it as a dataset so as to use it for this kernel, its dependency [pretrained-models.pytorch](https://github.com/Cadene/pretrained-models.pytorch) is also added as a dataset.

In [1]:
# ! pip install git+https://github.com/qubvel/segmentation_models.pytorch --upgrade

In [2]:
encoder_name = 'efficientnet-b0'

In [3]:
# !pip install ../input/pretrainedmodels/pretrainedmodels-0.7.4/pretrainedmodels-0.7.4/ > /dev/null # no output
# package_path = './unetmodelscript/' # add unet script dataset
# import sys
# sys.path.append(package_path)
# from model import Unet # import Unet model from the script
import segmentation_models_pytorch as smp #import Unet, FPN, PSPNet

## Imports

In [4]:
import os
os.environ['CUDA_VISIBLE_DEVICES']='1'
import cv2
import pdb
import time
import warnings
import random
import numpy as np
import pandas as pd
from tqdm import tqdm_notebook as tqdm
from torch.optim.lr_scheduler import ReduceLROnPlateau
from sklearn.model_selection import train_test_split
import torch
import torch.nn as nn
from torch.nn import functional as F
import torch.optim as optim
import torch.backends.cudnn as cudnn
from torch.utils.data import DataLoader, Dataset, sampler
from matplotlib import pyplot as plt
from albumentations import (HorizontalFlip, ShiftScaleRotate, Normalize, Resize, Compose, GaussNoise)
from albumentations.torch import ToTensor
warnings.filterwarnings("ignore")
seed = 69
random.seed(seed)
os.environ["PYTHONHASHSEED"] = str(seed)
np.random.seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True

from radam import RAdam

from torchcontrib.optim import SWA
from torch.utils.data import Sampler

## RLE-Mask utility functions

In [5]:
#https://www.kaggle.com/paulorzp/rle-functions-run-lenght-encode-decode
def mask2rle(img):
    '''
    img: numpy array, 1 -> mask, 0 -> background
    Returns run length as string formated
    '''
    pixels= img.T.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

def make_mask(row_id, df):
    '''Given a row index, return image_id and mask (256, 1600, 4) from the dataframe `df`'''
    fname = df.iloc[row_id].name
    labels = df.iloc[row_id][:4]
    masks = np.zeros((256, 1600, 4), dtype=np.float32) # float32 is V.Imp
    # 4:class 1～4 (ch:0～3)

    for idx, label in enumerate(labels.values):
        if label is not np.nan:
            label = label.split(" ")
            positions = map(int, label[0::2])
            length = map(int, label[1::2])
            mask = np.zeros(256 * 1600, dtype=np.uint8)
            for pos, le in zip(positions, length):
                mask[pos:(pos + le)] = 1
            masks[:, :, idx] = mask.reshape(256, 1600, order='F')
    return fname, masks

## Dataloader

In [6]:
class SteelDataset(Dataset):
    def __init__(self, df, data_folder, mean, std, phase):
        self.df = df
        self.root = data_folder
        self.mean = mean
        self.std = std
        self.phase = phase
        self.transforms = get_transforms(phase, mean, std)
        self.fnames = self.df.index.tolist()

    def __getitem__(self, idx):
        image_id, mask = make_mask(idx, self.df)
        image_path = os.path.join(self.root, "train_images",  image_id)
        img = cv2.imread(image_path)
        augmented = self.transforms(image=img, mask=mask)
        img = augmented['image']
        mask = augmented['mask'] # 1x256x1600x4
        mask = mask[0].permute(2, 0, 1) # 1x4x256x1600
        return img, mask

    def __len__(self):
        return len(self.fnames)


def get_transforms(phase, mean, std):
    list_transforms = []
    if phase == "train":
        list_transforms.extend(
            [
                HorizontalFlip(p=0.5), # only horizontal flip as of now
            ]
        )
    list_transforms.extend(
        [
            Normalize(mean=mean, std=std, p=1),
            ToTensor(),
        ]
    )
    list_trfms = Compose(list_transforms)
    return list_trfms

class FiveBalanceClassSampler(Sampler):

    def __init__(self, dataset):
        self.dataset = dataset

        # label = (self.dataset.df['Label'].values)
        # label = label.reshape(-1,4)
        # label = np.hstack([label.sum(1,keepdims=True)==0,label]).T

        # self.neg_index  = np.where(label[0])[0]
        # self.pos1_index = np.where(label[1])[0]
        # self.pos2_index = np.where(label[2])[0]
        # self.pos3_index = np.where(label[3])[0]
        # self.pos4_index = np.where(label[4])[0]
        
        self.neg_index = np.where(self.dataset.df['defects'] == 0)[0]
        self.pos1_index = np.where(~ pd.isna(self.dataset.df['class_1']))[0]
        self.pos2_index = np.where(~ pd.isna(self.dataset.df['class_2']))[0]
        self.pos3_index = np.where(~ pd.isna(self.dataset.df['class_3']))[0]
        self.pos4_index = np.where(~ pd.isna(self.dataset.df['class_4']))[0]

        #5x
        self.num_image = len(self.dataset.df)//4
        self.length = self.num_image*5


    def __iter__(self):
        # neg = self.neg_index.copy()
        # random.shuffle(neg)

        neg  = np.random.choice(self.neg_index,  self.num_image, replace=True)
        pos1 = np.random.choice(self.pos1_index, self.num_image, replace=True)
        pos2 = np.random.choice(self.pos2_index, self.num_image, replace=True)
        pos3 = np.random.choice(self.pos3_index, self.num_image, replace=True)
        pos4 = np.random.choice(self.pos4_index, self.num_image, replace=True)
    
        l = np.stack([neg,pos1,pos2,pos3,pos4]).T
        l = l.reshape(-1)
        return iter(l)

    def __len__(self):
        return self.length

def provider(
    data_folder,
    df_path,
    phase,
    mean=None,
    std=None,
    batch_size=8,
    num_workers=4,
):
    '''Returns dataloader for the model training'''
    df = pd.read_csv(df_path)
    uid = list(df['ImageId_ClassId'].str.split('_').str[0].unique())
    df['Class'] = df['ImageId_ClassId'].str[-1].astype(np.int32)
    df['Label'] = (df['EncodedPixels']!='').astype(np.int32)
    df = df_loc_by_list(df, 'ImageId_ClassId', 
                        [ u.split('/')[-1] + '_%d'%c  for u in uid for c in [1,2,3,4] ])
    
    
    # https://www.kaggle.com/amanooo/defect-detection-starter-u-net
    df['ImageId'], df['ClassId'] = zip(*df['ImageId_ClassId'].str.split('_'))
    df['ClassId'] = df['ClassId'].astype(int)
    df = df.pivot(index='ImageId',columns='ClassId',values='EncodedPixels')
    
    # added by Frank
    # df.reset_index(inplace=True)
    df = df.rename(columns={df.columns[0]: "class_1",
                            df.columns[1]: "class_2",
                            df.columns[2]: "class_3",
                            df.columns[3]: "class_4"})

    df['defects'] = df.count(axis=1)
    
    train_df, val_df = train_test_split(df, test_size=0.2, stratify=df["defects"], random_state=69)
    df = train_df if phase == "train" else val_df
    image_dataset = SteelDataset(df, data_folder, mean, std, phase)
    dataloader = DataLoader(
        image_dataset,
        batch_size=batch_size,
        num_workers=num_workers,
        pin_memory=True,
        # shuffle=True,   
        sampler=FiveBalanceClassSampler(image_dataset)
    )

    return dataloader

In [7]:
# def df_loc_by_list(df, key, values):
#     df = df.loc[df[key].isin(values)]
#     df = df.assign(sort = pd.Categorical(df[key], categories=values, ordered=True))
#     df = df.sort_values('sort')
#     #df = df.reset_index()
#     df = df.drop('sort', axis=1)
#     return  df

# df = pd.read_csv('/home/frank/Dropbox/Project/kaggle_severstal/input/train.csv')
# uid = list(df['ImageId_ClassId'].str.split('_').str[0].unique())
# df['Class'] = df['ImageId_ClassId'].str[-1].astype(np.int32)
# df['Label'] = (df['EncodedPixels']!='').astype(np.int32)
# df = df_loc_by_list(df, 'ImageId_ClassId', 
#                     [ u.split('/')[-1] + '_%d'%c  for u in uid for c in [1,2,3,4] ])


# # https://www.kaggle.com/amanooo/defect-detection-starter-u-net
# df['ImageId'], df['ClassId'] = zip(*df['ImageId_ClassId'].str.split('_'))
# df['ClassId'] = df['ClassId'].astype(int)
# df = df.pivot(index='ImageId',columns='ClassId',values='EncodedPixels')

# # added by Frank
# # df.reset_index(inplace=True)
# df = df.rename(columns={df.columns[0]: "class_1",
#                         df.columns[1]: "class_2",
#                         df.columns[2]: "class_3",
#                         df.columns[3]: "class_4"})

# df['defects'] = df.count(axis=1)

# train_df, val_df = train_test_split(df, test_size=0.2, stratify=df["defects"], random_state=69)

# phase = 'train'
# mean=(0.485, 0.456, 0.406),
# std=(0.229, 0.224, 0.225)
# data_folder = '/home/frank/Dropbox/Project/kaggle_severstal/input/'
# df = train_df if phase == "train" else val_df
# image_dataset = SteelDataset(df, data_folder, mean, std, phase)
# len(image_dataset)

In [8]:
# df = pd.read_csv('/home/frank/Dropbox/Project/kaggle_severstal/input/train.csv')
# # uid = list(df['ImageId_ClassId'].str.split('_').str[0].unique())
# # df['Class'] = df['ImageId_ClassId'].str[-1].astype(np.int32)
# # df['Label'] = (df['EncodedPixels']!='').astype(np.int32)
# # df = df_loc_by_list(df, 'ImageId_ClassId', 
# #                     [ u.split('/')[-1] + '_%d'%c  for u in uid for c in [1,2,3,4] ])


# # https://www.kaggle.com/amanooo/defect-detection-starter-u-net
# df['ImageId'], df['ClassId'] = zip(*df['ImageId_ClassId'].str.split('_'))
# df['ClassId'] = df['ClassId'].astype(int)
# df = df.pivot(index='ImageId',columns='ClassId',values='EncodedPixels')
# # df.reset_index(inplace=True)
# # df = df.rename(columns={df.columns[1]: "class_1",
# #                         df.columns[2]: "class_2",
# #                         df.columns[3]: "class_3",
# #                         df.columns[4]: "class_4"})
# df = df.rename(columns={df.columns[1]: "class_1",
#                         df.columns[2]: "class_2",
#                         df.columns[3]: "class_3"})
# df['defects'] = df.count(axis=1)

In [9]:
# df2 = pd.read_csv('/home/frank/Dropbox/Project/kaggle_severstal/input/train.csv')
# uid = list(df2['ImageId_ClassId'].str.split('_').str[0].unique())
# df2['Class'] = df2['ImageId_ClassId'].str[-1].astype(np.int32)
# df2['Label'] = (df2['EncodedPixels']!='').astype(np.int32)
# df2 = df_loc_by_list(df2, 'ImageId_ClassId', 
#                     [ u.split('/')[-1] + '_%d'%c  for u in uid for c in [1,2,3,4] ])


## Some more utility functions

Dice and IoU metric implementations, metric logger for training and validation.

In [10]:
def df_loc_by_list(df, key, values):
    df = df.loc[df[key].isin(values)]
    df = df.assign(sort = pd.Categorical(df[key], categories=values, ordered=True))
    df = df.sort_values('sort')
    #df = df.reset_index()
    df = df.drop('sort', axis=1)
    return  df

def predict(X, threshold):
    '''X is sigmoid output of the model'''
    X_p = np.copy(X)
    preds = (X_p > threshold).astype('uint8')
    return preds

def metric(probability, truth, threshold=0.5, reduction='none'):
    '''Calculates dice of positive and negative images seperately'''
    '''probability and truth must be torch tensors'''
    batch_size = len(truth)
    with torch.no_grad():
        probability = probability.view(batch_size, -1)
        truth = truth.view(batch_size, -1)
        assert(probability.shape == truth.shape)

        p = (probability > threshold).float()
        t = (truth > 0.5).float()

        t_sum = t.sum(-1)
        p_sum = p.sum(-1)
        neg_index = torch.nonzero(t_sum == 0)
        pos_index = torch.nonzero(t_sum >= 1)

        dice_neg = (p_sum == 0).float()
        dice_pos = 2 * (p*t).sum(-1)/((p+t).sum(-1))

        dice_neg = dice_neg[neg_index]
        dice_pos = dice_pos[pos_index]
        dice = torch.cat([dice_pos, dice_neg])

        dice_neg = np.nan_to_num(dice_neg.mean().item(), 0)
        dice_pos = np.nan_to_num(dice_pos.mean().item(), 0)
        dice = dice.mean().item()

        num_neg = len(neg_index)
        num_pos = len(pos_index)

    return dice, dice_neg, dice_pos, num_neg, num_pos

class Meter:
    '''A meter to keep track of iou and dice scores throughout an epoch'''
    def __init__(self, phase, epoch):
        self.base_threshold = 0.5 # <<<<<<<<<<< here's the threshold
        self.base_dice_scores = []
        self.dice_neg_scores = []
        self.dice_pos_scores = []
        self.iou_scores = []

    def update(self, targets, outputs):
        probs = torch.sigmoid(outputs)
        dice, dice_neg, dice_pos, _, _ = metric(probs, targets, self.base_threshold)
        self.base_dice_scores.append(dice)
        self.dice_pos_scores.append(dice_pos)
        self.dice_neg_scores.append(dice_neg)
        preds = predict(probs, self.base_threshold)
        iou = compute_iou_batch(preds, targets, classes=[1])
        self.iou_scores.append(iou)

    def get_metrics(self):
        dice = np.mean(self.base_dice_scores)
        dice_neg = np.mean(self.dice_neg_scores)
        dice_pos = np.mean(self.dice_pos_scores)
        dices = [dice, dice_neg, dice_pos]
        iou = np.nanmean(self.iou_scores)
        return dices, iou

def epoch_log(phase, epoch, epoch_loss, meter, start):
    '''logging the metrics at the end of an epoch'''
    dices, iou = meter.get_metrics()
    dice, dice_neg, dice_pos = dices
    print("Loss: %0.4f | IoU: %0.4f | dice: %0.4f | dice_neg: %0.4f | dice_pos: %0.4f" % (epoch_loss, iou, dice, dice_neg, dice_pos))
    return dice, iou

def compute_ious(pred, label, classes, ignore_index=255, only_present=True):
    '''computes iou for one ground truth mask and predicted mask'''
    pred[label == ignore_index] = 0
    ious = []
    for c in classes:
        label_c = label == c
        if only_present and np.sum(label_c) == 0:
            ious.append(np.nan)
            continue
        pred_c = pred == c
        intersection = np.logical_and(pred_c, label_c).sum()
        union = np.logical_or(pred_c, label_c).sum()
        if union != 0:
            ious.append(intersection / union)
    return ious if ious else [1]

def compute_iou_batch(outputs, labels, classes=None):
    '''computes mean iou for a batch of ground truth masks and predicted masks'''
    ious = []
    preds = np.copy(outputs) # copy is imp
    labels = np.array(labels) # tensor to np
    for pred, label in zip(preds, labels):
        ious.append(np.nanmean(compute_ious(pred, label, classes)))
    iou = np.nanmean(ious)
    return iou


## Model Initialization

In [11]:
# !mkdir -p /tmp/.cache/torch/checkpoints/
# !cp ../input/resnet18/resnet18.pth /tmp/.cache/torch/checkpoints/resnet18-5c106cde.pth

In [12]:
# model = Unet("resnet18", encoder_weights="imagenet", classes=4, activation=None)
model = smp.FPN(encoder_name, encoder_weights="imagenet", classes=4, activation=None)

In [13]:
# callbacks = [EarlyStopping(monitor='val_loss', patience=2)]
# model.set_callbacks(callbacks)

In [14]:
model # a *deeper* look

FPN(
  (encoder): EfficientNetEncoder(
    (_conv_stem): Conv2dStaticSamePadding(
      3, 32, kernel_size=(3, 3), stride=(2, 2), bias=False
      (static_padding): ZeroPad2d(padding=(0, 1, 0, 1), value=0.0)
    )
    (_bn0): BatchNorm2d(32, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True)
    (_blocks): ModuleList(
      (0): MBConvBlock(
        (_depthwise_conv): Conv2dStaticSamePadding(
          32, 32, kernel_size=(3, 3), stride=[1, 1], groups=32, bias=False
          (static_padding): ZeroPad2d(padding=(1, 1, 1, 1), value=0.0)
        )
        (_bn1): BatchNorm2d(32, eps=0.001, momentum=0.010000000000000009, affine=True, track_running_stats=True)
        (_se_reduce): Conv2dStaticSamePadding(
          32, 8, kernel_size=(1, 1), stride=(1, 1)
          (static_padding): Identity()
        )
        (_se_expand): Conv2dStaticSamePadding(
          8, 32, kernel_size=(1, 1), stride=(1, 1)
          (static_padding): Identity()
        )
        (_p

### Training and Validation

In [15]:
class Trainer(object):
    '''This class takes care of training and validation of our model'''
    def __init__(self, model):
        self.num_workers = 3
        self.batch_size = {"train": 8, "val": 8}
        self.accumulation_steps = 32 // self.batch_size['train']
        self.lr = 5e-4
        self.num_epochs = 200
        self.best_loss = float("inf")
        self.phases = ["train", "val"]
        self.device = torch.device("cuda")
        torch.set_default_tensor_type("torch.cuda.FloatTensor")
        self.net = model
        self.criterion = torch.nn.BCEWithLogitsLoss()
        # self.optimizer = optim.Adam(self.net.parameters(), lr=self.lr)
        self.optimizer = RAdam(self.net.parameters(), lr=self.lr)
#         base_opt = optim.SGD(self.net.parameters(), 
#                              lr=self.lr,
#                              momentum=0.9, 
#                              weight_decay=0.0001)
#         self.optimizer = SWA(base_opt, swa_start=15, 
#                              swa_freq=5, swa_lr=2.5e-4)
        self.scheduler = ReduceLROnPlateau(self.optimizer, mode="min", patience=3, verbose=True)
        self.net = self.net.to(self.device)
        cudnn.benchmark = True
        self.dataloaders = {
            phase: provider(
                data_folder=data_folder,
                df_path=train_df_path,
                phase=phase,
                mean=(0.485, 0.456, 0.406),
                std=(0.229, 0.224, 0.225),
                batch_size=self.batch_size[phase],
                num_workers=self.num_workers,
            )
            for phase in self.phases
        }
        self.losses = {phase: [] for phase in self.phases}
        self.iou_scores = {phase: [] for phase in self.phases}
        self.dice_scores = {phase: [] for phase in self.phases}
        
    def forward(self, images, targets):
        images = images.to(self.device)
        masks = targets.to(self.device)
        outputs = self.net(images)
        loss = self.criterion(outputs, masks)
        return loss, outputs

    def iterate(self, epoch, phase):
        meter = Meter(phase, epoch)
        start = time.strftime("%H:%M:%S")
        print(f"Starting epoch: {epoch} | phase: {phase} | ⏰: {start}")
        batch_size = self.batch_size[phase]
        self.net.train(phase == "train")
        dataloader = self.dataloaders[phase]
        running_loss = 0.0
        total_batches = len(dataloader)
#         tk0 = tqdm(dataloader, total=total_batches)
        self.optimizer.zero_grad()
        for itr, batch in enumerate(dataloader): # replace `dataloader` with `tk0` for tqdm
            images, targets = batch
            loss, outputs = self.forward(images, targets)
            loss = loss / self.accumulation_steps
            if phase == "train":
                loss.backward()
                if (itr + 1 ) % self.accumulation_steps == 0:
                    self.optimizer.step()
                    self.optimizer.zero_grad()
            running_loss += loss.item()
            outputs = outputs.detach().cpu()
            meter.update(targets, outputs)
#             tk0.set_postfix(loss=(running_loss / ((itr + 1))))
        epoch_loss = (running_loss * self.accumulation_steps) / total_batches
        dice, iou = epoch_log(phase, epoch, epoch_loss, meter, start)
        self.losses[phase].append(epoch_loss)
        self.dice_scores[phase].append(dice)
        self.iou_scores[phase].append(iou)
        torch.cuda.empty_cache()
        return epoch_loss

    def start(self):
        for epoch in range(self.num_epochs):
            self.iterate(epoch, "train")
#             if epoch+1 == self.num_epochs:
#                 self.optimizer.swap_swa_sgd()
            state = {
                "epoch": epoch,
                "best_loss": self.best_loss,
                "state_dict": self.net.state_dict(),
                "optimizer": self.optimizer.state_dict(),
            }
            with torch.no_grad():
                val_loss = self.iterate(epoch, "val")
                self.scheduler.step(val_loss)
            if val_loss < self.best_loss:
                print("******** New optimal found, saving state ********")
                state["best_loss"] = self.best_loss = val_loss
                torch.save(state, "./"+encoder_name+"_RAdam_classBal_bestModel.pth")
            torch.save(state, "./"+encoder_name+"_RAdam_classBal_model_epoch_"+str(epoch)+".pth")
            print()

In [16]:
sample_submission_path = '../input/sample_submission.csv'
train_df_path = '../input/train.csv'
data_folder = "../input/"
test_data_folder = "../input/test_images"

In [17]:
model_trainer = Trainer(model)
model_trainer.start()

Starting epoch: 0 | phase: train | ⏰: 23:46:59
Loss: 0.0991 | IoU: 0.1535 | dice: 0.2064 | dice_neg: 0.1677 | dice_pos: 0.2171
Starting epoch: 0 | phase: val | ⏰: 23:58:34
Loss: 0.0180 | IoU: 0.3033 | dice: 0.4598 | dice_neg: 0.6349 | dice_pos: 0.4154
******** New optimal found, saving state ********

Starting epoch: 1 | phase: train | ⏰: 00:00:31
Loss: 0.0174 | IoU: 0.3438 | dice: 0.4888 | dice_neg: 0.6025 | dice_pos: 0.4629
Starting epoch: 1 | phase: val | ⏰: 00:11:56
Loss: 0.0151 | IoU: 0.4455 | dice: 0.6118 | dice_neg: 0.7366 | dice_pos: 0.5812
******** New optimal found, saving state ********

Starting epoch: 2 | phase: train | ⏰: 00:13:49
Loss: 0.0142 | IoU: 0.4251 | dice: 0.5941 | dice_neg: 0.7651 | dice_pos: 0.5545
Starting epoch: 2 | phase: val | ⏰: 00:24:40
Loss: 0.0141 | IoU: 0.4724 | dice: 0.6554 | dice_neg: 0.8321 | dice_pos: 0.6101
******** New optimal found, saving state ********

Starting epoch: 3 | phase: train | ⏰: 00:26:18
Loss: 0.0120 | IoU: 0.4751 | dice: 0.6476 | 

Loss: 0.0063 | IoU: 0.6593 | dice: 0.8060 | dice_neg: 0.9274 | dice_pos: 0.7777
Starting epoch: 30 | phase: val | ⏰: 06:12:32
Loss: 0.0140 | IoU: 0.5267 | dice: 0.7107 | dice_neg: 0.9288 | dice_pos: 0.6567

Starting epoch: 31 | phase: train | ⏰: 06:14:09
Loss: 0.0062 | IoU: 0.6634 | dice: 0.8102 | dice_neg: 0.9316 | dice_pos: 0.7809
Starting epoch: 31 | phase: val | ⏰: 06:24:48
Loss: 0.0147 | IoU: 0.5291 | dice: 0.7137 | dice_neg: 0.9262 | dice_pos: 0.6603

Starting epoch: 32 | phase: train | ⏰: 06:26:25
Loss: 0.0063 | IoU: 0.6624 | dice: 0.8096 | dice_neg: 0.9319 | dice_pos: 0.7806
Starting epoch: 32 | phase: val | ⏰: 06:37:08
Loss: 0.0152 | IoU: 0.5413 | dice: 0.7219 | dice_neg: 0.9262 | dice_pos: 0.6726

Starting epoch: 33 | phase: train | ⏰: 06:38:49
Loss: 0.0062 | IoU: 0.6621 | dice: 0.8098 | dice_neg: 0.9306 | dice_pos: 0.7803
Starting epoch: 33 | phase: val | ⏰: 06:49:41
Loss: 0.0152 | IoU: 0.5269 | dice: 0.7161 | dice_neg: 0.9377 | dice_pos: 0.6587

Starting epoch: 34 | phase: 

KeyboardInterrupt: 

In [None]:
# PLOT TRAINING
losses = model_trainer.losses
dice_scores = model_trainer.dice_scores # overall dice
iou_scores = model_trainer.iou_scores

def plot(scores, name):
    plt.figure(figsize=(15,5))
    plt.plot(range(len(scores["train"])), scores["train"], label=f'train {name}')
    plt.plot(range(len(scores["train"])), scores["val"], label=f'val {name}')
    plt.title(f'{name} plot'); plt.xlabel('Epoch'); plt.ylabel(f'{name}');
    plt.legend(); 
    plt.show()

plot(losses, "BCE loss")
plot(dice_scores, "Dice score")
plot(iou_scores, "IoU score")

## Test prediction and submission

This training and validation takes about ~400 minutes which exceeds Kaggle's GPU usage limit of 60 minutes, we won't be able to submit the `submission.csv` file generated from this kernel. So, for test prediction and submission I've written a separate [UNet inference kernel](https://www.kaggle.com/rishabhiitbhu/unet-pytorch-inference-kernel), make sure you add the `model.pth` file generated from this kernel as dataset to test inference kernel.

I've used resnet-18 architecture in this kernel. It scores ~0.89 on LB. Try to play around with other architectures of `segmenation_models.pytorch` and see what works best for you, let me know in the comments :) and do upvote if you liked this kernel, I need some medals too. 😬

## Refrences:

Few kernels from which I've borrowed some cod[](http://)e:

* https://www.kaggle.com/amanooo/defect-detection-starter-u-net
* https://www.kaggle.com/go1dfish/clear-mask-visualization-and-simple-eda

A big thank you to all those who share their code on Kaggle, I'm nobody without you guys. I've learnt a lot from fellow kagglers, special shout-out to [@Abhishek](https://www.kaggle.com/abhishek), [@Yury](https://www.kaggle.com/deyury), [@Heng](https://www.kaggle.com/hengck23), [@Ekhtiar](https://www.kaggle.com/ekhtiar), [@lafoss](https://www.kaggle.com/iafoss), [@Siddhartha](https://www.kaggle.com/meaninglesslives), [@xhulu](https://www.kaggle.com/xhlulu), and the list goes on..