## UNet starter for Steel defect detection challenge


This kernel uses a UNet model with pretrained resnet18 encoder for this challenge, with simple augmentations using albumentations library, uses BCE loss, metrics like Dice and IoU. I've used [segmentation_models.pytorch](https://github.com/qubvel/segmentation_models.pytorch) which comes with a lot pre-implemented segmentation architectures. This is a modified version of my previous [kernel](https://www.kaggle.com/rishabhiitbhu/unet-with-resnet34-encoder-pytorch) for [siim-acr-pneumothorax-segmentation](https://www.kaggle.com/c/siim-acr-pneumothorax-segmentation/) competition.

**As internet is not allowed for this competition, I tried installing `segmentation_models.pytorch` by source using pip but due to some reasons it didn't work. So, as a [Jugaad](https://en.wikipedia.org/wiki/Jugaad) I took all of `segmentation_models.pytorch`'s UNet code and wrote it in a single file and added it as a dataset so as to use it for this kernel, its dependency [pretrained-models.pytorch](https://github.com/Cadene/pretrained-models.pytorch) is also added as a dataset.

In [1]:
# ! pip install git+https://github.com/qubvel/segmentation_models.pytorch --upgrade

In [2]:
encoder_name = 'resnet34'
mask_size = 1600

In [3]:
# !pip install ../input/pretrainedmodels/pretrainedmodels-0.7.4/pretrainedmodels-0.7.4/ > /dev/null # no output
# package_path = './unetmodelscript/' # add unet script dataset
# import sys
# sys.path.append(package_path)
# from model import Unet # import Unet model from the script
import segmentation_models_pytorch as smp #import Unet, FPN, PSPNet

## Imports

In [4]:
import os
os.environ['CUDA_VISIBLE_DEVICES']='1'
import cv2
import pdb
import time
import warnings
import random
import numpy as np
import pandas as pd
from tqdm import tqdm_notebook as tqdm
from torch.optim.lr_scheduler import ReduceLROnPlateau
from sklearn.model_selection import train_test_split
import torch
import torch.nn as nn
from torch.nn import functional as F
import torch.optim as optim
import torch.backends.cudnn as cudnn
from torch.utils.data import DataLoader, Dataset, sampler
from matplotlib import pyplot as plt
from albumentations import (HorizontalFlip, ShiftScaleRotate, Normalize, Resize, Compose, GaussNoise)
from albumentations.torch import ToTensor
warnings.filterwarnings("ignore")
seed = 69
random.seed(seed)
os.environ["PYTHONHASHSEED"] = str(seed)
np.random.seed(seed)
torch.cuda.manual_seed(seed)
torch.backends.cudnn.deterministic = True

from radam import RAdam

from torchsample.callbacks import EarlyStopping
from lovasz_loss import LovaszSoftmax

## RLE-Mask utility functions

In [5]:
#https://www.kaggle.com/paulorzp/rle-functions-run-lenght-encode-decode
def mask2rle(img):
    '''
    img: numpy array, 1 -> mask, 0 -> background
    Returns run length as string formated
    '''
    pixels= img.T.flatten()
    pixels = np.concatenate([[0], pixels, [0]])
    runs = np.where(pixels[1:] != pixels[:-1])[0] + 1
    runs[1::2] -= runs[::2]
    return ' '.join(str(x) for x in runs)

def make_mask(row_id, df):
    '''Given a row index, return image_id and mask (256, 1600, 4) from the dataframe `df`'''
    fname = df.iloc[row_id].name
    labels = df.iloc[row_id][:4]
    masks = np.zeros((256, mask_size, 4), dtype=np.float32) # float32 is V.Imp
    # 4:class 1～4 (ch:0～3)

    for idx, label in enumerate(labels.values):
        if label is not np.nan:
            label = label.split(" ")
            positions = map(int, label[0::2])
            length = map(int, label[1::2])
            mask = np.zeros(256 * mask_size, dtype=np.uint8)
            for pos, le in zip(positions, length):
                mask[pos:(pos + le)] = 1
            masks[:, :, idx] = mask.reshape(256, mask_size, order='F')
    return fname, masks

## Dataloader

In [6]:
class SteelDataset(Dataset):
    def __init__(self, df, data_folder, mean, std, phase):
        self.df = df
        self.root = data_folder
        self.mean = mean
        self.std = std
        self.phase = phase
        self.transforms = get_transforms(phase, mean, std)
        self.fnames = self.df.index.tolist()

    def __getitem__(self, idx):
        image_id, mask = make_mask(idx, self.df)
        image_path = os.path.join(self.root, "train_images",  image_id)
        img = cv2.imread(image_path)
        augmented = self.transforms(image=img, mask=mask)
        img = augmented['image']
        mask = augmented['mask'] # 1x256x1600x4
        mask = mask[0].permute(2, 0, 1) # 1x4x256x1600
        return img, mask

    def __len__(self):
        return len(self.fnames)


def get_transforms(phase, mean, std):
    list_transforms = []
    if phase == "train":
        list_transforms.extend(
            [
                HorizontalFlip(p=0.5), # only horizontal flip as of now
            ]
        )
    list_transforms.extend(
        [
            Resize(256, mask_size),
            Normalize(mean=mean, std=std, p=1),
            ToTensor(),
        ]
    )
    list_trfms = Compose(list_transforms)
    return list_trfms

def provider(
    data_folder,
    df_path,
    phase,
    mean=None,
    std=None,
    batch_size=8,
    num_workers=4,
):
    '''Returns dataloader for the model training'''
    df = pd.read_csv(df_path)
    # https://www.kaggle.com/amanooo/defect-detection-starter-u-net
    df['ImageId'], df['ClassId'] = zip(*df['ImageId_ClassId'].str.split('_'))
    df['ClassId'] = df['ClassId'].astype(int)
    df = df.pivot(index='ImageId',columns='ClassId',values='EncodedPixels')
    df['defects'] = df.count(axis=1)
    
    train_df, val_df = train_test_split(df, test_size=0.2, stratify=df["defects"], random_state=69)
    df = train_df if phase == "train" else val_df
    image_dataset = SteelDataset(df, data_folder, mean, std, phase)
    dataloader = DataLoader(
        image_dataset,
        batch_size=batch_size,
        num_workers=num_workers,
        pin_memory=True,
        shuffle=True,   
    )

    return dataloader


## Some more utility functions

Dice and IoU metric implementations, metric logger for training and validation.

In [7]:
def predict(X, threshold):
    '''X is sigmoid output of the model'''
    X_p = np.copy(X)
    preds = (X_p > threshold).astype('uint8')
    return preds

def metric(probability, truth, threshold=0.5, reduction='none'):
    '''Calculates dice of positive and negative images seperately'''
    '''probability and truth must be torch tensors'''
    batch_size = len(truth)
    with torch.no_grad():
        probability = probability.view(batch_size, -1)
        truth = truth.view(batch_size, -1)
        assert(probability.shape == truth.shape)

        p = (probability > threshold).float()
        t = (truth > 0.5).float()

        t_sum = t.sum(-1)
        p_sum = p.sum(-1)
        neg_index = torch.nonzero(t_sum == 0)
        pos_index = torch.nonzero(t_sum >= 1)

        dice_neg = (p_sum == 0).float()
        dice_pos = 2 * (p*t).sum(-1)/((p+t).sum(-1))

        dice_neg = dice_neg[neg_index]
        dice_pos = dice_pos[pos_index]
        dice = torch.cat([dice_pos, dice_neg])

        dice_neg = np.nan_to_num(dice_neg.mean().item(), 0)
        dice_pos = np.nan_to_num(dice_pos.mean().item(), 0)
        dice = dice.mean().item()

        num_neg = len(neg_index)
        num_pos = len(pos_index)

    return dice, dice_neg, dice_pos, num_neg, num_pos

class Meter:
    '''A meter to keep track of iou and dice scores throughout an epoch'''
    def __init__(self, phase, epoch):
        self.base_threshold = 0.5 # <<<<<<<<<<< here's the threshold
        self.base_dice_scores = []
        self.dice_neg_scores = []
        self.dice_pos_scores = []
        self.iou_scores = []

    def update(self, targets, outputs):
        probs = torch.sigmoid(outputs)
        dice, dice_neg, dice_pos, _, _ = metric(probs, targets, self.base_threshold)
        self.base_dice_scores.append(dice)
        self.dice_pos_scores.append(dice_pos)
        self.dice_neg_scores.append(dice_neg)
        preds = predict(probs, self.base_threshold)
        iou = compute_iou_batch(preds, targets, classes=[1])
        self.iou_scores.append(iou)

    def get_metrics(self):
        dice = np.mean(self.base_dice_scores)
        dice_neg = np.mean(self.dice_neg_scores)
        dice_pos = np.mean(self.dice_pos_scores)
        dices = [dice, dice_neg, dice_pos]
        iou = np.nanmean(self.iou_scores)
        return dices, iou

def epoch_log(phase, epoch, epoch_loss, meter, start):
    '''logging the metrics at the end of an epoch'''
    dices, iou = meter.get_metrics()
    dice, dice_neg, dice_pos = dices
    print("Loss: %0.4f | IoU: %0.4f | dice: %0.4f | dice_neg: %0.4f | dice_pos: %0.4f" % (epoch_loss, iou, dice, dice_neg, dice_pos))
    return dice, iou

def compute_ious(pred, label, classes, ignore_index=255, only_present=True):
    '''computes iou for one ground truth mask and predicted mask'''
    pred[label == ignore_index] = 0
    ious = []
    for c in classes:
        label_c = label == c
        if only_present and np.sum(label_c) == 0:
            ious.append(np.nan)
            continue
        pred_c = pred == c
        intersection = np.logical_and(pred_c, label_c).sum()
        union = np.logical_or(pred_c, label_c).sum()
        if union != 0:
            ious.append(intersection / union)
    return ious if ious else [1]

def compute_iou_batch(outputs, labels, classes=None):
    '''computes mean iou for a batch of ground truth masks and predicted masks'''
    ious = []
    preds = np.copy(outputs) # copy is imp
    labels = np.array(labels) # tensor to np
    for pred, label in zip(preds, labels):
        ious.append(np.nanmean(compute_ious(pred, label, classes)))
    iou = np.nanmean(ious)
    return iou


## Loss

In [8]:
class BCESoftDiceLoss(nn.Module):
    def __init__(self, weight=None, size_average=True, w1=1, w2=1):
        super(BCESoftDiceLoss, self).__init__()
        self.bce_loss = nn.BCELoss(weight, size_average)
        self.w1 = w1
        self.w2 = w2
        
    def forward(self, logits, targets):
        # BCELoss2d
        probs        = torch.sigmoid(logits)
        probs_flat   = probs.view (-1)
        targets_flat = targets.view(-1)
        bce_loss = self.bce_loss(probs_flat, targets_flat)
        
        # SoftDiceLoss
        num = targets.size(0)
        probs = torch.sigmoid(logits)
        m1  = probs.view(num,-1)
        m2  = targets.view(num,-1)
        intersection = (m1 * m2)

        score = 2. * (intersection.sum(1)+1) / (m1.sum(1) + m2.sum(1)+1)
        soft_dice_loss = 1- score.sum()/num
        
        return self.w1 * bce_loss + self.w2 * soft_dice_loss

## Model Initialization

In [9]:
# !mkdir -p /tmp/.cache/torch/checkpoints/
# !cp ../input/resnet18/resnet18.pth /tmp/.cache/torch/checkpoints/resnet18-5c106cde.pth

In [10]:
# model = Unet("resnet18", encoder_weights="imagenet", classes=4, activation=None)
model = smp.FPN(encoder_name, encoder_weights="imagenet", classes=4, activation=None)

In [11]:
# callbacks = [EarlyStopping(monitor='val_loss', patience=2)]
# model.set_callbacks(callbacks)

In [12]:
model # a *deeper* look

FPN(
  (encoder): ResNetEncoder(
    (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
    (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (relu): ReLU(inplace=True)
    (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
    (layer1): Sequential(
      (0): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
        (relu): ReLU(inplace=True)
        (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (1): BasicBlock(
        (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
        (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_

### Training and Validation

In [13]:
class Trainer(object):
    '''This class takes care of training and validation of our model'''
    def __init__(self, model):
        self.num_workers = 3
        self.batch_size = {"train": 16, "val": 16}
        self.accumulation_steps = 32 // self.batch_size['train']
        self.lr = 5e-4
        self.num_epochs = 1000
        self.best_loss = float("inf")
        self.phases = ["train", "val"]
        self.device = torch.device("cuda:0")
        torch.set_default_tensor_type("torch.cuda.FloatTensor")
        self.net = model
        self.criterion = torch.nn.BCEWithLogitsLoss()
        # self.criterion = LovaszSoftmax()
        # self.criterion = BCESoftDiceLoss(w1=1, w2=1)
        # self.optimizer = optim.Adam(self.net.parameters(), lr=self.lr)
        self.optimizer = RAdam(self.net.parameters(), lr=self.lr)
        self.scheduler = ReduceLROnPlateau(self.optimizer, mode="min", patience=3, verbose=True)
        self.net = self.net.to(self.device)
        cudnn.benchmark = True
        self.dataloaders = {
            phase: provider(
                data_folder=data_folder,
                df_path=train_df_path,
                phase=phase,
                mean=(0.485, 0.456, 0.406),
                std=(0.229, 0.224, 0.225),
                batch_size=self.batch_size[phase],
                num_workers=self.num_workers,
            )
            for phase in self.phases
        }
        self.losses = {phase: [] for phase in self.phases}
        self.iou_scores = {phase: [] for phase in self.phases}
        self.dice_scores = {phase: [] for phase in self.phases}
        
    def forward(self, images, targets):
        images = images.to(self.device)
        masks = targets.to(self.device)
        outputs = self.net(images)
        loss = self.criterion(outputs, masks)
        return loss, outputs

    def iterate(self, epoch, phase):
        meter = Meter(phase, epoch)
        start = time.strftime("%H:%M:%S")
        print(f"Starting epoch: {epoch} | phase: {phase} | ⏰: {start}")
        batch_size = self.batch_size[phase]
        self.net.train(phase == "train")
        dataloader = self.dataloaders[phase]
        running_loss = 0.0
        total_batches = len(dataloader)
#         tk0 = tqdm(dataloader, total=total_batches)
        self.optimizer.zero_grad()
        for itr, batch in enumerate(dataloader): # replace `dataloader` with `tk0` for tqdm
            images, targets = batch
            loss, outputs = self.forward(images, targets)
            loss = loss / self.accumulation_steps
            if phase == "train":
                loss.backward()
                if (itr + 1 ) % self.accumulation_steps == 0:
                    self.optimizer.step()
                    self.optimizer.zero_grad()
            running_loss += loss.item()
            outputs = outputs.detach().cpu()
            meter.update(targets, outputs)
#             tk0.set_postfix(loss=(running_loss / ((itr + 1))))
        epoch_loss = (running_loss * self.accumulation_steps) / total_batches
        dice, iou = epoch_log(phase, epoch, epoch_loss, meter, start)
        self.losses[phase].append(epoch_loss)
        self.dice_scores[phase].append(dice)
        self.iou_scores[phase].append(iou)
        torch.cuda.empty_cache()
        return epoch_loss

    def start(self):
        for epoch in range(self.num_epochs):
            self.iterate(epoch, "train")
            state = {
                "epoch": epoch,
                "best_loss": self.best_loss,
                "state_dict": self.net.state_dict(),
                "optimizer": self.optimizer.state_dict(),
            }
            with torch.no_grad():
                val_loss = self.iterate(epoch, "val")
                self.scheduler.step(val_loss)
            if val_loss < self.best_loss:
                print("******** New optimal found, saving state ********")
                state["best_loss"] = self.best_loss = val_loss
                torch.save(state, "./"+encoder_name+"_model.pth")
            if epoch % 10 == 0:
                torch.save(state, "./"+encoder_name+"_model_epoch_"+str(epoch)+".pth")
            print()


In [14]:
sample_submission_path = '../input/sample_submission.csv'
train_df_path = '../input/train.csv'
data_folder = "../input/"
test_data_folder = "../input/test_images"

In [15]:
model_trainer = Trainer(model)
model_trainer.start()

Starting epoch: 0 | phase: train | ⏰: 20:58:32
Loss: 0.0778 | IoU: 0.1881 | dice: 0.4258 | dice_neg: 0.6255 | dice_pos: 0.2590
Starting epoch: 0 | phase: val | ⏰: 21:05:55
Loss: 0.0141 | IoU: 0.3197 | dice: 0.6524 | dice_neg: 0.9192 | dice_pos: 0.4181
******** New optimal found, saving state ********

Starting epoch: 1 | phase: train | ⏰: 21:07:10
Loss: 0.0150 | IoU: 0.3453 | dice: 0.6258 | dice_neg: 0.8385 | dice_pos: 0.4488
Starting epoch: 1 | phase: val | ⏰: 21:14:24
Loss: 0.0136 | IoU: 0.4181 | dice: 0.6784 | dice_neg: 0.8440 | dice_pos: 0.5297
******** New optimal found, saving state ********

Starting epoch: 2 | phase: train | ⏰: 21:15:39
Loss: 0.0131 | IoU: 0.4022 | dice: 0.6702 | dice_neg: 0.8556 | dice_pos: 0.5150
Starting epoch: 2 | phase: val | ⏰: 21:22:53
Loss: 0.0123 | IoU: 0.4205 | dice: 0.7224 | dice_neg: 0.9205 | dice_pos: 0.5376
******** New optimal found, saving state ********

Starting epoch: 3 | phase: train | ⏰: 21:24:08
Loss: 0.0124 | IoU: 0.4333 | dice: 0.6930 | 

Loss: 0.0101 | IoU: 0.5746 | dice: 0.8119 | dice_neg: 0.9361 | dice_pos: 0.6930

Starting epoch: 30 | phase: train | ⏰: 01:13:14
Loss: 0.0053 | IoU: 0.6581 | dice: 0.8615 | dice_neg: 0.9633 | dice_pos: 0.7741
Starting epoch: 30 | phase: val | ⏰: 01:20:36
Loss: 0.0103 | IoU: 0.5735 | dice: 0.8096 | dice_neg: 0.9391 | dice_pos: 0.6967
Epoch    30: reducing learning rate of group 0 to 5.0000e-09.

Starting epoch: 31 | phase: train | ⏰: 01:21:53
Loss: 0.0053 | IoU: 0.6565 | dice: 0.8598 | dice_neg: 0.9623 | dice_pos: 0.7722
Starting epoch: 31 | phase: val | ⏰: 01:29:12
Loss: 0.0101 | IoU: 0.5761 | dice: 0.8101 | dice_neg: 0.9333 | dice_pos: 0.6988

Starting epoch: 32 | phase: train | ⏰: 01:30:29
Loss: 0.0054 | IoU: 0.6563 | dice: 0.8576 | dice_neg: 0.9566 | dice_pos: 0.7723
Starting epoch: 32 | phase: val | ⏰: 01:37:50
Loss: 0.0103 | IoU: 0.5702 | dice: 0.8104 | dice_neg: 0.9349 | dice_pos: 0.6925

Starting epoch: 33 | phase: train | ⏰: 01:39:07
Loss: 0.0054 | IoU: 0.6546 | dice: 0.8575 | 

Loss: 0.0102 | IoU: 0.5761 | dice: 0.8124 | dice_neg: 0.9419 | dice_pos: 0.6992

Starting epoch: 62 | phase: train | ⏰: 05:52:29
Loss: 0.0054 | IoU: 0.6566 | dice: 0.8586 | dice_neg: 0.9612 | dice_pos: 0.7725
Starting epoch: 62 | phase: val | ⏰: 05:59:56
Loss: 0.0103 | IoU: 0.5679 | dice: 0.8131 | dice_neg: 0.9493 | dice_pos: 0.6850

Starting epoch: 63 | phase: train | ⏰: 06:01:16
Loss: 0.0053 | IoU: 0.6588 | dice: 0.8610 | dice_neg: 0.9617 | dice_pos: 0.7738
Starting epoch: 63 | phase: val | ⏰: 06:08:46
Loss: 0.0102 | IoU: 0.5738 | dice: 0.8112 | dice_neg: 0.9376 | dice_pos: 0.6967

Starting epoch: 64 | phase: train | ⏰: 06:10:05
Loss: 0.0053 | IoU: 0.6578 | dice: 0.8580 | dice_neg: 0.9573 | dice_pos: 0.7736
Starting epoch: 64 | phase: val | ⏰: 06:17:26
Loss: 0.0102 | IoU: 0.5757 | dice: 0.8118 | dice_neg: 0.9393 | dice_pos: 0.6984

Starting epoch: 65 | phase: train | ⏰: 06:18:44
Loss: 0.0053 | IoU: 0.6589 | dice: 0.8596 | dice_neg: 0.9585 | dice_pos: 0.7746
Starting epoch: 65 | phase

Loss: 0.0054 | IoU: 0.6536 | dice: 0.8577 | dice_neg: 0.9600 | dice_pos: 0.7700
Starting epoch: 94 | phase: val | ⏰: 10:41:07
Loss: 0.0103 | IoU: 0.5737 | dice: 0.8077 | dice_neg: 0.9290 | dice_pos: 0.6970

Starting epoch: 95 | phase: train | ⏰: 10:42:23
Loss: 0.0053 | IoU: 0.6591 | dice: 0.8614 | dice_neg: 0.9645 | dice_pos: 0.7751
Starting epoch: 95 | phase: val | ⏰: 10:49:44
Loss: 0.0102 | IoU: 0.5679 | dice: 0.8108 | dice_neg: 0.9417 | dice_pos: 0.6913

Starting epoch: 96 | phase: train | ⏰: 10:51:00
Loss: 0.0054 | IoU: 0.6571 | dice: 0.8589 | dice_neg: 0.9585 | dice_pos: 0.7734
Starting epoch: 96 | phase: val | ⏰: 10:58:21
Loss: 0.0102 | IoU: 0.5695 | dice: 0.8100 | dice_neg: 0.9401 | dice_pos: 0.6918

Starting epoch: 97 | phase: train | ⏰: 10:59:38
Loss: 0.0053 | IoU: 0.6589 | dice: 0.8588 | dice_neg: 0.9576 | dice_pos: 0.7742
Starting epoch: 97 | phase: val | ⏰: 11:06:58
Loss: 0.0102 | IoU: 0.5749 | dice: 0.8111 | dice_neg: 0.9380 | dice_pos: 0.6972

Starting epoch: 98 | phase: 

Loss: 0.0053 | IoU: 0.6564 | dice: 0.8596 | dice_neg: 0.9619 | dice_pos: 0.7722
Starting epoch: 126 | phase: val | ⏰: 15:20:12
Loss: 0.0102 | IoU: 0.5683 | dice: 0.8110 | dice_neg: 0.9436 | dice_pos: 0.6915

Starting epoch: 127 | phase: train | ⏰: 15:21:33
Loss: 0.0053 | IoU: 0.6574 | dice: 0.8593 | dice_neg: 0.9605 | dice_pos: 0.7734
Starting epoch: 127 | phase: val | ⏰: 15:29:08
Loss: 0.0104 | IoU: 0.5671 | dice: 0.8111 | dice_neg: 0.9425 | dice_pos: 0.6894

Starting epoch: 128 | phase: train | ⏰: 15:30:30
Loss: 0.0053 | IoU: 0.6575 | dice: 0.8577 | dice_neg: 0.9567 | dice_pos: 0.7731
Starting epoch: 128 | phase: val | ⏰: 15:38:05
Loss: 0.0104 | IoU: 0.5672 | dice: 0.8092 | dice_neg: 0.9335 | dice_pos: 0.6900

Starting epoch: 129 | phase: train | ⏰: 15:39:26
Loss: 0.0053 | IoU: 0.6568 | dice: 0.8585 | dice_neg: 0.9600 | dice_pos: 0.7730
Starting epoch: 129 | phase: val | ⏰: 15:47:05
Loss: 0.0103 | IoU: 0.5725 | dice: 0.8136 | dice_neg: 0.9484 | dice_pos: 0.6950

Starting epoch: 130 |

Loss: 0.0053 | IoU: 0.6562 | dice: 0.8596 | dice_neg: 0.9611 | dice_pos: 0.7723
Starting epoch: 158 | phase: val | ⏰: 20:04:22
Loss: 0.0102 | IoU: 0.5743 | dice: 0.8137 | dice_neg: 0.9463 | dice_pos: 0.6966

Starting epoch: 159 | phase: train | ⏰: 20:05:40
Loss: 0.0053 | IoU: 0.6563 | dice: 0.8591 | dice_neg: 0.9611 | dice_pos: 0.7721
Starting epoch: 159 | phase: val | ⏰: 20:13:05
Loss: 0.0103 | IoU: 0.5712 | dice: 0.8141 | dice_neg: 0.9471 | dice_pos: 0.6887

Starting epoch: 160 | phase: train | ⏰: 20:14:26
Loss: 0.0054 | IoU: 0.6558 | dice: 0.8585 | dice_neg: 0.9572 | dice_pos: 0.7715
Starting epoch: 160 | phase: val | ⏰: 20:22:04
Loss: 0.0102 | IoU: 0.5747 | dice: 0.8115 | dice_neg: 0.9451 | dice_pos: 0.6928

Starting epoch: 161 | phase: train | ⏰: 20:23:35
Loss: 0.0054 | IoU: 0.6564 | dice: 0.8571 | dice_neg: 0.9553 | dice_pos: 0.7730
Starting epoch: 161 | phase: val | ⏰: 20:31:03
Loss: 0.0104 | IoU: 0.5700 | dice: 0.8102 | dice_neg: 0.9403 | dice_pos: 0.6929

Starting epoch: 162 |

KeyboardInterrupt: 

In [None]:
# PLOT TRAINING
losses = model_trainer.losses
dice_scores = model_trainer.dice_scores # overall dice
iou_scores = model_trainer.iou_scores

def plot(scores, name):
    plt.figure(figsize=(15,5))
    plt.plot(range(len(scores["train"])), scores["train"], label=f'train {name}')
    plt.plot(range(len(scores["train"])), scores["val"], label=f'val {name}')
    plt.title(f'{name} plot'); plt.xlabel('Epoch'); plt.ylabel(f'{name}');
    plt.legend(); 
    plt.show()

plot(losses, "BCE loss")
plot(dice_scores, "Dice score")
plot(iou_scores, "IoU score")

## Test prediction and submission

This training and validation takes about ~400 minutes which exceeds Kaggle's GPU usage limit of 60 minutes, we won't be able to submit the `submission.csv` file generated from this kernel. So, for test prediction and submission I've written a separate [UNet inference kernel](https://www.kaggle.com/rishabhiitbhu/unet-pytorch-inference-kernel), make sure you add the `model.pth` file generated from this kernel as dataset to test inference kernel.

I've used resnet-18 architecture in this kernel. It scores ~0.89 on LB. Try to play around with other architectures of `segmenation_models.pytorch` and see what works best for you, let me know in the comments :) and do upvote if you liked this kernel, I need some medals too. 😬

## Refrences:

Few kernels from which I've borrowed some cod[](http://)e:

* https://www.kaggle.com/amanooo/defect-detection-starter-u-net
* https://www.kaggle.com/go1dfish/clear-mask-visualization-and-simple-eda

A big thank you to all those who share their code on Kaggle, I'm nobody without you guys. I've learnt a lot from fellow kagglers, special shout-out to [@Abhishek](https://www.kaggle.com/abhishek), [@Yury](https://www.kaggle.com/deyury), [@Heng](https://www.kaggle.com/hengck23), [@Ekhtiar](https://www.kaggle.com/ekhtiar), [@lafoss](https://www.kaggle.com/iafoss), [@Siddhartha](https://www.kaggle.com/meaninglesslives), [@xhulu](https://www.kaggle.com/xhlulu), and the list goes on..