<a href="https://colab.research.google.com/github/sadeelmu/deeplearning/blob/main/Semantic_segmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<style>
r { color: Red }
o { color: Orange }
g { color: Green }
b { color: Blue }
l { color: lighblue }
</style>


<html>
<body>
<table style="border: 0; rules=none; font-size:28px">
<tr>
<th rowspan=5><img width="200px", height="70px" src="https://raw.githubusercontent.com/camma-public/multibypass140/master/static/camma_logo_tr.png"/></th>
<td colspan=2 style="font-size:16px; color:blue; font-weight:bold"><h1><b>Deep Learning for Computer Vision</b></h1></td>
<th rowspan=5><img width="200px", height="130px" src="https://community.sap.com/legacyfs/online/storage/blog_attachments/2019/10/283545_NeuralNetwork_R_blue.png"/></th>
</tr>
<tr><td>Instructor:</td><td>Dr. Chinedu Nwoye</td></tr>
<tr><td colspan=2>(c) Research Group CAMMA</td></tr>
<tr><td colspan=2>University of Strasbourg</td></tr>
<tr><td>Website:</td><td><g>http://camma.u-strasbg.fr</g></td></tr>
<tr><td colspan=4 style="text-align:centre; background-color:black; font-weight:bold"><center><h3><o>Semantic Segmentation</o></td></center></tr>
</table>
</body>
</html>


--------

### Instructions

- In this lab session we will train a deep learning model for semantic segmentation.
- You will practise and learn how to organize your custom dataset and build data loading pipeline for your experiment.
- You will learn to design metrics and loss functions for your tasks.
- You will practise and learn to train and monitor your model optimization on your dataset.
- Your will be required to complete all the TODO tasks in this exercise. Ask the instructor if you get in a hole.
- You will be required to innovate tricks using all you have learnt in previous classes to improve your model performance.

### GPU activation

- Be sure to have cuda enabled from your computer.

### Imports

- Every experiments starts with importing the required libraries.
- Check and see what libraries you don't know their usage.



In [None]:
!pip install torchmetrics
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader, Subset
import torchvision
from torchvision.transforms import ToTensor, ToPILImage, Resize, CenterCrop, ConvertImageDtype, Normalize
from torchmetrics import JaccardIndex
import matplotlib.pyplot as plt
import importlib as ipl
import numpy as np
import random
import pickle
import os
import glob
import time
import urllib
from timeit import default_timer as timer
import gc
from zipfile import ZipFile
from PIL import Image, ImageColor

# check the PyTorch version;
print("PyTorch version: ", torch.__version__)
print("torchvision version: ", torchvision.__version__)

# check the GPU support; shold be yes
print("Is GPU available?: ", torch.cuda.is_available())


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


# Section 1: Dataset
In this lab session, we will use two datasets:

<br><hr><hr><br>

1.   Semantic Segmentation of Underwater Imagery (SUIM): A Large Scale Dataset for underwater Semantic Segmentation

<img src="https://storage.googleapis.com/kaggle-datasets-images/1557385/2565717/11a9865b8b2d8f692e45f51c3468a339/dataset-cover.jpg?t=2021-08-28-11-25-08" alt="Drawing" style="width: 200px;"/>

<br><hr><hr><br>

2.   m2caiSeg: A Surgical Dataset for Semantic Segmentation of Laparoscopic Images

<img src="https://storage.googleapis.com/kaggle-datasets-images/1025913/1728780/461f50a0dde9c8d5fdc4b6f8459165e4/dataset-cover.png?t=2020-12-09-22-58-52" alt="Drawing" style="width: 14%"/>

<br><hr><hr><br>

**[1.1] Download data**
-
Make a choice of dataset to use and download it

In [None]:
# run to download dataset and unzip the folder

options = ["suim", "m2caiseg"]

dataset_choice = ... # TODO

if dataset_choice == 'suim':
  !wget --backups 0 https://seafile.unistra.fr/f/17341900616245fb8c0c/?dl=1 --content-disposition &&  unzip -qq suim.zip -d /suim

elif dataset_choice ==  'm2caiseg':
  !wget --backups 0 https://seafile.unistra.fr/f/a09df056d97f4de7a507/?dl=1 --content-disposition &&  unzip -qq m2caiSeg.zip -d /

else:
    os.error("No Dataset selected from {}".format(options))

In [None]:
os.listdir("/suim")

**[1.2] Dataset Meta***
- Information about the dataset.
- This usually provided on the dataset documentation, website or publication releasing the dataset.

In [None]:
# SUIM META
suim_class = ["background_waterbody", "human_divers", "aquatic_plants_and_sea_grass", "wrecks_and_ruins", "robots", "reefs_and_invertebrates", "fish_and_vertebrates", "sea_floor_and_rocks"]
suim_color = [(0,0,0), (0,0,255), (0,255,0), (0,255,255), (255,0,0), (255,0,255), (255,255,0), (255,255,255)]

# M2CAI META
m2caiseg_class = ["Unknown","Grasper", "Bipolar", "Hook", "Scissors", "Clipper", "Irrigator", "Specimen bag", "Trocars", "Clip", "Liver", "Gallbladder", "Fat","Upper Wall", "Intestine","Artery","Bile", "Blood", "Black"]
m2caiseg_color = [[170,0,85],[0,85,170],[0,85,255],[0,170,85],[0,255,85],[0,255,170],[85,0,170],[85,0,255],[170,85,85],[170,170,170],[85,170,0],[85,170,255],[85,255,0],[85,255,170],[170,0,255],[255,0,255],[255,255,0],[255,0,0],[0,0,0]]

# DATA ROOT
filepaths = {
    'suim': '/suim',
    'm2caiseg': '/m2caiSeg',
}

# COMBINE
color_maps = {
    "suim": suim_color,
     'm2caiseg': m2caiseg_color,

}

class_maps = {
    "suim": suim_class,
     'm2caiseg': m2caiseg_class,
}

image_dirs = {
    "suim": "images",
    'm2caiseg': 'images'
}

mask_dirs = {
    "suim": "masks",
    'm2caiseg': 'groundtruth',
}

image_exts = {
    "suim": ".jpg",
     'm2caiseg': '.jpg'
}


mask_exts = {
    "suim": ".bmp",
     'm2caiseg': '.png'
}

filepath = filepaths[dataset_choice]
color_mapping = color_maps[dataset_choice]
classes = class_maps[dataset_choice]
image_dir = image_dirs[dataset_choice]
mask_dir = mask_dirs[dataset_choice]
image_ext = image_exts[dataset_choice]
mask_ext = mask_exts[dataset_choice]


**[1.3] Helper functions**
- Some functions you will be needing

In [None]:
# ******** IMPORTANT HELPER FUNCTIONS FOR THIS LAB ****************

# Show the contents of your dataset folder
# It is important to see how your dataset is organized.
# You can also do this using file explorer
def show_folder_structure(startpath):
    assert os.path.exists(startpath), "File path does not exist!"
    for root, dirs, files in os.walk(startpath):
        level = root.replace(startpath, '').count(os.sep)
        indent = ' ' * 4 * (level)
        print('{}{}/'.format(indent, os.path.basename(root)))
        subindent = ' ' * 4 * (level + 1)
        for j,f in enumerate(sorted(files[:3])):
            print('{}{}'.format(subindent, f))
            if j==2:
              print(subindent,'...')
              print('{}{}'.format(subindent, files[-1]))


# Display images and labels
def display_image(images, titles):
    f, axes = plt.subplots(1, len(images), sharey=True)
    for i in range(len(images)):
        axes[i].imshow(images[i])
        axes[i].set_title(titles[i], fontsize=8, color= 'blue')
    plt.show()


# Convert segmentation mask from RGB to Semantic channel.
# RGB channel = 3 (reg, green, blue)
# Semantic channel = N, where N = number of classes, one channel per class
def rgb_to_semantic(image, color_mapping):
    image_array = np.array(image)
    repeated_image = np.repeat(image_array[:, :, np.newaxis, :], len(color_mapping), axis=2) # [rgb channels] x number of classes
    repeated_mapping = np.repeat(np.array(list(color_mapping))[np.newaxis, np.newaxis, :, :], image_array.shape[0], axis=0) # [semantic channels] x number of classes
    maskND = np.all(repeated_image == repeated_mapping, axis=-1).astype(np.uint8) # Equality broadcast
    return maskND


# Convert segmentation mask with semantic channel to a single channel
# Use NumPy broadcasting to assign the keys to the matching pixels
# Each pixel takes the class categorical value
def nD_to_1D(maskND):
    mask1D = np.argmax(maskND, axis=-1)
    return mask1D


# Convert semantic channel mask to rgb channel image
# Create an array of RGB values corresponding to keys in the mapping
# And Map the keys in the image to their corresponding RGB values
def semantic_to_rgb(mapped_image, color_mapping):
    color_array = np.array(color_mapping, dtype=np.uint8)
    rgb_image = color_array[mapped_image]
    return rgb_image



**[1.4] Data Inspection**

- The first step in training a model is understanding your data.
- We will analyze the data to see what it look like and how we can use it.

In [None]:
# folder structure exploration

show_folder_structure(filepath)


In [None]:
## Uncomment this if you want to use explorer to explore your dataset
# from google.colab import drive
# drive.mount('/content/drive')

In [None]:
# check data size

train_size = len(os.listdir(os.path.join(filepath, "train", image_dir)))
val_size = len(os.listdir(os.path.join(filepath, "val", image_dir)))
test_size = len(os.listdir(os.path.join(filepath, "test", image_dir)))

print("Size | train: {}, val: {}, test:{}".format(train_size, val_size, test_size))

In [None]:
# show one image and label to see what they are like, in actual sense, you need to check many examples.

selected_img_file = ... # TODO
selected_msk_file = ... # TODO

selected_img_file = os.listdir(os.path.join(filepath, "train", image_dir))[2]
selected_msk_file = os.listdir(os.path.join(filepath, "train", mask_dir))[2]

img1_url = os.path.join(filepath, "train", image_dir, selected_img_file)
msk1_url = os.path.join(filepath, "train", mask_dir, selected_msk_file)
rgb_img1 = Image.open(img1_url).convert("RGB")
rgb_msk1 = Image.open(msk1_url).convert("RGB")
rgb_img1 = np.array(rgb_img1)
rgb_msk1 = np.array(rgb_msk1)


display_image(images=[rgb_img1, rgb_msk1], titles=['image', 'mask'])


In [None]:
# Check your data shape and distribution
# This is very important to understand your data

print("Image shape = ", rgb_img1.shape)
print("Mask shape = ", rgb_msk1.shape)

print("Image distribution: [Min = {}, Mean = {}, Max = {}] ".format(rgb_img1.min(), rgb_img1.mean(), rgb_img1.max()))
print("Mask distribution: [Min = {}, Mean = {}, Max = {}] ".format(rgb_msk1.min(), rgb_msk1.mean(), rgb_msk1.max()))


In [None]:
# Data Label Processing
# You are to process your label to have it in a format that your model can use.
# You have seen the shape and it has RGB channel but your mode will need a semantic channel
# Semantic label means N channel where N = number of classes

# 1. Convert RGB channel to semantic channel mask.
# To do this, you must know the color code mapping of semantic class.
semantic_mask_ND = rgb_to_semantic(rgb_msk1, color_mapping)

# 2. Convert semantic mask to single channel mask
# We can now visualize the converted semantic channel mask,
# So we convert it to single channel with each pixel having the channel index with maximum value
semantic_mask_1D = nD_to_1D(semantic_mask_ND)

# 3. Recover rgb mask
# We can convert the single channel easily to the RGB channel to visual the mask.
# If you didn't get back your original RGB mask, it means your label processing code is not correct.
recovered_rgb_msk1 = semantic_to_rgb(semantic_mask_1D, color_mapping)

# 4. Visualize
display_image(images=[rgb_img1, rgb_msk1, semantic_mask_1D, recovered_rgb_msk1],
              titles=['Image', 'RGB mask', "Semantic mask", "Reversed RGB mask"])


# NB: We will only need the semantic channel mask for model training, the rest is for visualization purpose.

**[1.3] Data loader**
- Now, you can comfortably manipulate your dataset, you can build a dataloader to handle that.
- Previously, we relied on PyTorch's `torchvision.datasets.ImageFolder` library to load our data.
- But this data set is not structured as previous where every image is in its respective label folder.
- So, we have to write a custom data loader that can load our dataset
- The goal is to return a pair (image, mask) from your dataset at every call.

**[1.2] Data preparation**
- We will be packaging the data into a PyTorch datasets.
- We will add basic preprocessing of image resizing,
- We will split the data into train/val/test sets,
- We will build a data loader for each split with a batch size of 32 for speed and convenience,
- We will also create a small size train data.

In [None]:
# We define a dataset class that delivers images and correponding ground truth segmentation masks

class MyDataset(torch.utils.data.Dataset):
    # Your Dataset class will inherit torch.utils.data.Dataset
    # There are 3 most important function to overider here
    # 1. `init` function: This prepare your dataset like a stack of data that are indexable
    # 2. `len` function: This return the total number of data you have
    # 3. `getitem` function: This return individual (image, target) on each call
    # You can write other functions that can help these 3 fulfill their duties
    def __init__(self, root_dir="/muis", data_split="train", image_dir="images", mask_dir="masks",
                 image_ext=".jpg", mask_ext=".bmp", image_transforms=ToTensor(), mask_transforms=ToTensor(),
                 color_mapping=None):
        np.random.seed(13)
        image_paths = os.path.join(root_dir, data_split, image_dir, "*{}".format(image_ext))
        self.images = sorted(glob.glob(image_paths))
        self.masks  = [img.replace(image_dir, mask_dir).replace(image_ext, mask_ext) for img in self.images]
        self.image_transforms = image_transforms
        self.mask_transforms = mask_transforms
        self.color_mapping = color_mapping


    def __len__(self):
        return len(self.images)


    def __getitem__(self, index):
        img = Image.open(self.images[index]).convert("RGB")
        msk = Image.open(self.masks[index]).convert("RGB")
        img = self.image_transforms(img)
        msk = self.mask_transforms(msk)
        msk = self.rgb_to_semantic_mask(msk)
        return img, msk


    def rgb_to_semantic_mask(self, mask):
        mask  = (mask * 255.0).long()
        mask_flat = mask.view(3, -1).t()
        mapper = torch.tensor(list(self.color_mapping))
        indices = torch.argmax((mask_flat.unsqueeze(1) == mapper.unsqueeze(0)).all(dim=-1).int(), dim=-1)
        mask1D = indices.view(mask.shape[1], mask.shape[2])
        maskND = torch.eye(len(self.color_mapping), dtype=torch.float32)[mask1D].permute(2,0,1)
        return maskND



In [None]:
# Data Transformation
# This is where you can write all your data preprocessing and data augmentation function
# It is always preferable to have different transformation for the training and evaluation sets

mean_imagenet = [0.485, 0.456, 0.406]
std_imagenet  = [0.485, 0.456, 0.406]
base_size = 200
img_size = [224, 224]


train_image_transforms = torchvision.transforms.Compose([
    ToTensor(),
    # CenterCrop(base_size),
    Resize(size=(224,224)),
    Normalize(mean=mean_imagenet, std=std_imagenet),
])

train_mask_transforms = torchvision.transforms.Compose([
    ToTensor(),
    # CenterCrop(base_size),
    Resize(size=(224,224), interpolation=torchvision.transforms.InterpolationMode.NEAREST_EXACT),
])

eval_image_transforms = torchvision.transforms.Compose([
    ToTensor(),
    Resize(size=(224,224)),
    Normalize(mean=mean_imagenet, std=std_imagenet),
])

eval_mask_transforms = torchvision.transforms.Compose([
    ToTensor(),
    Resize(size=(224,224), interpolation=torchvision.transforms.InterpolationMode.NEAREST_EXACT),
])


In [None]:
# Test your dataloader # Be sure your data loader works as desired before using it to train your model

BATCH_SIZE = ... # TODO. Give a batch size to your data loader


# build dataset for different data split
train_dataset = MyDataset(root_dir=filepath, data_split="train", image_dir=image_dir, mask_dir=mask_dir,
                          image_ext=image_ext, mask_ext=mask_ext, image_transforms=train_image_transforms,
                          mask_transforms=train_mask_transforms, color_mapping=color_mapping)

val_dataset = MyDataset(root_dir=filepath, data_split="val", image_dir=image_dir, mask_dir=mask_dir,
                          image_ext=image_ext, mask_ext=mask_ext, image_transforms=eval_image_transforms,
                          mask_transforms=eval_mask_transforms, color_mapping=color_mapping)

test_dataset = MyDataset(root_dir=filepath, data_split="test", image_dir=image_dir, mask_dir=mask_dir,
                          image_ext=image_ext, mask_ext=mask_ext, image_transforms=eval_image_transforms,
                          mask_transforms=eval_mask_transforms, color_mapping=color_mapping)




# Build their loader, include a batch size, data shuffling and any other feature.
train_dataloader = ... # TODO
val_dataloader = ... # TODO
test_dataloader = ... # TODO


# Check one sample
image_i, mask_i = next(iter(train_dataloader))


print("Image shape = {} | Mask shape = {}".format(image_i.shape, mask_i.shape))

# Torch uses the channel-first tensor, we can transpose to channel-last to visualize
image_1 = image_i[0].permute(1,2,0)
mask_1 = mask_i[0].permute(1,2,0)


# Convert semantic mask to singel channel and final to rgb channel to visualize
semantic_mask_1D = nD_to_1D(mask_1)
recovered_rgb_msk = semantic_to_rgb(semantic_mask_1D, color_mapping)

# Plot
display_image(images=[image_1, recovered_rgb_msk],
               titles=['Image', 'Target mask'])


# Section 2: Model
- We will use a state-of-the-art segmentation model known as DeepLabv3
- We will leverage transfer learning from MS COCO dataset
- We will do surgery on the model to adapt to our dataset number of class

In [None]:
# Get state of the arts model from torchvision repository

seg_model = torchvision.models.segmentation.deeplabv3_resnet50(pretrained=True, progress=True, num_classes=21, aux_loss=None)


# Let's see the architecture of the model
# we look at only the classifier module
print(seg_model.classifier)

In [None]:
# Modify the last layer to have output channel matching your dataset classes
classes = list(range(19))
myClassifier = ... # TODO: Build a convolution layer to replace the layer convolution in your model. Keep in mind the number of input and output filters

seg_model.classifier[4] = myClassifier

# Inspect the architecture to see your new model
print(seg_model.classifier)

In [None]:
# Let check if all the model parameter are trainable

def is_trainable(module):
    status = np.all([param.requires_grad for param in module.parameters()])
    return status


print("Model Trainable = ", is_trainable(seg_model))

In [None]:
# Freeze the model so we don't train it because we are doing transfer learning with partial finetunning
for param in seg_model.parameters():
    param.requires_grad = False


# We want to finetune all the classifier layers only, so unfreeze that part
for param in ... : # TODO: Choose wisely layers to train
    param.requires_grad = True


print("Model's Backbone Trainable = ", is_trainable(seg_model.backbone))
print("Model's Classifier Trainable = ", is_trainable(seg_model.classifier))

In [None]:
# Run an inference to see how much it can do without training

device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

# Check one sample
image_i, gt_mask_i = next(iter(train_dataloader))
input_image = ... # TODO: put your data to a device

# inference
seg_model = seg_model.to(device)
seg_model.... # TODO: Switch your model into an evaluation mode
pd_mask = ... # TODO: Make an inference on the input image
pd_mask_i = pd_mask['out'].cpu()
print("Image shape = {} | GT Mask shape = {} | Pred Mask shape = {}".format(image_i.shape, gt_mask_i.shape, pd_mask_i.shape))

# Torch uses the channel-first tensor, we can transpose to channel-last to visualize
image_1   = image_i[0].... # TODO

# Convert semantic mask to singel channel and final to rgb channel to visualize
gt_mask_1  = gt_mask_i.softmax(1).argmax(1)[0]
pd_mask_1  = pd_mask_i.softmax(1).argmax(1)[0]
gt_rgb_msk = semantic_to_rgb(gt_mask_1, color_mapping)
pd_rgb_msk = semantic_to_rgb(pd_mask_1, color_mapping)

# Plot
display_image(images=[image_1, gt_rgb_msk, pd_rgb_msk],
               titles=['Image', 'Target mask', "Predicted mask"])

# Section 3: Training and Evaluation

- You have seen that pretrained model understanding some low-level details like shape, edges, etc. but they lack the class semantics.
- We will train this model by finetuning it on the new dataset for few epochs/iteration to adapt it to the daraser domain.

In [None]:
# For ease of mastery, we will reuse our previous training and evaluation code


# Training step
def train_step(inputs, labels, model, criterion, optimizer, device):
    model.train() # set to training mode
    inputs = inputs.to(device) # device allocation
    labels = labels.to(device) # device allocation
    outputs = model(inputs)['out'] # inference
    # TODO: In 4 lines of code, do the following:
    # 1. zero the parameter gradients in your optimizer
    # 2. compute loss
    # 3. backpropagation
    # 4. optimization step
    batch_loss = loss.item() * inputs.size(0) # loss performance
    return batch_loss


# Evaluation step
def eval_step(inputs, labels, model, metrics, device):
    model.eval() # set to evaluation mode
    with torch.no_grad(): # stop gradient computation
      inputs = inputs.to(device) # device allocation
      labels = labels.to(device) # device allocation
      outputs = model(inputs)['out'] # inference
      preds = outputs.softmax(1).argmax(1)
      targets = labels.softmax(1).argmax(1)
      batch_iou = metrics(preds, targets) # iou performance
      batch_accs = torch.sum(preds == targets).float()/targets.numel() # accuracy performance
    return batch_accs.item() * inputs.size(0), batch_iou.item() * inputs.size(0)


# train and validate cycle
def train_model(model, criterion, optimizer, scheduler, metrics, device, num_epochs=25):
    start = time.time()
    epoch_loss = []
    epoch_accs = []
    epoch_ious = []

    for epoch in range(num_epochs):
        # train loop
        running_loss = 0.0
        running_accs = 0
        for inputs, labels in train_dataloader: # Iterate over data.
            loss = train_step(inputs, labels, model, criterion, optimizer, device)
            running_loss += loss
        scheduler.step() # decay learning rate
        train_epoch_loss = running_loss / len(train_dataset)
        epoch_loss.append(train_epoch_loss)

        # validation loop
        running_accs = 0
        running_ious = 0
        for inputs, labels in val_dataloader: # Iterate over data.
            acc, iou = eval_step(inputs, labels, model, metrics, device)
            running_accs += acc
            running_ious += iou
        val_epoch_acc = running_accs / len(val_dataset)
        val_epoch_iou = running_ious / len(val_dataset)
        epoch_accs.append(val_epoch_acc)
        epoch_ious.append(val_epoch_iou)

        print('Epoch {}/{} >> TRAIN Loss: {:.4f} | VAL IoU: {:.4f} Acc: {:.4f}'.format(
                epoch, num_epochs-1, train_epoch_loss, val_epoch_iou, val_epoch_acc))
        print('-' * 10)

    # Reports
    time_elapsed = time.time() - start
    print('Training complete in {:.0f}m {:.0f}s'.format(time_elapsed // 60, time_elapsed % 60))

    fig = plt.figure()

    ax = fig.add_subplot(131)
    intervals = np.arange(len(epoch_loss))
    ax.plot(intervals, epoch_loss)
    ax.set_xlabel("epochs")
    ax.set_ylabel("segmentation loss")

    ax = fig.add_subplot(132)
    intervals = np.arange(len(epoch_accs))
    ax.plot(intervals, epoch_accs)
    ax.set_xlabel("epochs")
    ax.set_ylabel("validation accuracy")

    ax = fig.add_subplot(133)
    intervals = np.arange(len(epoch_ious))
    ax.plot(intervals, epoch_ious)
    ax.set_xlabel("epochs")
    ax.set_ylabel("validation IoU")

    print("Evaluation Accuracy = {} and IoU = {}".format(epoch_accs[-1], epoch_ious[-1]))

    # clear GPU cache memory
    torch.cuda.empty_cache()
    return model



def visualize_preds(model, dataloader, choice, device):
    iterloader = iter(dataloader)
    if choice >= len(iterloader):
        choice = 0
    for _ in range(choice-1):
        next(iterloader)
    model.eval()
    with torch.no_grad():
        inputs, labels = next(iterloader)
        inputs = inputs.to(device)
        labels = labels.to(device)
        outputs = model(inputs)['out'] # inference
        preds   = outputs.softmax(1).argmax(1)
        targets = labels.softmax(1).argmax(1)
    for img, pred, target in zip(inputs, preds, targets):
        img = img.permute(1,2,0).cpu()
        target = target.cpu()
        pred = pred.cpu()
        print(np.unique(pred))
        gt_rgb_msk = semantic_to_rgb(target, color_mapping)
        pd_rgb_msk = semantic_to_rgb(pred, color_mapping)
        display_image(images=[img.cpu(), gt_rgb_msk, pd_rgb_msk],
                  titles=['Image', 'Target mask', "Predicted mask"])
    return None




**[3.2] Train and validate**

In [None]:
# learning rate
LEARNING_RATE = 0.07

# resource allocation
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
seg_model  = seg_model.to(device)

# loss function
criterion = ... # TODO: Define your loss function
criterion = criterion.to(device)

# metrics function
metrics = JaccardIndex(task='multiclass', num_classes=len(classes))
metrics = metrics.to(device)

# optimizer
optimizer =  ... # TODO: Define an optimizer

# learning rate decay scheduler
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=5, gamma=0.1)

# training
trained_model = train_model(seg_model, criterion, optimizer, scheduler, metrics, device, num_epochs=25)

# Evaluate on test set

In [None]:
# Testing on the test data
running_ious = 0
running_accs = 0
for inputs, labels in ... : # TODO: Iterate over test data.
    acc, iou = eval_step(inputs, labels, trained_model, metrics, device)
    running_ious += iou
    running_accs += acc
train_epoch_acc = running_accs / len(test_dataset)
train_epoch_iou = running_ious / len(test_dataset)

print('TEST Acc: {:.4f} | IoU {:.4f}'.format( train_epoch_acc, train_epoch_iou ))


In [None]:
# Visualize the images and prediction, manually find the images that your model failed on.
iterloader = iter(test_dataloader)
N = len(test_dataloader)
choice = random.choice(list(range(N)))
print("Chosing batch {} out of {} batches".format(choice, N))

visualize_preds(trained_model, test_dataloader, choice, device)

# Section 4: Performance improvement

**[4.1] Model babysitting**

**Question**
- How can we improve the performance of the model?
- *This case can be similar to your internship projects and real-world problems.*



**Solutions**
- There can be a lot of things to try, I will list them and you will try them unassisted to see which works best for you.
- If you can train your model on small dataset and get a test performance equal or higher than the one we trained on the large training data, we will get a beautiful gift from me.


**List of strategies to try:**
1. Data augumentation
2. Weight regularization
3. Hyperparameter tuning (e.g. learning rate, batch size, weight decay, etc. *find more*)
4. Early stopping
5. Dropout
6. Batch normalization
7. Intuitive objective loss function
8. Some network layer modification
9. Change of model or parts of the model
10. Full retrain
11. *Find more*