# Homework 3.1: Dense Prediction
---
In this part, you will study a problem of segmentation. The goal of this assignment is to study, implement, and compare different components of dense prediction models, including **data augmentation**, **backbones**, **classifiers** and **losses**.

This assignment will require training multiple neural networks, therefore it is advised to use a **GPU** accelerator.

In [None]:
# Uncomment and run if in Colab
# !mkdir datasets
# !gdown --id 139GsP9CqFCW1LA1Mf3e1gZpWz2uXmfHf -O datasets/tiny-floodnet-challenge.tar.gz
# !tar -xzf datasets/tiny-floodnet-challenge.tar.gz -C datasets
# !rm datasets/tiny-floodnet-challenge.tar.gz
# !gdown --id 1Td3RKkTsBEn1lBULddEmXKHxKhXqz_LC
# !tar -xzf part1_semantic_segmentation.tar.gz
# !rm part1_semantic_segmentation.tar.gz

!pip install pytorch_lightning

In [1]:
import torch

In [2]:
from torch.nn import Conv2d, ReLU, BatchNorm2d, Dropout, ConvTranspose2d, MaxPool2d

class DownBlock(torch.nn.Module):
    def __init__(self, in_ch, mid_ch = None , out_ch = None):
        super(DownBlock, self).__init__()

        if out_ch == None: out_ch = 2*in_ch
        if mid_ch == None: mid_ch = out_ch

        self.conv1 = Conv2d(in_channels = in_ch , out_channels = mid_ch, kernel_size = 3, stride = 1, padding = 1)
        self.batn1 = BatchNorm2d(mid_ch) 
        self.act1  = ReLU()
        
        self.conv2 = Conv2d(in_channels = mid_ch , out_channels = out_ch , kernel_size = 3, stride = 1, padding = 1)
        self.batn2 = BatchNorm2d(out_ch)
        self.act2  = ReLU()
        
        self.pool  = MaxPool2d(kernel_size = 2, stride = 2, padding = 0)
    
    def forward(self, x):
        x = self.conv1(x)
        x = self.batn1(x) 
        x = self.act1(x)

        x = self.conv2(x)
        x = self.batn2(x)
        x = self.act2(x)

        skip = x

        x = self.pool(x)

        return x, skip

class BaseBlock(torch.nn.Module):
    def __init__(self, in_ch):
        super(BaseBlock, self).__init__()
        
        self.block = torch.nn.Sequential(
            Conv2d(in_channels=in_ch, out_channels = in_ch, kernel_size=3, stride=1, padding=1),
            Conv2d(in_channels=in_ch, out_channels = in_ch, kernel_size=3, stride=1, padding=1),
            ConvTranspose2d(in_channels = in_ch, out_channels= in_ch, kernel_size = 3, padding = 1, output_padding=1 , stride = 2)
        )

    def forward(self, x):
        return self.block(x)


class UpBlock(torch.nn.Module):
    def __init__(self, in_ch , mid_ch = None, out_ch = None, isUpconv = True):
        super(UpBlock, self).__init__()
        
        if out_ch == None: out_ch = in_ch*2 // 4
        if mid_ch == None: mid_ch = out_ch
        #self.skip = skip
        self.isUpconv = isUpconv

        self.drop1 = Dropout(p = 0.5)
        
        self.conv1 = Conv2d(in_channels = in_ch*2, out_channels = mid_ch, kernel_size = 3, stride= 1, padding=1)
        self.batn1 = BatchNorm2d(mid_ch)
        self.act1  = ReLU()
        
        self.conv2 = Conv2d(in_channels = mid_ch, out_channels = out_ch, kernel_size = 3, stride= 1, padding=1)
        self.batn2 = BatchNorm2d(out_ch)
        self.act2 = ReLU()
        
        if isUpconv:
            self.ups1  = ConvTranspose2d(in_channels = out_ch, out_channels= out_ch, kernel_size =3, padding =1, output_padding = 1, stride = 2)
        

    def forward(self, x, skip):

        x = torch.cat((skip, x), dim = 1)
        
        x = self.drop1(x)
        
        x = self.conv1(x)
        x = self.batn1(x)
        x = self.act1(x)

        x = self.conv2(x)
        x = self.batn2(x)
        x = self.act2(x)

        if self.isUpconv:
            x = self.ups1(x)
        
        return x 

class Unet(torch.nn.Module):
    def __init__(self):
        super(Unet, self).__init__()

        self.down1 = DownBlock(1,56,112)
        self.down2 = DownBlock(112,224)
        self.down3 = DownBlock(224,448)
        self.base  = BaseBlock(448)
        self.up1   = UpBlock(448)
        self.up2   = UpBlock(224)
        self.up3   = UpBlock(112, 112, 56, False)

    def forward(self, x):
        
        x, skip1 = self.down1(x)
        #print('down1: ', x.shape) 
        x, skip2 = self.down2(x)
        #print('down2: ', x.shape)
        x,skip3 = self.down3(x)
        #print('down3: ', x.shape)
        x = self.base(x)
        #print('base: ', x.shape)
        x = self.up1(x,skip3)
        #print('up1: ', x.shape)
        x = self.up2(x,skip2)
        #print('up2: ', x.shape)
        x = self.up3(x,skip1)

        
        return x



In [9]:
from torch.nn import Conv2d, ReLU, BatchNorm2d, Dropout, ConvTranspose2d, MaxPool2d

class DownBlock(torch.nn.Module):
    def __init__(self, in_ch, mid_ch = None , out_ch = None):
        super(DownBlock, self).__init__()

        if out_ch == None: out_ch = 2*in_ch
        if mid_ch == None: mid_ch = out_ch

        self.block = nn.Sequential(
            Conv2d(in_channels = in_ch , out_channels = mid_ch, kernel_size = 3, stride = 1, padding = 1),
            #BatchNorm2d(mid_ch),
            ReLU(), 

            Conv2d(in_channels = mid_ch , out_channels = out_ch , kernel_size = 3, stride = 1, padding = 1),
            #BatchNorm2d(out_ch),
            ReLU()
        )

        self.pool  = MaxPool2d(kernel_size = 2, stride = 2, padding = 0)
    
    def forward(self, x):

        x = self.block(x)
        
        skip = x

        x = self.pool(x)

        return x, skip

class BaseBlock(torch.nn.Module):
    def __init__(self, in_ch):
        super(BaseBlock, self).__init__()
        
        self.block = torch.nn.Sequential(
            Conv2d(in_channels=in_ch, out_channels = in_ch, kernel_size=3, stride=1, padding=1),
            #BatchNorm2d(in_ch),
            ReLU(),
            Conv2d(in_channels=in_ch, out_channels = in_ch, kernel_size=3, stride=1, padding=1),
            #BatchNorm2d(in_ch),
            ReLU(),
            ConvTranspose2d(in_channels = in_ch, out_channels= in_ch, kernel_size = 3, padding = 1, output_padding=1 , stride = 2)
        )

    def forward(self, x):
        return self.block(x)


class UpBlock(torch.nn.Module):
    def __init__(self, in_ch , mid_ch = None, out_ch = None, isUpconv = True):
        super(UpBlock, self).__init__()
        
        if out_ch == None: out_ch = in_ch*2 // 4
        if mid_ch == None: mid_ch = out_ch
        #self.skip = skip
        self.isUpconv = isUpconv

        self.block = nn.Sequential(
            Dropout(p = 0.5),
            Conv2d(in_channels = in_ch*2, out_channels = mid_ch, kernel_size = 3, stride= 1, padding=1),
            #BatchNorm2d(mid_ch),
            ReLU(),

            Conv2d(in_channels = mid_ch, out_channels = out_ch, kernel_size = 3, stride= 1, padding=1),
            #BatchNorm2d(out_ch),
            ReLU()
        )

        if isUpconv:
            self.ups1  = ConvTranspose2d(in_channels = out_ch, out_channels= out_ch, kernel_size =3, padding =1, output_padding = 1, stride = 2)
        

    def forward(self, x, skip):

        x = torch.cat((skip, x), dim = 1)
        
        x = self.block(x)

        if self.isUpconv:
            x = self.ups1(x)
        
        return x 

class Unet(torch.nn.Module):
    def __init__(self):
        super(Unet, self).__init__()

        self.down1 = DownBlock(1,56,112)
        self.down2 = DownBlock(112,224)
        self.down3 = DownBlock(224,448)
        self.base  = BaseBlock(448)
        self.up1   = UpBlock(448)
        self.up2   = UpBlock(224)
        self.up3   = UpBlock(112, 112, 56, False)

    def forward(self, x):
        
        x, skip1 = self.down1(x)
        #print('down1: ', x.shape) 
        x, skip2 = self.down2(x)
        #print('down2: ', x.shape)
        x,skip3 = self.down3(x)
        #print('down3: ', x.shape)
        x = self.base(x)
        #print('base: ', x.shape)
        x = self.up1(x,skip3)
        #print('up1: ', x.shape)
        x = self.up2(x,skip2)
        #print('up2: ', x.shape)
        x = self.up3(x,skip1)

        
        return x



import torch 
from torch import nn
class UnetDownBlock(nn.Module):
    def __init__(self, in_channels, out_channels, pooling=True):
        super().__init__()
        self.convs = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(out_channels),
        )
        self.maxpool = nn.MaxPool2d(kernel_size=2)
        
    def forward(self, x):
        out_before_pooling = self.convs(x)
        out = self.maxpool(out_before_pooling)

        return out, out_before_pooling
    
class UnetUpBlock(nn.Module):
    def __init__(self, in_channels, out_channels):
        super().__init__()
        self.upsample = nn.Upsample(scale_factor=2)
        self.convs = nn.Sequential(
            nn.Conv2d(in_channels * 2, out_channels, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.Conv2d(out_channels, out_channels, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.BatchNorm2d(out_channels),
        )
        
    def forward(self, x, x_bridge):
        x_up = self.upsample(x)
        x_concat = torch.cat([x_up, x_bridge], dim=1)
        out = self.convs(x_concat)
        
        return out

class Unet(nn.Module):
    def __init__(self, n_base_channels=64):
        super().__init__()
        self.down_blocks = nn.ModuleList([
            UnetDownBlock(3, n_base_channels),
            UnetDownBlock(n_base_channels, n_base_channels * 2),
            UnetDownBlock(n_base_channels * 2, n_base_channels * 4),
            UnetDownBlock(n_base_channels * 4, n_base_channels * 4),
            UnetDownBlock(n_base_channels * 4, n_base_channels * 4)
        ])
        self.up_blocks = nn.ModuleList([
            UnetUpBlock(n_base_channels * 4, n_base_channels * 4),
            UnetUpBlock(n_base_channels * 4, n_base_channels * 2),
            UnetUpBlock(n_base_channels * 2, n_base_channels),
            UnetUpBlock(n_base_channels, n_base_channels),
        ])
        self.final_block = nn.Sequential(
            nn.Conv2d(n_base_channels, 1, kernel_size=3, padding=1),
            nn.Sigmoid()
        )
            
        
    def forward(self, x):
        out = x
        outputs_before_pooling = []
        for i, block in enumerate(self.down_blocks):
            out, before_pooling = block(out)
            outputs_before_pooling.append(before_pooling)
        out = before_pooling
        
        for i, block in enumerate(self.up_blocks):
            out = block(out, outputs_before_pooling[-i - 2])
        out = self.final_block(out)
        
        return out

In [10]:
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

In [8]:
print(unet)

Unet(
  (down_blocks): ModuleList(
    (0): UnetDownBlock(
      (convs): Sequential(
        (0): Conv2d(3, 56, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(56, 56, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
        (4): BatchNorm2d(56, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    )
    (1): UnetDownBlock(
      (convs): Sequential(
        (0): Conv2d(56, 112, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (1): ReLU()
        (2): Conv2d(112, 112, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
        (3): ReLU()
        (4): BatchNorm2d(112, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
      (maxpool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
    )
    (2): UnetDownBlock(
      (convs): Sequential(
        (0): Co

In [18]:
unet = Unet()
tens = torch.rand(1,1,1000,1000)

In [20]:
unet = unet.to(device)
tens = tens.to(device)

In [21]:
res = unet(tens)
res.shape

RuntimeError: CUDA out of memory. Tried to allocate 108.00 MiB (GPU 0; 5.81 GiB total capacity; 4.24 GiB already allocated; 64.44 MiB free; 4.30 GiB reserved in total by PyTorch)

In [None]:
res.shape

In [36]:
test = torch.nn.Sequential(
    DownBlock(1,56,112),
    DownBlock(112,224),
    DownBlock(224,448),
    BaseBlock(448),
    
)

In [19]:
skip = torch.rand(10,448,64,64)
test = UpBlock(in_ch = 448)

In [20]:
tens = torch.rand(10, 448, 64,64)
tens.shape

torch.Size([10, 448, 64, 64])

In [22]:
res = test.forward(tens, tens)
res.shape

torch.Size([10, 224, 128, 128])

## Dataset

We will use a simplified version of a [FloodNet Challenge](http://www.classic.grss-ieee.org/earthvision2021/challenge.html).

Compared to the original challenge, our version doesn't have difficult (and rare) "flooded" labels, and the images are downsampled

<img src="https://i.imgur.com/RZuVuVp.png" />

## Assignments and grading


- **Part 1. Code**: fill in the empty gaps (marked with `#TODO`) in the code of the assignment (34 points):
    - `dataset.py` -- 4 points
    - `model.py` -- 22 points
    - `loss.py` -- 6 points
    - `train.py` -- 2 points
- **Part 2. Train and benchmark** the performance of the required models (6 points):
    - All 6 checkpoints are provided -- 3 points
    - Checkpoints have > 0.5 accuracy -- 3 points
- **Part 3. Report** your findings (10 points)
    - Each task -- 2.5 points

- **Total score**: 50 points.

For detailed grading of each coding assignment, please refer to the comments inside the files. Please use the materials provided during a seminar and during a lecture to do a coding part, as this will help you to further familiarize yourself with PyTorch. Copy-pasting the code from Google Search will get penalized.

In part 2, you should upload all your pre-trained checkpoints to your personal Google Drive, grant public access and provide a file ID, following the intructions in the notebook.

Note that for each task in part 3 to count towards your final grade, you should complete the corresponding tasks in part 2.

For example, if you are asked to compare Model X and Model Y, you should provide the checkpoints for these models in your submission, and their accuracies should be above minimal threshold.

## Part 1. Code


### `dataset.py`
**TODO: implement and apply data augmentations**

You'll need to study a popular augmentations library: [Albumentations](https://albumentations.ai/), and implement the requested augs. Remember that geometric augmentations need to be applied to both images and masks at the same time, and Albumentations has [native support](https://albumentations.ai/docs/getting_started/mask_augmentation/) for that.

### `model.py`
**TODO: Implement the required models.**

Typically, all segmentation networks consist of an encoder and decoder. Below is a scheme for a popular DeepLab v3 architecture:

<img src="https://i.imgur.com/cdlkxvp.png" />

The encoder consists of a convolutional backbone, typically with extensive use of convs with dilations (atrous convs) and a head, which helps to further boost the receptive field. As you can see, the general idea for the encoders is to have as big of a receptive field, as possible.

The decoder either does upsampling with convolutions (similarly to the scheme above, or to UNets), or even by simply interpolating the outputs of the encoder.

In this assignment, you will need to implement **UNet** and **DeepLab** models. Example UNet looks like this:

<img src="https://i.imgur.com/RJyO1rV.png" />

For **DeepLab** model we will have three variants for backbones: **ResNet18**, **VGG11 (with BatchNorm)**, and **MobileNet v3 (small).** Use `torchvision.models` to obtain pre-trained versions of these backbones and simply extract their convolutional parts. To familiarize yourself with **MobileNet v3** model, follow this [link](https://paperswithcode.com/paper/searching-for-mobilenetv3).

We will also use **Atrous Spatial Pyramid Pooling (ASPP)** head. Its scheme can be seen in the DeepLab v3 architecture above. ASPP is one of the blocks which greatly increases the spatial size of the model, and hence boosts the model's performance. For more details, you can refer to this [link](https://paperswithcode.com/method/aspp).

### `loss.py`
**TODO: implement test losses.**

For validation, we will use three metrics. 
- Mean intersection over union: **mIoU**,
- Mean class recall: **mRecall**,
- **Accuracy**.

To calculate **IoU**, use this formula for binary segmentation masks for each class, and then average w.r.t. all classes:

$$ \text{IoU} = \frac{ \text{area of intersection} }{ \text{area of union} } = \frac{ \| \hat{m} \cap m  \| }{ \| \hat{m} \cup m \| }, \quad \text{$\hat{m}$ — predicted binary mask},\ \text{$m$ — target binary mask}.$$

For **mRecall** you can use the following formula:

$$
    \text{mRecall} = \frac{ \| \hat{m} \cap m \| }{ \| m \| }
$$

And **accuracy** is a fraction of correctly identified pixels in the image.

Generally, we want our models to optimize accuracy since this implies that it makes little mistakes. However, most of the segmentation problems have imbalanced classes, and therefore the models tend to underfit the rare classes. Therefore, we also need to measure the mean performance of the model across all classes (mean IoU or mean class accuracy). In reality, these metrics (not the accuracy) are the go-to benchmarks for segmentation models.

### `train.py`
**TODO: define optimizer and learning rate scheduler.**

You need to experiment with different optimizers and schedulers and pick one of each which works the best. Since the grading will be partially based on the validation performance of your models, we strongly advise doing some preliminary experiments and pick the configuration with the best results.

## Part 2. Train and benchmark

In this part of the assignment, you need to train the following models and measure their training time:
- **UNet** (with and without data augmentation),
- **DeepLab** with **ResNet18** backbone (with **ASPP** = True and False),
- **DeepLab** with the remaining backbones you implemented and **ASPP** = True).

To get the full mark for this assignment, all the required models should be trained (and their checkpoints provided), and have at least 0.5 accuracies.

After the models are trained, evaluate their inference time on both GPU and CPU.

Example training and evaluation code are below.

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import pytorch_lightning as pl
from part1_semantic_segmentation.train import SegModel
import time
import torch



def define_model(model_name: str, 
                 backbone: str, 
                 aspp: bool, 
                 augment_data: bool, 
                 optimizer: str, 
                 scheduler: str, 
                 lr: float, 
                 checkpoint_name: str = '', 
                 batch_size: int = 16):
    assignment_dir = 'part1_semantic_segmentation'
    experiment_name = f'{model_name}_{backbone}_augment={augment_data}_aspp={aspp}'
    model_name = model_name.lower()
    backbone = backbone.lower() if backbone is not None else backbone
    
    model = SegModel(
        model_name, 
        backbone, 
        aspp, 
        augment_data,
        optimizer,
        scheduler,
        lr,
        batch_size, 
        data_path='datasets/tiny-floodnet-challenge', 
        image_size=256)

    if checkpoint_name:
        model.load_state_dict(torch.load(f'{assignment_dir}/logs/{experiment_name}/{checkpoint_name}')['state_dict'])
    
    return model, experiment_name

def train(model, experiment_name, use_gpu):
    assignment_dir = 'part1_semantic_segmentation'

    logger = pl.loggers.TensorBoardLogger(save_dir=f'{assignment_dir}/logs', name=experiment_name)

    checkpoint_callback = pl.callbacks.ModelCheckpoint(
        monitor='mean_iou',
        dirpath=f'{assignment_dir}/logs/{experiment_name}',
        filename='{epoch:02d}-{mean_iou:.3f}',
        mode='max')
    
    trainer = pl.Trainer(
        max_epochs=100, 
        gpus=1 if use_gpu else None, 
        benchmark=True, 
        check_val_every_n_epoch=5, 
        logger=logger, 
        callbacks=[checkpoint_callback])

    time_start = time.time()
    
    trainer.fit(model)
    
    torch.cuda.synchronize()
    time_end = time.time()
    
    training_time = (time_end - time_start) / 60
    
    return training_time

In [None]:
model, experiment_name = define_model(
    model_name='UNet',
    backbone=None,
    aspp=None,
    augment_data=False,
    optimizer='', # use these options to experiment
    scheduler='', # with optimizers and schedulers
    lr=1.) # experiment to find the best LR
training_time = train(model, experiment_name, use_gpu=False)

print(f'Training time: {training_time:.3f} minutes')

After training, the loss curves and validation images with their segmentation masks can be viewed using the TensorBoard extension:

In [None]:
%load_ext tensorboard
%tensorboard --logdir part1_semantic_segmentation/logs

Inference time can be measured via the following function:

In [None]:
def calc_inference_time(model, device, input_shape=(1000, 750), num_iters=100):
    timings = []

    for i in range(num_iters):
        x = torch.randn(1, 3, *input_shape).to(device)
        time_start = time.time()
        
        model(x)
        
        torch.cuda.synchronize()
        time_end = time.time()
        
        timings.append(time_end - time_start)

    return sum(timings) / len(timings) * 1e3


model = define_model(
    model_name='unet',
    backbone=None,
    aspp=None,
    augment_data=False,
    checkpoint_name=<TODO>)

inference_time = calc_inference_time(model.eval().cpu(), 'cpu')
# inference_time = calc_inference_time(model.eval().cuda(), 'cuda')

print(f'Inferece time (per frame): {inference_time:.3f} ms')

Your trained weights are available in the `part1_semantic_segmentation/logs` folder. Inside, your experiment directory has a log file with the following mask: `{epoch:02d}-{mean_iou:.3f}.ckpt`. Make sure that you models satisfy the accuracy requirements, upload them to your personal Google Drive. Provide file ids and checksums below. Use `!md5sum <PATH>` to compute the checksums.

To make sure that provided ids are correct, try running `!gdown --id <ID>` command from this notebook.

In [None]:
checkpoint_ids = {
    'UNet_None_augment=False_aspp=None': (<ID>, <CHECKSUM>), # TODO
    'UNet_None_augment=True_aspp=None': None, # TODO
    'DeepLab_ResNet18_augment=True_aspp=False': None, # TODO
    'DeepLab_ResNet18_augment=True_aspp=True': None, # TODO
    'DeepLab_VGG11_bn_augment=True_aspp=True': None, # TODO
    'DeepLab_MobileNet_v3_small_augment=True_aspp=True': None, # TODO
}

## Part 3. Report

You should have obtained 6 different models, which we will use for the comparison and evaluation. When asked to visualize specific loss curves, simply configure these plots in TensorBoard, screenshot, store them in the `report` folder, and load into Jupyter markdown:

`<img src="./part1_semantic_segmentation/report/<screenshot_filename>"/>`

If you have problems loading these images, try uploading them [here](https://imgur.com) and using a link as `src`. Do not forget to include the raw files in the `report` folder anyways.

You should make sure that your plots satisfy the following requirements:
- Each plot has a title,
- If there are multiple curves on one plot (or dots on the scatter plot), the plot legend should also be present,
- If the plot is not obtained using TensorBoard (Task 3), the axis should have names and ticks.

#### Task 1.
Visualize training loss and validation loss curves for UNet trained with and without data augmentation. What are the differences in the behavior of these curves between these experiments, and what are the reasons?

TODO

#### Task 2.
Visualize training and validation loss curves for ResNet18 trained with and without ASPP. Which model performs better?

TODO

#### Task 3.
Compare **UNet** with augmentations and **DeepLab** with all backbones (only experiments with **ASPP**). To do that, put these models on three scatter plots. For the first plot, the x-axis is **training time** (in minutes), for the second plot, the x-axis is **inference time** (in milliseconds), and for the third plot, the x-axis is **model size** (in megabytes). The size of each model is printed by PyTorch Lightning. For all plots, the y-axis is the best **mIoU**. To clarify, each of the **4** requested models should be a single dot on each of these plots.

Which models are the most efficient with respect to each metric on the x-axes? For each of the evaluated models, rate its performance using their validation metrics, training and inference time, and model size. Also for each model explain what are its advantages, and how its performance could be improved?

TODO

#### Task 4.

Pick the best model according to **mIoU** and look at the visualized predictions on the validation set in the TensorBoard. For each segmentation class, find the good examples (if they are available), and the failure cases. Provide the zoomed-in examples and their analysis below. Please do not attach full validation images, only the areas of interest which you should crop manually.

TODO