# <font style="color:blue">Project 4: Kaggle Competition - Semantic Segmentation</font>

#### Maximum Points: 100

<div>
    <table>
        <tr><td><h3>Sr. no.</h3></td> <td><h3>Section</h3></td> <td><h3>Points</h3></td> </tr>
        <tr><td><h3>1</h3></td> <td><h3>1.1. Dataset Class</h3></td> <td><h3>7</h3></td> </tr>
        <tr><td><h3>2</h3></td> <td><h3>1.2. Visualize dataset</h3></td> <td><h3>3</h3></td> </tr>
        <tr><td><h3>3</h3></td> <td><h3>2. Evaluation Metrics</h3></td> <td><h3>10</h3></td> </tr>
        <tr><td><h3>4</h3></td> <td><h3>3. Model</h3></td> <td><h3>10</h3></td> </tr>
        <tr><td><h3>5</h3></td> <td><h3>4.1. Train</h3></td> <td><h3>7</h3></td> </tr>
        <tr><td><h3>6</h3></td> <td><h3>4.2. Inference</h3></td> <td><h3>3</h3></td> </tr>
        <tr><td><h3>7</h3></td> <td><h3>5. Prepare Submission CSV</h3></td><td><h3>10</h3></td> </tr>
        <tr><td><h3>8</h3></td> <td><h3>6. Kaggle Profile Link</h3></td> <td><h3>50</h3></td> </tr>
    </table>
</div>

---

**In this project, you have participated in the Kaggle competition, and also submit the notebook and othe code in the course lab.**

**This Kaggle competition is a semantic segmentation challenge.**

<h2>Dataset Description </h2>
<p>The dataset consists of 3,269 images in 12 classes (including background). All images were taken from drones in a variety of scales. Samples are shown below:
<img src="https://github.com/ishann/aeroscapes/blob/master/assets/data_montage.png?raw=true" width="800" height="800">
<p>The data was splitted into public train set and private test set which is used for evaluation of submissions. You can split public subset into train and validation sets yourself.
Images are named with a unique <code>ImageId</code>. </p>
<p> You should segment and classify the images in the test set.</p>
<p>The dataset consists of landscape images taken from drones in a variety of scales.</p>

**The notebook is divided into sections. You have to write code, as mention in the section.  For other helper functions, you can write `.py` files and import them in the notebook. You have to submit the notebook along with `.py` files. Your submitted code must be runnable without any bug.**

# <font style="color:orange">Project Approach</font>

Use the **[Segmentation Models Pytorch](https://github.com/qubvel/segmentation_models.pytorch)** library w/ Albumentations.

**To Do List**

- Implement CopyPaste augmentation.
- Explore whether splitting images into overlapping regions is better than training full size images.
- Explore different models, mish activation function, and/or loss functions.
- Explore multistage models and/or ensembles.
- Discuss multi-threading with Ryan to see if `Submission` class can be speed up.

**Unresolved Issues**

- Despite trying to make training deterministic, it is not. Enabled CUDNN benchmarking?

**Articles and Code**

- [Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation](https://arxiv.org/abs/2012.07177v1) [PDF](https://arxiv.org/pdf/2012.07177v1.pdf) [GitHub](https://github.com/conradry/copy-paste-aug)

- [New Deep Learning Optimizer, Ranger: Synergistic combination of RAdam + LookAhead for the best of both.](https://lessw.medium.com/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d) [GitHub](https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer)

- [A survey of loss functions for semantic segmentation](https://arxiv.org/abs/2006.14822) [PDF](https://arxiv.org/pdf/2006.14822.pdf) [GitHub](https://github.com/shruti-jadon/Semantic-Segmentation-Loss-Functions) (Keras)

- [Loss Functions for Medical Image Segmentation: A Taxonomy](https://medium.com/@junma11/loss-functions-for-medical-image-segmentation-a-taxonomy-cefa5292eec0) [GitHub](https://github.com/JunMa11/SegLoss) (PyTorch) [Slides](https://docs.google.com/presentation/d/1GEi72Jb7ZpENtTCn0vCmU1FmIZjLeRc9C2W9KYR1hY0)

- [Albumentations](https://albumentations.ai/)

- [Pytorch-toolbelt](https://github.com/BloodAxe/pytorch-toolbelt)

- [Mixed precision training](https://docs.fast.ai/callback.fp16)

- [isort: a tool to format Python imports](https://pycqa.github.io/isort/docs/quick_start/0.-try/)

## <font style="color:orange">Download and Unzip the Dataset</font>

The `download_and_unzip_dataset()` function downloads and unzips the project's dataset. Since this dataset is stored on kaggle, this function requires the user to manually download their credentials from kaggle and place the credential file, `kaggle.json` in the same directory as this notebook. The output of the aforementioned function will be similar to the following.

```
Requirement already satisfied: kaggle in /root/miniconda/lib/python3.7/site-packages (1.5.10)
Requirement already satisfied: tqdm in /root/miniconda/lib/python3.7/site-packages (from kaggle) (4.47.0)
Requirement already satisfied: python-dateutil in /root/miniconda/lib/python3.7/site-packages (from kaggle) (2.8.1)
...

total 4.0K
-rw-r--r-- 1 root root 68 Feb 20 19:33 kaggle.json

ref                             title               author        lastRunTime          totalVotes  
------------------------------  ------------------  ------------  -------------------  ----------  
kevinkramer/notebookeafa5161c4  notebookeafa5161c4  Kevin Kramer  2021-01-09 21:40:41           0  

Downloading and unzipping dataset ...
Downloading opencv-pytorch-course-segmentation.zip to /home/kevinkramer/class/week12/project4
100%|████████████████████████████████████████| 738M/738M [00:18<00:00, 41.5MB/s]
```

In [None]:
def download_and_unzip_dataset():
    import os
    
    # install kaggle api
    !pip install kaggle
    
    # copy credentials to the proper location
    # note: manually uploaded credential file
    if not os.path.exists("/root/.kaggle/kaggle.json"):
        !mkdir -p /root/.kaggle
        !cp kaggle.json /root/.kaggle/
        !ls -lh /root/.kaggle
        !chmod 600 /root/.kaggle/kaggle.json
        
    # verify kaggle is properly setup
    !kaggle kernels list --user kevinkramer --sort-by dateRun

    # download and unzip data file
    print("Downloading and unzipping dataset ...")
    !mkdir ./data
    !kaggle competitions download -c opencv-pytorch-course-segmentation
    !unzip -q opencv-pytorch-course-segmentation.zip -d ./data
    !rm opencv-pytorch-course-segmentation.zip
    

# uncomment the following line to download and unzip the dataset    
# download_and_unzip_dataset()

## <font style="color:orange">Install Project Dependencies</font>

The `install_project_dependencies()` function installs the project's dependencies. Its output will be similar to the following.

```
Collecting segmentation-models-pytorch
  Downloading segmentation_models_pytorch-0.1.3-py3-none-any.whl (66 kB)
     |████████████████████████████████| 66 kB 2.8 MB/s  eta 0:00:01
Requirement already satisfied: albumentations in /root/miniconda/lib/python3.7/site-packages (0.5.2)
...
```

In [None]:
def install_project_dependencies():
    !pip install -U albumentations
    !pip install -U segmentation-models-pytorch
    
# uncomment the following line to install the project dependencies
# install_project_dependencies()

## <font style="color:orange">Python Imports</font>

**Note:** For convenience, I added albumentations and segmentation-models-pytorch to a Docker image along Python, Pytorch GPU, Jupyter notebook, etc.

In [None]:
import os
import random

import cv2
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import tools
import torch
from IPython.utils import io
from PIL import Image
from tqdm import tqdm


class classproperty(property):
    def __get__(self, cls, owner):
        return classmethod(self.fget).__get__(None, owner)()
    
tools.utils.seed_system(42)

# <font style="color:green">1. Data Exploration</font>

In this section, you have to write your custom dataset class and visualize a few images (max five images) and its mask.

## <font style="color:green">1.1. Dataset Class [7 Points]</font>

**In this sub-section, write your custom dataset class.**


**Note that there are not separate validation data, so you will have to create your validation set by dividing train data into train and validation data. Usually, in practice, we do `80:20` ratio for train and validation, respectively.** 

**for example:**

```
class SemSegDataset(Dataset):
    """ Generic Dataset class for semantic segmentation datasets.

        Arguments:
            data_path (string): Path to the dataset folder.
            images_folder (string): Name of the folder containing the images (related to the data_path).
            masks_folder (string): Name of the folder containing the masks (related to the data_path).
            csv_path (string): train or test csv file name
            image_ids (list): List of images.
            train_val_test (string): 'train', 'val' or 'test'
            transforms (callable, optional): A function/transform that inputs a sample
                and returns its transformed version.
            class_names (list, optional): Names of the classes.
            

        Dataset folder structure:
            Folder containing the dataset should look like:
            - data_path
            -- images_folder
            -- masks_folder

            Names of images in the images_folder and masks_folder should be the same for same samples.
    """
```

## <font style="color:orange">Aeroscapes Data</font>

The `Data` class follows the [singleton design pattern](https://www.tutorialspoint.com/python_design_patterns/python_design_patterns_singleton.htm). It converts this project's public and private datasets into lists of "image dictionaries". As it processes each entry in the public dataset, it analyzes that entry's mask to compute pixel counts for each class. Pixel counts are cached to a file quicken subsequent data initialization. This class also provides methods to split the public data into training and validation datasets as well as specify subsets of each to test the training pipeline.

In [None]:
class Data(object):
    __instance = None
    
    @classproperty
    def instance(cls):
        if cls.__instance == None:
            Data()
        return cls.__instance
    
    __data_dir = "./data"
    __img_dir = os.path.join(__data_dir, "imgs/imgs")
    __msk_dir = os.path.join(__data_dir, "masks/masks")
    __prv_img_csv_path = os.path.join(__data_dir, "test.csv")
    __pub_img_csv_path = os.path.join(__data_dir, "train.csv")

    @classproperty
    def data_dir(cls):
        return cls.__data_dir

    
    __classes = (        #  Value
    # ----------------   #  -----    
        "Background",    #    0
        "Person",        #    1
        "Bike",          #    2
        "Car",           #    3
        "Drone",         #    4
        "Boat",          #    5
        "Animal",        #    6
        "Obstacle",      #    7
        "Construction",  #    8
        "Vegetation",    #    9
        "Road",          #   10
        "Sky"            #   11
    )
    
    @classproperty
    def classes(cls):
        return cls.__classes
    
    @classproperty
    def num_classes(cls):
        return len(cls.__classes)
    
    @classmethod
    def class_index(cls, class_name):
        return cls.__classes.index(class_name)

    # ToDo: Recalculate split offsets so each set of 5 better spans public data.
    __split_params = [
        (  2, 0.0000), (  3, 0.2000), (  5, 0.4000), (  7, 0.6000), ( 11, 0.8000),
        ( 13, 0.1000), ( 17, 0.3000), ( 19, 0.5000), ( 23, 0.7000), ( 29, 0.9000),
        ( 31, 0.0500), ( 37, 0.1500), ( 41, 0.2500), ( 43, 0.3500), ( 47, 0.4500),
        ( 53, 0.5500), ( 59, 0.6500), ( 61, 0.7500), ( 67, 0.8500), ( 71, 0.9500)
 ]
    
    @classmethod
    def split_params(cls, idx):
        return cls.__split_params[max(int(idx), 0)]
    
    ####################################################################################################
    
    def __init__(self):
        if Data.__instance != None:
            raise Exception("Data is a singleton. Use the Data.instance class property.")
        else:
            Data.__instance = self
            self.__prv_image_dict_list = self.__parse_image_data_file(Data.__prv_img_csv_path)
            self.__pub_image_dict_list = self.__parse_image_data_file(Data.__pub_img_csv_path)
            random.Random(42).shuffle(self.__pub_image_dict_list)

    def __parse_image_data_file(self, path):
        df = pd.read_csv(path, dtype={'ImageID': 'str'}, engine='python')
        image_dict_list = [self.__create_image_dict(idx, image_id) for idx, image_id in enumerate(df.values[:,0])]
        return image_dict_list

    def __create_image_dict(self, idx, image_id):
        img_dict = {}
        img_path = os.path.join(Data.__img_dir, image_id + ".jpg")
        if not os.path.isfile(img_path):
            raise FileNotFoundError(f"Image '{img_path}' is missing.")
        
        img_dict["id"] = idx
        img_dict["name"] = image_id
        img_dict["image_path"] = img_path
        img_dict["width"] = 1280
        img_dict["height"] = 720

        msk_path = os.path.join(Data.__msk_dir, image_id + ".png")
        if os.path.isfile(msk_path):
            img_dict["mask_path"] = msk_path
            
        return img_dict
    
    @property
    def prv_image_dict_len(self):
        return len(self.__prv_image_dict_list)
            
    @property
    def pub_image_dict_len(self):
        return len(self.__pub_image_dict_list)

    def get_prv_image_dict_list(self, indices=None):
        if indices is None:
            return self.__prv_image_dict_list
        return [self.__prv_image_dict_list[idx] for idx in indices]
    
    def get_pub_image_dict_list(self, indices=None):
        if indices is None:
            return self.__pub_image_dict_list
        return [self.__pub_image_dict_list[idx] for idx in indices]
    
    def split_pub_dataset(self, split_idx=0):
        split_len = len(self.__split_params)
        if split_idx >= split_len:
            raise ValueError(f"split_idx must an integer whose value is less than {split_len}")

        total_len = len(self.__pub_image_dict_list)
        valid_len = int(0.2 * total_len + 0.5)
        
        _, rel_off = Data.split_params(split_idx)
        abs_off = int(rel_off * total_len + 0.5)
        
        valid_idx = [(abs_off + idx) % total_len for idx in range(valid_len)]
        train_idx = [idx for idx in range(total_len) if idx not in valid_idx]
        
        if split_idx < 0:
            train_idx = self.__create_subset(train_idx, min(-0.01 * split_idx, 1.))
            valid_idx = self.__create_subset(valid_idx, min(-0.01 * split_idx, 1.))
        
        return train_idx, valid_idx
    
    @classmethod
    def __create_subset(cls, indices, subset_size=0.05):
        _, indices = np.unique([int(idx * subset_size) for idx in indices], return_index=True)
        return indices

## <font style="color:orange">Training/Validation Splitting Analysis</font>

To maximize the benefit of a model ensemble, the following conditions should be true.

* Each model should be trained on different slices of the public data.
* The public data should be equally represented in the aggregrated datasets.

The `analyze_data_splitting(splits)` function produced the following output, which validates the the aforementioned criteria are satisfied.

```
Concatentation of 5 Splits
train - len:  10485, min:  4, max:  5, ave:  4.0004
valid - len:   2620, min:  1, max:  1, ave:  1.0000

Concatentation of 10 Splits
train - len:  20970, min:  8, max:  9, ave:  8.0008
valid - len:   5240, min:  1, max:  2, ave:  1.9992

Concatentation of 20 Splits
train - len:  41940, min: 16, max: 17, ave: 16.0015
valid - len:  10480, min:  3, max:  4, ave:  3.9985

Concatentation of 40 Splits
train - len:  83880, min: 32, max: 33, ave: 32.0031
valid - len:  20960, min:  7, max:  8, ave:  7.9969

Concatentation of 80 Splits
train - len: 167760, min: 64, max: 65, ave: 64.0061
valid - len:  41920, min: 15, max: 16, ave: 15.9939
```

In [None]:
def analyze_data_splitting(splits):
    data = Data.instance
    train_indices = []
    valid_indices = []
    for split_idx in range(splits):
        train_idx, valid_idx = data.split_pub_dataset(split_idx)
        train_indices.extend(train_idx)
        valid_indices.extend(valid_idx)

    def print_stats(label, image_indices):
        unique, count = np.unique(image_indices, return_counts=True)
        print(label, end=" - ")
        print(f"len: {len(image_indices):>6}", end=", ")
        print(f"min: {np.min(count):>2}", end=", ")
        print(f"max: {np.max(count):>2}", end=", ")
        print(f"ave: {np.mean(count):7.04f}")

    print(f"Concatentation of {splits} Splits")
    print_stats("train", train_indices)
    print_stats("valid", valid_indices)
    print()
    

# uncomment the following lines to analyze the splitting of the public data    
    
#analyze_data_splitting(5)
#analyze_data_splitting(10)
#analyze_data_splitting(20)

## <font style="color:orange">Aeroscapes Datasets</font>

...

In [None]:
class ImageCache(object):
    __instance = None
    __cache_path = os.path.join(Data.data_dir, "cache.npy")
    
    @classproperty
    def instance(cls):
        if cls.__instance == None:
            ImageCache()
        return cls.__instance

    def __init__(self):
        if ImageCache.__instance != None:
            raise Exception("ImageCache is a singleton. Use the ImageCache.instance class property.")
        else:
            if os.path.isfile(ImageCache.__cache_path):
                with open(ImageCache.__cache_path, "rb") as f:
                    self._img_data = np.load(f)
                    self._msk_data = np.load(f)               
            else:
                self.__create_cache()
                with open(ImageCache.__cache_path, "wb") as f:
                    np.save(f, self._img_data)
                    np.save(f, self._msk_data)
    
    def image(self, idx):
        return self._img_data[idx]
    
    def mask(self, idx):
        return self._msk_data[idx]
    
    def __create_cache(self):
        image_dict_list = Data.instance.get_pub_image_dict_list()
        images = len(image_dict_list)
        self._msk_data = np.empty((images, 720, 1280), dtype=np.uint8)
        self._img_data = np.empty((images, 720, 1280, 3), dtype=np.uint8)
        pbar = tqdm(image_dict_list, desc="Loading", unit="image")
        for image_dict in pbar:
            idx = image_dict["id"]
            img = cv2.imread(image_dict["image_path"])
            self._img_data[idx, ...] = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
            self._msk_data[idx, ...] = cv2.imread(image_dict["mask_path"], 0)
        pbar.close()


ImageCache.instance

In [None]:
from torch.utils.data import DataLoader
from torch.utils.data import Dataset as TorchDataset

class Dataset(TorchDataset):
    def __init__(
        self,
        image_dict_list = None,
        classes = None,
        pixel_transforms = None,
        geometric_transforms = None,
        tile_transforms_list = None,
        binary_masks = False,
        use_image_cache = False
    ):
        self._image_dict_list = image_dict_list
        self._image_cache = ImageCache.instance if use_image_cache else None
        self._pixel_transforms = pixel_transforms
        self._geometric_transforms = geometric_transforms
        self._tile_transforms_list = tile_transforms_list
        self._binary_masks = binary_masks

        self._classes = classes
        if classes is None:
            self._classes = Data.classes

        self._class_values = [Data.class_index(cls) for cls in self._classes]
        
        self._tiles_per_image = 1
        if tile_transforms_list is not None:
            self._tiles_per_image = len(tile_transforms_list)
    
    @property
    def classes(self):
        return self._classes
    
    @property
    def class_values(self):
        return self._class_values
    
    def __len__(self):
        return len(self._image_dict_list) * self._tiles_per_image
    
    def __getitem__(self, idx):
        idx_dict = idx // self._tiles_per_image
        idx_tile = idx  % self._tiles_per_image
        image_dict = self._image_dict_list[idx_dict]
        
        if self._image_cache:
            idx = image_dict["id"]
            image = self._image_cache.image(idx)
            mask = self._image.cache.mask(idx)
        else:
            image_path = image_dict["image_path"]
            image = cv2.imread(image_path)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            mask = None
            if "mask_path" in image_dict:
                mask = cv2.imread(image_dict["mask_path"], 0)

        # ToDo: Test whether Albumentations gracefully handles the case where mask = None.
        if self._geometric_transforms:
            result = self._geometric_transforms(image=image, mask=mask)
            image, mask = result["image"], result["mask"]

        if self._tile_transforms_list:
            result = self._tile_transforms_list[idx_tile](image=image, mask=mask)
            image, mask = result["image"], result["mask"]

        if self._binary_masks:
            mask = [(mask == v) for v in self._class_values]
            mask = np.stack(mask, axis=-1).astype("float32")

        if self._pixel_transforms:
            result = self._pixel_transforms(image=image, mask=mask)
            image, mask = result["image"], result["mask"]

        return image, mask

## <font style="color:green">1.2. Visualize dataset [3 Points]</font>

**In this sub-section,  you have to plot a few images and its mask.**

**for example:**

---

<img src="https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-data-sample.png">

---

## <font style="color:orange">Visualizer</font>

_mention it uses the same color map as the Aeroscapes Visualizations_ ...

In [None]:
class Visualizer(object):
    __class_colors = (    #  Class 
    #------------------   #  -------------
        (  0,   0,   0),  #  Background
        (192, 128, 128),  #  Person
        (  0, 128,   0),  #  Bike
        (128, 128, 128),  #  Car
        (128,   0,   0),  #  Drone
        (  0,   0, 128),  #  Boat
        (192,   0, 128),  #  Animal
        (192,   0,   0),  #  Obstacle
        (192, 128,   0),  #  Construction
        (  0,  64,   0),  #  Vegetation
        (128, 128,   0),  #  Road
        (  0, 128, 128)   #  Sky
    )
    
    @classproperty
    def class_colors(cls):
        return cls.__class_colors
    
    @classmethod
    def create_color_mask(cls, mask):       
        return np.array(Visualizer.__class_colors)[mask.ravel()].reshape((*mask.shape, 3))
    
    @classmethod
    def _process_image(cls, name, image):
        is_path = isinstance(image, str)
        is_data = isinstance(image, np.ndarray)
        is_tensor = isinstance(image, torch.Tensor)
        if not is_path and not is_data and not is_tensor:
            raise ValueError(f"{name} must be a string path, numpy.ndarray, or torch.Tensor.")
            
        if "mask" in name:
            if is_tensor:
                image = image.cpu().numpy()
            elif is_path:
                image = cv2.imread(image, 0)
            image = cls.create_color_mask(image)
        elif is_tensor:
            image = image.cpu().numpy()
        elif is_path:
            image = cv2.imread(image)
            image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
            
        return image
    
    @classmethod
    def visualize_images(cls, **images):
        """Plot images in one row."""
        n = len(images)
        plt.figure(figsize=(14, 5))
        for i, (name, image) in enumerate(images.items()):
            plt.subplot(1, n, i + 1)
            plt.xticks([])
            plt.yticks([])
            plt.title(' '.join(name.split('_')).title())
            plt.imshow(cls._process_image(name, image))                   
        plt.show()
        

for image_dict in Data.instance.get_pub_image_dict_list(range(3)):
    Visualizer.visualize_images(image=image_dict["image_path"], mask=image_dict["mask_path"])

## <font style="color:orange">Dataset Analysis</font>

The images in Aeroscapes dataset are 1280 x 720 pixels (720P). Given semantic segmentation classifies each pixel in an image, it is useful to see whether the classes are balanced. The `compute_class_counts()` function processes all the masks in the public datset and tallies the number of pixels in each class. Since this dataset is very imbalanced, the `plot_class_counts(class_counts)` function plots the class counts on a log scale. Each bar is colored according to its class color.

```
Analyzing: 100%|██████████| 2621/2621 [00:27<00:00, 94.36mask/s]
```

In [None]:
class ClassRepresentation(object):
    __pub_msk_csv_path = os.path.join(Data.data_dir, "train_mask.csv")

    @classmethod
    def get_mask_pxl_counts(cls):
        header = ("ImageID",) + Data.classes
        image_dict_list = Data.instance.get_pub_image_dict_list()
        image_dict_list = sorted(image_dict_list, key = lambda d: d["id"])
        
        def process(image_dict, data):
            counts = data[1:]
            assert image_dict["name"] == data[0]
            image_dict["mask_pixel_counts"] = counts
            return counts
        
        def analyze(image_dict):
            mask = cv2.imread(image_dict["mask_path"], 0)
            classes, counts = np.unique(mask, return_counts=True)
            d = {cls: count for cls, count in zip(classes, counts)}
            counts = [d.get(k, 0) for k in range(Data.num_classes)]
            image_dict["mask_pixel_counts"] = counts
            return [image_dict["name"]] + counts

        # if pixel counts are saved, then load them from the csv file
        if os.path.isfile(ClassRepresentation.__pub_msk_csv_path):
            df = pd.read_csv(ClassRepresentation.__pub_msk_csv_path, usecols=header, dtype={'ImageID': str})
            return [process(d, list(r)) for d, r in zip(image_dict_list, df.values)]
        
        # otherwise, compute pixel counts
        pbar = tqdm(image_dict_list, desc="Processing", unit="mask")
        analysis = [analyze(d) for d in pbar]
        pbar.close()
        
        # and save them to a csv file
        df = pd.DataFrame(analysis)
        df.to_csv(ClassRepresentation.__pub_msk_csv_path, index=False, header=("ImageID",)+Data.classes)
        return [counts[1:] for counts in analysis]

    @classmethod
    def compute_class_counts(cls):
        pub_mask_pxl_counts = cls.get_mask_pxl_counts()
        num_classes = Data.num_classes
        image_totals = np.array([0] * num_classes)
        pixel_totals = np.array([0] * num_classes)
        for counts in pub_mask_pxl_counts:
            image_totals += np.array([int(np.sign(count)) for count in counts])
            pixel_totals += np.array(counts)
        return image_totals, pixel_totals

    @classmethod
    def _plot_class_counts(cls, counts, divisor, title, yscale, ratio_approx):
        classes = Data.classes
        centers = range(Data.num_classes)
        counts = (np.array(counts) * 100 / divisor).tolist()
        colors = (np.array(Visualizer.class_colors) / 255).tolist()

        data = list(zip(classes, counts, colors))
        data.sort(key=lambda tup: tup[1], reverse=True)
        classes, counts, colors = zip(*data)

        ratio = int(np.max(counts) / np.min(counts) + 0.5)
        ratio = int(ratio / ratio_approx + 0.5) * ratio_approx
        
        fig = plt.figure(figsize=(14, 6))
        plt.title(f"{title} (max:min ~ {ratio}:1)")
        plt.bar(centers, counts, align="center", tick_label=classes, color=colors)
        plt.xticks(rotation=30, ha="right")
        plt.xlabel("Class")
        plt.yscale(yscale)
        plt.ylabel("%")
        plt.grid(True, which="both", axis="y", linestyle=':')
        plt.gca().set_axisbelow(True)
        plt.tight_layout
        plt.show()
        
    @classmethod
    def plot_image_histogram(cls, counts):
        divisor = Data.instance.pub_image_dict_len
        cls._plot_class_counts(counts, divisor, "Image Occurence Histogram", "linear", 10)

    @classmethod
    def plot_pixel_histogram(cls, counts):
        divisor = np.sum(counts)
        cls._plot_class_counts(counts, divisor, "Pixel Count Histogram", "log", 100)

        
image_counts, pixel_counts = ClassRepresentation.compute_class_counts()
#image_counts = [2621, 2223, 1091, 762, 264, 49, 73, 1732, 1455, 2414, 2294, 388]
#pixel_counts = [555900697, 10402499,  1531710,   8283476,    650353,    548987,
#                  1663688, 14487197, 91507497, 854898289, 763651425, 111987782]

ClassRepresentation.plot_image_histogram(image_counts)
ClassRepresentation.plot_pixel_histogram(pixel_counts)

## <font style="color:orange">Class Exploration</font>

Given this dataset's significant class imbalance, mitigating techniques will likely need to be employed. The `get_images_containing(cls_name)` function returns image dictionary list of image/mask pairs that contain the specified class.

In [None]:
def get_images_containing(cls_name):
    cls_idx = Data.class_index(cls_name)
    img_dict_list = Data.instance.get_pub_image_dict_list()
    img_dict_list = [img_dict for img_dict in img_dict_list if img_dict["mask_pixel_counts"][cls_idx] > 0]
    img_dict_list.sort(key=lambda img_dict: img_dict["mask_pixel_counts"][cls_idx], reverse=True)
    return img_dict_list


boat_dict_list = get_images_containing("Boat")
random.shuffle(boat_dict_list)
for image_dict in boat_dict_list[:3]:
    Visualizer.visualize_images(boat=image_dict["image_path"], mask=image_dict["mask_path"])

# <font style="color:green">2. Evaluation Metrics [10 Points]</font>

<p>This competition is evaluated on the mean <a href='https://en.wikipedia.org/wiki/Sørensen–Dice_coefficient'>Dice coefficient</a
>. The Dice coefficient can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by: </p>

<p>$$DSC =  \frac{2 |X \cap Y|}{|X|+ |Y|}$$
$$ \small \mathrm{where}\ X = Predicted\ Set\ of\ Pixels,\ \ Y = Ground\ Truth $$ </p>
<p>The Dice coefficient is defined to be 1 when both X and Y are empty.</p>

**In this section, you have to implement the dice coefficient evaluation metric.**

## <font style="color:orange">Evaluation Metrics</font>

The `IouAndDiceMetrics` class computes the _mean_ and _per class_ IoU and Dice coefficients using tensor operations. Given a batch of the predicted and ground truth masks are already loaded on the GPU for the loss calculation, the computation of the IoU and Dice coefficients on the GPU is very efficient. While these _per class_ coefficients are computed for batch of predicted/ground truth pairs, they are computed for each pair to honor the special case where the coefficient is unity when $X$ and $Y$ are empty.

Since both IoU and Dice coefficients can be computed from the intersection $(|X \cap Y|)$ between, and cardinality $(|X| + |Y|)$ of, the predictions and ground truths, it is more efficient to compute them simultaneously rather than separately. 

$$|X \cup Y| = |X| + |Y| - |X \cap Y|$$

$$Dice = \frac{2 |X \cap Y| + \epsilon}{|X| + |Y| + \epsilon}$$

$$IoU = \frac{|X \cap Y| + \epsilon}{|X \cup Y| + \epsilon} = \frac{|X \cap Y| + \epsilon}{|X| + |Y| - |X \cap Y| + \epsilon}$$

In [None]:
class IouAndDiceMetrics(object):
    # ToDo: If needed, add ignore classes to constructor.
    def __init__(self, num_classes, image_width=1280, image_height=720, eps=1e-7):
        self._eps = eps
        self._num_classes = num_classes
        self._image_width = image_width
        self._image_height = image_height
        self.reset()
        
    def reset(self):
        self._n = 0
        self._iou_totals = np.array([0.] * self._num_classes)
        self._dice_totals = np.array([0.] * self._num_classes)
        
    def add(self, y_pred, y_true):
        from torch.nn import functional as f
        batch_size, num_classes, image_height, image_width = tuple(y_true.shape)
        
        # remove padding so it does impact metrics
        if (image_width > self._image_width):
                y_pred = y_pred.narrow(3, 0, self._image_width)
                y_true = y_true.narrow(3, 0, self._image_width)
        if (image_height > self._image_height):
                y_pred = y_pred.narrow(2, 0, self._image_height)
                y_true = y_true.narrow(2, 0, self._image_height)
                
        # flatten height and width and one-hot encode prediction
        y_pred = y_pred.view(batch_size, num_classes, -1)
        y_true = y_true.view(batch_size, num_classes, -1)
        y_pred = f.one_hot(y_pred.argmax(dim=1), num_classes).transpose(1, 2).type_as(y_true)
        
        # compute the intersection (|y_pred * y_true|) and cardinality (|y_pred| + |y_true|)
        cardinality = torch.sum(y_pred + y_true, dim=2)
        intersection = torch.sum(y_pred * y_true, dim=2)
        
        # compute "per class" IoU and Dice coefficients for each prediction and sum over batch
        iou = torch.sum((intersection + self._eps) / (cardinality - intersection + self._eps), dim=0)
        dice = torch.sum((2. * intersection + self._eps) / (cardinality + self._eps), dim=0)
        
        # update the count and totals
        self._n += batch_size
        self._iou_totals += iou.data.cpu().numpy()
        self._dice_totals += dice.data.cpu().numpy()
        
    @property
    def iou(self):
        with np.errstate(divide='ignore', invalid='ignore'):
            iou = self._iou_totals / self._n
            return np.nanmean(iou), iou
        
    @property
    def dice(self):
        with np.errstate(divide='ignore', invalid='ignore'):
            dice = self._dice_totals / self._n
            return np.nanmean(dice), dice

# <font style="color:orange">Framework</font>

Several classes were developed to experiment with different models. 

These classes are good candidates to move into the custom tools module.

In [None]:
import sys
import yaml
from segmentation_models_pytorch import DeepLabV3Plus, FPN
from segmentation_models_pytorch.encoders import get_preprocessing_fn

## <font style="color:orange">Utility Classes</font>

Instances of the `FixedLengthIterator` class wrap another iterator, e.g., a data loader, to provide an iterator of a specified length. If necessary, the wrappered iterator will cycle and repeat itself.

Instances of the `SimpleStatistics` class allow values to be added and perform simple statistics, e.g., minimum, maximum, mean, exponential moving average, etc.

In [None]:
class FixedLengthIterator(object):
    def __init__(self, it, length):
        import itertools
        self._it = itertools.cycle(it)
        self._length = length

    def __len__(self):
        return self._length

    def __iter__(self):
        self._index = 0
        return self

    def __next__(self):
        self._index += 1
        if self._index > self._length:
            raise StopIteration()
        return next(self._it)


class SimpleStatistics(object):
    def __init__(self, alpha=0.3):
        self._alpha = 0.3
        self.reset()

    def reset(self):
        self._n = 0
        self._ema = 0.
        self._total = 0.
        self._values = []

    @property
    def samples(self):
        return self._n

    @property
    def first(self):
        return self._values[0] if self._n > 0 else float("nan")

    @property
    def last(self):
        return self._values[-1] if self._n > 0 else float("nan")

    @property
    def min(self):
        return min(self._values) if self._n > 0 else float("nan")

    @property
    def max(self):
        return max(self._values) if self._n > 0 else float("nan")

    @property
    def mean(self):
        return self._total / self._n if self._n > 0 else float("nan")

    @property
    def ema(self):
        return self._ema if self._n > 0 else float("nan")

    @property
    def values(self):
        return self._values

    def add(self, x):
        beta = 1. - self._alpha
        self._total += x
        self._values.append(x)
        self._ema = self._alpha * x + (1 - self._alpha) * self._ema if self._n > 0 else x
        self._n += 1
        return self.mean


## <font style="color:orange">Configuration</font>

Instances of the `Config` class store experiment specific details, e.g., which model and encoder to use, which dataset split to use, which transformations to use during training and validation, learning rates, number of epochs, whether to anneal the best model when the trainer is finished or loses patience, etc.

In [None]:
class Config(dict):
    def __init__(self, d=None):
        if isinstance(d, dict):
            self.merge(d)
        elif not None:
            raise ValueError("Parameter d must be None or a dictionary.")
    
    def merge(self, d):
        for k, v in d.items():
            if not isinstance(v, dict):
                self[k] = v
            else:
                if k not in self:
                    self[k] = Config(v)
                else:
                    if not isinstance(self[k], dict):
                        raise ValueError(f"Cannot merge dictionary {k} with a non-dictionary value.")
                    self[k].merge(v)
    
    def __getattr__(self, name):
        if name in self:
            return self[name]
        raise AttributeError(name)

    def __setattr__(self, name, value):
        self[name] = value

    def __str__(self):
        def _enumerate(d, prefix=""):
            output = []
            for k, v in d.items():
                if isinstance(v, dict):
                    output.extend(_enumerate(v, prefix + k + '.'))
                else:
                    output.append(f"{prefix + k}: {v}")
            return output
        return '\n'.join(_enumerate(self))

    def __repr__(self):
        return "{}({})".format(self.__class__.__name__, super(CfgNode, self).__repr__())

    @classmethod
    def from_dict(cls, d):
        return Config(d)
    
    @classmethod
    def from_yaml(cls, s):
        return Config(cls.__cleanup_yaml(yaml.safe_load(s)))
        
    @classmethod
    def __cleanup_yaml(cls, d):
        for k, v in d.items():
            if isinstance(v, dict):
                cls.__cleanup_yaml(v)
            elif isinstance(v, str):
                if v.lower() == "none":
                    d[k] = None
                elif v.startswith('('):
                    try:
                        d[k] = eval(v)
                    except Exception:
                        pass
        return d
        
    @classmethod
    def default(cls):
        return cls.from_yaml(cls.__default)
        
    __default = """
        EXPERIMENT:
          PREFIX: "PREFIX"
          SUFFIX: "SUFFIX"
        MODEL:
          NAME: "FPN"
          ABBR: "FPN"
          ENCODER: 
            NAME: "efficientnet-b3"
            ABBR: "EFF3"
            WEIGHTS: "imagenet"
          ACTIVATION: None
        DATASETS:
          SPLIT_INDEX: 0
          CLASSES: "all"
          AUGMENTATION:
            HORIZONTAL_FLIP:
              ENABLE: False
              PROBABILITY: 0.5
            SHIFT_SCALE_ROTATE:
              ENABLE: False
              PROBABILITY: 0.5
              SHIFT_LIMIT: 0.0625
              SCALE_LIMIT: 0.1
              ROTATE_LIMIT: 10
            TILE:
              ENABLE: False
              WIDTH: 416
              HEIGHT: 224
              SETS: [[1280, 720, 4, 4], [640, 360, 2, 2]]
            COPY_PASTE:
                ENABLE: False
            COLOR:
              ENABLE: False
              PROBABILITY: 0.9
              CLAHE:
                ENABLE: True
                PROBABILITY: 1.0
                CLIP_LIMIT: 4.0
              BRIGHTNESS_CONTRAST:
                ENABLE: True
                PROBABILITY: 1.0
                BRIGHTNESS_LIMIT: 0.2
                CONTRAST_LIMIT: 0.2
                BRIGHTNESS_BY_MAX: True
              HUE_SATURATION_VALUE:
                ENABLE: True
                PROBABILITY: 1.0
                HUE_SHIFT_LIMIT: 20
                SAT_SHIFT_LIMIT: 30
                VAL_SHIFT_LIMIT: 20
            GAUSSIAN_NOISE:
                ENABLE: False
                PROBABILITY: 0.2
                VAR_LIMIT: (10.0, 50.0)
                MEAN: 0.0
                PER_CHANNEL: True
        DATALOADERS:
          TRAIN:
            BATCH_SIZE: 2
            NUM_WORKERS: 8
            SHUFFLE: True
            PIN_MEMORY: True
            DROP_LAST: True
          VALID:
            BATCH_SIZE: 4
            NUM_WORKERS: 8
            SHUFFLE: False
            PIN_MEMORY: True
            DROP_LAST: False
          VISUALIZE:
            BATCH_SIZE: 8
            NUM_WORKERS: 8
            SHUFFLE: True
            PIN_MEMORY: FALSE
            DROP_LAST: True
        SOLVER:
          MAX_LR: 1.0e-3
          MIN_LR: 1.0e-8
          PATIENCE: 5
          NUM_EPOCHS: 100
          ANN_EPOCHS: 4
        """

## <font style="color:orange">Experiment Data</font>

Instances of the `ExpData` class ...

In [None]:
class ExpData(object):
    def __init__(self, cfg):
        self._seed, _ = Data.split_params(cfg.DATASETS.SPLIT_INDEX)
        tools.utils.seed_system(self._seed)
        self._cfg = cfg
        self._test_net_loader = None
        self._test_viz_loader = None
        self._test_net_dataset = None
        self._test_viz_dataset = None
        self._valid_net_loader = None
        self._valid_viz_loader = None
        self._valid_net_dataset = None
        self._valid_viz_dataset = None
        self._train_net_loader = None
        self._train_viz_loader = None
        self._train_net_dataset = None
        self._train_viz_dataset = None
        self._train_idx, self._valid_idx = Data.instance.split_pub_dataset(cfg.DATASETS.SPLIT_INDEX)
        self._test_image_dict_list = Data.instance.get_prv_image_dict_list()
        self._valid_image_dict_list = Data.instance.get_pub_image_dict_list(self._valid_idx)
        self._train_image_dict_list = Data.instance.get_pub_image_dict_list(self._train_idx)
        self._encoder_preprocessing_fn = get_preprocessing_fn(cfg.MODEL.ENCODER.NAME, cfg.MODEL.ENCODER.WEIGHTS)
        self._classes = cfg.DATASETS.CLASSES
        if self._classes == "all":
            self._classes = Data.classes

    @property
    def test_image_dict_list(self):
        return self._test_image_dict_list
    
    @property
    def valid_image_dict_list(self):
        return self._valid_image_dict_list
    
    @property
    def train_image_dict_list(self):
        return self._train_image_dict_list
    
    @property
    def test_net_dataset(self):
        if self._test_net_dataset is None:
            self._test_net_dataset = ExpData._create_dataset(
                image_dict_list = self._test_image_dict_list, 
                classes = self._classes, 
                augmentation = None, 
                encoder_preprocessing_fn = self._encoder_preprocessing_fn,
                use_image_cache = False
            )
        return self._test_net_dataset
    
    @property
    def test_viz_dataset(self):
        if self._test_viz_dataset is None:
            self._test_viz_dataset = ExpData._create_dataset(
                image_dict_list = self._test_image_dict_list, 
                classes = self._classes, 
                augmentation = None, 
                encoder_preprocessing_fn = None,
                use_image_cache = False
            )
        return self._test_viz_dataset

    @property
    def valid_net_dataset(self):
        if self._valid_net_dataset is None:
            self._valid_net_dataset = ExpData._create_dataset(
                image_dict_list = self._valid_image_dict_list, 
                classes = self._classes, 
                augmentation = None, 
                encoder_preprocessing_fn = self._encoder_preprocessing_fn,
                use_image_cache = True
            )
        return self._valid_net_dataset
    
    @property
    def valid_viz_dataset(self):
        if self._valid_viz_dataset is None:
            self._valid_viz_dataset = ExpData._create_dataset(
                image_dict_list = self._valid_image_dict_list, 
                classes = self._classes, 
                augmentation = None, 
                encoder_preprocessing_fn = None,
                use_image_cache = True
            )
        return self._valid_viz_dataset

    @property
    def train_net_dataset(self):
        if self._train_net_dataset is None:
            self._train_net_dataset = ExpData._create_dataset(
                image_dict_list = self._train_image_dict_list, 
                classes = self._classes, 
                augmentation = self._cfg.DATASETS.AUGMENTATION, 
                encoder_preprocessing_fn = self._encoder_preprocessing_fn,
                use_image_cache = True
            )
        return self._train_net_dataset
    
    @property
    def train_viz_dataset(self):
        if self._train_viz_dataset is None:
            self._train_viz_dataset = ExpData._create_dataset(
                image_dict_list = self._train_image_dict_list, 
                classes = self._classes, 
                augmentation = self._cfg.DATASETS.AUGMENTATION, 
                encoder_preprocessing_fn = None,
                use_image_cache = True
            )
        return self._train_viz_dataset

    @property
    def test_net_loader(self):
        if self._test_net_loader is None:
            self._test_net_loader = ExpData._create_loader(
                dataset = self.test_net_dataset, 
                params = self._cfg.DATALOADERS.VALID
            )
        return self._test_net_loader

    @property
    def test_viz_loader(self):
        if self._test_viz_loader is None:
            self._test_viz_loader = ExpData._create_loader(
                dataset = self.test_viz_dataset, 
                params = self._cfg.DATALOADERS.VISUALIZE
            )
        return self._test_viz_loader
    
    @property
    def valid_net_loader(self):
        if self._valid_net_loader is None:
            self._valid_net_loader = ExpData._create_loader(
                dataset = self.valid_net_dataset, 
                params = self._cfg.DATALOADERS.VALID
            )
        return self._valid_net_loader

    @property
    def valid_viz_loader(self):
        if self._valid_viz_loader is None:
            self._valid_viz_loader = ExpData._create_loader(
                dataset = self.valid_viz_dataset, 
                params = self._cfg.DATALOADERS.VISUALIZE
            )
        return self._valid_viz_loader

    @property
    def train_net_loader(self):
        if self._train_net_loader is None:
            self._train_net_loader = ExpData._create_loader(
                dataset = self.train_net_dataset, 
                params = self._cfg.DATALOADERS.TRAIN
            )
        return self._train_net_loader

    @property
    def train_viz_loader(self):
        if self._train_viz_loader is None:
            self._train_viz_loader = ExpData._create_loader(
                dataset = self.train_viz_dataset, 
                params = self._cfg.DATALOADERS.VISUALIZE
            )
        return self._train_viz_loader

    @classmethod
    def _create_dataset(cls, image_dict_list, classes, augmentation, encoder_preprocessing_fn, use_image_cache):
        pixel_transforms, geometric_transforms, tile_transforms_list = cls._create_transforms(augmentation, encoder_preprocessing_fn)
        return Dataset(
            image_dict_list,
            classes = classes,
            pixel_transforms = pixel_transforms,
            geometric_transforms = geometric_transforms,
            tile_transforms_list = tile_transforms_list,
            binary_masks = encoder_preprocessing_fn is not None,
            use_image_cache = use_image_cache
        )

    @classmethod
    def _create_loader(cls, dataset, params):
        return DataLoader(
            dataset = dataset, 
            batch_size = params.BATCH_SIZE,
            shuffle = params.SHUFFLE,
            num_workers = params.NUM_WORKERS,
            pin_memory = params.PIN_MEMORY,
            drop_last = params.DROP_LAST
        )

    @classmethod
    def _create_transforms(cls, augmentation=None, encoder_preprocessing_fn=None):
        import albumentations as albu
        
        def transpose(x, **kwargs):
            return x.transpose(2, 0, 1).astype('float32')    
        
        pixel_transforms = []
        geometric_transforms = None
        tile_transforms_list = None
        
        # ToDo: Add CopyPaste transform
        if augmentation:
            
            ####################################################################################################
            
            geometric_transforms = []
            
            if augmentation.HORIZONTAL_FLIP.ENABLE:
                geometric_transforms.append(albu.HorizontalFlip(p=augmentation.HORIZONTAL_FLIP.PROBABILITY))
            
            if augmentation.SHIFT_SCALE_ROTATE.ENABLE:
                geometric_transforms.append(albu.ShiftScaleRotate(
                    shift_limit = augmentation.SHIFT_SCALE_ROTATE.SHIFT_LIMIT,
                    scale_limit = augmentation.SHIFT_SCALE_ROTATE.SCALE_LIMIT,
                    rotate_limit = augmentation.SHIFT_SCALE_ROTATE.ROTATE_LIMIT,
                    p = augmentation.SHIFT_SCALE_ROTATE.PROBABILITY
                ))
                
            geometric_transforms = albu.Compose(geometric_transforms) if len(geometric_transforms) > 0 else None
            
            ####################################################################################################
            
            if augmentation.TILE.ENABLE:
                
                tile_width = augmentation.TILE.WIDTH
                tile_height = augmentation.TILE.HEIGHT
                tile_transforms_list = []
                for image_width, image_height, cols, rows in augmentation.TILE.SETS:
                    for row in range(rows):
                        y_min = row * (image_height - tile_height) // (rows - 1)
                        y_max = y_min + tile_height # - 1
                        for col in range(cols):
                            x_min = col * (image_width - tile_width) // (cols - 1)
                            x_max = x_min + tile_width # - 1
                            tile_transforms = []
                            if image_width != 1280 or image_height != 720:
                                tile_transforms.append(albu.Resize(
                                    height = image_height,
                                    width = image_width,
                                    always_apply = True
                                ))
                            tile_transforms.append(albu.Crop(
                                x_min = x_min,
                                y_min = y_min,
                                x_max = x_max,
                                y_max = y_max,
                                always_apply = True
                            ))
                            tile_transforms_list.append(albu.Compose(tile_transforms))
            
            ####################################################################################################

            if augmentation.COLOR.ENABLE:
                
                color_transforms = []
                
                if augmentation.COLOR.CLAHE.ENABLE:
                    color_transforms.append(albu.CLAHE(
                        clip_limit = augmentation.COLOR.CLAHE.CLIP_LIMIT,
                        p = augmentation.COLOR.CLAHE.PROBABILITY
                    ))
                    
                if augmentation.COLOR.BRIGHTNESS_CONTRAST.ENABLE:
                    color_transforms.append(albu.RandomBrightnessContrast(
                        brightness_limit = augmentation.COLOR.BRIGHTNESS_CONTRAST.BRIGHTNESS_LIMIT,
                        contrast_limit = augmentation.COLOR.BRIGHTNESS_CONTRAST.CONTRAST_LIMIT,
                        brightness_by_max = augmentation.COLOR.BRIGHTNESS_CONTRAST.BRIGHTNESS_BY_MAX,
                        p = augmentation.COLOR.BRIGHTNESS_CONTRAST.PROBABILITY
                    ))
                    
                if augmentation.COLOR.HUE_SATURATION_VALUE.ENABLE:
                    color_transforms.append(albu.HueSaturationValue(
                        hue_shift_limit = augmentation.COLOR.HUE_SATURATION_VALUE.HUE_SHIFT_LIMIT,
                        sat_shift_limit = augmentation.COLOR.HUE_SATURATION_VALUE.SAT_SHIFT_LIMIT,
                        val_shift_limit = augmentation.COLOR.HUE_SATURATION_VALUE.VAL_SHIFT_LIMIT,
                        p = augmentation.COLOR.HUE_SATURATION_VALUE.PROBABILITY
                    ))
            
                if len(color_transforms) > 0:
                    pixel_transforms.append(albu.OneOf(color_transforms, p=augmentation.COLOR.PROBABILITY))
            
            # !!! update augmentation module !!!
            if augmentation.GAUSSIAN_NOISE.ENABLE:
                pixel_transforms.append(albu.GaussNoise(
                    var_limit = augmentation.GAUSSIAN_NOISE.VAR_LIMIT,
                    mean = augmentation.GAUSSIAN_NOISE.MEAN,
                    #per_channel = augmentation.GAUSSIAN_NOISE.PER_CHANNEL,
                    p = augmentation.GAUSSIAN_NOISE.PROBABILITY
                ))
        
        # !!! image and mask dimensions must be divisible by 32 !!!
        if augmentation is None or not augmentation.TILE.ENABLE:
            pixel_transforms.append(albu.PadIfNeeded(min_height=736, min_width=1280, always_apply=True))
            
        if encoder_preprocessing_fn:
            pixel_transforms.append(albu.Lambda(image=encoder_preprocessing_fn, name="encoder_preprocessing_fn"))
            pixel_transforms.append(albu.Lambda(image=transpose, mask=transpose, name="transpose"))
            
        pixel_transforms = albu.Compose(pixel_transforms) if len(pixel_transforms) > 0 else None

        return pixel_transforms, geometric_transforms, tile_transforms_list

## <font style="color:orange">Inferencer</font>

Instances of the `Inferencer` class load a trained model and predict the mask of a specified image.

In [None]:
class Inferencer(ExpData):
    _output_dir = "./output"
    _best_model = "best_model.pth"
    _last_model = "last_model.pth"

    def __init__(self, cfg):
        super().__init__(cfg)
        self._model = None
        self._name = Inferencer.name(cfg)
        self._output_dir = Inferencer.output_dir(cfg)
        os.makedirs(self._output_dir, exist_ok=True)
        self._device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
        self._init_continued(cfg)
    
    def _init_continued(self, cfg):
        path = os.path.join(self._output_dir, Inferencer._best_model)
        checkpoint = torch.load(path)
        self.model.load_state_dict(checkpoint["model_state_dict"])

    @classmethod
    def name(cls, cfg):
        return f"{cfg.EXPERIMENT.PREFIX}-{cfg.MODEL.ABBR}-{cfg.MODEL.ENCODER.ABBR}-{cfg.EXPERIMENT.SUFFIX}"
        
    @classmethod
    def output_dir(cls, cfg):
        return os.path.join(cls._output_dir, cls.name(cfg))
        
    @property
    def device(self):
        return self._device
    
    @property
    def model(self):
        if self._model is None:
            self._model = self._create_model()
            self._model.to(self._device)
        return self._model
    
    def __get_image(self, image_dict, image_dict_list, dataset):
        try:
            idx = image_dict_list.index(image_dict)
            image, _ = dataset[idx]
            return image
        except ValueError:
            return None

    def predict(self, images, activation=None):       
        from torch.nn import functional as f

        # prepare input
        if isinstance(images, torch.Tensor):
            images = images.to(self.device)
        elif isinstance(images, np.ndarray):
            images = torch.from_numpy(images).to(self.device)
        elif isinstance(images, dict):
            image = self.__get_image(images, self.valid_image_dict_list, self.valid_net_dataset)
            if image is None:
                image = self.__get_image(images, self.test_image_dict_list, self.test_net_dataset)
                if image is None:
                    raise ValueError("unknown image dictionary")
            images = torch.from_numpy(image).to(self.device)
        else:
            raise ValueError("images must be a torch.Tensor, np.ndarray, or image dictionary")

        # add batch size of 1 if individual image
        squeeze = False
        dims = len(images.shape)
        if dims == 3:
            squeeze = True
            images = images.unsqueeze(0)
        elif dims < 3 or dims > 4:
            raise ValueError("images must have three color channels")
            
        # get prediction
        self.model.eval()
        with torch.no_grad():        
            probs = self.model(images)
        probs = probs[:,:,:720,:]
        
        # apply activitation
        if activation is None:
            output = probs
        elif activation == "argmax":
            output = probs.argmax(dim=1).byte()
        elif activation == "softmax":
            output = f.softmax(probs, dim=1)
        elif activation == "one_hot":
            batch_size, num_classes, image_height, image_width = tuple(probs.shape)
            probs = probs.view(batch_size, num_classes, -1)
            output = f.one_hot(probs.argmax(dim=1), num_classes).transpose(1, 2).byte()
            output = output.view(batch_size, num_classes, image_height, image_width)
        else:
            raise ValueError("activation should be 'argmax', 'softmax', 'one_hot', or None")
        
        # prepare output
        if squeeze:
            output = output.squeeze(0)

        return output
        
    def _create_model(self):
        model_class = getattr(sys.modules[__name__], self._cfg.MODEL.NAME)
        return model_class(
            encoder_name = self._cfg.MODEL.ENCODER.NAME,
            encoder_weights = self._cfg.MODEL.ENCODER.WEIGHTS,
            classes = len(self._classes),
            activation = self._cfg.MODEL.ACTIVATION
        )
    
    @classmethod
    def _create_progress_loader(cls, description, data_loader):
        return tqdm(
            iterable = data_loader, 
            bar_format = "{l_bar}{bar}| {n_fmt:>4}/{total_fmt:4} [{elapsed}<{remaining}, {rate_fmt}{postfix}]", 
            desc = description, 
            mininterval = 1., 
            unit = "batch"
        )

## <font style="color:orange">Trainer</font>

Instances of the `Trainer` class load a pretrained encoder and fine tune the model on the Aeroscapes dataset.

In [None]:
class Trainer(Inferencer):
    _runs_dir = "./runs"

    def __init__(self, cfg):
        super().__init__(cfg)
        self._loss_fn = None
        self._optimizer = None
        self._scheduler = None
        self._runs_dir = os.path.join(Trainer._runs_dir, self._name)
        # Instantiating the model, optimizer, and LR scheduler may produce output.
        with io.capture_output():
            model = self.model
            optimizer = self.optimizer
            scheduler = self.scheduler

    def _init_continued(self, cfg):
        # ToDo: load model if continuing training
        pass

    @property
    def loss_fn(self):
        if self._loss_fn is None:
            self._loss_fn = self._create_loss_fn()
        return self._loss_fn
    
    @property
    def optimizer(self):
        if self._optimizer is None:
            self._optimizer = self._create_optimizer()
        return self._optimizer
    
    @property
    def scheduler(self):
        if self._scheduler is None:
            self._scheduler = self._create_scheduler()
        return self._scheduler

    @classmethod
    def _do_train_cycle(
        cls, 
        epoch_num, 
        model,
        device,
        loss_fn,
        optimizer,
        scheduler,
        data_loader,
        metrics = None, 
        loss_stats = None,
        show_progress = False,
        annealing = False
    ):
        cycle_pbar = data_loader
        if show_progress:
            action = "  Training" if not annealing else " Annealing"
            cycle_pbar = cls._create_progress_loader(f"{action} {epoch_num:02d}", data_loader)

        model.train()
        for images, gt_masks in cycle_pbar:
            optimizer.zero_grad()
            images = images.to(device)
            gt_masks = gt_masks.to(device)
            pr_masks = model(images)
            loss = loss_fn(pr_masks, gt_masks)
            loss.backward()
            optimizer.step()
            if scheduler:
                scheduler.step()

            mean_loss = float("nan")
            if loss_stats:
                loss_stats.add(float(loss))
                mean_loss = loss_stats.mean

            mean_dice = float("nan")
            if metrics:
                metrics.add(pr_masks, gt_masks)
                mean_dice, _ = metrics.dice
                
            if show_progress:
                cycle_pbar.set_postfix_str(f"loss={loss_stats.mean:.5f}, dice={mean_dice:.5f}")

        if show_progress:
            cycle_pbar.close()

    @classmethod
    def _do_valid_cycle(
        cls, 
        epoch_num, 
        model,
        device,
        loss_fn,
        data_loader,
        metrics = None, 
        loss_stats = None,
        show_progress = False
    ):
        cycle_pbar = data_loader
        if show_progress:
            cycle_pbar = cls._create_progress_loader(f"Validating {epoch_num:02d}", data_loader)
            
        model.eval()
        for images, gt_masks in cycle_pbar:
            images = images.to(device)
            gt_masks = gt_masks.to(device)
            with torch.no_grad():        
                pr_masks = model(images)
                loss = loss_fn(pr_masks, gt_masks)

            mean_loss = float("nan")
            if loss_stats:
                loss_stats.add(float(loss))
                mean_loss = loss_stats.mean
            
            mean_dice = float("nan")
            if metrics:
                metrics.add(pr_masks, gt_masks)
                mean_dice, _ = metrics.dice
            
            if show_progress:
                cycle_pbar.set_postfix_str(f"loss={loss_stats.mean:.5f}, dice={mean_dice:.5f}")
                
        if show_progress:
            cycle_pbar.close()
    
    def train(self):
        from torch.utils.tensorboard import SummaryWriter 
        tools.utils.seed_system(self._seed)

        def log_results(writer, epoch_num, metrics, loss_stats):
            mean_loss = loss_stats.mean
            mean_iou, per_class_iou = metrics.iou
            mean_dice, per_class_dice = metrics.dice
            writer.add_scalar("loss", mean_loss, epoch_num)
            writer.add_scalar("metrics/mean_iou", mean_iou, epoch_num)
            writer.add_scalar("metrics/mean_dice", mean_dice, epoch_num)
            for name, iou, dice in zip(self._classes, per_class_iou, per_class_dice):
                writer.add_scalar(f"iou/{name}", iou, epoch_num)
                writer.add_scalar(f"dice/{name}", dice, epoch_num)
            
        def do_train_cycle(writer, epoch_num, metrics, loss_stats, annealing):
            metrics.reset()
            loss_stats.reset()
            scheduler = self.scheduler if annealing else None
            Trainer._do_train_cycle(
                epoch_num = epoch_num,
                model = self.model, 
                device = self.device, 
                loss_fn = self.loss_fn, 
                optimizer = self.optimizer, 
                scheduler = scheduler, 
                data_loader = self.train_net_loader,
                metrics = metrics,
                loss_stats = loss_stats,
                show_progress = True,
                annealing = annealing
            )
            log_results(writer, epoch_num, metrics, loss_stats)
            writer.add_scalar("param/lr", self.optimizer.param_groups[0]['lr'], epoch_num)
            return True
        
        def do_valid_cycle(writer, epoch_num, metrics, loss_stats):
            metrics.reset()
            loss_stats.reset()
            Trainer._do_valid_cycle(
                epoch_num = epoch_num,
                model = self.model, 
                device = self.device, 
                loss_fn = self.loss_fn, 
                data_loader = self.valid_net_loader,
                metrics = metrics,
                loss_stats = loss_stats,
                show_progress = True
            )
            log_results(writer, epoch_num, metrics, loss_stats)
        
        def do_checkpoint(writer, loss, min_loss, epochs_since_best):
            if min_loss <= loss:
                epochs_since_best += 1
                self._save_checkpoint(epoch_num, False)
            else:
                min_loss = loss
                epochs_since_best = 0
                self._save_checkpoint(epoch_num, True)
            writer.add_scalar("param/epochs_since_best", epochs_since_best, epoch_num)
            return min_loss, epochs_since_best
        
        min_valid_loss = 1e10
        epochs_since_best = 0
        patience = self._cfg.SOLVER.PATIENCE
        
        train_writer = SummaryWriter(log_dir=os.path.join(self._runs_dir, "train"))
        train_metrics = IouAndDiceMetrics(len(self._classes))
        train_loss_stats = SimpleStatistics()
        
        valid_writer = SummaryWriter(log_dir=os.path.join(self._runs_dir, "valid"))
        valid_metrics = IouAndDiceMetrics(len(self._classes))
        valid_loss_stats = SimpleStatistics()
        
        valid_writer.add_scalar("param/epochs_since_best", 0, 0)
        train_writer.add_scalar("param/lr", self.optimizer.param_groups[0]['lr'], 0)
        
        do_valid_cycle(valid_writer, 0, valid_metrics, valid_loss_stats)
        for epoch_num in range(1, self._cfg.SOLVER.NUM_EPOCHS + 1):
            do_train_cycle(train_writer, epoch_num, train_metrics, train_loss_stats, annealing=False)
            do_valid_cycle(valid_writer, epoch_num, valid_metrics, valid_loss_stats)
            min_valid_loss, epochs_since_best = do_checkpoint(
                valid_writer, 
                valid_loss_stats.mean, 
                min_valid_loss, 
                epochs_since_best
            )
            if patience is not None and epochs_since_best >= patience:
                break
        
        epochs_since_best = 0
        with io.capture_output():
            best_epoch = self._load_checkpoint()
        
        epoch_ann = epoch_num + 1
        epoch_end = epoch_ann + self._cfg.SOLVER.ANN_EPOCHS
        for epoch_num in range(epoch_ann, epoch_end):
            do_train_cycle(train_writer, epoch_num, train_metrics, train_loss_stats, annealing=True)
            do_valid_cycle(valid_writer, epoch_num, valid_metrics, valid_loss_stats)
            min_valid_loss, epochs_since_best = do_checkpoint(
                valid_writer, 
                valid_loss_stats.mean, 
                min_valid_loss, 
                epochs_since_best
            )
    
        train_writer.close()
        valid_writer.close()
        self.model.eval()
    
    def _create_loss_fn(self):
        from tools.losses import DiceLoss
        return DiceLoss(mode="multiclass")
        pass
    
    def _create_optimizer(self):
        from tools.optimizers import Ranger
        # ToDo: Investigate whether specifying different LRs for model parameters is helpful.
        # learning_rate = 1e-3
        # [
        #     {'params': model.decoder.parameters(), 'lr': learning_rate}, 
        #     {'params': model.encoder.parameters(), 'lr': 1e-4},
        #     {'params': model.segmentation_head.parameters(), 'lr': learning_rate}
        # ]
        model = self.model
        return Ranger(model.parameters(), self._cfg.SOLVER.MAX_LR)
    
    def _create_scheduler(self):
        from torch.optim.lr_scheduler import CosineAnnealingLR
        return CosineAnnealingLR(
            optimizer = self.optimizer,
            T_max = self._cfg.SOLVER.ANN_EPOCHS * len(self.train_net_loader),
            eta_min = self._cfg.SOLVER.MIN_LR
        )
    
    def _load_checkpoint(self, best=True):
        path = os.path.join(self._output_dir, Inferencer._best_model)
        if not best:
            path = os.path.join(self._output_dir, Inferencer._last_model)
        checkpoint = torch.load(path)
        self.model.load_state_dict(checkpoint["model_state_dict"])
        self.optimizer.load_state_dict(checkpoint["optimizer_state_dict"])
        return checkpoint["epoch"]
    
    def _save_checkpoint(self, epoch, best=False):
        state = {
            "epoch": epoch,
            "model_state_dict": self.model.state_dict(),
            "optimizer_state_dict": self.optimizer.state_dict()
        }
        torch.save(state, os.path.join(self._output_dir, Inferencer._last_model))
        if best:
            torch.save(state, os.path.join(self._output_dir, Inferencer._best_model))

## <font style="color:orange">LR Sweep Test</font>

Instances of the `LrSweepTest` class perform a visual a learning rate sweep as proposed by ... and popularized by 

In [None]:
class LrSweepTest(object):
    def __init__(self, cfg, lrs=None, losses=None):
        self._cfg = cfg
        self._lrs = lrs
        self._losses = losses
        
    def run(self, start_lr=1e-7, end_lr=10, num_it=100):
        import math
        from torch.optim.lr_scheduler import LambdaLR

        trainer = Trainer(self._cfg)
        lr_ratio = start_lr / self._cfg.SOLVER.MIN_LR
        lr_exp_inc = lambda it: lr_ratio * (end_lr / start_lr) ** (it / (num_it - 1))
        optimizer = trainer.optimizer
        scheduler = LambdaLR(optimizer, lr_exp_inc)

        model = trainer.model
        device = trainer.device
        loss_fn = trainer.loss_fn
        train_net_loader = trainer.train_net_loader
        valid_net_loader = trainer.valid_net_loader

        self._lrs = []
        self._losses = []
        valid_loss_stats = SimpleStatistics()
        cycle_pbar = Trainer._create_progress_loader(f"LR Sweep", FixedLengthIterator(train_net_loader, num_it))
        for images, gt_masks in cycle_pbar:
            # perform a forward-backward pass
            model.train()
            lr = optimizer.param_groups[0]['lr']
            optimizer.zero_grad()
            images = images.to(device)
            gt_masks = gt_masks.to(device)
            pr_masks = model(images)
            loss = loss_fn(pr_masks, gt_masks)
            loss.backward()
            optimizer.step()
            scheduler.step()

            # compute mean validation loss
            valid_loss_stats.reset()
            Trainer._do_valid_cycle(0, model, device, loss_fn, valid_net_loader, None, valid_loss_stats, False)
        
            # can we stop prematurely?
            if math.isnan(valid_loss_stats.mean):
                break

            self._lrs.append(float(lr))
            self._losses.append(valid_loss_stats.mean)           
            cycle_pbar.set_postfix_str(f"lr={lr:.3e}, loss={valid_loss_stats.mean:.5f}")
                
        cycle_pbar.close()
    
    def _smooth_loss(self, alpha):       
        smooth = []
        loss_ema = self._losses[0]
        for loss in self._losses:
            loss_ema = alpha * loss + (1. - alpha) * loss_ema
            smooth.append(loss_ema)
        return smooth
    
    def load(self):
        import pandas as pd
        path = os.path.join(Inferencer.output_dir(self._cfg), "lr_sweep_test.csv")
        try:
            df = pd.read_csv(path, index_col=False)
            self._lrs = df["lrs"].tolist()
            self._losses = df["losses"].tolist()
            return true
        except:
            return false
        
    def save(self):
        output_dir = Inferencer.output_dir(self._cfg)
        path = os.path.join(output_dir, "lr_sweep_test.csv")
        os.makedirs(output_dir, exist_ok=True)
        df = pd.DataFrame({'lrs': self._lrs, 'losses': self._losses}) 
        df.to_csv(path, index=False) 
    
    def plot(self, alpha=0.3):
        if self._losses is None or len(self._losses) < 2:
            return
        fig = plt.figure(figsize=(8, 4))
        plt.title(f"LR Sweep Test")
        plt.xlabel("LR")
        plt.ylabel("Loss")
        plt.xscale("log")
        plt.plot(self._lrs, self._losses, color="#a0a0ff", label="raw")
        plt.plot(self._lrs, self._smooth_loss(alpha), color="#0000ff", label=f"ema (α = {alpha:.2f})")
        plt.legend()
        plt.grid(True, which="both", axis="x", linestyle=':')
        plt.grid(True, which="both", axis="y", linestyle=':')
        plt.gca().set_axisbelow(True)
        plt.tight_layout
        plt.show
        
    def suggestion(self, alpha=0.3):
        if self._losses is None or len(self._losses) < 10:
            return None, None
        lrs = torch.tensor(self._lrs[5:-5])
        losses = torch.tensor(self._smooth_loss(alpha)[5:-5])
        lr_min = lrs[losses.argmin()].item()
        grads = (losses[1:]-losses[:-1]) / (lrs[1:].log()-lrs[:-1].log())
        lr_steep = lrs[grads.argmin()].item()
        return lr_min, lr_steep

# <font style="color:green">3. Model [10 Points]</font>

**In this section, you have to define your model.**

## <font style="color:orange">Model</font>

The model is specified in configuration file. The default configuration file uses `FPN` with an `EfficientNet-B3` encoder. Initial experiments used this model. Subsequent experiments used ... Learning rate (LR) sweep tests were performed to select the LR rate where the model learned fast.

In [None]:
# cannot recreate w/o code change
def ExpAAA():
    cfg = Config.default()
    cfg.SOLVER.NUM_EPOCHS = 30
    cfg.EXPERIMENT.PREFIX = "AAA"
    cfg.EXPERIMENT.SUFFIX = "1C-FS-NA"
    cfg.DATASETS.SPLIT_INDEX = 0
    cfg.DATALOADERS.TRAIN.BATCH_SIZE = 2
    return cfg

# changed scheduler in code
def ExpAAB():
    cfg = ExpAAA()
    cfg.SOLVER.NUM_EPOCHS = 100
    cfg.EXPERIMENT.PREFIX = "AAB"
    cfg.EXPERIMENT.SUFFIX = "CA-FS-NA"
    return cfg

def ExpAAC():
    cfg = ExpAAB()
    cfg.EXPERIMENT.PREFIX = "AAC"
    cfg.EXPERIMENT.SUFFIX = "CA-FS-DA"
    cfg.DATASETS.AUGMENTATION.HORIZONTAL_FLIP.ENABLE = True
    cfg.DATASETS.AUGMENTATION.SHIFT_SCALE_ROTATE.ENABLE = True
    cfg.DATASETS.AUGMENTATION.COLOR.ENABLE = True
    cfg.DATASETS.AUGMENTATION.GAUSSIAN_NOISE.ENABLE = True
    return cfg

def ExpAAD():
    cfg = ExpAAC()
    cfg.EXPERIMENT.PREFIX = "AAD"
    cfg.EXPERIMENT.SUFFIX = "CA-T1-DA"
    cfg.DATASETS.AUGMENTATION.TILE.ENABLE = True
    cfg.DATASETS.AUGMENTATION.TILE.WIDTH = 640
    cfg.DATASETS.AUGMENTATION.TILE.HEIGHT = 352
    cfg.DATASETS.AUGMENTATION.TILE.SETS = [[1280, 720, 3, 3], [854, 480, 2, 2]]
    cfg.DATALOADERS.TRAIN.BATCH_SIZE = 10
    cfg.DATALOADERS.VALID.BATCH_SIZE = 2
    return cfg

def ExpAAE():
    cfg = ExpAAD()
    cfg.EXPERIMENT.PREFIX = "AAE"
    cfg.MODEL.ENCODER.NAME = "timm-resnest50d"
    cfg.MODEL.ENCODER.ABBR = "RNS50"
    cfg.MODEL.ENCODER.WEIGHTS: imagenet    
    cfg.DATALOADERS.TRAIN.BATCH_SIZE = 13
    cfg.DATALOADERS.VALID.BATCH_SIZE = 2
    return cfg

def ExpBAA():
    cfg = ExpAAC()
    cfg.EXPERIMENT.PREFIX = "BAA"
    cfg.MODEL.NAME = "DeepLabV3Plus"
    cfg.MODEL.ABBR = "DL3P"
    cfg.DATALOADERS.TRAIN.BATCH_SIZE = 6
    cfg.DATALOADERS.VALID.BATCH_SIZE = 2
    return cfg

def ExpBAB():
    cfg = ExpAAD()
    cfg.EXPERIMENT.PREFIX = "BAB"
    cfg.MODEL.NAME = "DeepLabV3Plus"
    cfg.MODEL.ABBR = "DL3P"
    cfg.DATALOADERS.TRAIN.BATCH_SIZE = 24
    cfg.DATALOADERS.VALID.BATCH_SIZE = 2
    return cfg

In [None]:
def lr_sweep_test(cfg):
    sweeper = LrSweepTest(cfg)
    if not sweeper.load():
        sweeper.run(start_lr=1e-5, end_lr=10, num_it=200)
        sweeper.save()
    sweeper.plot()
    lr_min, lr_steep = sweeper.suggestion()
    print(f"lr_min: {lr_min:.3e}, lr_steep: {lr_steep:.3e}")

# uncomment the following line to run/load a LR sweep test    
# lr_sweep_test(ExpAAB())

# <font style="color:green">4. Train & Inference</font>

- **In this section, you have to train the model and infer on sample data.**


- **You can write your trainer class in this section.**


- **If you are using any loss function other than PyTorch standard loss function, you have to define in this section.**


- **This section should also have optimizer and LR-schedular (if using) details.**



## <font style="color:green">4.1. Train [7 Points]</font>

**Write your training code in this sub-section.**


**This section must contain training plots (use matplotlib or share tensorboard.dev scalars logs).**

**You must have to plot the following:**
- **train loss**


- **validation loss**


- **IoU for all twelve classes (0-11) and the mean IoU of all classes on validation data.** 

**an example of matplotlib plot:**

---

<img src='https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-train-loss.png'>

---

<img src='https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-val-loss.png'>

---

<img src='https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-mean_iou.png'>

---

<img src='https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-iou-0.png'>

---

<center>*</center>
<center>*</center>
<center>*</center>

---

<img src='https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-iou-11.png'>

---


## <font style="color:orange">Train</font>

The `Training` class, which is defined in the **<font style="color:orange">Framework</font>** section, has a `train()` method that performs the following steps.

- Validate (before training)
- For `cfg.SOLVER.NUM_EPOCHS` epochs
  - Train with a constant `cfg.SOLVER.MAX_LR` learning rate using `Ranger` optimizer
  - Validate
  - Terminate early if validation loss does not decrease after `cfg.SOLVER.PATIENCE` epochs
- Load best model for annealing
- For `cfg.SOLVER.ANN_EPOCHS` epochs
  - Train with `CosineAnnealingLR` scheduler from `cfg.SOLVER.MAX_LR` to `cfg.SOLVER.MIN_LR` 
  - Validate
  
During training and validation, the mean loss is computed as well as _mean_ and _per class_ IoU and Dice metrics.

In [None]:
cfg = ExpAAE()
print(cfg)

In [None]:
trainer = Trainer(cfg)
trainer.train()

## <font style="color:green">4.2. Inference [3 Points]</font>

**Plot some sample inference in this sub-section.**

**for example:**

---

<img src='https://www.learnopencv.com/wp-content/uploads/2020/04/c3-w12-sample-predtiction.png'>

---



## <font style="color:orange">Inferencer</font>

...

In [None]:
inferencer = Inferencer(cfg)

def predict(image_dict, visualize=True):
    #pr_mask = inferencer.predict_mask(image_dict)
    pr_mask = inferencer.predict(image_dict, activation="argmax").cpu().numpy()
    if visualize:
        Visualizer.visualize_images(
            image = image_dict["image_path"],
            predicted_mask = pr_mask,
            ground_truth_mask = image_dict["mask_path"]
        )
    return pr_mask

for image_dict in random.sample(inferencer.valid_image_dict_list, 10):
    predict(image_dict, visualize=True)

In [None]:
def evaluate_model(data_loader):
    inferencer.model.eval()
    metrics = IouAndDiceMetrics(len(Data.classes))
    tqdm_iter = tqdm(data_loader)
    for images, gt_masks in tqdm_iter:
        images = images.to(inferencer.device)
        gt_masks = gt_masks.to(inferencer.device)
        with torch.no_grad():        
            pr_masks = inferencer.model(images)
            metrics.add(pr_masks, gt_masks)
        iou, _ = metrics.iou
        dice, _ = metrics.dice
        tqdm_iter.set_postfix(iou=f"{iou:.03f}", dice=f"{dice:.03f}")
    tqdm_iter.close()
    return metrics

def print_metrics(metrics):
    mean_iou, per_class_iou = metrics.iou
    print("  IoU Coefficients ")
    print("-------------------")
    print(f"        Mean: {mean_iou:.03f}")
    for class_name, class_iou in zip(Data.classes, per_class_iou):
        print(f"{class_name:>12}: {class_iou:.03f}")
    print()

    mean_dice, per_class_dice = metrics.dice
    print(" Dice Coefficients ")
    print("-------------------")
    print(f"        Mean: {mean_dice:.03f}")
    for class_name, class_dice in zip(Data.classes, per_class_dice):
        print(f"{class_name:>12}: {class_dice:.03f}")


valid_metrics = evaluate_model(inferencer.valid_net_loader)
print_metrics(valid_metrics)

# <font style="color:green">5. Prepare Submission CSV [10 Points]</font>

**Write your code to prepare the submission CSV file.**


**Note that in the submission file, you have to write Encoded Pixels.**

[Here is a blog to understand what is Encoded Pixels.](https://medium.com/analytics-vidhya/generating-masks-from-encoded-pixels-semantic-segmentation-18635e834ad0)

In [None]:
# ToDo: This code runs painfully slow. Fix!
class Submission(object):
    root_dir = "./submissions"
    csv_fields = ("ImageID", "EncodedPixels")
    
    @classmethod
    def __encode(cls, binary_array):
        from itertools import accumulate, groupby

        # compute the length of "0" and "1" pixel runs
        lengths = []
        for i, (value, elements) in enumerate(groupby(binary_array.ravel())):
            if i == 0 and value == 1:
                lengths.append(0)
            lengths.append(len(list(elements)))

        # compute the offsets of the pixel runs
        offsets = list(accumulate([0] + lengths))[:-1]

        # discard the offsets and lengths that correspond to the "0" pixels
        lengths = lengths[1::2]
        offsets = offsets[1::2]

        # interleave the offsets and lengths and convert to a string
        encoding = [val for pair in zip(offsets, lengths) for val in pair]
        return ' '.join(map(str, encoding))
    
    @classmethod
    def create(cls, inferencer):
        import csv
        os.makedirs(Submission.root_dir, exist_ok=True)
        path = os.path.join(Submission.root_dir, inferencer._name + ".csv")
        with open(path, 'w') as file:
            writer = csv.writer(file)
            writer.writerow(Submission.csv_fields)
            pbar = tqdm(inferencer.test_image_dict_list, desc="Encoding", unit="image")
            for image_dict in pbar:
                name = image_dict["name"]
                pred = inferencer.predict(image_dict, activation="one_hot").cpu().numpy()
                rows = [(f"{name}_{idx}", cls.__encode(ba)) for idx, ba in enumerate(pred)]
                writer.writerows(rows)
            pbar.close()


#Submission.create(inferencer)

```
import csv, threading
import concurrent.futures

class Submission(object):
    root_dir = "./submissions"
    csv_fields = ("ImageID", "EncodedPixels")
    
    def __init__(self, inferencer):
        os.makedirs(Submission.root_dir, exist_ok=True)
        self.__path = os.path.join(Submission.root_dir, inferencer._name + ".csv")
        self.__lock1 = threading.Lock()
        self.__lock2 = threading.Lock()
        self.__inferencer = inferencer
    
    def __encode(self, binary_array):
        from itertools import accumulate, groupby

        # compute the length of "0" and "1" pixel runs
        lengths = []
        for i, (value, elements) in enumerate(groupby(binary_array.ravel())):
            if i == 0 and value == 1:
                lengths.append(0)
            lengths.append(len(list(elements)))

        # compute the offsets of the pixel runs
        offsets = list(accumulate([0] + lengths))[:-1]

        # discard the offsets and lengths that correspond to the "0" pixels
        lengths = lengths[1::2]
        offsets = offsets[1::2]

        # interleave the offsets and lengths and convert to a string
        encoding = [val for pair in zip(offsets, lengths) for val in pair]
        return ' '.join(map(str, encoding))
    
    def __predict_and_encode(self, image_dict):
        name = image_dict["name"]
        self.__pbar.update(1)
        self.__lock1.acquire()
        pred = self.__inferencer.predict(image_dict, activation="one_hot").cpu().numpy()
        self.__lock1.release()
        rows = [(f"{name}_{idx}", self.__encode(ba)) for idx, ba in enumerate(pred)]
        self.__lock2.acquire()
        self.__writer.writerows(rows)
        self.__lock2.release()
    
    def create(self):
        with open(self.__path, 'w') as file:
            self.__writer = csv.writer(file)
            self.__writer.writerow(Submission.csv_fields)
            with concurrent.futures.ThreadPoolExecutor(max_workers=12) as executor:
                image_dict_list = self.__inferencer.test_image_dict_list
                self.__pbar = tqdm(total=len(image_dict_list), desc="Predicting and Encoding", unit="image")
                for image_dict in image_dict_list:
                    executor.submit(self.__predict_and_encode, image_dict=image_dict)
                self.__pbar.close()


submission = Submission(inferencer)
submission.create()
```

# <font style="color:green">6. Kaggle Profile Link [50 Points]</font>

Share your Kaggle profile link here with us so that we can give points for the competition score. 

You should have a minimum IoU of `0.60` on the test data to get all points. If the IoU is less than `0.55`, you will not get any points for the section. 

**You must have to submit `submission.csv` (prediction for images in `test.csv`) in `Submit Predictions` tab in Kaggle to get any evaluation in this section.**