## <font style="color:blue">Project 4: Kaggle Competition - Semantic Segmentation</font>

#### Maximum Points: 100

<div>
    <table>
        <tr><td><h6>Sr. no.</h6></td> <td><h6>Section</h6></td> <td><h6>Points</h6></td> </tr>
        <tr><td><h6>1</h6></td> <td><h6>1.1. Dataset Class</h6></td> <td><h6>7</h6></td> </tr>
        <tr><td><h6>2</h6></td> <td><h6>1.2. Visualize dataset</h6></td> <td><h6>3</h6></td> </tr>
        <tr><td><h6>3</h6></td> <td><h6>2. Evaluation Metrics</h6></td> <td><h6>10</h6></td> </tr>
        <tr><td><h6>4</h6></td> <td><h6>3. Model</h6></td> <td><h6>10</h6></td> </tr>
        <tr><td><h6>5</h6></td> <td><h6>4.1. Train</h6></td> <td><h6>7</h6></td> </tr>
        <tr><td><h6>6</h6></td> <td><h6>4.2. Inference</h6></td> <td><h6>3</h6></td> </tr>
        <tr><td><h6>7</h6></td> <td><h6>5. Prepare Submission CSV</h6></td><td><h6>10</h6></td> </tr>
        <tr><td><h6>8</h6></td> <td><h6>6. Kaggle Profile Link</h6></td> <td><h6>50</h6></td> </tr>
    </table>
</div>

---

<h2>Dataset Description </h2>
<p>The dataset consists of 3,269 images in 12 classes (including background). All images were taken from drones in a variety of scales. Samples are shown below:
<img src="https://github.com/ishann/aeroscapes/blob/master/assets/data_montage.png?raw=true" width="800" height="800">
<p>The data was splitted into public train set and private test set which is used for evaluation of submissions.

In [None]:
!wget -nc <<FILL ME>>

In [None]:
!apt update && apt install -y unzip

In [None]:
!unzip -nq opencv-pytorch-segmentation-project.zip

In [1]:
!pip install -r requirements.txt
!pip uninstall -y opencv-python
!pip install --force-reinstall opencv-python-headless numpy==1.26.*

In [None]:
# Standard Library imports
from pathlib import Path
import os

# External imports
import torch
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from torch.utils.data import DataLoader
from tqdm.autonotebook import tqdm
import segmentation_models_pytorch as smp
from pytorch_toolbelt.utils.rle import rle_encode, rle_to_string
from iterstrat.ml_stratifiers import MultilabelStratifiedShuffleSplit

# Local imports
from semantic_segmentation.train import main
from semantic_segmentation.model import make_deeplabv3_resnet101, unfreeze_deeplabv3_resnet101
from semantic_segmentation.loss import CombinedLoss, SoftDiceLoss
from semantic_segmentation.metrics import DiceScore
from semantic_segmentation.scheduler import get_scheduler
from semantic_segmentation.dataset import SemanticSegmentationDataset
from semantic_segmentation.visualization import bar_plot, draw_batch
from semantic_segmentation.visualization import plot_loss_and_score, plot_score_per_class
from semantic_segmentation.visualization import visualize_classes, draw_predictions
from semantic_segmentation.visualization import visualize_rle_encoding_decoding
from semantic_segmentation.transforms import get_transforms
from semantic_segmentation.utils import LABELS_NAMES_MAP
from semantic_segmentation.utils import extract_and_onehot_encode_classes_from_multilabel_masks
from semantic_segmentation.utils import calculate_class_weights, count_pixels_per_class
from semantic_segmentation.utils import prepare_for_prediction
from semantic_segmentation.utils import ClearCache, count_images_per_class
from semantic_segmentation.optimizer import smart_optimizer

plt.style.use('bmh')

<h2>Configuration</h2>

In [2]:
class Config:
    LOAD_MODEL_NAME = "deeplabv3_best_model.pt"
    
    # Cloud
    DATA_PATH = "/workspace/DLPT-semantic-segmentation/"
    OUTPUT_PATH = "/workspace/DLPT-semantic-segmentation/output"
    INPUT_PATH = "/workspace/DLPT-semantic-segmentation/input"

    # Kaggle
    # DATA_PATH = "/kaggle/input/opencv-pytorch-segmentation-project/"
    # OUTPUT_PATH = "/kaggle/working"
    # INPUT_PATH = "/kaggle/input/model-epoch-13/deeplabv3_model_epoch_13.pkl"
    
    # Training
    STARTING_EPOCH = 0
    EPOCHS = 25
    SCHEDULER = "constant"
    LR = 1e-3  # Used when using a constant lr scheduler
    MAX_LR = 1e-3  # Used when the scheduler is OneCycleLR
    MIN_LR = 1e-5  # Used with CosineAnnealing
    MOMENTUM = 0.937 # SGD momentum/Adam beta1, from Yolo v5
    WEIGHT_DECAY = 0.0001 # optimizer weight decay
    LABEL_SMOOTHING = None
    BATCH_SIZE = 2 if torch.cuda.is_available() else 2
    GRADIENT_ACCUMULATION_STEPS = 1
    # Use or not the auxiliary loss of DeeplabV3. 
    # Needs at least the layer3 of the backbone to be unfrozen
    USE_AUX = False
    BACKBONE_LAYERS = []
    
    # Data
    TRAIN_SPLIT = 0.8
    MASK_FILL_VALUE = 0  # This is the "background" class
    NUM_CLASSES = 12

    # Infrastructure
    NUM_WORKERS = 4  # There are 4 CPUs in Kaggle
    DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

config = Config()

In [3]:
train_csv_path = Path(config.DATA_PATH) / "train.csv"
test_csv_path = Path(config.DATA_PATH) / "test.csv"

train_original_ids = pd.read_csv(train_csv_path).ImageID
test_ids = pd.read_csv(test_csv_path).ImageID

# <font style="color:green">1. Data Exploration</font>

## <font style="color:green">1.1. Dataset Class [7 Points]</font>

In [None]:
train_original_dataset = SemanticSegmentationDataset(
    config.DATA_PATH,
    "imgs/imgs",
    "masks/masks",
    train_original_ids,
)

X = train_original_ids
y = extract_and_onehot_encode_classes_from_multilabel_masks(train_original_dataset, config.NUM_CLASSES)

In [5]:
msss = MultilabelStratifiedShuffleSplit(n_splits=1, test_size=(1-config.TRAIN_SPLIT), random_state=0)

for train_index, val_index in msss.split(X, y):
    train_ids, valid_ids = X[train_index], X[val_index]

In [6]:
train_transforms, valid_transforms, test_transforms = get_transforms(config.MASK_FILL_VALUE)

train_dataset = SemanticSegmentationDataset(
    config.DATA_PATH,
    "imgs/imgs",
    "masks/masks",
    train_ids.tolist(),
    transforms=train_transforms,
)

valid_dataset = SemanticSegmentationDataset(
    config.DATA_PATH,
    "imgs/imgs",
    "masks/masks",
    valid_ids.tolist(),
    transforms=valid_transforms
)

test_dataset = SemanticSegmentationDataset(
    config.DATA_PATH,
    "imgs/imgs",
    "masks/masks",
    test_ids.tolist(),
    transforms=test_transforms
)

# Reason for drop_last: https://discuss.pytorch.org/t/error-expected-more-than-1-value-per-channel-when-training/26274/5
train_dataloader = DataLoader(train_dataset, batch_size=config.BATCH_SIZE, shuffle=True, drop_last=True, num_workers=config.NUM_WORKERS)
valid_dataloader = DataLoader(valid_dataset, batch_size=config.BATCH_SIZE, shuffle=False, drop_last=True, num_workers=config.NUM_WORKERS)

## <font style="color:green">1.2. Visualize dataset [3 Points]</font>

In [None]:
draw_batch(train_dataset, n_samples=10)

In [None]:
draw_batch(valid_dataset, n_samples=10)

### Visualize each class

In [None]:
visualize_classes(train_dataset, config.NUM_CLASSES)

In [None]:
dataset_count = count_images_per_class(train_original_dataset)

In [11]:
# NOTE: if the train dataset pipeline does random cropping during augmentation,
# there may be less classes in an augmented train image than in the original image
# train_count = count_images_per_class(train_dataset)

In [12]:
# valid_count = count_images_per_class(valid_dataset)

In [13]:
# for i in range(config.NUM_CLASSES):
#     print(f"Class {i:02}: total: {dataset_count[i]:04} | train: {train_count[i]:04} | valid: {valid_count[i]:04}")

### Count of pixels per class

In [14]:
# pixel_count_per_class = count_pixels_per_class(train_original_ids, config.DATA_PATH, config.NUM_CLASSES)

In [15]:
# x = list(LABELS_NAMES_MAP.values())
# bar_plot(x, pixel_count_per_class, "Class", "Pixel Count", "Pixels Count per Class")

Prepare weights per class for the loss function

In [16]:
# x = list(LABELS_NAMES_MAP.values())
# class_weights = calculate_class_weights(pixel_count_per_class)
# bar_plot(x, class_weights, "Class", "Class Weight", "Class Weights")

# <font style="color:green">2. Evaluation Metrics [10 Points]</font>

<p>This competition is evaluated on the mean <a href='https://en.wikipedia.org/wiki/Sørensen–Dice_coefficient'>Dice coefficient</a
>. The Dice coefficient can be used to compare the pixel-wise agreement between a predicted segmentation and its corresponding ground truth. The formula is given by: </p>

<p>$$DSC =  \frac{2 |X \cap Y|}{|X|+ |Y|}$$
$$ \small \mathrm{where}\ X = Predicted\ Set\ of\ Pixels,\ \ Y = Ground\ Truth $$ </p>
<p>The Dice coefficient is defined to be 1 when both X and Y are empty.</p>

In [17]:
scorer = DiceScore(num_classes=config.NUM_CLASSES, ignore_index=None).to(config.DEVICE)

# <font style="color:green">3. Model [10 Points]</font>

#### Prepare model

In [18]:
model = make_deeplabv3_resnet101(config.NUM_CLASSES, config.DEVICE)
unfreeze_deeplabv3_resnet101(model, config.BACKBONE_LAYERS)

#### Prepare optimizer

In [19]:
optimizer = torch.optim.SGD(model.parameters(), lr=config.LR, momentum=config.MOMENTUM, nesterov=False)
# optimizer = smart_optimizer(model, "RMSProp", lr=config.LR, momentum=config.MOMENTUM, decay=config.WEIGHT_DECAY)
# optimizer = smart_optimizer(model, "SGD", lr=config.MAX_LR, momentum=config.MOMENTUM, decay=config.WEIGHT_DECAY)

#### Load model and optimizer weights if available

In [None]:
if config.LOAD_MODEL_NAME is not None:
    
    model_path = Path(config.INPUT_PATH, config.LOAD_MODEL_NAME)

    if model_path.exists():
        
        # Load saved model and optimizer parameters
        saved_model = torch.load(model_path, weights_only=True)

        # Load model weights        
        model.load_state_dict(saved_model["model_state_dict"])

        # Load optimizer weights
        optimizer.load_state_dict(saved_model["optimizer_state_dict"])

        print(f"Model loaded: '{str(model_path)}'")
    else:
        raise Exception("Model doesn't exist")

### Loss function

In [21]:
# Produces a loss around 10
loss_fun1 = SoftDiceLoss(num_classes=config.NUM_CLASSES).to(config.DEVICE)

# Produces a loss of around 2
# Receives logits
loss_fun2 = smp.losses.FocalLoss("multiclass", normalized=False, reduction='mean').to(config.DEVICE)

# CrossEntropyLoss: The `input` is expected to contain the unnormalized logits for each class (which do `not` need
# to be positive or sum to 1, in general)
# loss_fun2 = torch.nn.CrossEntropyLoss(weight=torch.from_numpy(class_weights).to(torch.float32), label_smoothing=config.LABEL_SMOOTHING)

loss_fun = CombinedLoss(loss_fun1, loss_fun2, weight1=1, weight2=40).to(config.DEVICE)

### Scheduler

In [22]:
steps_per_epoch = int(len(train_dataloader) / config.GRADIENT_ACCUMULATION_STEPS)
total_steps = int(steps_per_epoch * config.EPOCHS)
scheduler = get_scheduler(config.SCHEDULER, optimizer, total_steps, max_lr=config.MAX_LR)

# <font style="color:green">4. Train & Inference</font>

## <font style="color:green">4.1. Train [7 Points]</font>

In [None]:
H = main(
    model,
    optimizer,
    scheduler,
    loss_fun,
    scorer,
    train_dataloader,
    valid_dataloader,
    config.STARTING_EPOCH,
    config.EPOCHS,
    config.OUTPUT_PATH,
    config.GRADIENT_ACCUMULATION_STEPS,
    config.BATCH_SIZE,
    config.DEVICE,
    use_aux=config.USE_AUX
)

In [None]:
plot_loss_and_score(H, config.EPOCHS)

In [None]:
plot_score_per_class(H, config.NUM_CLASSES)

## <font style="color:green">4.2. Inference [3 Points]</font>

### Predictions on the validation set

In [None]:
draw_predictions(model, valid_dataset, num_predictions=20, include_mask=True, device=config.DEVICE)

### Predictions on the test set

In [None]:
draw_predictions(model, test_dataset, num_predictions=20, include_mask=False, device=config.DEVICE)

# <font style="color:green">5. Prepare Submission CSV [10 Points]</font>

Format:
```
ImageID,EncodedPixels
01_0,1 1 5 1
01_1,2 3 8 1
02_0,1 1
02_1,3 1
03_0,1 1
03_1,4 5
etc.
```

#### Verify RLE encoding
Visually verify that the RLE encoder works correctly by encoding and then decoding a mask

In [None]:
image, mask = next(iter(valid_dataset))
visualize_rle_encoding_decoding(image, mask, config.NUM_CLASSES, titles=("Image", "Ground truth mask", "Encoded/decoded mask"))

In [None]:
output_lines = ["ImageID,EncodedPixels"]

model.eval()
with torch.no_grad():
    for image_id in tqdm(test_ids.tolist()):
        image_path = os.path.join(config.DATA_PATH, "imgs/imgs", f"{image_id}.jpg")
        image = prepare_for_prediction(image_path, test_transforms, config.DEVICE)
        pred = model(image)['out']
        pred = pred.argmax(dim=1)
        pred = pred.detach().cpu().numpy()

        for class_id in range(config.NUM_CLASSES):
            class_image = np.zeros_like(pred)
            class_image[pred == class_id] = 1

            pred_rle = rle_to_string(rle_encode(class_image))

            output_line = f"{image_id}_{class_id},{pred_rle}"
            output_lines.append(output_line)

with open(os.path.join(config.OUTPUT_PATH, "submission.csv"), "w") as f:
    out = "\n".join(line.strip() for line in output_lines)
    f.write(out)

In [None]:
pd.read_csv(os.path.join(config.OUTPUT_PATH, "submission.csv"))

# <font style="color:green">6. Kaggle Profile Link [50 Points]</font>

Share your Kaggle profile link here with us so that we can give points for the competition score.

You should have a minimum IoU of `0.60` on the test data to get all points. If the IoU is less than `0.55`, you will not get any points for the section.

**You must have to submit `submission.csv` (prediction for images in `test.csv`) in `Submit Predictions` tab in Kaggle to get any evaluation in this section.**