<a href="https://colab.research.google.com/github/AgneseRe/Real-Time-Anomaly-Segmentation-for-Road-Scenes/blob/main/AML_AnomalySegmentation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Real-time Anomaly Segmentation for Road Scenes**

Existing deep neural networks, when deployed in open-world settings, perform poorly on unknown, anomaly, out-of-distribution (OoD) objects that were not present during the training. The goal of this project is to build tiny anomaly segmentation models to segment anomaly patterns. Models must be able to fit in small devices, which represents a realistic memory constraint for an edge application.

## Preparation

In [1]:
!rm -r sample_data/

Install required packages and import useful modules.

In [2]:
%%capture
!pip3 install --quiet numpy
!pip3 install --quiet Pillow

!pip3 install --quiet gdown
!pip3 install --quiet torchvision
!pip3 install --quiet ood_metrics
!pip3 install --quiet cityscapesscripts

!pip3 install --quiet matplotlib
!pip3 install --quiet visdom

import os, sys, subprocess, torch

The following function is implemented to download the *Cityscapes* dataset in two different ways: via Google Drive (using `gdown`) or directly from the Cityscapes official website (using `csDownload`). Although the first option is preferable as it is definitely faster, direct download from the website is provided as an alternative. `gdown` may in fact raise the error *Failed to retrieve the file url* if the file we are attempting to download is exceptionally large (*e.g.* 11G), there are numerous users simultaneously trying to download it programmatically or we download it many times in a limited time. Regardless of the method used, use the conversor (available [here](https://github.com/mcordts/cityscapesScripts/blob/master/cityscapesscripts/preparation/createTrainIdLabelImgs.py)) to generate labelTrainIds from labelIds.

In [3]:
def download_cityscapes():

    if not os.path.isdir('/content/Real-Time-Anomaly-Segmentation-for-Road-Scenes/cityscapes'):
        print("Attempting to download cityscapes dataset using gdown...")

        try:
            # If check is true, and the process exits with a non-zero exit code, a CalledProcessError exception will be raised.
            subprocess.run(["gdown", "https://drive.google.com/uc?id=11gSQ9UcLCnIqmY7srG2S6EVwV3paOMEq"], check=True)
            print("Dataset downloaded successfully using gdown. Unzipping...")
            subprocess.run(["unzip", "-q", "cityscapes.zip"], check=True)
            # Use the conversor to generate labelTrainIds from labelIds
            print("Generating trainIds from labelIds...")
            !CITYSCAPES_DATASET='cityscapes/' csCreateTrainIdLabelImgs

        except subprocess.CalledProcessError as e:
            print("gdown failed. Attempting to download cityscapes dataset from the official website...")
            try:
              !csDownload leftImg8bit_trainvaltest.zip
              !csDownload gtFine_trainvaltest.zip

              print("Dataset downloaded successfully from the official website. Unzipping...")
              !unzip -q 'leftImg8bit_trainvaltest.zip' -d 'cityscapes'
              !unzip -o -q 'gtFine_trainvaltest.zip' -d 'cityscapes'

              print("Generating trainIds from labelIds...")
              !CITYSCAPES_DATASET='cityscapes/' csCreateTrainIdLabelImgs

              print("Cityscapes dataset ready")

            except Exception as e2:
                print("Failed to download the dataset using both methods.")

Download and unzip the validation dataset (*FS_LostFound_full*, *RoadAnomaly*, *RoadAnomaly21*, *RoadObsticle21*, *fs_static*), clone or update the GitHub repository (*Real-Time-Anomaly-Segmentation-for-Road-Scenes*) and download the *Cityscapes* dataset.

In [4]:
# download and unzip validation dataset
if not os.path.isdir('/content/validation_dataset'):
  !gdown 'https://drive.google.com/uc?id=12YJq48XkCxQHjN3CmLc-zM5dThSak4Ta'
  !unzip -q 'Validation_Dataset.zip'
  !mkdir validation_dataset && cp -pR Validation_Dataset/* validation_dataset/ && rm -R Validation_Dataset/
  !rm 'Validation_Dataset.zip'

# clone the github repo and pull command
if not os.path.isdir('content/Real-Time-Anomaly-Segmentation-for-Road-Scenes'):
  !git clone https://github.com/AgneseRe/Real-Time-Anomaly-Segmentation-for-Road-Scenes.git
else: # if folder already present
  !git pull

%cd Real-Time-Anomaly-Segmentation-for-Road-Scenes

Downloading...
From (original): https://drive.google.com/uc?id=12YJq48XkCxQHjN3CmLc-zM5dThSak4Ta
From (redirected): https://drive.google.com/uc?id=12YJq48XkCxQHjN3CmLc-zM5dThSak4Ta&confirm=t&uuid=5e6d9077-e749-40c9-981b-0c763276a5e5
To: /content/Validation_Dataset.zip
100% 329M/329M [00:06<00:00, 50.2MB/s]
Cloning into 'Real-Time-Anomaly-Segmentation-for-Road-Scenes'...
remote: Enumerating objects: 1793, done.[K
remote: Counting objects: 100% (125/125), done.[K
remote: Compressing objects: 100% (85/85), done.[K
remote: Total 1793 (delta 55), reused 99 (delta 40), pack-reused 1668 (from 5)[K
Receiving objects: 100% (1793/1793), 1.95 GiB | 17.40 MiB/s, done.
Resolving deltas: 100% (950/950), done.
Updating files: 100% (444/444), done.
/content/Real-Time-Anomaly-Segmentation-for-Road-Scenes


In [5]:
# download cityscapes dataset - use credentials (agnesere, FCSBwcVMi-u9-Zn) if downloading from official site
download_cityscapes()

Attempting to download cityscapes dataset using gdown...
Dataset downloaded successfully using gdown. Unzipping...
Generating trainIds from labelIds...
Processing 5000 annotation files
Progress: 100.0 % 

## Evaluation

### Step 2A

#### Compute AuPRC & FPR95TPR

In [6]:
%cd eval

/content/Real-Time-Anomaly-Segmentation-for-Road-Scenes/eval


Define datasets used for evaluation.

In [7]:
# datasets = os.listdir("../../validation_dataset")
datasets = {
    "SMIYC RA-21": "RoadAnomaly21",
    "SMIYC RO-21": "RoadObsticle21",
    "FS L&F": "FS_LostFound_full",
    " FS Static": "fs_static",
    "Road Anomaly": "RoadAnomaly"
    }

List anomaly detection methods used for evaluation.

In [8]:
methods = ["MSP", "MaxLogit", "MaxEntropy"]

Automate running anomaly detection experiments on multiple datasets using different methods. The evaluation script can be invoked with the appropriate parameters: name of the model, path to the training folder, base folder to save generated plots, directory to load the model.

In [9]:
def run_eval_anomaly(datasets, methods, model = None, training_folder = None, plot_folder = None, load_dir = None) -> None:

  for dataset, folder in datasets.items():
    print(f"Dataset {dataset}")

    for method in methods:
      print(f" - {method:<10} ", end = "")
      input_path = f"../../validation_dataset/{folder}/images/*.*"
      plot_dir_path = f"../plots/losses/{plot_folder}/{folder}_{method}" if model else f"../plots/baselines/{folder}_{method}"

      add_cmd = "--cpu" if not torch.cuda.is_available() else ""

      if model:
        !python evalAnomaly.py --input={input_path} --method={method} --loadModel={model} --loadDir={load_dir} --loadWeights={training_folder}/model_best.pth --plotdir={plot_dir_path} {add_cmd}
      else: # ERFNet pre-trained
        !python evalAnomaly.py --input={input_path} --method={method} --plotdir={plot_dir_path} {add_cmd}

    print("=" * 55, end = "\n")

Evaluate a segmentation model on Cityscapes using specified weights.

In [10]:
def run_eval_iou(model = "erfnet", load_dir = "../trained_models/", training_folder = "erfnet_pretrained.pth", void = False) -> None:

  load_model = f"{model}.py"
  add_cmd = "--cpu" if not torch.cuda.is_available() else ""
  method_flag = "--method void" if void else "" # for void classifier
  !python eval_iou.py --model={model} --loadDir={load_dir} --loadModel={load_model} --loadWeights={training_folder} --datadir /content/Real-Time-Anomaly-Segmentation-for-Road-Scenes/cityscapes {method_flag} {add_cmd}

Perform inference using the pre-trained **ERFNet** model on anomaly segmentation test datasets provided. Evaluate results with different techniques: MSP, MaxLogit and MaxEntropy.

In [None]:
run_eval_anomaly(datasets, methods)

Dataset SMIYC RA-21
 - MSP        | AUPRC score: 29.100 | FPR@TPR95: 62.511
 - MaxLogit   | AUPRC score: 38.320 | FPR@TPR95: 59.337
 - MaxEntropy | AUPRC score: 31.005 | FPR@TPR95: 62.593
Dataset SMIYC RO-21
 - MSP        | AUPRC score: 2.712 | FPR@TPR95: 64.974
 - MaxLogit   | AUPRC score: 4.627 | FPR@TPR95: 48.443
 - MaxEntropy | AUPRC score: 3.052 | FPR@TPR95: 65.600
Dataset FS L&F
 - MSP        | AUPRC score: 1.748 | FPR@TPR95: 50.763
 - MaxLogit   | AUPRC score: 3.301 | FPR@TPR95: 45.495
 - MaxEntropy | AUPRC score: 2.582 | FPR@TPR95: 50.368
Dataset  FS Static
 - MSP        | AUPRC score: 7.470 | FPR@TPR95: 41.823
 - MaxLogit   | AUPRC score: 9.499 | FPR@TPR95: 40.300
 - MaxEntropy | AUPRC score: 8.826 | FPR@TPR95: 41.523
Dataset Road Anomaly
 - MSP        | AUPRC score: 12.426 | FPR@TPR95: 82.492
 - MaxLogit   | AUPRC score: 15.582 | FPR@TPR95: 73.248
 - MaxEntropy | AUPRC score: 12.678 | FPR@TPR95: 82.632


If you want to save the baselines folder in your local machine, create a ZIP file with the following command and then download it.

In [None]:
 # !zip -r baselines.zip baselines/

#### Compute mIoU

In [None]:
run_eval_iou()

Loading model: ../trained_models/erfnet
Loading weights: ../trained_models/erfnet_pretrained.pth
Model and weights LOADED successfully
---------------------------------------
Took  80.3807921409607 seconds
Per-Class IoU:
[0m97.62[0m Road
[0m81.37[0m sidewalk
[0m90.77[0m building
[0m49.43[0m wall
[0m54.93[0m fence
[0m60.81[0m pole
[0m62.60[0m traffic light
[0m72.32[0m traffic sign
[0m91.35[0m vegetation
[0m60.97[0m terrain
[0m93.38[0m sky
[0m76.11[0m person
[0m53.45[0m rider
[0m92.91[0m car
[0m72.78[0m truck
[0m78.87[0m bus
[0m63.86[0m train
[0m46.41[0m motorcycle
[0m71.89[0m bicycle
MEAN IoU:  [0m72.20[0m %


### Step 2B

#### Compute AuPRC & FPR95TPR with temperature scaling

In [None]:
temperatures = [0.5, 0.75, 1.0, 1.1, 1.2, 1.5, 2.0, 5.0, 10.0]

for dataset, folder in datasets.items():
  print(f"Dataset {dataset}")

  for temperature in temperatures:
    print(f" - {temperature:<10} ", end = "")
    input_path = f"../../validation_dataset/{folder}/images/*.*"
    if torch.cuda.is_available():
      !python evalAnomaly.py --input={input_path} --method="MSP" --temperature={temperature}
    else:
      !python evalAnomaly.py --input={input_path} --method="MSP" --temperature={temperature} --cpu

  print("=" * 55, end = "\n")

Dataset SMIYC RA-21
 - 0.5        | AUPRC score: 27.061 | FPR@TPR95: 62.731
 - 0.75       | AUPRC score: 28.156 | FPR@TPR95: 62.479
 - 1.0        | AUPRC score: 29.100 | FPR@TPR95: 62.511
 - 1.1        | AUPRC score: 29.410 | FPR@TPR95: 62.590
 - 1.2        | AUPRC score: 29.678 | FPR@TPR95: 62.724
 - 1.5        | AUPRC score: 30.258 | FPR@TPR95: 63.318
 - 2.0        | AUPRC score: 30.679 | FPR@TPR95: 64.721
 - 5.0        | AUPRC score: 30.196 | FPR@TPR95: 71.594
 - 10.0       | AUPRC score: 29.526 | FPR@TPR95: 75.757
Dataset SMIYC RO-21
 - 0.5        | AUPRC score: 2.420 | FPR@TPR95: 63.225
 - 0.75       | AUPRC score: 2.567 | FPR@TPR95: 64.053
 - 1.0        | AUPRC score: 2.712 | FPR@TPR95: 64.974
 - 1.1        | AUPRC score: 2.766 | FPR@TPR95: 65.524
 - 1.2        | AUPRC score: 2.816 | FPR@TPR95: 66.033
 - 1.5        | AUPRC score: 2.937 | FPR@TPR95: 67.928
 - 2.0        | AUPRC score: 3.026 | FPR@TPR95: 71.459
 - 5.0        | AUPRC score: 2.841 | FPR@TPR95: 83.111
 - 10.0       | 

### Training models

### Utils

In [11]:
base_dir = "../train"
data_dir = "../cityscapes"

In [12]:
def train_model(model: str, num_epochs: int, batch_size: int, stop_epoch: int = 20, pretrained: bool = False, resume: bool = False, fineTune: bool = False) -> None:

  state_flag = f"--state ../trained_models/{model}_pretrained.pth" if pretrained else ""
  resume_flag = "--resume" if resume else ""
  finetune_flag = f"--FineTune --loadWeights ../trained_models/{model}_pretrained.pth" if fineTune else ""

  # if model == "bisenet":
  #     !gdown "https://drive.usercontent.google.com/download?id=1Gj4eZrmdygA5c_y7N0KrmSRThoYjfjk-" -O "checkpoint.pth.tar"
  # dict_keys(['epoch', 'arch', 'state_dict', 'best_acc', 'optimizer'])

  if fineTune:
    savedir_name = f"{model}_training_void_ft"
  else:
    savedir_name = f"{model}_training_void"

  !cd {base_dir} && python -W ignore main_v2.py \
    --savedir {savedir_name}\
    --datadir {data_dir} \
    --model {model} \
    --cuda \
    --num-epochs={num_epochs} \
    --epochs-save=1 \
    --batch-size={batch_size} \
    --stop-epoch={stop_epoch} \
    --decoder \
    {finetune_flag} \
    {state_flag} \
    {resume_flag}

### ERFNet Fine-Tuning

In [None]:
train_model("erfnet", num_epochs=20, batch_size=6, fineTune=True)
# %cd ../save
# !zip -r erfnet_training_void_ft.zip erfnet_training_void_ft/

Import Model erfnet with weights ../trained_models/erfnet_pretrained.pth to FineTune
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: <class 'utils.losses.ce_loss.CrossEntropyLoss2d'>
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  5e-05
loss: 0.5082 (epoch: 1, step: 0) // Avg time/img: 0.3150 s
loss: 0.3931 (epoch: 1, step: 50) // Avg time/img: 0.0396 s
loss: 0.3815 (epoch: 1, step: 100) // Avg time/img: 0.0371 s
loss: 0.3817 (epoch: 1, step: 150) // Avg time/img: 0.0363 s
loss: 0.3752 (epoch: 1, step: 200) // Avg time/img: 0.0360 s
loss: 0.3759 (epoch: 1, step: 250) // Avg time/img: 0.0357 s
loss: 0.3744 (epoch: 1, step: 300) // Avg time/img: 0.0355 s
loss: 0.3768 (epoch: 1, step: 350) // Avg time/img: 0.0353 s
loss: 0.3742 (epoch: 1, step: 400) // Avg time/img: 0.0353 s
loss: 0.3715 (epoch: 1, step: 450) // Avg time/img: 0.0352 s
----- VALIDATING - EPOCH 1 -----
VAL loss: 0.4713 (epoch: 1, step: 0) // Avg time/img: 0.0335 s
VAL loss: 0.5808 (epoch: 1, step: 50

### BiSeNet Training

Start training BiSeNet for a total of 40 epochs. In the first run, due to GPU time limitations on Google Colab, the training is intentionally interrupted after 20 epochs by setting the parameter `stop_epoch` equal to 20. Remember to set `num_epochs` to 40 from the beginning to ensure that the learning rate scheduler behaves correctly across the full training process. The process is then resumed in the following run from epoch 21 using the `--resume` flag.

In [None]:
train_model("bisenet", num_epochs=40, batch_size=6, stop_epoch=20)
# %cd ../save
# !zip -r bisenet_training_void.zip bisenet_training_void/

Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth
100% 44.7M/44.7M [00:00<00:00, 301MB/s]
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  0.025
loss: 5.835 (epoch: 1, step: 0) // Avg time/img: 0.4772 s
loss: 3.602 (epoch: 1, step: 50) // Avg time/img: 0.0502 s
loss: 3.355 (epoch: 1, step: 100) // Avg time/img: 0.0460 s
loss: 3.206 (epoch: 1, step: 150) // Avg time/img: 0.0446 s
loss: 3.098 (epoch: 1, step: 200) // Avg time/img: 0.0440 s
loss: 3.043 (epoch: 1, step: 250) // Avg time/img: 0.0438 s
loss: 2.982 (epoch: 1, step: 300) // Avg time/img: 0.0437 s
loss: 2.926 (epoch: 1, step: 350) // Avg time/img: 0.0436 s
loss: 2.897 (epoch: 1, step: 400) // Avg time/img: 0.0435 s
loss: 2.871 (epoch: 1, step: 450) // Avg time/img: 0.0434 s
----- VALIDATING - EPOCH 1 -----
VAL loss: 2.136 (epoch: 1, step: 0) // Avg time/img: 0.0375 s
VAL loss: 2.807 (epo

In [None]:
train_model("bisenet", num_epochs=40, batch_size=6, stop_epoch=40, resume=True)
# %cd ../save
# !zip -r bisenet_training_void.zip bisenet_training_void/

Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth
100% 44.7M/44.7M [00:00<00:00, 384MB/s]
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
=> Loaded checkpoint at epoch 21)
----- TRAINING - EPOCH 21 -----
LEARNING RATE:  0.013397168281703665
loss: 2.068 (epoch: 21, step: 0) // Avg time/img: 0.4451 s
loss: 2.012 (epoch: 21, step: 50) // Avg time/img: 0.0482 s
loss: 2.009 (epoch: 21, step: 100) // Avg time/img: 0.0447 s
loss: 2.011 (epoch: 21, step: 150) // Avg time/img: 0.0438 s
loss: 2.021 (epoch: 21, step: 200) // Avg time/img: 0.0436 s
loss: 2.012 (epoch: 21, step: 250) // Avg time/img: 0.0436 s
loss: 2.017 (epoch: 21, step: 300) // Avg time/img: 0.0436 s
loss: 2.026 (epoch: 21, step: 350) // Avg time/img: 0.0435 s
loss: 2.02 (epoch: 21, step: 400) // Avg time/img: 0.0436 s
loss: 2.019 (epoch: 21, step: 450) // Avg time/img: 0.0435 s
----- VALIDATING - EPOCH 21 -----
VAL loss: 1.843 (epoch

### BiSeNet Fine-Tuning

In [None]:
train_model("bisenet", num_epochs=20, batch_size=6, fineTune=True)
# %cd ../save
# !zip -r bisenet_training_void_ft.zip bisenet_training_void_ft/

Import Model bisenet with weights ../trained_models/bisenet_pretrained.pth to FineTune
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: <class 'utils.losses.ohem_ce_loss.OhemCELoss'>
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  0.0025
loss: 6.09 (epoch: 1, step: 0) // Avg time/img: 0.4535 s
loss: 5.907 (epoch: 1, step: 50) // Avg time/img: 0.0267 s
loss: 5.636 (epoch: 1, step: 100) // Avg time/img: 0.0225 s
loss: 5.524 (epoch: 1, step: 150) // Avg time/img: 0.0203 s
loss: 5.437 (epoch: 1, step: 200) // Avg time/img: 0.0195 s
loss: 5.399 (epoch: 1, step: 250) // Avg time/img: 0.0191 s
loss: 5.411 (epoch: 1, step: 300) // Avg time/img: 0.0187 s
loss: 5.409 (epoch: 1, step: 350) // Avg time/img: 0.0183 s
loss: 5.387 (epoch: 1, step: 400) // Avg time/img: 0.0182 s
loss: 5.391 (epoch: 1, step: 450) // Avg time/img: 0.0180 s
----- VALIDATING - EPOCH 1 -----
VAL loss: 7.204 (epoch: 1, step: 0) // Avg time/img: 0.0365 s
VAL loss: 6.475 (epoch: 1, step: 50) // Avg time

### ENet Training

Same procedure adopted for BiSeNet is applied for ENet here.

In [None]:
train_model("enet", num_epochs=40, batch_size=6, stop_epoch=20)
# %cd ../save
# !zip -r enet_training_void.zip enet_training_void/

../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: CrossEntropyLoss2d
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  0.0005
loss: 3.071 (epoch: 1, step: 0) // Avg time/img: 0.4825 s
loss: 2.726 (epoch: 1, step: 50) // Avg time/img: 0.0765 s
loss: 2.334 (epoch: 1, step: 100) // Avg time/img: 0.0721 s
loss: 2.05 (epoch: 1, step: 150) // Avg time/img: 0.0711 s
loss: 1.858 (epoch: 1, step: 200) // Avg time/img: 0.0707 s
loss: 1.716 (epoch: 1, step: 250) // Avg time/img: 0.0707 s
loss: 1.613 (epoch: 1, step: 300) // Avg time/img: 0.0705 s
loss: 1.531 (epoch: 1, step: 350) // Avg time/img: 0.0707 s
loss: 1.462 (epoch: 1, step: 400) // Avg time/img: 0.0707 s
loss: 1.406 (epoch: 1, step: 450) // Avg time/img: 0.0707 s
----- VALIDATING - EPOCH 1 -----
VAL loss: 0.7688 (epoch: 1, step: 0) // Avg time/img: 0.0339 s
VAL loss: 0.9467 (epoch: 1, step: 50) // Avg time/img: 0.0316 s
EPOCH IoU on VAL set:  [0m17.06[0m %
Saving model as best
save: ../save/enet_training_void/model

In [None]:
train_model("enet", num_epochs=40, batch_size=6, stop_epoch=40, resume=True)
# %cd ../save
# !zip -r enet_training_void.zip enet_training_void/

../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: CrossEntropyLoss2d
=> Loaded checkpoint at epoch 21)
----- TRAINING - EPOCH 21 -----
LEARNING RATE:  0.0005
loss: 0.3133 (epoch: 21, step: 0) // Avg time/img: 0.4467 s
loss: 0.423 (epoch: 21, step: 50) // Avg time/img: 0.0767 s
loss: 0.4169 (epoch: 21, step: 100) // Avg time/img: 0.0734 s
loss: 0.4144 (epoch: 21, step: 150) // Avg time/img: 0.0724 s
loss: 0.42 (epoch: 21, step: 200) // Avg time/img: 0.0723 s
loss: 0.4207 (epoch: 21, step: 250) // Avg time/img: 0.0722 s
loss: 0.4208 (epoch: 21, step: 300) // Avg time/img: 0.0719 s
loss: 0.4202 (epoch: 21, step: 350) // Avg time/img: 0.0720 s
loss: 0.4216 (epoch: 21, step: 400) // Avg time/img: 0.0720 s
loss: 0.4243 (epoch: 21, step: 450) // Avg time/img: 0.0721 s
----- VALIDATING - EPOCH 21 -----
VAL loss: 0.3167 (epoch: 21, step: 0) // Avg time/img: 0.0396 s
VAL loss: 0.4666 (epoch: 21, step: 50) // Avg time/img: 0.0312 s
EPOCH IoU on VAL set:  [0m35.06[0m %
Sav

###Enet Fine-Tuning

In [None]:
train_model("enet", num_epochs=20, batch_size=6, fineTune=True)
# %cd ../save
# !zip -r enet_training_void_ft.zip enet_training_void_ft/

Import Model enet with weights ../trained_models/enet_pretrained.pth to FineTune
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: CrossEntropyLoss2d
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  5e-05
loss: 10.75 (epoch: 1, step: 0) // Avg time/img: 0.3722 s
loss: 10.32 (epoch: 1, step: 50) // Avg time/img: 0.0422 s
loss: 10.23 (epoch: 1, step: 100) // Avg time/img: 0.0380 s
loss: 10.13 (epoch: 1, step: 150) // Avg time/img: 0.0369 s
loss: 10.05 (epoch: 1, step: 200) // Avg time/img: 0.0360 s
loss: 9.928 (epoch: 1, step: 250) // Avg time/img: 0.0355 s
loss: 9.824 (epoch: 1, step: 300) // Avg time/img: 0.0357 s
loss: 9.703 (epoch: 1, step: 350) // Avg time/img: 0.0355 s
loss: 9.593 (epoch: 1, step: 400) // Avg time/img: 0.0355 s
loss: 9.481 (epoch: 1, step: 450) // Avg time/img: 0.0353 s
----- VALIDATING - EPOCH 1 -----
VAL loss: 8.499 (epoch: 1, step: 0) // Avg time/img: 0.0389 s
VAL loss: 8.385 (epoch: 1, step: 50) // Avg time/img: 0.0313 s
EPOCH IoU on VAL se

## Void Classifier

In [13]:
%cd ../eval

/content/Real-Time-Anomaly-Segmentation-for-Road-Scenes/eval


In [14]:
models = ["erfnet", "bisenet", "enet"]

Run the `eval_anomaly` script for each model.

In [None]:
for model in models:
  print(f"{'=' * 64}")
  print(f"MODEL: {model.upper()}")
  print(f"{'=' * 64}")

  for dataset, folder in datasets.items():
    print(f"Dataset {dataset:<15}", end = "")

    input_path = f"../../validation_dataset/{folder}/images/*.*"
    plot_dir_path = f"../plots/void/{model}/{folder}"
    load_dir = f"../save/{model}_training_void_ft/"
    load_weights = f"model_best.pth"

    add_cmd = "--cpu" if not torch.cuda.is_available() else ""
    !python evalAnomaly.py --input={input_path} --loadModel={model} --loadDir={load_dir} --loadWeights={load_weights} --method="void" --plotdir={plot_dir_path} {add_cmd}

  print()

MODEL: ERFNET
Dataset SMIYC RA-21    | AUPRC score: 24.239 | FPR@TPR95: 68.735
Dataset SMIYC RO-21    | AUPRC score:  1.360 | FPR@TPR95: 99.809
Dataset FS L&F         | AUPRC score: 11.535 | FPR@TPR95: 15.749
Dataset  FS Static     | AUPRC score: 15.785 | FPR@TPR95: 49.445
Dataset Road Anomaly   | AUPRC score: 10.978 | FPR@TPR95: 86.823

MODEL: BISENET
Dataset SMIYC RA-21    | AUPRC score: 33.084 | FPR@TPR95: 86.463
Dataset SMIYC RO-21    | AUPRC score: 13.214 | FPR@TPR95: 99.474
Dataset FS L&F         | AUPRC score: 15.462 | FPR@TPR95: 52.662
Dataset  FS Static     | AUPRC score: 30.864 | FPR@TPR95: 60.263
Dataset Road Anomaly   | AUPRC score: 13.042 | FPR@TPR95: 93.016

MODEL: ENET
Dataset SMIYC RA-21    | AUPRC score: 21.616 | FPR@TPR95: 88.547
Dataset SMIYC RO-21    | AUPRC score:  1.763 | FPR@TPR95: 95.070
Dataset FS L&F         | AUPRC score:  0.624 | FPR@TPR95: 68.780
Dataset  FS Static     | AUPRC score:  6.430 | FPR@TPR95: 68.681
Dataset Road Anomaly   | AUPRC score: 18.122 | 

If you want to save the void folder in your local machine, create a ZIP file with the following command and then download it.

In [None]:
# %cd ../plots
# !zip -r void.zip void/

Run the `eval_iou` script for each model. Parameter `void` is set to `False`.



In [20]:
for model in ["enet"]:
  print(f"{'=' * 64}")
  print(f"MODEL: {model.upper()}")
  print(f"{'=' * 64}")
  load_dir = f"../save/{model}_training_void_ft/" if model != "enet" else f"../save/{model}_training_void/"
  training_folder = f"model_best.pth"

  run_eval_iou(model, load_dir, training_folder)
  print()

MODEL: ENET
Loading model: ../save/enet_training_void/enet.py
Loading weights: ../save/enet_training_void/model_best.pth
Model and weights LOADED successfully
---------------------------------------
Took  90.40699338912964 seconds
Per-Class IoU:
[0m94.21[0m Road
[0m67.34[0m sidewalk
[0m83.65[0m building
[0m20.29[0m wall
[0m13.36[0m fence
[0m33.64[0m pole
[0m0.00[0m traffic light
[0m37.51[0m traffic sign
[0m86.87[0m vegetation
[0m44.23[0m terrain
[0m89.22[0m sky
[0m49.99[0m person
[0m0.00[0m rider
[0m84.36[0m car
[0m21.96[0m truck
[0m9.56[0m bus
[0m5.97[0m train
[0m0.00[0m motorcycle
[0m43.35[0m bicycle
MEAN IoU:  [0m41.34[0m %



Run the `eval_iou` script for each model. Parameter `void` is set to `True`.

In [16]:
for model in models:
  print(f"{'=' * 64}")
  print(f"MODEL: {model.upper()}")
  print(f"{'=' * 64}")
  load_dir = f"../save/{model}_training_void_ft/"
  training_folder = f"model_best.pth"

  run_eval_iou(model, load_dir, training_folder, void=True)
  print()

MODEL: ERFNET
Loading model: ../save/erfnet_training_void_ft/erfnet.py
Loading weights: ../save/erfnet_training_void_ft/model_best.pth
Model and weights LOADED successfully
---------------------------------------
Took  76.10783123970032 seconds
Per-Class IoU:
[0m83.17[0m Road
[0m66.71[0m sidewalk
[0m83.86[0m building
[0m35.33[0m wall
[0m43.69[0m fence
[0m55.00[0m pole
[0m56.69[0m traffic light
[0m62.44[0m traffic sign
[0m88.13[0m vegetation
[0m45.99[0m terrain
[0m86.45[0m sky
[0m68.59[0m person
[0m52.79[0m rider
[0m87.17[0m car
[0m66.92[0m truck
[0m74.83[0m bus
[0m50.83[0m train
[0m37.98[0m motorcycle
[0m63.46[0m bicycle
[0m14.08[0m void
MEAN IoU:  [0m61.21[0m %

MODEL: BISENET
Loading model: ../save/bisenet_training_void_ft/bisenet.py
Loading weights: ../save/bisenet_training_void_ft/model_best.pth
Downloading: "https://download.pytorch.org/models/resnet18-5c106cde.pth" to /root/.cache/torch/hub/checkpoints/resnet18-5c106cde.pth
100% 44.7M/44.

## Effect of Training Loss function

Analyze the effect of the training model along with losses that are specifically made for anomaly detection.

### Utils

In [None]:
def train_erfnet_with_loss(base_dir: str, data_dir: str, loss: str, stop_epoch: int, num_epochs: int = 20,
                           batch_size: int = 6, resume: bool = False, logit_norm: bool = False,
                           iso_max: bool = False, class_weights: str = 'hard') -> None:

    resume_flag = "--resume" if resume else ""
    model = "erfnet_isomaxplus" if iso_max else "erfnet"
    logit_suffix = "_logit_norm" if logit_norm else ""
    logit_norm_flag = "--logit_norm" if logit_norm else ""
    pretrained_encoder = "../trained_models/erfnet_encoder_pretrained.pth.tar"

    !cd {base_dir} && python -W ignore main_v2.py \
      --savedir {model}_training_{loss}{logit_suffix} \
      --loss {loss} \
      --datadir {data_dir} \
      --model {model} \
      --cuda \
      --num-epochs={num_epochs} \
      --epochs-save=1 \
      --stop-epoch={stop_epoch} \
      --batch-size={batch_size} \
      {resume_flag} \
      {logit_norm_flag} \
      --decoder \
      --pretrainedEncoder={pretrained_encoder}

Different combinations of loss functions are experimented here.

### Cross-Entropy

In [None]:
train_erfnet_with_loss(base_dir, data_dir, loss="ce")
# !zip -r erfnet_training_ce.zip erfnet_training_ce/

##Focal Loss


In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch=10, loss="f")
# !zip -r erfnet_training_f.zip erfnet_training_f/

Loading encoder pretrained in imagenet
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: FocalLoss(gamma=2.0, alpha=tensor([ 2.8149,  6.9850,  3.7890,  9.9428,  9.7702,  9.5111, 10.3114, 10.0265,
         4.6323,  9.5608,  7.8698,  9.5169, 10.3737,  6.6616, 10.2605, 10.2879,
        10.2898, 10.4054, 10.1381,  1.0000], device='cuda:0'))
----- TRAINING - EPOCH 1 -----
LEARNING RATE:  0.0005
loss: 11.75 (epoch: 1, step: 0) // Avg time/img: 0.4367 s
loss: 9.154 (epoch: 1, step: 50) // Avg time/img: 0.1190 s
loss: 7.499 (epoch: 1, step: 100) // Avg time/img: 0.1184 s
loss: 6.402 (epoch: 1, step: 150) // Avg time/img: 0.1197 s
loss: 5.657 (epoch: 1, step: 200) // Avg time/img: 0.1200 s
loss: 5.174 (epoch: 1, step: 250) // Avg time/img: 0.1203 s
loss: 4.764 (epoch: 1, step: 300) // Avg time/img: 0.1204 s
loss: 4.433 (epoch: 1, step: 350) // Avg time/img: 0.1205 s
loss: 4.175 (epoch: 1, step: 400) // Avg time/img: 0.1205 s
loss: 3.961 (epoch: 1, step: 450) // Avg time/i

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch=20, loss="f", resume=True)

Loading encoder pretrained in imagenet
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: FocalLoss(gamma=2.0, alpha=tensor([ 2.8149,  6.9850,  3.7890,  9.9428,  9.7702,  9.5111, 10.3114, 10.0265,
         4.6323,  9.5608,  7.8698,  9.5169, 10.3737,  6.6616, 10.2605, 10.2879,
        10.2898, 10.4054, 10.1381,  1.0000], device='cuda:0'))
=> Loaded checkpoint at epoch 11)
----- TRAINING - EPOCH 11 -----
LEARNING RATE:  0.0002679433656340733
loss: 0.7438 (epoch: 11, step: 0) // Avg time/img: 0.3692 s
loss: 0.6395 (epoch: 11, step: 50) // Avg time/img: 0.1278 s
loss: 0.6236 (epoch: 11, step: 100) // Avg time/img: 0.1243 s
loss: 0.6207 (epoch: 11, step: 150) // Avg time/img: 0.1236 s
loss: 0.622 (epoch: 11, step: 200) // Avg time/img: 0.1229 s
loss: 0.622 (epoch: 11, step: 250) // Avg time/img: 0.1226 s
loss: 0.6282 (epoch: 11, step: 300) // Avg time/img: 0.1224 s
loss: 0.6361 (epoch: 11, step: 350) // Avg time/img: 0.1223 s
loss: 0.6308 (epoch: 11, step: 400) // Avg 

### Cross-Entropy + Focal Loss

In [None]:
train_erfnet_with_loss(base_dir, data_dir, loss="cef")
# !zip -r erfnet_training_ce_focal.zip erfnet_training_ce_focal/

### Cross-Entropy + Logit Norm

In [None]:
train_erfnet_with_loss(base_dir, data_dir, loss="ce", logit_norm=True)
# !zip -r erfnet_training_ce_logitnorm.zip erfnet_training_ce_logitnorm/

### Cross-Entropy Loss + Focal + Logit Norm



In [None]:
train_erfnet_with_loss(base_dir, data_dir, loss="cef", logit_norm=True)
# !zip -r erfnet_training_ce_focal_logitnorm.zip erfnet_training_ce_focal_logitnorm/

### Cross-Entropy + EIM

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch = 10, loss="ceim", iso_max=True)
# !zip -r erfnet_isomaxplus_training_ceim.zip erfnet_isomaxplus_training_ceim/

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch = 20, loss="ceim", iso_max=True, resume=True)

### Cross-Entropy + Focal + EIM

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch = 10, loss="cefeim", iso_max=True)
# !zip -r erfnet_isomaxplus_training_cefeim.zip erfnet_isomaxplus_training_cefeim/

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch = 20, loss="cefeim", iso_max=True, resume=True)

### Cross-Entropy + Focal + EIM + Logit Norm

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch=10, loss="cefeim", logit_norm=True, iso_max=True)
# !zip -r erfnet_isomaxplus_training_cefeim_logit_norm.zip erfnet_isomaxplus_training_cefeim_logit_norm/

In [None]:
train_erfnet_with_loss(base_dir, data_dir, stop_epoch=20, loss="cefeim", logit_norm=True, iso_max=True, resume=True)

Loading encoder pretrained in imagenet
../cityscapes/leftImg8bit/train
../cityscapes/leftImg8bit/val
Criterion: LogitNormLoss(loss=CombinedLoss(CE(alpha=0.3333333333333333), Focal(beta=0.3333333333333333), EIM(gamma=0.3333333333333333)), t=1.0)
=> Loaded checkpoint at epoch 11)
----- TRAINING - EPOCH 11 -----
LEARNING RATE:  0.0002679433656340733
loss: 5.638 (epoch: 11, step: 0) // Avg time/img: 0.8220 s
loss: 5.354 (epoch: 11, step: 50) // Avg time/img: 0.3397 s
loss: 5.366 (epoch: 11, step: 100) // Avg time/img: 0.3376 s
loss: 5.365 (epoch: 11, step: 150) // Avg time/img: 0.3360 s
loss: 5.376 (epoch: 11, step: 200) // Avg time/img: 0.3354 s
loss: 5.375 (epoch: 11, step: 250) // Avg time/img: 0.3352 s
loss: 5.381 (epoch: 11, step: 300) // Avg time/img: 0.3349 s
loss: 5.382 (epoch: 11, step: 350) // Avg time/img: 0.3346 s
loss: 5.383 (epoch: 11, step: 400) // Avg time/img: 0.3343 s
loss: 5.383 (epoch: 11, step: 450) // Avg time/img: 0.3344 s
----- VALIDATING - EPOCH 11 -----
VAL loss: 

# Inference with different training losses

In [None]:
load_dir = "../save/"

losses = {"Cross-Entropy": ["erfnet", "erfnet_training_ce"],
          "Focal": ["erfnet", "erfnet_training_f"],
          "Cross-Entropy + Focal": ["erfnet", "erfnet_training_cef"],
          "Cross-Entropy + LogitNorm": ["erfnet", "erfnet_training_ce_logit_norm"],
          "Cross-Entropy + Focal + LogitNorm": ["erfnet", "erfnet_training_cef_logit_norm"],
          "CrossEntropy + EIM": ["erfnet_isomaxplus", "erfnet_training_ceim"],
          "CrossEntropy + Focal + EIM": ["erfnet_isomaxplus", "erfnet_training_cefeim"],
          "CrossEntropy + Focal + EIM + LogitNorm": ["erfnet_isomaxplus", "erfnet_training_cefeim_logit_norm"]}

for loss, (model, training_folder) in losses.items():
  print(f"ERFNet {loss}")
  plot_folder = training_folder.split("training_")[1]
  run_eval_anomaly(datasets, methods, model, training_folder, plot_folder, load_dir)

  load_dir_for_iou = f"../save/{training_folder}/"
  model_best = f"model_best.pth"
  run_eval_iou(model, load_dir_for_iou, model_best)
  print()

# Visualization

In [None]:
# TODO: some images

# Ensemble

In [None]:
!cd {base_dir} && python -W ignore main_v2.py --datadir {data_dir} --savedir dummy --ensemble

Loading ensemble models...
../cityscapes/leftImg8bit/val
Running ensemble inference...

==> Ensemble mIoU: 0.6988
Per-class IoU:
Class 0: 0.9709
Class 1: 0.7991
Class 2: 0.8994
Class 3: 0.4724
Class 4: 0.4908
Class 5: 0.5313
Class 6: 0.5298
Class 7: 0.6834
Class 8: 0.9085
Class 9: 0.6181
Class 10: 0.9338
Class 11: 0.7303
Class 12: 0.4925
Class 13: 0.9202
Class 14: 0.7179
Class 15: 0.7981
Class 16: 0.6741
Class 17: 0.4148
Class 18: 0.6910
