# Model evaluation

Now that we have trained our model, we need to evaluate its performance. In order to do this we will use the test set that we separated in the beginning and also some images from the internet

> **Note**: This notebook is meant to be used with Kaggle notebooks (and also Colab), the link is [here](https://www.kaggle.com/code/leonardodanelutti/evaluatemodel). From Kaggle the notebook can be opened in Colab.

### Install detectron2 and import the required pakages

In [None]:
# Install detectron2
!python -m pip install pyyaml==5.1
import sys, os, shutil, distutils.core
if os.path.isdir('./detectron2'):
    shutil.rmtree("./detectron2")
# Note: This is a faster way to install detectron2 in Colab, but it does not include all functionalities (e.g. compiled operators).
# See https://detectron2.readthedocs.io/tutorials/install.html for full installation instructions
!git clone 'https://github.com/facebookresearch/detectron2'
dist = distutils.core.run_setup("./detectron2/setup.py")
!python -m pip install {' '.join([f"'{x}'" for x in dist.install_requires])}
sys.path.insert(0, os.path.abspath('./detectron2'))

# Get pytorch and CUDA vertions
import detectron2, torch
TORCH_VERSION = ".".join(torch.__version__.split(".")[:2])
CUDA_VERSION = torch.__version__.split("+")[-1]

# Setup the logger
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import os, random
import matplotlib.pyplot as plt

# import some common detectron2 utilities
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader

### Register the datasets

The validation dataset, with the images created with UnrealCV, is registered.

In [2]:
from detectron2.data.datasets import register_coco_instances
register_coco_instances("person_park_val", {}, "/kaggle/input/personparkval/dataset_validation.json", "/kaggle/input/personparkval/lit/lit")

### Get the parameters and weights from the trained model

In [None]:
cfg = get_cfg()
cfg.merge_from_file("/kaggle/input/personparkmodel/pytorch/peoplepark/2/cfg.yaml")
cfg.MODEL.WEIGHTS = "/kaggle/input/personparkmodel/pytorch/peoplepark/2/model_final.pth"

## Qualitative evaluation

Let's visualize the predictions made by the model. First we need to set a threshold for the confidence of the predictions.

In [None]:
# set a custom testing threshold
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.8  
# Create the predictor
predictor = DefaultPredictor(cfg)

Get a random image from the validation dataset and visualize the predictions made by the model.

In [None]:
# take a random image
validation_imgs = os.listdir("/kaggle/input/personparkval/lit/lit/");
random_img = random.choice(validation_imgs)
im = plt.imread("/kaggle/input/personparkval/lit/lit/%s" % random_img)

# make a prediction
metadata = MetadataCatalog.get("person_park_val")
outputs = predictor(im)
v = Visualizer(im,
               metadata=metadata, 
               scale=0.5,
)

# draw the prediction
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
plt.imshow(out.get_image()[:, :, ::-1])

This is the result:

<img src="./docs/imgs/out/p0.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/p1.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/p2.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/p3.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/p4.png" alt="drawing" width="60%"/>

We can to the same thing with random images taken from the internet

In [None]:
# take a random image
validation_imgs = os.listdir("/kaggle/input/personparktest/imgs");
random_img = random.choice(validation_imgs)
im = cv2.imread("/kaggle/input/personparktest/imgs/%s" % random_img)

# make a random image
metadata = MetadataCatalog.get("person_park_test")
outputs = predictor(im)
v = Visualizer(im,
               metadata=metadata, 
               scale=0.5,
)

# draw the prediction
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
plt.imshow(out.get_image()[:, :, ::-1])

Some examples:

<img src="./docs/imgs/out/i0.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/i1.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/i2.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/i3.png" alt="drawing" width="60%"/>
<img src="./docs/imgs/out/i4.png" alt="drawing" width="60%"/>

# Quantitative evaluation

We can use detectron2 to evaluate the model with the COCO evaluation metric

First evaluate the model with the validation dataset

In [None]:
evaluator = COCOEvaluator("person_park_val")
val_loader = build_detection_test_loader(cfg, "person_park_val")
print(inference_on_dataset(predictor.model, val_loader, evaluator))

Evaluate the custom model and the model from detectron2 model zoo with the test dataset 

In [None]:
setup_logger()
evaluator = COCOEvaluator("person_park_test")
val_loader = build_detection_test_loader(cfg, "person_park_test")
print(inference_on_dataset(predictor.model, val_loader, evaluator))

This result of the Evaluation:

```
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.784
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.979
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.923
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.702
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.858
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.941
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.051
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.421
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.815
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.747
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.889
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.958
[06/13 14:28:53 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm   |  APl   |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 78.377 | 97.907 | 92.341 | 70.234 | 85.769 | 94.096 |
OrderedDict([('bbox', {'AP': 78.37658739206807, 'AP50': 97.90662613898476, 'AP75': 92.34089935771088, 'APs': 70.23352244408629, 'APm': 85.76934424291247, 'APl': 94.09629592084283})])
```