#  Evaluate Detection Model with FiftyOne

This walkthrough demonstrates how use FiftyOne to perform hands-on evaluation of your detection model.

It covers the following concepts:
* Loading a dataset with detections
* Adding detection predictions
* Sample-wise MSCOCO evaluation
* Sorting and searching samples by model performance
* Visualizing true-positives and false-positives
* Querying your dataset for a custom insight

# Setup

Install `torch` and `torchvision`, if necessary:

In [None]:
# Modify as necessary (e.g., GPU install). See https://pytorch.org for options
!pip install torch
!pip install torchvision

Import the FiftyOne zoo and download the MSCOCO validation split to `~/fiftyone/coco-2017/validation`

In [3]:
import fiftyone as fo
import fiftyone.zoo as foz
import torch, torchvision

Media data is not copied when read into FiftyOne, when the Python process ends the dataset is deleted. Set `dataset.persistent=True` so that this process can be killed and the dataset can be loaded up again instantly in a new process.

In [2]:
dataset = foz.load_zoo_dataset("coco-2017", "validation")
dataset.persistent=True

Split 'validation' already downloaded
Loading 'coco-2017' split 'validation'
 100%  5000/5000 [18.6s elapsed, 0s remai


Initialize Faster-RCNN and download pretrained weights:

In [None]:
# Run the model on gpu if it is available
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)
model.to(device)
model.eval()

# Generate Predictions
Run Faster-RCNN on every sample in the validation dataset and add detections to our FiftyOne dataset.
Predictions are added to each sample in a new field we will call `faster_rcnn`

In [4]:
# ETA is installed with FiftyOne
# etai provides functionality to read images into memory
import fiftyone.core.utils as fou
import eta.core.image as etai
import json
from torchvision.transforms import functional as func

labels_path = "/home/erich/fiftyone/coco-2017/validation/labels.json"
with open(labels_path, "r") as labels_file:
    classes = json.load(labels_file)["classes"]

# Add predictions
with fou.ProgressBar() as pb:
    for sample in pb(dataset):
        image = etai.read(sample.filepath)
        image = func.to_tensor(image).to(device)
        c,h,w = image.shape

        preds = model([image])[0]

        labels = preds["labels"].cpu().detach().numpy()
        scores = preds["scores"].cpu().detach().numpy()
        boxes = preds["boxes"].cpu().detach().numpy()

        detections = []
        for label, score, box in zip(labels, scores, boxes):
            # Compute relative bounding box coordinates
            x1, y1, x2, y2 = box
            rel_box = [x1/w, y1/h, (x2-x1)/w, (y2-y1)/h]

            detections.append(fo.Detection(
                label=classes[label],
                bounding_box=rel_box,
                confidence=score
            ))

        sample["faster_rcnn"] = fo.Detections(
            detections=detections
        )
        sample.save()

print("Finished adding predictions")

   0%     3/5000 [271.5ms elapsed, 7.5m r

	nonzero(Tensor input, *, Tensor out)
Consider using one of the following signatures instead:
	nonzero(Tensor input, *, bool as_tuple)


 100%  5000/5000 [5.0m elapsed, 0s remain
Finished adding predictions


# Evaluate Detections
Use MSCOCO detection evaluation provided within FiftyOne to threshold detections and compute true and false positives for each sample

In [2]:
from fiftyone import ViewField as F

FiftyOne allows you to write expressions to filter detections or classifications based on attributes of your field. For example, we can filter all of our Faster-RCNN detections to keep only boxes with a `confidence` higher than `0.75` and store them in a new `DatasetView`.

In [6]:
faster_rcnn_75 = dataset.filter_detections("faster_rcnn", F("confidence")>0.75)

To make this filter more permanent we can clone the "faster_rcnn" field into "faster_rcnn_75" using the filtered view we just computed. This new field will now only contain bounding boxes with a `confidence > 0.75`.

In [7]:
dataset.clone_field("faster_rcnn", "faster_rcnn_75", samples=faster_rcnn_75)

 100%  5000/5000 [59.0s elapsed, 0s remai


(5000, 0)

Match detections to ground truth and compute true and false positives according to MSCOCO evaluation

In [3]:
import fiftyone.utils.cocoeval as fouc

fouc.evaluate_detections(dataset, "faster_rcnn_75", "ground_truth")

Evaluating detections for each sample
 100%  5000/5000 [1.6m elapsed, 0s remain


In [4]:
dataset

Name:           coco-2017-validation
Persistent:     True
Num samples:    5000
Tags:           ['validation']
Sample fields:
    filepath:       fiftyone.core.fields.StringField
    tags:           fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:       fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth:   fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    faster_rcnn:    fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    faster_rcnn_75: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    tp_iou_0_75:    fiftyone.core.fields.IntField
    fp_iou_0_75:    fiftyone.core.fields.IntField
    fn_iou_0_75:    fiftyone.core.fields.IntField

Every `Sample` now contains new fields `tp_iou_0_75`, `fp_iou_0_75`, and `fn_iou_0_75` corresponding to the total true positive, false positive, and false negative counts in your detections for an IoU of 0.75. This value can be changed using the `save_iou` kwarg in `evaluate_detections(dataset, "faster_rcnn_75", "ground_truth", save_iou=0.95)`

Every `faster_rcnn_75` field in every `Sample` now contains a new `ground_truth_eval` field that contains `true_positives`, `false_positives`, and `false_negatives` ranging from IoUs `0_5`, `0_55`,..., to `0_95`.

Every `Detection` in the `faster_rcnn_75` field now also has a `ground_truth_eval` field that contains:
* The unique `eval_id` of that detection
* The `ious` for every class of that detection with all ground truth detections of that class
* The `matches` for 10 IoU values ranging from `0.5` to `0.95` that each contain the `gt_id` and `iou` of the ground truth detection that this predicted detection was matched with according to the pycocotools matching algorithm


# Visualize Detections
Launch the FiftyOne app and easily view ground truth and predicted bounding boxes.

In [None]:
session = fo.launch_app(dataset=dataset)

In [6]:
session.dataset = dataset

![launch](images/eval_dets/launch_app.png)

All fields are shown as togglable bubbles on the left sidebar which can be used to switch between ground truth detections, predictions, and thresholded predictions.

![bubbles](images/eval_dets/coco_gt.png)

## Dataset Views
A `DatasetView` can also be used to search, sort, or slice your dataset for you to look at different views of the samples. 

Individual samples can be selected and a `DatasetView` can be created to look at just those samples.

In [None]:
selected_samples = session.selected
session.view = dataset.select(selected_samples)

![selected](images/eval_dets/selected.png)

Reset the session dataset to show the entire dataset again.

In [20]:
session.dataset = dataset

`tp_iou_0_75` was calculated for each sample during evaluation. We can make a `DatasetView` that sorts by `tp_iou_0_75` to look at the best predictions that the model had based on the number of true positives.

In [18]:
session.view = dataset.sort_by("tp_iou_0_75", reverse=True)

![tp_rev](images/eval_dets/tp_rev.png)

Similarly, we can use `fp_iou_0_75` to see the samples that our model performed the worst on.

In [25]:
session.view = dataset.sort_by("fp_iou_0_75", reverse=True)

![fp_rev](images/eval_dets/fp_rev.png)

`DatasetView` queries are extremely powerful. For example, if we just want to look at how our model performed on detecting small images, we can write a function that filters out any detections except ones where the box height multiplied by the box width is less than `0.005`.

In [22]:
small_boxes_view = dataset.filter_detections(
    "faster_rcnn_75",
    F("bounding_box")[2] * F("bounding_box")[3] < 0.005
)

session.view = small_boxes_view

![small](images/eval_dets/small_view.png)

In MSCOCO, bounding boxes can have an `iscrowd` attribute indicating that the box contains multiple instances of the same object. We can make a view of only samples with the `iscrowd` attribute on a detection.

In [23]:
crowded_images_view = dataset.match(
    F("ground_truth.detections").filter(F("attributes.iscrowd.value") == 1).length() > 0
)

session.view = crowded_images_view

![crowd](images/eval_dets/crowded_view.png)

We can sort that view of crowded images by false positive count in decreasing order to see samples that have a lot of false predictions and also include an `iscrowd` ground truth object.

In [24]:
sorted_crowded_images_view = crowded_images_view.sort_by(
    "fp_iou_0_75", reverse=True
)

session.view = sorted_crowded_images_view

![crowd_sort](images/eval_dets/crowded_sorted.png)

Now if we compare this view to the one where we just sorted by false positives we can see something interesting.

In [25]:
session.view = dataset.sort_by("fp_iou_0_75", reverse=True)

![fp_rev](images/eval_dets/fp_rev.png)

Looking into specific examples, we can see that samples where there are a lot of false positives are ones where the underlying ground truth bounding box was missing the `iscrowd` attribute. This resulted in crowds of correct prediction to be labeled as false positive even though they are true positives. Knowing this, the MSCOCO labels could be refined to fix missing `iscrowd` attributes. 

This finding would have been nearly impossible to detect unless going through and looking at individual samples and searching by a variety of different criteria.