<a href="https://colab.research.google.com/github/andandandand/practical-computer-vision/blob/main/notebooks/Exploring_Object_Detection_with_FiftyOne_on_COCO_2017.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exploring Object Detection Performance with FiftyOne

In this notebook we examine the performance of a pretrained [RetinaNet](https://arxiv.org/abs/1708.02002) on the [COCO 2017 validation set](https://cocodataset.org/#home), through the open source [FiftyOne](https://github.com/voxel51/fiftyone) SDK and visualization app.

It covers the following concepts:

- Loading a dataset with ground truth labels [into FiftyOne](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/index.html)
- [Adding model predictions](https://voxel51.com/docs/fiftyone/recipes/adding_detections.html) to your dataset
- [Evaluating your model](https://voxel51.com/docs/fiftyone/user_guide/evaluation.html#detections) using FiftyOne's evaluation API
- Viewing the best and worst performing samples in your dataset

**So, what's the takeaway?**

Aggregate measures of performance like [mean Average Precision](https://kili-technology.com/data-labeling/machine-learning/mean-average-precision-map-a-complete-guide) don't give us the full picture of your detection model. In practice, the limiting factor on your model's performance is often data quality issues that we need to **see** to address. FiftyOne is designed to make it easy to do just that.

## Inspecting your datasets

Running the workflow presented here on your ML projects will help you to understand the current failure modes (edge cases) of your model and how to fix them, including:

- Identifying scenarios that require additional training samples in order to boost your model's performance
- Deciding whether your ground truth annotations have errors/weaknesses that need to be corrected before any subsequent model training will be profitable


This walkthrough demonstrates how to use FiftyOne to perform hands-on evaluation of your detection model.

## Install FiftyOne

In [1]:
# Install the library, you will need to uncomment this on Colab
#!pip install fiftyone==1.5.2 > /dev/null


In [2]:
#Check installed versions
import fiftyone as fo
print(f"FiftyOne version: {fo.__version__}")

import fiftyone.zoo as foz
import fiftyone.brain as fob 
from fiftyone import ViewField as F

FiftyOne version: 1.5.2


## Dataset loading

In [3]:
dataset = foz.load_zoo_dataset(
    "coco-2017",
    split="validation",
    dataset_name="evaluate-detections-tutorial",
)
dataset.persistent = True
dataset_clone = dataset.clone()
dataset_clone.persistent = True

print("Dataset loaded.")

# Let's inspect the first sample of the dataset.
# This will show its filepath, tags, metadata, and any existing fields like 'ground_truth'.
print(dataset_clone.first())

Downloading split 'validation' to '/fiftyone/zoo/datasets/coco-2017/validation' if necessary
Downloading annotations to '/fiftyone/zoo/datasets/coco-2017/tmp-download/annotations_trainval2017.zip'
 100% |██████|    1.9Gb/1.9Gb [3.6m elapsed, 0s remaining, 9.4Mb/s]       
Extracting annotations to '/fiftyone/zoo/datasets/coco-2017/raw/instances_val2017.json'
Downloading images to '/fiftyone/zoo/datasets/coco-2017/tmp-download/val2017.zip'
 100% |██████|    6.1Gb/6.1Gb [30.2s elapsed, 0s remaining, 215.3Mb/s]      
Extracting images to '/fiftyone/zoo/datasets/coco-2017/validation/data'
Writing annotations to '/fiftyone/zoo/datasets/coco-2017/validation/labels.json'
Dataset info written to '/fiftyone/zoo/datasets/coco-2017/info.json'
Loading 'coco-2017' split 'validation'
 100% |███████████████| 5000/5000 [9.9s elapsed, 0s remaining, 508.2 samples/s]      
Dataset 'evaluate-detections-tutorial' created
Dataset loaded.
<Sample: {
    'id': '6838b09acf5d4afce388ec8f',
    'media_type': 'ima

# Choose a random subset of samples to add predictions to

In [4]:
predictions_view = dataset_clone.take(100, seed=51)

In [5]:
model = foz.load_zoo_model("retinanet-resnet50-fpn-coco-torch")


Downloading model from 'https://download.pytorch.org/models/retinanet_resnet50_fpn_coco-eeacb38b.pth'...
 100% |██████|    1.0Gb/1.0Gb [9.7s elapsed, 0s remaining, 89.7Mb/s]       


Downloading: "https://download.pytorch.org/models/retinanet_resnet50_fpn_coco-eeacb38b.pth" to /root/.cache/torch/hub/checkpoints/retinanet_resnet50_fpn_coco-eeacb38b.pth
100%|██████████| 130M/130M [00:01<00:00, 69.2MB/s] 


In [6]:
predictions_view.apply_model(model, label_field="predictions")

 100% |█████████████████| 100/100 [2.8m elapsed, 0s remaining, 0.6 samples/s]    


In [7]:
session = fo.launch_app(predictions_view, auto=False)

Session launched. Run `session.show()` to open the App in a cell output.

Welcome to

███████╗██╗███████╗████████╗██╗   ██╗ ██████╗ ███╗   ██╗███████╗
██╔════╝██║██╔════╝╚══██╔══╝╚██╗ ██╔╝██╔═══██╗████╗  ██║██╔════╝
█████╗  ██║█████╗     ██║    ╚████╔╝ ██║   ██║██╔██╗ ██║█████╗
██╔══╝  ██║██╔══╝     ██║     ╚██╔╝  ██║   ██║██║╚██╗██║██╔══╝
██║     ██║██║        ██║      ██║   ╚██████╔╝██║ ╚████║███████╗
╚═╝     ╚═╝╚═╝        ╚═╝      ╚═╝    ╚═════╝ ╚═╝  ╚═══╝╚══════╝ v1.5.2

If you're finding FiftyOne helpful, here's how you can get involved:

|
|  ⭐⭐⭐ Give the project a star on GitHub ⭐⭐⭐
|  https://github.com/voxel51/fiftyone
|
|  🚀🚀🚀 Join the FiftyOne Discord community 🚀🚀🚀
|  https://community.voxel51.com/
|



In [8]:
# Shows the predictions on the view only
session.view = predictions_view
print(session.url)

http://0.0.0.0:5151/


In [9]:
# Resets the session; the entire dataset will now be shown
session.view = None
print(session.url)

http://0.0.0.0:5151/


In [10]:
# Going back to showing the view
session.view = predictions_view
print(session.url)

http://0.0.0.0:5151/


Now try inspecting the predictions with confidence > 0.75

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/predictions.png?raw=true)

## Confidence threshold through the SDK

In [11]:
# Only contains detections with confidence >= 0.75
high_conf_view = predictions_view.filter_labels("predictions",
                                                F("confidence") > 0.75,
                                                only_matches=False)

Note the `only_matches=False` argument. When filtering labels, any samples that no longer contain labels would normally be removed from the view. However, this is not desired when performing evaluations since it can skew your results between views. We set `only_matches=False` so that all samples will be retained, even if some no longer contain labels.

In [12]:
# Print some information about the view
print(high_conf_view)

Dataset:     2025.05.29.19.08.20.907977
Media type:  image
Num samples: 100
Sample fields:
    id:               fiftyone.core.fields.ObjectIdField
    filepath:         fiftyone.core.fields.StringField
    tags:             fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:       fiftyone.core.fields.DateTimeField
    last_modified_at: fiftyone.core.fields.DateTimeField
    ground_truth:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:      fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
View stages:
    1. Take(size=100, seed=51)
    2. FilterLabels(field='predictions', filter={'$gt': ['$$this.confidence', 0.75]}, only_matches=False, trajectories=False)


In [13]:
# Print a prediction from the view to verify that its confidence is > 0.75
sample = high_conf_view.first()
sample

<SampleView: {
    'id': '6838b09ecf5d4afce38923e1',
    'media_type': 'image',
    'filepath': '/fiftyone/zoo/datasets/coco-2017/validation/data/000000189475.jpg',
    'tags': ['validation'],
    'metadata': <ImageMetadata: {
        'size_bytes': None,
        'mime_type': None,
        'width': 500,
        'height': 375,
        'num_channels': None,
    }>,
    'created_at': datetime.datetime(2025, 5, 29, 19, 8, 20, 912000),
    'last_modified_at': datetime.datetime(2025, 5, 29, 19, 8, 36, 511000),
    'ground_truth': <Detections: {
        'detections': [
            <Detection: {
                'id': '6838b09ecf5d4afce3892284',
                'attributes': {},
                'tags': [],
                'label': 'bottle',
                'bounding_box': [0.2356, 0.64016, 0.1394, 0.3390933333333333],
                'mask': None,
                'mask_path': None,
                'confidence': None,
                'index': None,
                'supercategory': 'kitchen',
    

In [14]:
# Load high confidence view in the App
session.view = high_conf_view
print(session.url)

http://0.0.0.0:5151/


Try inspecting patches on the view

![](https://github.com/andandandand/practical-computer-vision/blob/main/images/inspecting_patches.png?raw=true)

## Evaluate predictions


In [15]:
# Evaluate the predictions in the `faster_rcnn` field of our `high_conf_view`
# with respect to the objects in the `ground_truth` field
results = high_conf_view.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval",
    compute_mAP=True,
)

Evaluating detections...
 100% |█████████████████| 100/100 [520.4ms elapsed, 0s remaining, 192.2 samples/s]      
Performing IoU sweep...
 100% |█████████████████| 100/100 [394.3ms elapsed, 0s remaining, 253.6 samples/s]      


In [16]:
# Get the 10 most common classes in the dataset
counts = dataset_clone.count_values("ground_truth.detections.label")
classes_top10 = sorted(counts, key=counts.get, reverse=True)[:10]

# Print a classification report for the top-10 classes
results.print_report(classes=classes_top10)

               precision    recall  f1-score   support

       person       0.99      0.40      0.57       229
          car       0.92      0.29      0.44        38
        chair       1.00      0.11      0.20        44
         book       0.00      0.00      0.00        37
       bottle       1.00      0.10      0.18        20
          cup       1.00      0.26      0.42        19
 dining table       1.00      0.11      0.19        19
traffic light       0.00      0.00      0.00        25
         bowl       0.00      0.00      0.00         9
      handbag       0.00      0.00      0.00        14

    micro avg       0.97      0.26      0.40       454
    macro avg       0.59      0.13      0.20       454
 weighted avg       0.80      0.26      0.38       454



In [17]:
print(results.mAP())

0.3034670744932542


In [18]:
# install ipywidgets if needed for the interactive plot below
!pip install 'ipywidgets>=8,<9'


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.3.1[0m[39;49m -> [0m[32;49m25.1.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [19]:
plot = results.plot_pr_curves(classes=["person", "car"])
plot.show()



FigureWidget({
    'data': [{'customdata': array([0.99218518, 0.98963709, 0.98864924, 0.98567861, 0.98375748, 0.97796589,
                                   0.97225765, 0.97107594, 0.96842679, 0.95127236, 0.869515  , 0.867999  ,
                                   0.8626564 , 0.85811699, 0.84844059, 0.84236296, 0.83271847, 0.82672042,
                                   0.82146986, 0.81599904, 0.81177319, 0.8047875 , 0.79303241, 0.71254991,
                                   0.70864769, 0.70561459, 0.69964364, 0.6886519 , 0.68341516, 0.67957386,
                                   0.67195178, 0.66524132, 0.65525721, 0.57322136, 0.55930566, 0.55193307,
                                   0.54647251, 0.53574771, 0.45811466, 0.37977147, 0.30017221, 0.        ,
                                   0.        , 0.        , 0.        , 0.        , 0.        , 0.        ,
                                   0.        , 0.        , 0.        , 0.        , 0.        , 0.        ,
                      

## Sample level analysis

In [20]:
# Our dataset's schema now contains `eval_*` fields from a confusion matrix
print(dataset_clone)

Name:        2025.05.29.19.08.20.907977
Media type:  image
Num samples: 5000
Persistent:  True
Tags:        []
Sample fields:
    id:               fiftyone.core.fields.ObjectIdField
    filepath:         fiftyone.core.fields.StringField
    tags:             fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:       fiftyone.core.fields.DateTimeField
    last_modified_at: fiftyone.core.fields.DateTimeField
    ground_truth:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:      fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    eval_tp:          fiftyone.core.fields.IntField
    eval_fp:          fiftyone.core.fields.IntField
    eval_fn:          fiftyone.core.fields.IntField


In [21]:
# The dataset keeps track of the evaluations that we have run
print(dataset_clone.list_evaluations())

['eval']


In [22]:
print(dataset_clone.get_evaluation_info("eval"))

{
    "key": "eval",
    "version": "1.5.2",
    "timestamp": "2025-05-29T19:11:28.592000",
    "config": {
        "cls": "fiftyone.utils.eval.coco.COCOEvaluationConfig",
        "type": "detection",
        "method": "coco",
        "pred_field": "predictions",
        "gt_field": "ground_truth",
        "iou": 0.5,
        "classwise": true,
        "custom_metrics": null,
        "iscrowd": "iscrowd",
        "use_masks": false,
        "use_boxes": false,
        "tolerance": null,
        "compute_mAP": true,
        "iou_threshs": [
            0.5,
            0.55,
            0.6,
            0.65,
            0.7,
            0.75,
            0.8,
            0.85,
            0.9,
            0.95
        ],
        "max_preds": 100,
        "error_level": 1
    }
}


In [23]:
# Load the view on which we ran the `eval` evaluation
eval_view = dataset_clone.load_evaluation_view("eval")
print(eval_view)

Dataset:     2025.05.29.19.08.20.907977
Media type:  image
Num samples: 100
Sample fields:
    id:               fiftyone.core.fields.ObjectIdField
    filepath:         fiftyone.core.fields.StringField
    tags:             fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:       fiftyone.core.fields.DateTimeField
    last_modified_at: fiftyone.core.fields.DateTimeField
    ground_truth:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:      fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    eval_tp:          fiftyone.core.fields.IntField
    eval_fp:          fiftyone.core.fields.IntField
    eval_fn:          fiftyone.core.fields.IntField
View stages:
    1. Take(size=100, seed=51)
    2. FilterLabels(field='predictions', filter={'$gt': ['$$this.confidence', 0.75]}, only_matche

In [24]:
# Our detections have helpful evaluation data on them
sample = high_conf_view.first()
print(sample.predictions.detections[0])

<Detection: {
    'id': '6838b0b4cf5d4afce3898fb1',
    'attributes': {},
    'tags': [],
    'label': 'person',
    'bounding_box': [
        0.006261056289076805,
        0.07051300257444382,
        0.43123242259025574,
        0.8487465977668762,
    ],
    'mask': None,
    'mask_path': None,
    'confidence': 0.9155575037002563,
    'index': None,
    'eval_iou': 0.9590503341005493,
    'eval_id': '6838b09ecf5d4afce389228e',
    'eval': 'tp',
}>


In [25]:
# View the `iscrowd` attribute on a ground truth object
sample = dataset_clone.first()
print(sample.ground_truth.detections[0])

<Detection: {
    'id': '6838b09acf5d4afce388ec7b',
    'attributes': {},
    'tags': [],
    'label': 'potted plant',
    'bounding_box': [
        0.37028125,
        0.3345305164319249,
        0.038593749999999996,
        0.16314553990610328,
    ],
    'mask': None,
    'mask_path': None,
    'confidence': None,
    'index': None,
    'supercategory': 'furniture',
    'iscrowd': 0,
}>


## Evaluation patches

So, now that we have a sense for the aggregate performance of our model, let's dive into sample-level analysis by creating an [evaluation view](https://voxel51.com/docs/fiftyone/user_guide/app.html#viewing-evaluation-patches).

Any evaluation that you stored on your dataset can be used to generate an [evaluation view](https://voxel51.com/docs/fiftyone/user_guide/app.html#viewing-evaluation-patches) that is a patches view creating a sample for every true positive, false positive, and false negative in your dataset.
Through this view, you can quickly filter and sort evaluated detections by their type (TP/FP/FN), evaluated IoU, and if they are matched to a crowd object.

These evaluation views can be created through Python or directly in the App as shown below.

In [26]:
eval_patches = dataset_clone.to_evaluation_patches("eval")
print(eval_patches)

Dataset:     2025.05.29.19.08.20.907977
Media type:  image
Num patches: 37606
Patch fields:
    id:               fiftyone.core.fields.ObjectIdField
    sample_id:        fiftyone.core.fields.ObjectIdField
    filepath:         fiftyone.core.fields.StringField
    tags:             fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:       fiftyone.core.fields.DateTimeField
    last_modified_at: fiftyone.core.fields.DateTimeField
    ground_truth:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:      fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    crowd:            fiftyone.core.fields.BooleanField
    type:             fiftyone.core.fields.StringField
    iou:              fiftyone.core.fields.FloatField
View stages:
    1. ToEvaluationPatches(eval_key='eval', config=None)


In [27]:
# let's use this evaluation to find false positives with confidence above > .85
session_view = high_conf_view
print(session.url )

http://0.0.0.0:5151/


![](https://github.com/andandandand/practical-computer-vision/blob/main/images/false_positive.png?raw=true)

## View with best performing cases

In [28]:
# Show samples with most true positives
session.view = high_conf_view.sort_by("eval_tp", reverse=True)
print(session.url)

http://0.0.0.0:5151/


## View with the worst performing cases by false positives

In [29]:
# Show samples with most false positives
session.view = high_conf_view.sort_by("eval_fp", reverse=True)
print(session.url)

http://0.0.0.0:5151/


## View with the best performing cases by false negatives

In [30]:
# Show samples with most false negatives
session.view = high_conf_view.sort_by("eval_fn", reverse=True)
print(session.url)

http://0.0.0.0:5151/


## Filtering by bounding box area

In [31]:
# Compute metadata so we can reference image height/width in our view
dataset.compute_metadata()

In [32]:
#
# Create an expression that will match objects whose bounding boxes have
# area less than 32^2 pixels
#
# Bounding box format is [top-left-x, top-left-y, width, height]
# with relative coordinates in [0, 1], so we multiply by image
# dimensions to get pixel area
#
bbox_area = (
    F("$metadata.width") * F("bounding_box")[2] *
    F("$metadata.height") * F("bounding_box")[3]
)
small_boxes = bbox_area < 32 ** 2

# Create a view that contains only small (and high confidence) predictions
small_boxes_view = high_conf_view.filter_labels("predictions", small_boxes)

session.view = small_boxes_view
print(session.url)

http://0.0.0.0:5151/


In [33]:
# Create a view that contains only small GT and predicted boxes
small_boxes_eval_view = (
    high_conf_view
    .filter_labels("ground_truth", small_boxes, only_matches=False)
    .filter_labels("predictions", small_boxes, only_matches=False)
)

# Run evaluation
small_boxes_results = small_boxes_eval_view.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
)

Evaluating detections...
 100% |█████████████████| 100/100 [98.9ms elapsed, 0s remaining, 1.0K samples/s] 


In [34]:
# Get the 10 most common small object classes
small_counts = small_boxes_eval_view.count_values("ground_truth.detections.label")
classes_top10_small = sorted(small_counts, key=counts.get, reverse=True)[:10]

# Print a classification report for the top-10 small object classes
small_boxes_results.print_report(classes=classes_top10_small)

               precision    recall  f1-score   support

       person       0.80      0.06      0.10        72
          car       0.80      0.19      0.31        21
        chair       0.00      0.00      0.00         8
         book       0.00      0.00      0.00        22
       bottle       0.00      0.00      0.00        13
          cup       0.00      0.00      0.00        10
 dining table       0.00      0.00      0.00         3
traffic light       0.00      0.00      0.00        21
         bowl       0.00      0.00      0.00         1
      handbag       0.00      0.00      0.00         4

    micro avg       0.73      0.05      0.09       175
    macro avg       0.16      0.02      0.04       175
 weighted avg       0.43      0.05      0.08       175



## Inspecting the crowd views

In [35]:
# View the `iscrowd` attribute on a ground truth object
sample = dataset_clone.first()
print(sample.ground_truth.detections[0])

<Detection: {
    'id': '6838b09acf5d4afce388ec7b',
    'attributes': {},
    'tags': [],
    'label': 'potted plant',
    'bounding_box': [
        0.37028125,
        0.3345305164319249,
        0.038593749999999996,
        0.16314553990610328,
    ],
    'mask': None,
    'mask_path': None,
    'confidence': None,
    'index': None,
    'supercategory': 'furniture',
    'iscrowd': 0,
}>


In [36]:
# Create a view that contains only samples for which at least one detection has
# its iscrowd attribute set to 1
crowded_images_view = high_conf_view.match(
    F("ground_truth.detections").filter(F("iscrowd") == 1).length() > 0
)

session.view = crowded_images_view
print(session.url)

http://0.0.0.0:5151/


In [37]:
# Evaluating the crowd by the number of false positives
session.view = crowded_images_view.sort_by("eval_fp", reverse=True)
print(session.url)

http://0.0.0.0:5151/


## Using the model to improve the dataset (active learning)

In [38]:
# Tag all highly confident false positives as "possibly-missing"
(
    high_conf_view
        .filter_labels("predictions", F("eval") == "fp")
        .select_fields("predictions")
        .tag_labels("possibly-missing")
)

These tagged labels could then be sent off to our annotation provider of choice for review and addition to the ground truth labels. FiftyOne currently offers integrations for [Scale AI](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.scale.html), [Labelbox](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.labelbox.html), and [CVAT](https://voxel51.com/docs/fiftyone/api/fiftyone.types.dataset_types.html?highlight=cvat#fiftyone.types.dataset_types.CVATImageDataset).


In [39]:
# Export all labels with the `possibly-missing` tag in CVAT format
(
    dataset
        .select_labels(tags=["possibly-missing"])
        .export("./possibly_missing_labels", fo.types.CVATImageDataset)
)

 100% |█████████████████████| 0/0 [11.5ms elapsed, ? remaining, ? samples/s] 


## Summary

In this tutorial, we covered loading a dataset into FiftyOne and analyzing the performance of an out-of-the-box object detection model on the dataset.

**So, what's the takeaway?**

Aggregate evaluation results for an object detector are important, but they alone don't tell the whole story of a model's performance. It's critical to study the failure modes of your model so you can take the right actions to improve them.

In this tutorial, we covered two types of analysis:

- Analyzing the performance of your detector across different strata, like high confidence, small objects in crowded scenes
- Inspecting the hardest samples in your dataset to diagnose the underlying issue, whether it be your detector or the ground truth annotations

## About this tutorial

This tutorial is based on [FiftyOne's documentation](https://docs.voxel51.com/tutorials/evaluate_detections.html). You will notice a couple of minor changes.

* Views for the app launch on their own window, this makes it easier for us to inspect the output of our views on the app.
* We create a clone of the COCO dataset at the start of the notebook so that we can go back to its original state if we want
* I replaced Faster R-CNN for RetinaNet. As an exercise I encourage to try it with [Faster R-CNN from our model zoo](https://docs.voxel51.com/model_zoo/models.html#faster-rcnn-resnet50-fpn-coco-torch) or through our [integration with Ultralytics's YOLO](https://docs.voxel51.com/integrations/ultralytics.html).

