# Detection Mistakenness with FiftyOne

Finding mistakes in your annotations can be extremely tedious. The mistakenness feature of FiftyOne can be used to help you find annotation mistakes. Check out [our classification tutorial](https://voxel51.com/docs/fiftyone/tutorials/label_mistakes.html) to see how FiftyOne can help you find and correct label mistakes in your classification datasets.

This recipe is designed to show you how you can use FiftyOne to compute mistakenness on your detection dataset, enabling you to curate higher quality datasets and, ultimately, train better models!

## Overview

In this recipe, we explore how FiftyOne can be used to help you find mistakes in your detection annotations.

Requirements:

- A detection model trained on the same label schema as the annotations you want to analyze
- A FiftyOne Dataset with your annotations and predictions from the model with logits for each detection


We'll cover the following concepts:

-   Computing insights into your detection dataset relating to possible mistakes
-   Visualizing the mistake in the FiftyOne App



### Outcome


Mistakenness for detection datasets will add the following fields:

Ground truth detection fields:

- `mistakenness`: A measure of the correctness of the detection and classification of the object therein
- `mistakenness_loc`: Specificaly a measure of the mistakenness of the localization of the bounding box
- `possible_mistake`: If the ground truth object was not matched with a prediction, it is flagged as a possible mistake

Prediction field:

- `possible_mistake`: If the prediction was confident but not matched with a ground truth object, it is flagged as a possible mistake


Sample level fields:

- `mistakenness`: An average of the `mistakenness` of all detections (larger values are more likely to be mistakes)
- `possible_mistake`: The number of possible mistakes across all detections in the sample

## Setup

Your Dataset should have two `Detections` fields, one with your ground truth annotations and one with your model predictions.

In this example, I used the `coco-2017-validation` dataset from the [FiftyOne zoo](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/zoo.html) and added predictions from the [PyTorch implementation of Faster-RCNN](https://github.com/pytorch/vision/blob/master/torchvision/models/detection/faster_rcnn.py)

In [4]:
import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("coco-2017", split="validation")

add_custom_predictions(dataset)

print(dataset)

Name:           coco-2017-validation
Media type:     image
Num samples:    1002
Persistent:     True
Info:           {'classes': ['0', 'person', 'bicycle', ...]}
Tags:           ['validation']
Sample fields:
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:  fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)


In [5]:
print(dataset.first().predictions.detections[0])

<Detection: {
    'id': '5fbd91988d2308dc08d1e091',
    '_id': '5fbd91988d2308dc08d1e091',
    'attributes': BaseDict({}),
    'label': 'chair',
    'bounding_box': BaseList([
        0.4544248104095459,
        0.510538163879108,
        0.10179729461669922,
        0.2363765519549589,
    ]),
    'mask': None,
    'confidence': 0.9961196184158325,
    'index': None,
    '_cls': 'Detection',
    'logits': BaseList([
        9.405427932739258,
        6.29296875,
        2.7285072803497314,
        0.41016122698783875,
        1.419637680053711,
        -1.2565374374389648,
        -0.6636660695075989,
        -2.6886038780212402,
        -0.48208746314048767,
        1.0514572858810425,
        -1.1872789859771729,
        -2.45004940032959,
        -2.424051284790039,
        -4.479976177215576,
        -1.658575177192688,
        6.200488567352295,
        -1.1229450702667236,
        -1.4458647966384888,
        1.2875698804855347,
        0.9066644906997681,
        -0.24453511834

## Find the mistakes

Now we can run a method from the FiftyOne Brain that estimates the mistakenness of the
ground truth detections for which we generated predictions:

In [6]:
import fiftyone.brain as fob

# Compute mistakenness, use the field names of your dataset in place of "predictions" and "ground_truth"
fob.compute_mistakenness(dataset, "predictions", label_field="ground_truth")

Evaluating detections...
 100% |██████████████████████████████████████████████████████████████| 1002/1002 [48.0s elapsed, 0s remaining, 23.2 samples/s]      
Computing mistakenness...
  100% |██████████████████████████████████████████████████████████████| 1002/1002 [50.1s elapsed, 0s remaining, 24.6 samples/s]    
  100% |██████████████████████████████████████████████████████████████| 1002/1002 [51.6s elapsed, 0s remaining, 26.0 samples/s]      
Mistakenness computation complete


The above method added fields to all samples for which we had predictions at both a sample and detection level. Specifically, it added the following.

Ground truth detection fields:

- `mistakenness`: A measure of the correctness of the detection and classification of the object therein
- `mistakenness_loc`: Specificaly a measure of the mistakenness of the localization of the bounding box
- `possible_mistake`: If the ground truth object was not matched with a prediction, it is flagged as a possible mistake

Prediction field:

- `possible_mistake`: If the prediction was confident but not matched with a ground truth object, it is flagged as a possible mistake

Sample level fields:

- `mistakenness`: An average of the `mistakenness` of all detections (larger values are more likely to be mistakes)
- `possible_mistake`: The number of possible mistakes across all detections in the sample

We can easily sort by likelihood of mistakenness from code:

In [7]:
# Sort by likelihood of mistake (most likely first)
mistake_view = (dataset
    .sort_by("mistakenness", reverse=True)
)

# Print some information about the view
print(mistake_view)

Dataset:        coco-2017-validation
Media type:     image
Num samples:    1002
Tags:           ['validation']
Sample fields:
    filepath:          fiftyone.core.fields.StringField
    tags:              fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth:      fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:       fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    mistakenness:      fiftyone.core.fields.FloatField
    possible_mistakes: fiftyone.core.fields.IntField
Pipeline stages:
    1. SortBy(field_or_expr='mistakenness', reverse=True)


In [12]:
# Inspect some samples and detections
# This is the first detection of the first sample
print(mistake_view.first().ground_truth.detections[0])

<Detection: {
    'id': '5fbd91818d2308dc08d0badc',
    '_id': '5fbd91818d2308dc08d0badc',
    'attributes': BaseDict({
        'area': <NumericAttribute: {'value': 2784.7888000000003, '_cls': 'NumericAttribute'}>,
        'iscrowd': <NumericAttribute: {'value': 0.0, '_cls': 'NumericAttribute'}>,
    }),
    'label': 'kite',
    'bounding_box': BaseList([
        0.43630232558139537,
        0.174640625,
        0.23800000000000002,
        0.114140625,
    ]),
    'mask': None,
    'confidence': None,
    'index': None,
    '_cls': 'Detection',
    'predictions_eval': BaseDict({
        'matches': BaseDict({
            '0_5': BaseDict({
                'pred_id': '5fbd921c8d2308dc08d205d4',
                'iou': 0.5872641643442088,
            }),
        }),
    }),
    'mistakenness': 0.983423712863277,
    'mistakenness_loc': 0.8146814475545225,
}>


Let's use the App to visually inspect the results:

In [13]:
# Launch the FiftyOne App
session = fo.launch_app()

# Open your dataset in the App
session.dataset = dataset

App launched


![dataset](images/det_mistakenness_1.png)

In [22]:
# Show the top 50 samples we processed in rank order by the mistakenness
session.view = mistake_view[:50]

![view](images/det_mistakenness_2.png)

Another useful query is to find all objects that have a high mistakenness, lets say > 0.95

In [None]:
from fiftyone import ViewField as F

high_mistake_view = dataset.filter_detections("ground_truth", F("mistakenness") > 0.95, only_matches=True)

session.view = high_mistake_view

Looking through the results, we see some annotations that may be incorrect. For example, in the image below the `goat` is labeled as a `sheep`.

![sheep](images/det_mistakenness_3.png)

We can use a similar workflow to look at objects that may be localized poorly.

In [None]:
loc_view = dataset.filter_detections("ground_truth", F("mistakenness_loc") > 0.95, only_matches=True)

session.view = loc_view

One of the examples that popped up from this query is shown below. It is a wine glass that was incorrectly box during the annotation process where the bottom of the glass was missed.

![wine](images/det_mistakenness_4.png)

The `possible_mistakes` field can also be useful to sort by to find instances of incorrect annotations or missed objects.

In [None]:
session.view = dataset.sort_by("possible_mistakes", reverse=True)

An example that showed up from this search is shown below. There is a `car` annotation that was not detected by the model and upon closer inspection, it does not appear to actually be a car.

![car](images/det_mistakenness_5.png)

Remember: Since you are using model predictions to guide the mistakenness process, the better your model, the more accurate the mistakenness suggestions. We used Faster-RCNN in this example which is quite a few years old. Using a newer detector can provide better results.