# Find Detection Mistakes

Annotations mistakes create an artificial ceiling on the performance of your models. However, finding these mistakes by hand is at least as arduous as the original annotation work! Enter FiftyOne.

In this tutorial, we explore how FiftyOne can be used to help you find mistakes in your object detection annotations. To detect mistakes in classification datasets, check out the recipe in the Classification task.

We’ll cover the following concepts:

- Computing insights into your dataset relating to possible label mistakes
- Visualizing mistakes in the FiftyOne App

## Setup

If you haven't already, install FiftyOne:

In [None]:
!pip install fiftyone

In order to compute mistakenness, your dataset needs to have two [detections fields](https://docs.voxel51.com/user_guide/using_datasets.html#object-detection), one with your ground truth annotations and one with your model predictions.

In this example, we’ll load the [quickstart](https://docs.voxel51.com/user_guide/dataset_zoo/datasets.html#dataset-zoo-quickstart) dataset from the FiftyOne Dataset Zoo, which has ground truth annotations and predictions from a PyTorch Faster-RCNN model for a few samples from the COCO dataset.

In [None]:
import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("quickstart")

In [3]:
results = dataset.evaluate_detections(
    "predictions",
    gt_field="ground_truth",
    eval_key="eval",
)

Evaluating detections...
 100% |█████████████████| 200/200 [8.6s elapsed, 0s remaining, 17.2 samples/s]      


We can start too by visualizing our dataset. See any mistakes yet?

In [None]:
session = fo.launch_app(dataset)

## Compute Mistakenness

Now we’re ready to assess the mistakenness of the ground truth detections.

We can do so by running the [compute_mistakenness()](https://docs.voxel51.com/api/fiftyone.brain.html#fiftyone.brain.compute_mistakenness) method from the FiftyOne Brain:

In [8]:
import fiftyone.brain as fob

# Compute mistakenness of annotations in `ground_truth` field using
# predictions from `predictions` field as point of reference
fob.compute_mistakenness(dataset, "predictions", label_field="ground_truth")

Evaluating detections...
 100% |█████████████████| 200/200 [8.5s elapsed, 0s remaining, 16.8 samples/s]      
Computing mistakenness...
 100% |█████████████████| 200/200 [1.6s elapsed, 0s remaining, 113.5 samples/s]         
Mistakenness computation complete


The above method populates a number of fields on the samples of our dataset as well as the ground truth and predicted objects:

New ground truth object attributes (in `ground_truth` field):

- `mistakenness` (float): A measure of the likelihood that a ground truth object’s label is incorrect
- `mistakenness_loc`: A measure of the likelihood that a ground truth object’s localization (bounding box) is inaccurate
- `possible_spurious`: Ground truth objects that were not matched with a predicted object and are deemed to be likely spurious annotations will have this attribute set to True

New predicted object attributes (in `predictions` field):

- `possible_missing`: If a highly confident prediction with no matching ground truth object is encountered, this attribute is set to True to indicate that it is a likely missing ground truth annotation

Sample-level fields:

- `mistakenness`: The maximum mistakenness of the ground truth objects in each sample
- `possible_spurious`: The number of possible spurious ground truth objects in each sample
- `possible_missing`: The number of possible missing ground truth objects in each sample

## Analyzing Results

Let’s use FiftyOne to investigate the results.

First, let’s show the samples with the most likely annotation mistakes:

In [10]:
from fiftyone import ViewField as F

# Sort by likelihood of mistake (most likely first)
mistake_view = dataset.sort_by("mistakenness", reverse=True)

# Print some information about the view
print(mistake_view)

Dataset:     quickstart
Media type:  image
Num samples: 200
Sample fields:
    id:                fiftyone.core.fields.ObjectIdField
    filepath:          fiftyone.core.fields.StringField
    tags:              fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth:      fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    uniqueness:        fiftyone.core.fields.FloatField
    predictions:       fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    mistakenness:      fiftyone.core.fields.FloatField
    possible_missing:  fiftyone.core.fields.IntField
    possible_spurious: fiftyone.core.fields.IntField
View stages:
    1. SortBy(field_or_expr='mistakenness', reverse=True, create_index=True)


In [11]:
# Inspect some samples and detections
# This is the first detection of the first sample
print(mistake_view.first().ground_truth.detections[0])

<Detection: {
    'id': '5f452487ef00e6374aad2744',
    'attributes': {},
    'tags': [],
    'label': 'tv',
    'bounding_box': [
        0.002746666666666667,
        0.36082,
        0.24466666666666667,
        0.3732,
    ],
    'mask': None,
    'confidence': None,
    'index': None,
    'area': 16273.3536,
    'iscrowd': 0.0,
    'mistakenness': 0.005771428346633911,
    'mistakenness_loc': 0.16955941131917984,
}>


Let’s use the App to visually inspect the results:

In [None]:
# Show the samples we processed in rank order by the mistakenness
session.view = mistake_view

![mistakeness](../assets/mistakeness.png)

Another useful query is to find all objects that have a high mistakenness, lets say > 0.95:

In [None]:
from fiftyone import ViewField as F

session.view = dataset.filter_labels("ground_truth", F("mistakenness") > 0.95)

![mistake_view](../assets/mistake_view.gif)

Looking through the results, we can see that many of these images have a bunch of predictions which actually look like they are correct, but no ground truth annotations. This is a common mistake in object detection datasets, where the annotator may have missed some objects in the image. On the other hand, there are some detections which are mislabeled, like the `cow` in the last image above which is predicted to be a `horse`.

We can use a similar workflow to look at objects that may be localized poorly:

In [None]:
session.view = dataset.filter_labels("ground_truth", F("mistakenness_loc") > 0.85)

![mistake_loc](../assets/mistake_loc.gif)

In some of these examples, there is not necessarily highly mistaken localization, there are just a bunch of small, relatively overlapping objects. In other examples, such as above, the localization is clearly off.

The `possible_missing` field can also be useful to sort by to find instances of incorrect annotations.

Similarly, `possible_spurious` can be used to find objects that the model detected that may have been missed by annotators.

In [None]:
session.view = dataset.match(F("possible_missing") > 0)

![mistake_missing](../assets/mistake_missing.gif)

Once again, we can find more of those pesky mistakes! In FiftyOne, we can tag our samples and export them for annotation job with one of labeling integrations: [CVAT](https://docs.voxel51.com/integrations/cvat.html), [Label Studio](https://docs.voxel51.com/integrations/labelstudio.html), [V7](https://docs.voxel51.com/integrations/v7.html), or [LabelBox](https://docs.voxel51.com/integrations/labelbox.html)! This can get our dataset back into tip-top shape to train again!