# Find Classification Mistakes

Annotations mistakes create an artificial ceiling on the performance of your models. However, finding these mistakes by hand is almost as time consuming as the original annotation work! Luckily, FiftyOne comes to the rescue!

In this tutorial, we explore how FiftyOne can be used to help you find mistakes in your classification annotations.

We’ll cover the following concepts:

- Loading a zoo dataset with FiftyOne

- Adding model predictions to your dataset

- Computing insights into your dataset relating to possible label mistakes

- Visualizing mistakes in the FiftyOne App

## Setup

If you haven't already, install FiftyOne:

In [None]:
!pip install fiftyone

Let's kick things off by loading in the zoo dataset [imagenet-sample](https://docs.voxel51.com/user_guide/dataset_zoo/datasets.html#dataset-zoo-imagenet-sample). The dataset contains 1,000 images, one randomly chosen from each class of the validation split of the ImageNet 2012 dataset. We will use it our example dataset for the tutorial.

In [None]:
import fiftyone as fo
import fiftyone.zoo as foz

dataset = foz.load_zoo_dataset("imagenet-sample")

For this walkthrough, we will artificially perturb an existing dataset with mistakes on the labels. Of course, in your normal workflow, you would not add labeling mistakes; this is only for the sake of the walkthrough.

In order to accomplish this, we randomly break 10% (100 sample) of all labels.

In [18]:
import random

# Get the ImageNet classes list
classes = dataset.default_classes


# Artificially corrupt 10% of the labels
num_mistakes = int(0.1 * len(dataset))
samples_to_corrupt = dataset.take(num_mistakes)

for sample in samples_to_corrupt:
    
    mistake_class = random.randint(0, 999)
    
    # Make sure it gets corrupted
    while classes[mistake_class] == sample.ground_truth.label:
        mistake_class = random.randint(0, 999)

    # Tag and corrupt the sample
    sample.tags.append("mistake")
    sample.ground_truth.label = classes[mistake_class]
    sample.save()

Let’s print some information about the dataset to verify the operation that we performed:

In [4]:
# Count the number of samples with the `mistake` tag
num_mistakes = len(dataset.match_tags("mistake"))
print("%d ground truth labels are now mistakes" % num_mistakes)

100 ground truth labels are now mistakes


## Add predictions to the dataset

Next, we need to add some predictions to our dataset. We will use [mobilenet-v2](https://docs.voxel51.com/user_guide/model_zoo/models.html#mobilenet-v2-imagenet-torch) from the model zoo! We can add predictions easily with the following:

In [19]:
model = foz.load_zoo_model("mobilenet-v2-imagenet-torch")

dataset.apply_model(model, label_field="predictions")

 100% |███████████████| 1000/1000 [9.5s elapsed, 0s remaining, 92.8 samples/s]       


We can print our dataset to verify that the predictions have been added:

In [None]:
print(dataset)

Let's check out our new predictions in the app as well! See if you can spot any mistakes already!

In [None]:
session = fo.launch_app(dataset)

![imagenet-sample](../assets/imagenet-sample.png)

## Find the mistakes with FiftyOne

Now we can run a method from FiftyOne that estimates the mistakenness of the ground samples for which we generated predictions:

In [None]:
import fiftyone.brain as fob

# Compute mistakenness
fob.compute_mistakenness(dataset, "predictions", label_field="ground_truth")

The above method added mistakenness field to all samples for which we added predictions. We can easily sort by likelihood of mistakenness from code:

In [22]:
# Sort by likelihood of mistake (most likely first)
mistake_view = (dataset
    .sort_by("mistakenness", reverse=True)
)

# Print some information about the view
print(mistake_view)

Dataset:     imagenet-sample
Media type:  image
Num samples: 1000
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    predictions:  fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    mistakenness: fiftyone.core.fields.FloatField
View stages:
    1. SortBy(field_or_expr='mistakenness', reverse=True, create_index=True)


In [None]:
session.view = mistake_view

![class-mistakes](../assets/class-mistakes.png)

In a real world scenario, we would then take the ground truth classifications that are likely mistakes and send them off to our annotation provider of choice as annotations to be reviewed. In FiftyOne, we can tag our samples and export them for annotation job with one of labeling integrations: [CVAT](https://docs.voxel51.com/integrations/cvat.html), [Label Studio](https://docs.voxel51.com/integrations/labelstudio.html), [V7](https://docs.voxel51.com/integrations/v7.html), or [LabelBox](https://docs.voxel51.com/integrations/labelbox.html)!