In [None]:
import fiftyone as fo
import fiftyone.utils.huggingface as fouh

# Load the dataset from Hugging Face if it's your first time using it

# dataset = fouh.load_from_hub(
# "Voxel51/Coursera_lecture_dataset_train", 
# dataset_name="lecture_dataset_train", 
# persistent=True)

Let's start this section by examining the labels we have in our dataset:

In [None]:
dataset = fo.load_dataset("lecture_dataset_train")

train_dataset = dataset.clone()

In [None]:
fo.launch_app(train_dataset)

Notice that it seems our annotators may have missed some annotations...

In any event, let's proceed with our investigation.

In [None]:
dataset.count_values("ground_truth.detections.label")

Something you may notice is the presence of classes that could potentially confuse labelers (and also, any model you may train on them).

For example, human annotators (and even a model) might have trouble with:

- sunglasses and goggles

- coat and jacket

- doughnut and pastry

- baseball cap and hat

Let's focus on these for see what we can learn. First, create a patches view of the dataset:

In [None]:
patches_view = train_dataset.to_patches("ground_truth")

In [None]:
fo.launch_app(patches_view)

Next, create filtered views containing only the labels of interest:

In [None]:
from fiftyone import ViewField as F

sunglasses_goggles_view = patches_view.filter_labels(field="ground_truth", filter=F("label").is_in(["sunglasses", "goggles"]))

coat_jacket_view = patches_view.filter_labels(field="ground_truth", filter=F("label").is_in(["coat", "jacket"]))

doughnut_pastry_view = patches_view.filter_labels(field="ground_truth", filter=F("label").is_in(["doughnut", "pastry"]))

baseball_cap_hat_view = patches_view.filter_labels(field="ground_truth", filter=F("label").is_in(["baseball_cap", "hat"]))

Now, from here we can compute embeddings for each view to see if we can glean anything about the labels. For this, let's make use of the CLIP model as it's inference is quite fast even on CPU.

In [None]:
import os 
from fiftyone import brain as fob

sunglasses_goggles_view_results = fob.compute_visualization(
    samples=sunglasses_goggles_view,
    patches_field="ground_truth",
    model="clip-vit-base32-torch",
    brain_key="sunglasses_goggles_embeddings",
    method="umap",
    num_dims=2,
    num_workers=os.cpu_count(),
    progress=True,
)

In [None]:
fo.launch_app(sunglasses_goggles_view)

In [None]:
coat_jacket_view_results = fob.compute_visualization(
    samples=coat_jacket_view,
    patches_field="ground_truth",
    model="clip-vit-base32-torch",
    brain_key="coat_jacket_embeddings",
    method="umap",
    num_dims=2,
    num_workers=os.cpu_count(),
    progress=True,
)

In [None]:
fo.launch_app(coat_jacket_view)

In [None]:
doughnut_pastry_results = fob.compute_visualization(
    samples=doughnut_pastry_view,
    patches_field="ground_truth",
    model="clip-vit-base32-torch",
    brain_key="doughnut_pastry_embeddings",
    method="umap",
    num_dims=2,
    num_workers=os.cpu_count(),
    progress=True,
)

In [None]:
fo.launch_app(doughnut_pastry_view)

In [None]:
baseball_cap_hat_view_results = fob.compute_visualization(
    samples=baseball_cap_hat_view,
    patches_field="ground_truth",
    model="clip-vit-base32-torch",
    brain_key="baseball_cap_hat_embeddings",
    method="umap",
    num_dims=2,
    num_workers=os.cpu_count(),
    progress=True,
)

In [None]:
# zero shot detection for classifying mistakes


In [None]:
from fiftyone import plugins

plugins.download_plugin(
    url_or_gh_repo="https://github.com/jacobmarks/zero-shot-prediction-plugin"
)

plugins.install_plugin_requirements(
    plugin_name="@jacobmarks/zero_shot_prediction"
)

In [None]:
import fiftyone.operators as foo

## Access the operator via its URI (plugin name + operator name)
zero_shot_detection_operator = foo.get_operator("@jacobmarks/zero_shot_prediction/zero_shot_detect")

zero_shot_detection_operator.list_models()

In [None]:
## Run zero-shot detection on all images in the dataset, specifying the labels the model to use, and the field to add the results to
zero_shot_detection_operator(
    train_dataset,
    labels = ['jacket', 'coat', 'jean', 'trousers', 'short_pants', 'trash_can', 'bucket', 'flowerpot', 'helmet', 'baseball_cap', 'hat', 'sunglasses', 'goggles', 'doughnut', 'pastry', 'onion', 'tomato'],
    model_name = "YOLO-World",
    label_field = "zero_shot_predictions",
)

In [None]:
fo.launch_app(train_dataset)

## Compute mistakenness

Now we're ready to assess the mistakenness of the ground truth detections.

We can do so by running the [compute_mistakenness()](https://voxel51.com/docs/fiftyone/api/fiftyone.brain.html#fiftyone.brain.compute_mistakenness) method from the FiftyOne Brain.

**REMEMBER**: Since you are using model predictions to guide the mistakenness process, the better your model, the more accurate the mistakenness suggestions. Additionally, using logits of confidence scores will also provide better results. 

Note, you can pass `copy_missing=True` which will copy predicted objects that were deemed to be missing into the `label_field`.

In [None]:
import fiftyone.brain as fob

# Compute mistakenness of annotations in `ground_truth` field using 
# predictions from `zero_shot_predictions` field as point of reference
fob.compute_mistakenness(
    train_dataset, 
    pred_field="zero_shot_predictions", 
    label_field="ground_truth",
    copy_missing=True # you can pass this as True if you trust your model is powerful enough
    )

The above method populates a number of fields on the samples of our dataset as well as the ground truth and predicted objects:

#### New ground truth object attributes (in `ground_truth` field):

- `mistakenness` (float): A measure of the likelihood that a ground truth object's label is incorrect

- `mistakenness_loc`: A measure of the likelihood that a ground truth object's localization (bounding box) is inaccurate

- `possible_spurious`: Ground truth objects that were not matched with a predicted object and are deemed to be likely spurious annotations will have this attribute set to True

#### New predicted object attributes (in `predictions` field):

- `possible_missing`: If a highly confident prediction with no matching ground truth object is encountered, this attribute is set to True to indicate that it is a likely missing ground truth annotation

#### Sample-level fields:

- `mistakenness`: The maximum mistakenness of the ground truth objects in each sample

- `possible_spurious`: The number of possible spurious ground truth objects in each sample

- `possible_missing`: The number of possible missing ground truth objects in each sample

In [None]:
from fiftyone import ViewField as F

# Sort by likelihood of mistake (most likely first)
mistake_view = train_dataset.sort_by("mistakenness", reverse=True)

# Print some information about the view
print(mistake_view)

In [None]:
# Inspect some samples and detections
# This is the first detection of the first sample
print(mistake_view.first().ground_truth.detections[0])

Another useful query is to find all objects that have a high mistakenness, lets say > 0.95

Recall that `mistakenness` measures of the likelihood that a ground truth object's label is incorrect

In [None]:
from fiftyone import ViewField as F

highly_mistaken_view = train_dataset.filter_labels("ground_truth", F("mistakenness") > 0.95)

In [None]:
fo.launch_app(highly_mistaken_view)

Looking through the results, we can see that many of these images have a bunch of predictions which actually look like they are correct, but no ground truth annotations. This is a common mistake in object detection datasets, where the annotator may have missed some objects in the image. On the other hand, there are some detections which are mislabeled...    

Recall that `mistakenness_loc` is measure of the likelihood that a ground truth object's localization (bounding box) is inaccurate.

We can use a similar workflow to look at objects that may be localized poorly:

In [None]:
high_mistaken_loc_view = train_dataset.filter_labels("ground_truth", F("mistakenness_loc") > 0.85)

In [None]:
fo.launch_app(high_mistaken_loc_view)

The `possible_missing` field can also be useful to sort by to find instances of incorrect annotations. Similarly, `possible_spurious` can be used to find objects that the model detected that may have been missed by annotators.

In [None]:
possible_spurious_view = train_dataset.match(F("possible_spurious") > 0)

In [None]:
fo.launch_app(possible_spurious_view)

# Finding duplicate detections

When dealing with duplicate labels, there is inherent ambiguity: which one is "correct" and which one(s) are "duplicate"?

By default, [find_duplicates()](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.iou.html#fiftyone.utils.iou.find_duplicates) will simply iterate through the labels in each sample and flag any label whose IoU with a previous label exceeds the chosen threshold as a duplicate.

Alternatively, you can pass the `method="greedy"` option to instead use a greedy approach to mark the fewest number of labels as duplicate such that no non-duplicate labels have IoU greater than the specified threshold with each other.



In [None]:
import fiftyone.utils.iou as foui

dup_ids = foui.find_duplicates(
    train_dataset, 
    "ground_truth", 
    iou_thresh=0.85, 
    classwise=True,
    method="greedy"
    )

In [None]:
dup_ids


### Tagging and resolution

In either case, it is recommended to visualize the duplicates in the App before taking any action. One convenient way to do this is to first tag the duplicates.


Any label or collection of labels can be tagged at any time in the sample grid or expanded sample view. In the expanded sample view, individual samples can be selected by clicking on them in the media player.

Labels with specific tags can then be selected with [select_labels()](https://voxel51.com/docs/fiftyone/api/fiftyone.core.collections.html?highlight=select_labels#fiftyone.core.collections.SampleCollection.select_labels) stage and sent off to assist in improving the annotations with your annotation provided of choice. FiftyOne currently offers integrations for both [Labelbox](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.labelbox.html), [Scale](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.scale.html), and [CVAT](https://docs.voxel51.com/tutorials/cvat_annotation.html).


In [None]:
# Tag the automatically selected duplicates
train_dataset.select_labels(ids=dup_ids).tag_labels("duplicate")

print(dataset.count_label_tags())

You can use [match_labels()](https://voxel51.com/docs/fiftyone/api/fiftyone.core.collections.html#fiftyone.core.collections.SampleCollection.match_labels) to load the samples containing at least one duplicate label in the App and use the `duplicate` tag you added to conveniently isolate and evaluate the duplicates.

If you see any erroneous duplicates, simply remove the `duplicate` tag in the App:

In [None]:
dup_view = train_dataset.match_labels(ids=dup_ids)

In [None]:
fo.launch_app(dup_view)

When you’re ready to act, you can then easily delete the duplicate labels as follows:

In [None]:
train_dataset.delete_labels(tags="duplicate")

# If you want to delete every label flagged by `find_duplicates()`
# train_dataset.delete_labels(ids=dup_ids)

# Annotating Datasets with CVAT

[FiftyOne](https://fiftyone.ai) and [CVAT](https://github.com/opencv/cvat) are two leading open-source tools, each tackling different parts of the dataset curation and improvement workflows.

[The tight integration](https://voxel51.com/docs/fiftyone/integrations/cvat.html) between FiftyOne and CVAT allows you to curate and explore datasets in FiftyOne and then send off samples or existing labels for annotation in CVAT with just one line of code.


In order to use CVAT, you must create an account on a CVAT server.

By default, FiftyOne uses [app.cvat.ai](https://app.cvat.ai). So if you haven't already, go to [app.cvat.ai](https://app.cvat.ai) and create an account now.

Another option is to [set up CVAT locally](https://opencv.github.io/cvat/docs/administration/basics/installation) and then [configure FiftyOne](https://voxel51.com/docs/fiftyone/integrations/cvat.html#self-hosted-servers) to use your self-hosted server. A primary benefit of setting up CVAT locally is that you are limited to 10 tasks and 500MB of data with app.cvat.ai.

In any case, FiftyOne will need to connect to your CVAT account. The easiest way to configure your CVAT login credentials is to store them in environment variables.

Whether you are annotating the data yourself or have a team of annotators, the workflow of uploading data from FiftyOne to CVAT is the same. The [annotate()](https://voxel51.com/docs/fiftyone/api/fiftyone.core.collections.html#fiftyone.core.collections.SampleCollection.annotate) method on a collection of samples lets you specify the name, type, and classes for the labels you are annotating.

For example, let's annotate bounding boxes masks for the classes "goggles" and "sunglasses".

We'll only include a few samples to be annotated in our view for brevity. To create annotation jobs in CVAT for these samples, we simply call [annotate()](https://voxel51.com/docs/fiftyone/api/fiftyone.core.collections.html#fiftyone.core.collections.SampleCollection.annotate) passing in a unique name for this annotation run and the relevant label schema information for the annotation task.
Since we'll be annotating these samples ourselves, we pass `launch_editor=True` to automatically launch a browser window with the CVAT editor open once the data has been loaded.

In [None]:
from getpass import getpass

In [None]:
os.environ["FIFTYONE_CVAT_USERNAME"] = getpass("Enter CVAT username: ")

In [None]:
os.environ["FIFTYONE_CVAT_PASSWORD"] = getpass("Enter CVAT passoword: ")

In [None]:
# A unique identifer for this run
anno_key = "reannotate_example"

small_sample = train_dataset.take(100, seed=51)

# Upload the samples and launch CVAT
anno_results = small_sample.annotate(
    anno_key,
    label_field="ground_truth",
    classes=['jacket', 'coat', 'jean', 'trousers', 'short_pants', 'trash_can', 'bucket', 'flowerpot', 'helmet', 'baseball_cap', 'hat', 'sunglasses', 'goggles', 'doughnut', 'pastry', 'onion', 'tomato'],
    launch_editor=True,
)

In [None]:
small_sample.load_annotations(anno_key, cleanup=True)

In [None]:
anno_results.print_status()

For a more in-depth guide to the CVAT integration, check out [this notebook](https://colab.research.google.com/github/voxel51/fiftyone/blob/v0.25.0/docs/source/tutorials/cvat_annotation.ipynb)

# Merging labels

Depending on your usecase, it might make sense to merge labels.

As we discussed before, human annotators (and even a model) might have trouble with some labels, so it might make sense to merge them like so:

- sunglasses and goggles --> eyewear

- coat and jacket --> outerwear

- doughnut and pastry --> pastry

- baseball cap and hat --> hat

For that, you can use the [`map_labels`](https://docs.voxel51.com/api/fiftyone.core.view.html#fiftyone.core.view.DatasetView.map_labels) method:

In [None]:
label_map = {
    "sunglasses": "eyewear", 
    "goggles": "eyewear",
    "coat":"outerwear",
    "jacket":"outerwear",
    "baseball_cap":"hat"}

train_dataset = train_dataset.map_labels("ground_truth", label_map)

In [None]:
fo.launch_app(train_dataset)

## Additional resources:

- YouTube video: [Finding and correcting mistakes](https://www.youtube.com/watch?v=WDl80g7_SBw)

- Colab notebook: [Detection mistakes](https://colab.research.google.com/github/voxel51/fiftyone/blob/v0.24.1/docs/source/tutorials/detection_mistakes.ipynb)

- Colab notebook: [Removing duplicate objects](https://colab.research.google.com/github/voxel51/fiftyone/blob/v0.25.0/docs/source/recipes/remove_duplicate_annos.ipynb)

- Colab notebook: [Annotating datasets with CVAT](https://colab.research.google.com/github/voxel51/fiftyone/blob/v0.25.0/docs/source/tutorials/cvat_annotation.ipynb)