# 📊 Model Evaluation and Sample Selection with FiftyOne

Welcome to Notebook 3 of our workshop! In this section, we focus on evaluating object detection models, identifying diverse samples, and preparing curated subsets of data using the power of **FiftyOne**.

## 🚦 What You'll Learn

In this notebook, you will:

- **Apply object detection models** (e.g., YOLOv8 and YOLOv11) to a dataset filtered to include only annotated BDD100K samples.
- **Compute uniqueness scores** using FiftyOne Brain to identify the most distinct images in your dataset.
- **Evaluate model predictions** against BDD100K ground truth labels using built-in metrics (e.g., precision, recall).
- **Filter and clone a view** with the top 100 most unique samples based on their visual or semantic content.
- **Export this subset** to disk for further analysis, training, or sharing.

## 🧠 Why This Matters

Evaluating models in a structured and insightful way is essential for improving performance and understanding failure modes. Additionally, selecting a **diverse set of unique images** helps reduce dataset bias and enhances generalization.

This notebook gives you the tools to:
- Visualize and compare multiple model outputs
- Select and analyze samples beyond random splits
- Prepare high-quality, targeted subsets of data

Let’s dive in and explore how FiftyOne empowers data-centric workflows!


## 📁 Load the BDD100K Dataset and Launch FiftyOne
We will use the `BDD100K` dataset from HuggingFace Hub.

In [1]:
import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob

import fiftyone.utils.huggingface as fouh # Hugging Face integration

import os

# Increase both connection and read timeout values (in seconds)
# os.environ["HF_HUB_DOWNLOAD_TIMEOUT"] = "60"  # default is 10
# os.environ["HF_HUB_ETAG_TIMEOUT"] = "30"      # metadata fetch timeout
# dataset = fouh.load_from_hub("dgural/bdd100k", persistent=True, name= "bdd10k") #, overwrite=True)
#fo.delete_dataset("dgural/bdd100k")

# # Define the new dataset name
# dataset_name = "bdd10k"
dataset_name = "bdd10k_imported"

# Check if the dataset exists
if dataset_name in fo.list_datasets():
    print(f"Dataset '{dataset_name}' exists. Loading...")
    dataset = fo.load_dataset(dataset_name)
else:
    print(f"Dataset '{dataset_name}' does not exist. Creating a new one...")
    # Clone the dataset with a new name and make it persistent
    dataset = dataset.clone(dataset_name, persistent=True)



Dataset 'bdd10k_imported' exists. Loading...


### 📋 List Available Datasets
Check which FiftyOne datasets are currently loaded in your environment.

In [2]:
print(fo.list_datasets())
print(dataset)

['bdd100k_test', 'bdd10k_imported']
Name:        bdd10k_imported
Media type:  image
Num samples: 10000
Persistent:  False
Tags:        []
Sample fields:
    id:                 fiftyone.core.fields.ObjectIdField
    filepath:           fiftyone.core.fields.StringField
    tags:               fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:           fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:         fiftyone.core.fields.DateTimeField
    last_modified_at:   fiftyone.core.fields.DateTimeField
    detections:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    polylines:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Polylines)
    weather:            fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    timeofday:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    scene:  

### 🚀 Launch FiftyOne App
Start the FiftyOne App to interactively explore your dataset and model predictions.

In [4]:
proxy_host = "https://"+os.getenv("VIRTUAL_HOST")+"/fiftyone/"
fo.app_config.proxy_url = proxy_host
session = fo.launch_app(dataset, auto=False )

Connected to FiftyOne on port 5151 at 0.0.0.0.
If you are not connecting to a remote session, you may need to start a new session and specify a port
Session launched. Run `session.show()` to open the App in a cell output.


In [5]:
print(session.url)

https://ml-az-05.oit.duke.edu:40003/fiftyone/?proxy=/fiftyone/&polling=true


### 🔧 Setup for Running YOLO Models with FiftyOne

This code snippet does the following:
- Suppresses Ultralytics logging to keep output clean.
- Imports the necessary libraries: `fiftyone`, `ultralytics`, and related modules.
- Loads two YOLO models:
  - `yolov8s.pt` (YOLOv8 Small)
  - `yolo11s.pt` (YOLO11 Small, assuming it's a custom or experimental model)

In [3]:
# # Suppress Ultralytics logging
# import os; os.environ["YOLO_VERBOSE"] = "False"

# import fiftyone as fo
# import fiftyone.zoo as foz
# import fiftyone.utils.ultralytics as fou

# from ultralytics import YOLO

# # YOLOv8
# model1 = YOLO("yolov8s.pt")

# # YOLO11
# model2 = YOLO("yolo11s.pt")

100%|███████████████████████████████████████████████████████████████████| 21.5M/21.5M [00:00<00:00, 387MB/s]
100%|███████████████████████████████████████████████████████████████████| 18.4M/18.4M [00:00<00:00, 400MB/s]


### 🤖 Apply YOLO Model
Apply a YOLOv8 or YOLOv11 model to generate predictions on the dataset. Results are stored in the specified label field.

In [26]:
# dataset.apply_model(model1, label_field="yolo8_predictions")
# dataset.apply_model(model2, label_field="yolo11_predictions")

 100% |█████████████| 10000/10000 [11.1m elapsed, 0s remaining, 15.1 samples/s]      
 100% |█████████████| 10000/10000 [11.5m elapsed, 0s remaining, 14.4 samples/s]      


### 🧠 Computing Uniqueness with FiftyOne Brain

This snippet uses the FiftyOne Brain module to compute the **uniqueness** score for each sample in the dataset.  
The `compute_uniqueness()` method helps identify how distinct each sample is compared to others, which is useful for:
- Detecting near-duplicates
- Selecting diverse subsets
- Understanding dataset variability

In [27]:
# import fiftyone.brain as fob

# fob.compute_uniqueness(dataset)

Computing embeddings...
 100% |█████████████| 10000/10000 [1.5m elapsed, 0s remaining, 113.2 samples/s]      
Computing uniqueness...
Uniqueness computation complete


### 🔍 Inspecting the Dataset with Uniqueness Scores

After computing uniqueness, each sample now includes a `"uniqueness"` field.  
This snippet prints:
- The entire dataset summary (e.g., number of samples, fields)
- The first sample, including its new uniqueness score

This is helpful to verify that the uniqueness computation was applied correctly.


In [5]:
# # Now the samples have a "uniqueness" field on them
# print(dataset)
# print(dataset.first())

Name:        bdd10k
Media type:  image
Num samples: 10000
Persistent:  True
Tags:        []
Sample fields:
    id:                 fiftyone.core.fields.ObjectIdField
    filepath:           fiftyone.core.fields.StringField
    tags:               fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:           fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.ImageMetadata)
    created_at:         fiftyone.core.fields.DateTimeField
    last_modified_at:   fiftyone.core.fields.DateTimeField
    detections:         fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    polylines:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Polylines)
    weather:            fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    timeofday:          fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Classification)
    scene:              fiftyone.core.fields.EmbeddedDocum

### 📊 Viewing Least Unique Samples in FiftyOne App

This code sorts the dataset by the `"uniqueness"` field in **descending order**, so the **least unique** (most similar or duplicated) samples appear first.

- `sort_by("uniqueness", reverse=True)`: sorts samples from least to most unique.
- `session.view = dups_view`: opens this sorted view in the FiftyOne App for interactive exploration.

Great for spotting near-duplicates or redundant data!


In [6]:
# Sort in increasing order of uniqueness (least unique first)
dups_view = dataset.sort_by("uniqueness", reverse=True)

# Open view in the App
session.view = dups_view

### 🌟 Creating a View of the Top 100 Most Unique Samples

This code generates a new view containing the **100 most unique samples** in the dataset:

- `sort_by("uniqueness", reverse=True)`: sorts the dataset from most to least unique.
- `.limit(100)`: selects only the top 100 samples.

Useful for exploring diverse samples or building a representative subset for training or analysis.


In [7]:
# Create a view with the 100 most unique samples
unique_view = dataset.sort_by("uniqueness", reverse=True).limit(100)

### 📦 Cloning and Exporting the Unique Samples Dataset

**Step 3:**  
Clone the `unique_view` as a standalone dataset named `"bdd100k_100_unique"`.  
This creates a new `fiftyone.core.dataset.Dataset` object that can be manipulated or exported independently.

**Step 4:**  
Export the cloned dataset to disk using FiftyOne’s export functionality:
- `export_dir`: defines the output directory (`bdd100k_unique_100_FO`)
- `dataset_type`: specifies the export format (in this case, `FiftyOneDataset`, but this can be changed to COCO, YOLO, etc.)

Ideal for saving a diverse subset for further use or sharing.


In [8]:
# # Clone the view as a standalone dataset
# unique_dataset = unique_view.clone(name="bdd100k_100_unique")
# unique_dataset.persistent=True

# # Export the dataset (can be modified to your desired format or directory)
# export_dir = "bdd100k_unique_100_FO"
# unique_dataset.export(
#     export_dir=export_dir,
#     dataset_type=fo.types.FiftyOneDataset,
# )

In [10]:
# if "bdd100k_100_unique" in fo.list_datasets():
#     fo.delete_dataset("bdd100k_100_unique")

In [11]:
# # Define the new dataset name
dataset_name = "bdd100k_100_unique"

# Check if the dataset exists
if dataset_name in fo.list_datasets():
    print(f"Dataset '{dataset_name}' exists. Loading...")
    unique_dataset = fo.load_dataset(dataset_name)
else:
    print(f"Dataset '{dataset_name}' does not exist. Creating a new one...")
    # Path to the exported folder
    export_dir = "bdd100k_unique_100_FO"
    
    # Load the dataset from the folder
    unique_dataset = fo.Dataset.from_dir(
        dataset_dir=export_dir,
        dataset_type=fo.types.FiftyOneDataset,
        name=dataset_name  # You can choose any name here
    )
    
unique_dataset.persistent = True

Dataset 'bdd100k_100_unique' does not exist. Creating a new one...
Importing samples...
 100% |█████████████████| 100/100 [17.7ms elapsed, 0s remaining, 5.7K samples/s]      


In [12]:
session = fo.launch_app(unique_dataset, auto=False )

Session launched. Run `session.show()` to open the App in a cell output.


### 🧠 Comparing Similarity Across Images and YOLO Detections

We now compute similarity in three dimensions:
1. **Image-level similarity** using CLIP embeddings (already precomputed).
2. **Detection-level similarity with YOLOv8 results.**
3. **Detection-level similarity with YOLOv11 results.**

By using `patches_field`, we compare how similar individual object detections are across images. This enables us to:
- Group images with visually or semantically similar detections.
- Evaluate model differences between YOLOv8 and YOLOv11.
- Identify edge cases or inconsistencies in object detection.

Each similarity computation is stored with a different `brain_key` and can be visualized interactively using FiftyOne's App.

In [9]:
# # Step 1: Define the field where YOLO8 and YOLO11 detections are stored
# # Adjust these names to your actual label field names
# yolo8_field = "yolo8_predictions"
# yolo11_field = "yolo11_predictions"

# # Example 1: Similarity based on images (already computed with embeddings)
# fob.compute_visualization(
#     unique_dataset,
#     brain_key="image_similarity"
# )

# # Example 2: Similarity based on YOLOv8 detections
# fob.compute_visualization(
#     unique_dataset,
#     brain_key="yolo8_similarity",
#     patches_field=yolo8_field
# )

# # Example 3: Similarity based on YOLOv11 detections
# fob.compute_visualization(
#     unique_dataset,
#     brain_key="yolo11_similarity",
#     patches_field=yolo11_field
# )

Computing embeddings...
 100% |█████████████████| 100/100 [15.0s elapsed, 0s remaining, 6.6 samples/s]      
Generating visualization...




UMAP( verbose=True)
Wed May 28 07:02:57 2025 Construct fuzzy simplicial set
Wed May 28 07:02:57 2025 Finding Nearest Neighbors
Wed May 28 07:02:58 2025 Finished Nearest Neighbor Search
Wed May 28 07:02:59 2025 Construct embedding


Epochs completed: 100%| ██████████ 500/500 [00:00]

	completed  0  /  500 epochs
	completed  50  /  500 epochs
	completed  100  /  500 epochs
	completed  150  /  500 epochs
	completed  200  /  500 epochs
	completed  250  /  500 epochs
	completed  300  /  500 epochs
	completed  350  /  500 epochs
	completed  400  /  500 epochs
	completed  450  /  500 epochs
Wed May 28 07:03:00 2025 Finished embedding
Computing patch embeddings...
   0% ||----------------|   0/100 [14.6ms elapsed, ? remaining, ? samples/s] 




 100% |█████████████████| 100/100 [57.7s elapsed, 0s remaining, 1.8 samples/s]      
Generating visualization...
UMAP( verbose=True)
Wed May 28 07:03:58 2025 Construct fuzzy simplicial set




Wed May 28 07:03:58 2025 Finding Nearest Neighbors
Wed May 28 07:03:58 2025 Finished Nearest Neighbor Search
Wed May 28 07:03:58 2025 Construct embedding


Epochs completed: 100%| ██████████ 500/500 [00:00]

	completed  0  /  500 epochs
	completed  50  /  500 epochs
	completed  100  /  500 epochs
	completed  150  /  500 epochs
	completed  200  /  500 epochs
	completed  250  /  500 epochs
	completed  300  /  500 epochs
	completed  350  /  500 epochs
	completed  400  /  500 epochs
	completed  450  /  500 epochs
Wed May 28 07:03:58 2025 Finished embedding





Computing patch embeddings...
 100% |█████████████████| 100/100 [53.9s elapsed, 0s remaining, 2.0 samples/s]      
Generating visualization...
UMAP( verbose=True)
Wed May 28 07:04:52 2025 Construct fuzzy simplicial set




Wed May 28 07:04:52 2025 Finding Nearest Neighbors
Wed May 28 07:04:52 2025 Finished Nearest Neighbor Search
Wed May 28 07:04:52 2025 Construct embedding


Epochs completed: 100%| ██████████ 500/500 [00:00]

	completed  0  /  500 epochs
	completed  50  /  500 epochs
	completed  100  /  500 epochs
	completed  150  /  500 epochs
	completed  200  /  500 epochs
	completed  250  /  500 epochs
	completed  300  /  500 epochs
	completed  350  /  500 epochs
	completed  400  /  500 epochs
	completed  450  /  500 epochs
Wed May 28 07:04:52 2025 Finished embedding





<fiftyone.brain.visualization.VisualizationResults at 0x36a838d60>

### 🚀 Launch FiftyOne App
Start the FiftyOne App to interactively explore your dataset and model predictions.

In [None]:
# session = fo.launch_app(unique_dataset, auto=False)

Session launched. Run `session.show()` to open the App in a cell output.


### 🧩 Introduction to FiftyOne Plugins

**FiftyOne Plugins** extend the capabilities of the FiftyOne App by adding custom panels, tools, and workflows tailored to your dataset and model needs.

Plugins allow you to:
- Visualize custom metrics and evaluation results
- Add new buttons, filters, or visual tools to the App interface
- Seamlessly integrate external libraries or logic into the FiftyOne experience

They are built using JavaScript and Python and follow a modular architecture that makes them easy to download, install, and use.

---

### 🔌 Downloading the `@voxel51/evaluation` Plugin

The following command downloads a specific plugin from the official [Voxel51 plugin repository](https://github.com/voxel51/fiftyone-plugins):

```python
!fiftyone plugins download https://github.com/voxel51/fiftyone-plugins --plugin-names @voxel51/evaluation


In [None]:
# !fiftyone plugins download https://github.com/voxel51/fiftyone-plugins --plugin-names @voxel51/evaluation

### 📊 Evaluate Detections
Run evaluation between model predictions and ground truth labels using FiftyOne's built-in metrics.

In [10]:
# # eval_key
# evalkey_yolo8 = "evalkey_yolo8_"
# evalkey_yolo11 = "evalkey_yolo11_"

# # Detection evaluation
# unique_dataset.evaluate_detections(
#     "yolo8_predictions",
#     gt_field="detections",
#     eval_key=evalkey_yolo8,
# )

# # Detection evaluation
# unique_dataset.evaluate_detections(
#     "yolo11_predictions",
#     gt_field="detections",
#     eval_key=evalkey_yolo11,
# )

Evaluating detections...
 100% |█████████████████| 100/100 [2.2s elapsed, 0s remaining, 39.6 samples/s]         
Evaluating detections...
 100% |█████████████████| 100/100 [2.3s elapsed, 0s remaining, 39.2 samples/s]      


<fiftyone.utils.eval.detection.DetectionResults at 0x36a867390>

In [4]:
# export_dir = "bdd100k_FO"
# dataset.export(
#     export_dir=export_dir,
#     dataset_type=fo.types.FiftyOneDataset,
# )

Exporting samples...
 100% |████████████████| 10000/10000 [11.0s elapsed, 0s remaining, 942.4 docs/s]      


In [11]:
# # Step 4: Export the dataset (can be modified to your desired format or directory)
# export_dir = "bdd100k_unique_100_FO"
# unique_dataset.export(
#     export_dir=export_dir,
#     dataset_type=fo.types.FiftyOneDataset,
# )

Exporting samples...
 100% |████████████████████| 100/100 [119.4ms elapsed, 0s remaining, 837.4 docs/s]     
