In [None]:
!pip install fiftyone umap-learn
!pip install git+https://github.com/huggingface/transformers.git#egg=transformers

In this tutorial we'll make use of the [RIS-LAD](https://huggingface.co/datasets/Voxel51/RIS-LAD) dataset. [RIS-LAD is the first fine-grained benchmark](https://arxiv.org/abs/2507.20920) designed specifically for low-altitude drone image segmentation.

The dataset features 13,871 annotations with image-text-mask triplets captured from real drone footage at 30-100 meter altitudes with oblique viewing angles. Unlike existing remote sensing datasets that rely on high-altitude satellite imagery, RIS-LAD focuses on the visual complexities of low-altitude drone perception. These challenges include perspective changes, densely packed tiny objects, variable lighting conditions, and the notorious problems of **category drift** (tiny targets causing confusion with larger, semantically similar objects) and **object drift** (difficulty distinguishing among crowded same-class instances) that plague crowded aerial scenes.

This benchmark addresses the gap in understanding how Visual AI systems see the world from a drone's perspective.

You can download the dataset from the Hugging Face Hub as follows

In [None]:
import fiftyone as fo
from fiftyone.utils.huggingface import load_from_hub

dataset = load_from_hub(
    "Voxel51/RIS-LAD",
    overwrite=True,
    persistent=True
)

This dataset is in [FiftyOne format](https://docs.voxel51.com/user_guide/using_datasets.html). 

FiftyOne provides powerful functionality to inspect, search, and modify it from a [Dataset](https://docs.voxel51.com/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset)-wide down to a [Sample](https://docs.voxel51.com/api/fiftyone.utils.data.html#fiftyone.utils.data.Sample) level.

To see the schema of this dataset, you can simply call the Dataset as follows:

In [None]:
dataset

A FiftyOne dataset is comprised of [Samples](https://docs.voxel51.com/api/fiftyone.utils.data.html#fiftyone.utils.data.Sample).  

Samples store all information associated with a particular piece of data in a dataset, including basic metadata about the data, one or more sets of labels, and additional features associated with subsets of the data and/or label sets.

The attributes of a Sample are called [Fields](https://docs.voxel51.com/api/fiftyone.core.fields.html#fiftyone.core.fields.Field), which stores information about the Sample. When a new Field is assigned to a Sample in a Dataset, it is automatically added to the dataset’s schema and thus accessible on all other samples in the dataset.

To see the schema of a single Sample and the contents of its Fields, you can call the [`first()` method](https://docs.voxel51.com/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset.first):

In [None]:
dataset.first()

You can use the FiftyOne SDK to quickly compute some high-level statistics about your dataset with it's [built-in Aggregration methods](https://docs.voxel51.com/user_guide/using_aggregations.html).

For example, you can use the [`count()` aggregation](https://docs.voxel51.com/api/fiftyone.core.collections.html#fiftyone.core.collections.SampleCollection.count) to compute the number of non-None field values in a collection:

In [None]:
dataset.count("ground_truth.detections.label")

In [None]:
dataset.count("ground_truth.detections.referring_expression")

You can use the [`count_values()` aggregation](https://docs.voxel51.com/api/fiftyone.core.collections.html#fiftyone.core.collections.SampleCollection.count_values) to compute the occurrences of field values in a collection:

In [None]:
dataset.count_values("ground_truth.detections.label")

You can use the [`distinct()` aggregation](https://docs.voxel51.com/api/fiftyone.core.collections.html#fiftyone.core.collections.SampleCollection.distinct) to compute the distinct values of a field in a collection:

In [None]:
len(dataset.distinct("ground_truth.detections.referring_expression"))

### Adding a new Field to the Dataset

A useful piece of information to have about a sample is the number of detection labels in that sample.  You can easily add this to each sample in your Dataset using a `ViewField` expression.  

[`ViewField`](https://docs.voxel51.com/api/fiftyone.core.expressions.html#fiftyone.core.expressions.ViewField) and [`ViewExpression`](https://docs.voxel51.com/api/fiftyone.core.expressions.html#fiftyone.core.expressions.ViewExpression) classes allow you to use native Python operators to define expression. Simply wrap the target field of your sample in a `ViewField` and then apply comparison, logic, arithmetic or array operations to it to create a `ViewExpression`

The idiomatic FiftyOne way to count the number of instance labels in a sample is to use a `ViewField` expression to access the list of labels and then use `.length()` to count them.

To add the number of instances per image as a field on each sample in your dataset, you can use FiftyOne's [`set_values()`](https://docs.voxel51.com/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset.set_values) method. This will efficiently compute and store the count for each sample.

You can learn more about creating Dataset Views [in these docs](https://docs.voxel51.com/user_guide/using_views.html).

In [None]:
import fiftyone as fo
from fiftyone import ViewField as F

num_instances = dataset.values(F("ground_truth.detections").length())

dataset.set_values("num_instances", num_instances)

dataset.save()

In a similar manner, you can count the number of unique instance types for each sample in your Dataset:

In [None]:
from fiftyone import ViewField as F

labels_per_sample = dataset.values("ground_truth.detections.label")

num_distinct_labels_per_sample = [len(set(labels)) if labels else 0 for labels in labels_per_sample]

dataset.set_values("num_unique_instances", num_distinct_labels_per_sample)

dataset.save()

You can then combine these values together to create a complexity score for each Sample in your Dataset. As a simple example you can define the complexity score as number of instances + number of unique instance types. Note that the [`.values()` method](https://docs.voxel51.com/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset.values) is used for efficiently extracting a slice of field across all Samples in a Dataset.

In [None]:
unique_instance_counts = dataset.values("num_unique_instances")

num_instances_values = dataset.values("num_instances")

# Compute complexity scores for all samples
complexity_scores = [nd + nul for nd, nul in zip(num_instances_values, unique_instance_counts)]

# Set the values
dataset.set_values("complexity_score", complexity_scores)

dataset.save()

There's a lot of interesting and non-trival things. like those shown above, that you can do with Fiftyone. Here are some additional resources for you to check out later:

- For those familar with `pandas` you may want to check out this [pandas v FiftyOne cheat sheet](https://docs.voxel51.com/cheat_sheets/pandas_vs_fiftyone.html) to learn how to you can translate common pandas operations into FiftyOne syntax. 

- How to [create Views of your Dataset](https://docs.voxel51.com/cheat_sheets/views_cheat_sheet.html) 

- [Filtering cheat sheet docs](https://docs.voxel51.com/cheat_sheets/filtering_cheat_sheet.html)

Of course, the most interesting part of FiftyOne is [the FiftyOne App](https://docs.voxel51.com/user_guide/app.html#using-the-fiftyone-app) (which runs locally on your machine)

In [None]:
!fiftyone plugins download https://github.com/voxel51/fiftyone-plugins --plugin-names @voxel51/dashboard

In [None]:
import fiftyone.zoo as foz

# Register this custom model source
foz.register_zoo_model_source("https://github.com/harpreetsahota204/siglip2", overwrite=True)

In [None]:
import fiftyone.zoo as foz

siglip_model = foz.load_zoo_model(
    "google/siglip2-giant-opt-patch16-256"
)

In [None]:
dataset.compute_embeddings(
    model=siglip_model,
    embeddings_field="siglip2_embeddings",
)

In [None]:
import fiftyone.brain as fob

results = fob.compute_visualization(
    dataset,
    embeddings="siglip2_embeddings",
    method="umap",
    brain_key="siglip2_viz",
    num_dims=2,
)


In [None]:
# Build a similarity index
text_img_index = fob.compute_similarity(
    dataset,
    model="google/siglip2-giant-opt-patch16-256",
    embeddings="siglip2_embeddings",
    brain_key="siglip2_similarity",
)

In [None]:
session = fo.launch_app(dataset, auto=False)
session.url

In [None]:
siglip_model.text_prompt = "Low altitude drone footage taken at "
siglip_model.classes = ["day", "night", "dusk"]

dataset.apply_model(
    siglip_model,
    label_field="time_of_day"
)

In [None]:
siglip_model.text_prompt = "The scene in this low altitude drone footage is in a "
siglip_model.classes = ["urban area", "near water", "highway", "pedestrian area"]

dataset.apply_model(
    siglip_model,
    label_field="location"
)

In [5]:
import fiftyone.zoo as foz

# Register the remote model source
foz.register_zoo_model_source(
    "https://github.com/harpreetsahota204/sam3_images",
    overwrite=True
)

# Load the model
sam3_model = foz.load_zoo_model("facebook/sam3")

Downloading https://github.com/harpreetsahota204/sam3_images...
  594.8Mb [3.9s elapsed, ? remaining, 616.3Mb/s] 
Overwriting existing model source '/home/harpreet/fiftyone/__models__/sam3'


Downloading (incomplete total...): 0.00B [00:00, ?B/s]

Fetching 12 files:   0%|          | 0/12 [00:00<?, ?it/s]

Loading weights:   0%|          | 0/1468 [00:00<?, ?it/s]

In [None]:
import fiftyone as fo
import fiftyone.zoo as foz
import fiftyone.brain as fob

sam3_model.pooling_strategy = "max"  # or "mean", "cls"

dataset.compute_embeddings(
    sam3_model,
    embeddings_field="sam_embeddings",
    batch_size=32
)

# Visualize with UMAP
fob.compute_visualization(
    dataset,
    method="umap",
    brain_key="sam_viz",
    embeddings="sam_embeddings",
    num_dims=2
)

In [None]:
sam3_model.operation = "concept_segmentation"
sam3_model.threshold = 0.5
sam3_model.mask_threshold = 0.5

sam3_model.prompt = dataset.distinct("ground_truth.detections.label")

dataset.apply_model(
    sam3_model,
    label_field="sam3_not_finetuned",
    batch_size=32,
    num_workers=8,
    skip_failures=False
)

  37% |█████\---------|  768/2103 [38.2m elapsed, 1.1h remaining, 0.3 samples/s] 

In [None]:
results = dataset.evaluate_detections(
    "sam3_not_finetuned",          # Detections with masks
    gt_field="ground_truth",   # Detections with masks
    eval_key="initial_sam3_eval",
    use_masks=True,            # use instance masks for IoU
    compute_mAP=True,
)

results.print_report()
print(results.mAP())