<a href="https://colab.research.google.com/github/mbsantiago/AI-Intervene-Training-Material/blob/main/CameraTrapsAI/AI_Intervene_Camera_Traps.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AI for Wildlife Images

In this notebook, you'll explore how AI can be used for large-scale ecological research. We'll use a real-world case study: a camera trap project conducted by a UCL team in Kenya's Maasai Mara ecosystem.

Here’s a brief overview of what you will cover:

* Camera Traps: What they are and how they're used in ecology.
* Image Annotation: The process of labelling camera trap images.
* MegaDetector: An AI tool for automatically detecting animals in photos.
* SpeciesNet: An AI tool for automatically identifying the species of those animals.
* Model Evaluation: How to measure the performance of these AI tools.

**Note:** Throughout this notebook, you'll find several "Exercises".
These are prompts for reflection, designed to get you thinking critically about the concepts.
You don't need to write down any formal answers, but we strongly encourage you to discuss your thoughts with your peers.
The notebook is intended to raise more questions than it answers.

**Note:** You will see cells with Python code throughout this notebook.
Your task is simply to run them to produce the results we'll be discussing.
There is no need to read or understand the code, but if you are interested and want to learn more, the tutors are happy to explain it.

# Setup

If this is your first time in Google Colab and feel disoriented, take a few minutes to go through this [intro guide](https://colab.research.google.com/github/MScEcologyAndDataScienceUCL/BIOS0032_AI4Environment/blob/main/01_Work_environment/01a_Intro_to_colab.ipynb).

Before starting, you need to connect the lab's dataset to this notebook.
Follow these three steps to get everything linked up.

## 1. Add a Shortcut to the Data

The data needed for this notebook is in a public [Google Drive folder](https://drive.google.com/drive/folders/1W_k4zbADhx9rwxGQLm97KZUfXra9g3C9?usp=drive_link).
You need to add a shortcut to this folder in your own Drive.

**Note:** If you've already added this shortcut, you can skip this step and move on to mounting your Drive.

First, open the link. Then, click the folder's name at the top of the page and select the "Add shortcut" option.

**Important:** Be sure to click on the folder name on the top of the page and not on the individual file in the table below.

<img src="https://github.com/mbsantiago/AI-Intervene-Training-Material/blob/main/CameraTrapsAI/assets/google_drive_shortcut.png?raw=1" alt="drawing" width="400"/>

A new window will pop up. Go to the "All locations" tab and choose "My Drive".

<img src="https://github.com/mbsantiago/AI-Intervene-Training-Material/blob/main/CameraTrapsAI/assets/google_drive_all_locations.png?raw=1" alt="drawing" width="400"/>

The data folder should now appear in your 'drive' folder (check the 'Files' tab on the left).

## 2. Mount Google Drive

Run the code cell below. This gives the notebook permission to access your Google Drive.

In [1]:
# @title
from google.colab import drive

drive.mount("/content/drive")

Mounted at /content/drive


## 3. Run the Setup script

Run the cell below to complete the setup.

In [2]:
# @title
!wget -q -O - https://github.com/mbsantiago/AI-Intervene-Training-Material/raw/refs/heads/main/CameraTrapsAI/ct_notebook_setup.sh | bash

Downloading colab_utils.py
Downloading ct_notebook_utils.py
Using CPython 3.12.11 interpreter at: [36m/usr/bin/python3[39m
Creating virtual environment at: [36m.mdvenv/[39m
Activate with: [32msource .mdvenv/bin/activate[39m
[2mUsing Python 3.12.11 environment at: .mdvenv[0m
[2mUsing Python 3.12.11 environment at: .mdvenv[0m
[2K[2mResolved [1m111 packages[0m [2min 2.91s[0m[0m
[2K[2mPrepared [1m110 packages[0m [2min 1m 03s[0m[0m
[2K[2mInstalled [1m111 packages[0m [2min 969ms[0m[0m
 [32m+[39m [1mabsl-py[0m[2m==2.3.1[0m
 [32m+[39m [1malbucore[0m[2m==0.0.24[0m
 [32m+[39m [1malbumentations[0m[2m==2.0.8[0m
 [32m+[39m [1mannotated-types[0m[2m==0.7.0[0m
 [32m+[39m [1masttokens[0m[2m==3.0.0[0m
 [32m+[39m [1mcertifi[0m[2m==2025.10.5[0m
 [32m+[39m [1mcharset-normalizer[0m[2m==3.4.3[0m
 [32m+[39m [1mclipboard[0m[2m==0.0.4[0m
 [32m+[39m [1mcontourpy[0m[2m==1.3.3[0m
 [32m+[39m [1mcycler[0m[2m==0.12.1[0m
 [32m+

# Introduction

**Camera traps** are static cameras set up in the wild to monitor animal populations.
Typically, they are triggered by motion, capturing an image whenever an animal moves within the camera's field of view.

<img src="https://github.com/MScEcologyAndDataScienceUCL/BIOS0032_AI4Environment/blob/bios0032_23-24/3_AI_for_Wildlife_Images/images/ct.png?raw=true" alt="drawing" width="200"/>

The images from camera traps show which species are in an area and what they're doing.
This data reveals how often animals appear, their activity times, and their general behaviour.
Camera traps are also relatively easy to set up and are great for capturing a wide range of medium to large animals.

By combining this animal data with environmental information, ecologists can answer important questions.
For example, a key goal is to understand how wildlife reacts to human pressure.
By comparing animal communities in areas with low versus high human impact, it's possible to find thresholds, or tipping points, where the ecosystem changes significantly.
Finding these thresholds helps inform conservation decisions, like where to create a protected area or what land-use rules to set.
Placing cameras strategically along these areas of varying human impact provides the data needed to answer these kinds of questions.

After the cameras are collected, the biggest challenge is going through all the photos.
A huge number of these images are "false triggers" with no animals, set off by things like waving grass.
Even when an animal is in the photo, it can be hard to identify if the view is bad, it's too far away, or it's an unfamiliar species.
On top of that, a single project can produce hundreds of thousands or even millions of images.
Manually checking every photo is extremely slow and often impossible for large studies.

This is one of the key ways AI is changing ecological work.
It helps by automating the slow task of sorting photos, which in turn makes large-scale studies using camera traps possible.
In this notebook, you will work with an example dataset from the Biome Health Project, learning how to use AI to process and analyse the collected imagery.

# Biome Health Project

The dataset you'll use comes from the Biome Health Project.
A key goal of this project is to study how wildlife responds to different levels of human pressure.
By understanding this response in detail, the project aims to identify specific pressure thresholds that can be used to make conservation actions, like setting land-use limits, more effective.

![Responses](https://static.wixstatic.com/media/d56724_6d6b60fecd174d24a714672dafcf00cf~mv2.png/v1/crop/x_8,y_0,w_829,h_588/fill/w_829,h_588,al_c,q_90,enc_avif,quality_auto/Gradient_3.png)

This notebook focuses on data from the Greater Maasai Mara Ecosystem in Kenya, a savanna famous for its abundant wildlife.
Below you see consecutive frames captured by a camera trap in Kenya.
It shows a hyena entering the scene and checking out a buffalo!

<img src="https://github.com/MScEcologyAndDataScienceUCL/BIOS0032_AI4Environment/blob/bios0032_23-24/3_AI_for_Wildlife_Images/images/hyena.gif?raw=true" alt="drawing" width="500"/>

The study area contains a mix of different zones with different management rules.
It includes a highly protected part of the Maasai Mara National Reserve alongside community-run conservancies.
In the National Reserve, protection is very strict and the area is actively patrolled.
In the community conservancies, Maasai landowners in partnership with tourism companies and follow specific rules for grazing their cattle.
This setup creates a clear gradient of human and livestock pressure, from highly protected land to areas with more grazing and human presence.

To monitor wildlife, a team from UCL and local conservancy staff placed camera traps across the landscape.
A systematic approach was used to ensure the entire area was sampled evenly.
Over 180 motion-activated cameras were set up across a huge 1,200 km² area in a 2x2 km grid pattern.
In the centre of each grid square, one camera was mounted on a tree or post about 50 cm off the ground.

## **Exercise 1**: Explore the landscape

Let's start by getting a feel for the landscape where the data was collected.

Run the cell below to generate an interactive map of the study area. Each point on the map marks the location of a camera trap.

Take a moment to explore the map.
Zoom in, zoom out, and pan around the region.
Use the layer selection tool on the top right to switch between different views, like satellite imagery, topography, and street maps.

As you look, think about these questions:

1. What major geographical features can you see? Look for things like rivers, hills, and potential changes in vegetation.
2. How might these features affect which animals live there? For example, would a leopard prefer a rocky outcrop or an open plain?
3. To build a better mental picture, search online for images of the "Maasai Mara National Reserve". How does the environment look on the ground?

In [None]:
# @title "Camera Trap Location Map"

from functools import partial

import folium
import geopandas as gpd
import pandas as pd

cameras = pd.read_csv("data/metadata/cameras.csv")

areas = {
    "Mara Triangle": {
        "path": "data/gis/Triangle/",
        "color": "green",
    },
    "Mara North Conservancy": {
        "path": "data/gis/MaraNorthConservancy/",
        "color": "blue",
    },
    "Motorogi Conservancy": {
        "path": "data/gis/MotorogiConservancy/",
        "color": "orange",
    },
    "Olare Orok Conservancy": {
        "path": "data/gis/OlareOrokConservancy/",
        "color": "orange",
    },
    "Naboisho Conservancy": {
        "path": "data/gis/NaboishoConservancy/",
        "color": "purple",
    },
}


def style_fn(feature, color):
    return {"fillColor": color, "color": color}


def get_regime(location_id):
    conservancy = location_id[:-2]
    return {
        "MT": "Mara Triangle",
        "MN": "Mara North Conservancy",
        "OMC": "Motorogi Conservancy",
        "NB": "Naboisho Conservancy",
    }[conservancy]


m = folium.Map(tiles=None)

folium.TileLayer("OpenTopoMap", overlay=False).add_to(m)
folium.TileLayer("Esri.WorldImagery", overlay=False).add_to(m)
folium.TileLayer("OpenStreetMap", overlay=False).add_to(m)

crs = "EPSG:4326"

m.fit_bounds(
    [
        [cameras.Latitude.min(), cameras.Longitude.min()],
        [cameras.Latitude.max(), cameras.Longitude.max()],
    ]
)

for _, row in cameras.iterrows():
    regime = get_regime(row["Location ID"])
    color = areas[regime]["color"]
    folium.Marker(
        location=[row.Latitude, row.Longitude],
        icon=folium.Icon(prefix="fa", icon="camera", color=color),
        popup=f"Location ID = {row['Location ID']}",
    ).add_to(m)

for name, data in areas.items():
    area = gpd.read_file(data["path"]).to_crs(crs)
    layer = folium.GeoJson(
        area.to_json(),
        style_function=partial(style_fn, color=data["color"]),
        name=name,
    )
    folium.Popup(name).add_to(layer)
    layer.add_to(m)

folium.LayerControl().add_to(m)

m

# Data Collection

The dataset for this lab comes from a single year, 2018.
For this year alone we collected a total of 2.4 million images.

At this stage, we have no idea what's in these photos.
They could be rare animals or just empty shots triggered by moving branches.
A good first step in any camera trap project is to analyse the metadata (the information about the images), like when and where they were taken.
This helps us understand the data collection process itself.

## **Exercise 2**: Visualise the Collection Effort

Let's visualise the entire 2018 data collection effort.
Run the cell below to generate a plot that gives an overview of when and where photos were taken.

Each column represents a single camera trap site.
Each row is a day of the year (from 1 to 365, but we focus on 260-340).
The color of each pixel shows the number of images taken at that site on that day. White means no images were captured.

Notice that the site names tell you which area they belong to (e.g., the Mara Triangle or a specific conservancy).

Now, take a close look at the plot and think about these questions:

1. Why are there gaps and "noise" in the plot?
2. Do you see any broad patterns between the different areas? For example, did data collection start and stop at the same time everywhere?
3. Look at the color bar to see the range of values. Some sites have days with tens of thousands of images. Does a high image count for a site directly translate to high animal abundance?

In [None]:
# @title "Number of Images per Day"
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

images_metadata = pd.read_parquet("data/metadata/images.parquet")

images_per_day = (
    images_metadata.groupby(
        [
            images_metadata.site_id,
            images_metadata.datetime.dt.day_of_year.rename("day_of_year"),
        ],
        observed=True,
    )
    .size()
    .rename("num_images")
    .reset_index()
    .pivot(index="site_id", columns="day_of_year", values="num_images")
    .fillna(0)
    .T
)

images_per_day = images_per_day[images_per_day.index >= 260]

_, ax = plt.subplots(figsize=(20, 8))

ax.xaxis.tick_top()
sns.heatmap(
    images_per_day,
    ax=ax,
    mask=images_per_day == 0,
    cmap="flare",
)
ax.set(xlabel="Site ID", ylabel="Day of Year")

ax

# Manual Annotation

So, how do you find the animals in 2.4 million photos? Before jumping into AI solutions, it's helpful to understand the traditional, manual approach.

The process of reviewing images to record information about them, like which species are present, is called **annotation** or **labelling**.

Before AI became common, all camera trap research relied on people, researchers, experts, or citizen scientists, manually annotating every single photo.
While this is incredibly slow, it has a major benefit: it gives the annotator a much closer familiarity with the data.
By looking through thousands of images, you gain an intuitive sense of what the data looks like, what to expect, and where potential issues might arise.
This hands-on experience is invaluable for correctly interpreting the final results of any analysis.

Manual annotation is also essential for building AI models.
The human-labelled images serve as the "ground truth" used to both train an AI model and test how well it performs.
We'll cover that in more detail later.
For now, it's time to get a feel for the annotation process yourself.

## **Exercise 3**: Manual Annotation

Your task is to review a small set of 10 images and detect any animals you see.

Run the cell below to launch the annotation tool.

Here are your instructions:

* Draw a box around every animal you can find. If there are multiple animals in one photo, make sure to box each one.
* Just detect, don't identify. For this exercise, your only task is to find the animals, not to name their species.
* Try to make your boxes as tight as possible around the animal, without including too much background.
* Remember, some images will be empty. If you don't see any animals, just move on to the next one.
* Be thorough! Do your best to find every animal, even if it's small, far away, or partially hidden.

When you've finished all 10 images, click the Submit button to save your work.

In [None]:
# @title "Annotator"
from pathlib import Path

from ct_notebook_utils import Annotator

# path for detection data
data_dir = Path.cwd() / "data"

image_dir = data_dir / "images"

selected_images = [
    Path("data/images/2018_NB01_001794.JPG"),
    Path("data/images/2018_NB40_002921.JPG"),
    Path("data/images/2018_MT22_020230.JPG"),
    Path("data/images/2018_OMC11_009862.JPG"),
    Path("data/images/2018_NB26_025049.JPG"),
    Path("data/images/2018_MN33_009632.JPG"),
    Path("data/images/2018_NB26_000679.JPG"),
    Path("data/images/2018_MT27_005639.JPG"),
    Path("data/images/2018_NB05_002216.JPG"),
    Path("data/images/2018_NB47_006890.JPG"),
]

# create a list with the numpy arrays that correspond to each of the
# images to annotate
annotator = Annotator(selected_images)

annotator.start()

## **Exercise 4**: Annotation Costs

That short annotation task gives you a feel for the process. But how does that effort scale up from 10 images to 2.4 million?

### **Step 1**: Get Your Annotation Stats

Run the cell below to see a report on your work from the last exercise. It will show you how long it took to annotate the 10 images and how many animals you found.

In [None]:
# @title "Annotation Report"
# run this cell straight after your annotation
annotation_duration = annotator.duration
tagged_image_boxes = annotator.annotations

total_images = len(selected_images)

# convert time to minutes
print(f"The annotation of the {total_images} images took {annotation_duration}")

# given produced annotation list, we calculate total animals and the number
# of images with animals
num_animals = sum(x.shape[0] for x in tagged_image_boxes if x is not None)
num_non_empty_images = sum(x is not None for x in tagged_image_boxes)
annotation_speed = annotation_duration / float(len(selected_images))
print(
    f"In total, you found {num_animals} animals across {num_non_empty_images} images while "
    f"{total_images - num_non_empty_images} out of the {total_images} images were tagged as empty.\n"
    f"You tagged 1 image every {annotation_speed}"
)

### **Step 2:** Extrapolate to the Full Dataset

Now, let's see what it would take to annotate the entire 2018 dataset. Run the next cell to launch an interactive tool that estimates the total time and cost.

Play around with the sliders in the tool to see how the numbers change. You can control:

* Number of annotators: How many people are working on the project?
* Dataset percentage: Do you need to annotate all the images (100%), or just a smaller fraction?
* Hourly pay rate ($): How much would you pay an annotator per hour?

Think about the results. Is manually annotating the full 2.4 million images a feasible task for a typical research project?

In [None]:
# @title "Extrapolating to the whole dataset"
import ipywidgets as widgets
import pandas as pd

images_metadata = pd.read_parquet("data/metadata/images.parquet")


@widgets.interact(
    num_annotators=(1, 20),
    dataset_percentage=(0.0, 1.0, 0.05),
    cost_per_hour=(6, 50),
)
def report_annotation_costs(num_annotators=1, dataset_percentage=1.0, cost_per_hour=17):
    total_images = len(images_metadata)

    num_images_to_annotate = int(total_images * dataset_percentage)

    total_duration = annotation_speed * (num_images_to_annotate / num_annotators)
    total_cost = total_duration.total_seconds() * cost_per_hour * num_annotators / 3600
    print(
        f"At the same speed it would take {num_annotators} person(s) a total "
        f"of {total_duration} to annotate {dataset_percentage:.1%} of the data ({num_images_to_annotate:,d} images).\n"
        f"This would cost {total_cost:,.2f}£ at an hourly rate of {cost_per_hour}£."
    )

# MegaDetector

Manually annotating millions of images is clearly not practical. This is where a pre-trained AI model can save a huge amount of time.

For this notebook, we will use [**MegaDetector**](https://github.com/microsoft/CameraTraps/blob/main/megadetector.md), a model initially developed by Microsoft's AI for Earth program.
It was trained on millions of camera trap images from many different ecosystems, so it's generally robust and reliable.

**Note:** Like any AI, MegaDetector learned to detect animals by looking at a huge library of training examples.
It works best when your data is similar to what it was trained on.
It might struggle with images from unique ecosystems or with different types of imagery (like from drones).
No AI is perfect, and it will still make mistakes.
You can see some examples of known failure cases [here](https://github.com/agentmorris/MegaDetector/blob/main/megadetector-challenges.md).

Here are a few key things about it:

* MegaDetector's purpose is to find and draw boxes around three types of objects: animals, people, and vehicles. It doesn't identify the species of the animal, it just finds it.

* MegaDetector is "pretrained". This means it has already been trained on a large, general dataset. We can take this general, pre-trained model and apply it directly to our own images without any additional train.

* For every box it draws, MegaDetector provides a confidence score between 0 and 1. A score close to 1.0 means the model is very certain about its detection (e.g., "I'm 98% sure this is an animal").

* We can use these scores to automatically filter out empty images. By setting a confidence threshold, we can decide that any detection below a certain score isn't a "real" detection. Choosing this threshold is a trade-off: a lower threshold means you might find more animals but will also have to check more false positives.

## **Exercise 5**: Run MegaDetector

It's time to run MegaDetector on the same 10 images you annotated manually.
This will give you a direct comparison between your work and the AI's predictions.

Using an AI model isn't always simple.
In general, you have to find the model's code repository (like on GitHub) and read the developers' notes to get it working.
Most models are built with deep learning frameworks (like [PyTorch](https://pytorch.org/) or [TensorFlow](https://www.tensorflow.org/)), and using them often requires some Python scripting.

Fortunately, because MegaDetector is so popular, there are many ways to use it:

* Desktop Apps: User-friendly tools like [Addax](https://addaxdatascience.com/addaxai/) provide a graphical interface.
* Cloud Services: Platforms like [Wildlife Insights](https://wildlifeinsights.org/) have integrated MegaDetector into their workflow.
* Code: You can always run it directly from a Python script, see its [official docs](https://megadetector.readthedocs.io/en/latest/).

### Step 1: Run MegaDetector command

For this exercise, we'll use another common method: the command line.
This approach allows us to run the model by typing a single command with a few parameters, telling it where the images are and how to process them.
It's a great way to run tools without needing a graphical interface or writing a full script.

**Note:** If you're new to the command line, think of it as a text-based way to give instructions to your computer.
You type commands into an application called a "terminal" to run programs or manage files.

Run the cell below. It contains the command-line instruction to run MegaDetector on the example images.

In [None]:
# @title
%%bash
source .mdvenv/bin/activate
python -m megadetector.detection.run_detector_batch MDV5A \
  "data/images/" \
  "data/results/md_detections.json" \
  --output_relative_filenames \
  --threshold 0.2 \
  --quiet

### Step 2: Measure Speed

Once the process is finished, the cell output will show some statistics about how fast the model ran.

Find the line that looks like this:

    Finished inference for N images in M minutes and x seconds (y images per second)

Take note of the processing speed (the "images per second" value), as we'll use in the cell below.


**Note:** The first time you run the model, it takes a bit longer because the model files have to be downloaded and prepared for processing.
This is a one-time setup cost.
To get a more accurate measure of the model's true processing speed, it's a good idea to run the cell a second time and use that value.

Now that you have the processing speed, let's extrapolate. How long would it take MegaDetector to process the entire 2.4 million image dataset?

Run the cell below to launch another interactive tool. Input your "images per second" value from the previous step. You can also use the slider to see how the total time changes if you only need to process a fraction of the dataset.

In [None]:
# @title "MegaDetector Speed"
import datetime

import ipywidgets as widgets
import pandas as pd

images_metadata = pd.read_parquet("data/metadata/images.parquet")


@widgets.interact(
    images_per_sec=widgets.BoundedFloatText(
        value=0.28,
        min=0,
        max=10.0,
        step=0.01,
        description="Images per Sec:",
        disabled=False,
    ),
    dataset_percentage=(0.0, 1.0, 0.05),
)
def compute_megadetector_costs(
    images_per_sec=0.28,
    dataset_percentage=1,
):
    duration = datetime.timedelta(seconds=1 / images_per_sec)
    images_to_process = int(dataset_percentage * len(images_metadata))
    total_duration = duration * images_to_process
    print(
        f"MegaDetector would take {total_duration} to process {dataset_percentage:.0%} of the dataset (i.e. {images_to_process} images)"
    )

### Step 3: Think about Hardware

It's important to know that the model's processing speed heavily depends on the computer's hardware.

Most AI models run massively faster on a GPU (Graphics Processing Unit) compared to a standard CPU.
A GPU is a specialised chip, most often used for rendering graphics, that is extremely good at the types of calculations AI requires.

Not every computer has a compatible GPU, and they can be expensive.
However, using one can accelerate the process by orders of magnitude, turning a task that takes weeks into one that takes only a day.
If you're curious, you can try re-running the MegaDetector step with different hardware settings in this notebook.

**Note:** Curious to see the difference?
You can change the hardware for this Colab notebook.
Go to the "Runtime" menu at the top, select "Change runtime type", and choose a different "Hardware accelerator" (like GPU or CPU).

**Warning:** Changing the runtime will disconnect you and start a new session, so you'll have to run all the setup steps from the beginning of the notebook.

## **Exercise 6**: MegaDetector Outputs

When MegaDetector finished, it saved all its findings into a single output file located at: `data/results/md_detections.json`.

This is a JSON file, a common format for storing structured data.
While scientists often work with tables (like CSV files or Excel spreadsheets), JSON is widely used in software and on the web.
It's essentially a text file, but the information is organised in a specific, nested way that can look a bit confusing at first.

### Step 1: Take a Peek at the Raw JSON

Let's get a feel for what this raw data format looks like.

Use the file browser on the left to navigate to the `md_detections.json` file and double-click to open it.
It will appear in a new tab.

Don't worry about understanding every detail.
The goal is just to see how the information is structured.
When you're done, close the tab to come back to this notebook.

Since JSON is such a common format, it's worth getting familiar with it.
If you're curious to learn more, you can read about it [here](https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Objects/JSON).

### Step 2: Read the File with Code

Now, let's use code to read that same JSON file and pull out a summary of what MegaDetector found. Did you get the same number of animals?

Run the cell below.

In [None]:
# @title "MegaDetector Report"
import json
from pathlib import Path

data_dir = Path("data")
output_file_path = data_dir / "results" / "md_detections.json"

selected_images = [
    "2018_NB01_001794.JPG",
    "2018_NB40_002921.JPG",
    "2018_MT22_020230.JPG",
    "2018_OMC11_009862.JPG",
    "2018_NB26_025049.JPG",
    "2018_MN33_009632.JPG",
    "2018_NB26_000679.JPG",
    "2018_MT27_005639.JPG",
    "2018_NB05_002216.JPG",
    "2018_NB47_006890.JPG",
]

md_output = json.loads(Path(output_file_path).read_text())["images"]

total_animals = 0
empty_images = 0
total_images = len(selected_images)

for file_info in md_output:
    if not Path(file_info["file"]).name in selected_images:
        continue

    detections = len(file_info["detections"])
    total_animals += detections

    if detections == 0:
        empty_images += 1

non_empty_images = total_images - empty_images

print(
    f"In total, MegaDetector found {total_animals} animals across "
    f"{non_empty_images} images while {empty_images} out of the "
    f"{total_images} images were tagged as empty."
)

### Step 3: Visualise the outputs

While the JSON file contains all the information, it's not easy to interpret on its own. Visualizing the model's detections is always a good idea to make inspecting the results much easier.

First, run the cell below. It will create a new folder with copies of the original images, but with MegaDetector's detections (the bounding boxes, labels, and confidence scores) drawn on top.

In [None]:
# @title
%%bash
source .mdvenv/bin/activate
python -m megadetector.visualization.visualize_detector_output \
  --images_dir "data/images/" \
  "data/results/md_detections.json" \
  "data/results/md_viz/" \
  --output_image_width 1000

Now, run the next cell to display those annotated images in an interactive viewer right here in the notebook.

In [None]:
# @title Visualise MegaDetector Outputs
from pathlib import Path

from ct_notebook_utils import image_tabs

data_dir = Path("data")

md_viz_dir = data_dir / "results" / "md_viz"

selected_images = [
    "2018_NB01_001794.JPG",
    "2018_NB40_002921.JPG",
    "2018_MT22_020230.JPG",
    "2018_OMC11_009862.JPG",
    "2018_NB26_025049.JPG",
    "2018_MN33_009632.JPG",
    "2018_NB26_000679.JPG",
    "2018_MT27_005639.JPG",
    "2018_NB05_002216.JPG",
    "2018_NB47_006890.JPG",
]

image_tabs(
    [
        path
        for path in md_viz_dir.glob("*.JPG")
        if path.name.replace("anno_", "") in selected_images
    ],
    width=1000,
)

Use the viewer to look through the 10 images. Compare the AI's detections with the annotations you made earlier and think about the following questions:

1. How did the AI do? Did MegaDetector find the same animals that you did?
2. Did the AI miss anything? If you found an animal that MegaDetector missed, why do you think it failed?
3. Did the AI find anything you missed? Sometimes the model can spot things that are easy for a human eye to overlook.
4. What's a good confidence threshold? Look at the confidence scores on the boxes. Based on your own judgment, could you pick a threshold (e.g., 0.5) that successfully separates the real animals from false detections?

# Species Classification

MegaDetector is a great first step for processing camera trap data, as it helps filter out all the empty images.
But if you want to identify which species are present, MegaDetector can't help.
The next step is typically to manually review all the non-empty images, or perhaps to build a custom species classifier.
However, a new tool called [**SpeciesNet**](https://github.com/google/cameratrapai) offers another powerful, pre-trained solution, similar to MegaDetector but for species identification.

Both models were trained on huge amounts of data.
SpeciesNet, for example, was trained on around 65 million images from all over the globe (see [this paper](doi.org/10.1049/cvi2.12318) for more details).
These images were collected, manually annotated by many different research teams, and then shared on platforms like Wildlife Insights.
The result is a model that synthesises a vast amount of expert knowledge to identify animals in photos.
Both models are also free and open-source, meaning you can inspect their code and use them however you like, as long as you attribute them correctly.

SpeciesNet is different from MegaDetector because it doesn't locate where an animal is in a photo.
Instead, it looks at the *whole image* and tries to identify the species shown.
Its current version can recognise 2,000 different classes of animals.
If it can't identify the exact species, it will attempt to name a higher taxonomic category (like genus or family).
Like MegaDetector, SpeciesNet also outputs a confidence score for each classification.

In most scenarios, the two models are best used together.
The ideal workflow is to run MegaDetector first to find and locate all the animals, then crop the image around each detection, and finally, run SpeciesNet on those smaller, cropped images to identify the species.

## **Exercise**: Using SpeciesNet

### **Step 1**: Run the SpeciesNet command

In [None]:
# @title
%%bash
source .mdvenv/bin/activate
python -m megadetector.detection.run_md_and_speciesnet \
    "data/images/" \
    "data/results/speciesnet_predictions.json" \
    --country KEN

### Step 3: Visualise the outputs

In [None]:
# @title
%%bash
source .mdvenv/bin/activate
python -m megadetector.visualization.visualize_detector_output \
    --images_dir "data/images/" \
    "data/results/speciesnet_predictions.json" \
    "data/results/speciesnet_viz" \
    --output_image_width 1000

In [None]:
# @title Visualise SpeciesNet Outputs
from pathlib import Path

from ct_notebook_utils import image_tabs

data_dir = Path("data")

speciesnet_viz_dir = data_dir / "results" / "speciesnet_viz"

image_tabs([path for path in speciesnet_viz_dir.glob("*.JPG")], show_max=50)

## **Exercise**: Working with detection data

The output from SpeciesNet is saved in a JSON file, very similar to the one MegaDetector produced.

This format is useful because it's compatible with other tools.
For example, you could load these results into a graphical interface like [TimeLapse](https://timelapse.ucalgary.ca/) to review them visually.
For analysis, however, it's often easier to work with a simple table (like a CSV file).

Let's convert the JSON output into a more familiar tabular format.
The cell below extracts all the detection and classification information and saves it as a CSV file named `ai_outputs.csv`.
Run the cell, then use the file browser on the left to find and open the new CSV file to see what the final data looks like.

In [None]:
# @title Turn Detections into Table
import json
from pathlib import Path

from google.colab import data_table

data_table.enable_dataframe_formatter()

import pandas as pd

data_dir = Path("data")

outputs_path = data_dir / "results" / "speciesnet_predictions.json"

speciesnet_results = json.loads(outputs_path.read_text())

classes = speciesnet_results["classification_categories"]


species_df = []
for file_predictions in speciesnet_results["images"]:
    for detection in file_predictions["detections"]:
        if detection["category"] != "1":
            # Ignore detection if not an animal
            continue

        if "classifications" not in detection:
            # Ignore detection if it was not classified
            continue


        minx, miny, width, height = detection["bbox"]

        class_num, class_score = detection["classifications"][0]
        class_name = classes[class_num]

        if class_name == "blank":
            # Ignore detection if it was classified as blank
            continue

        species_df.append(
            {
                "filepath": file_predictions["file"],
                "detection_score": detection["conf"],
                "class": class_name,
                "class_score": class_score,
                "minx": minx,
                "miny": miny,
                "width": width,
                "height": height,
            }
        )

species_df = pd.DataFrame(species_df)
species_df.to_csv("ai_outputs.csv", index=False)
print("CSV saved")

Download the `ai_outputs.csv` file from the `data/results` folder to your computer.
You can do this by finding it in the file browser on the left, clicking the three dots, and selecting "Download".

Once you have the file, use your preferred tool (like Excel, R, or Python) to answer the following questions:

1. Which species (or class) was detected most frequently by the AI?
2. Is there a noticeable difference in the average confidence scores between species? Why do you think that might be?

**Note**: If you're comfortable with Python, feel free to add a new code cell below and do your analysis right here in this notebook.

# Evaluating AI Outputs

After running the AI models, we have a neat table of species detections.
But are they correct?
As you've probably noticed from the examples, the model's outputs aren't always accurate.
Even models trained on millions of images will make mistakes, especially if your data looks very different from their training data.

It is **critical** to get a quantitative measure of how well a model performs on your specific data.
Without this step, you risk basing your research on "model hallucinations" rather than real ecological patterns.
A good starting point is to read the performance reports from the model's creators, like the [accompanying paper](https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/cvi2.12318) for SpeciesNet.
However, it is **strongly recommended** that you always perform your own independent evaluation on your own dataset.

Evaluating a model means comparing its predictions to a set of correct or *"ground truth"* answers.
To create dataset a dataset for testing the model you need to manually annotate a subset of your own images.
These annotations represent what you want the model to output.
By comparing the model's predictions to this ground truth, you can calculate a suite of performance metrics that measure how well the model is doing its job.
This is why manual annotation remains a vital step in any AI-assisted study.

## **Exercise:** Designing an Evaluation Dataset

How should you choose which images to manually annotate?
The way you select your sample can significantly impact your understanding of the model's performance.

### **Question 1**: Comparing Two Common Sampling Strategies

Consider the pros and cons of these two common methods for selecting images to annotate for your ground truth:

* Method A: Random Image Sample

    Select a completely random sample of 1,000-2,000 images from the entire dataset and manually annotate everything in them.
    
* Method B: Species-Stratified Sample

    For each species you care about, select 20-30 random images that the AI has already labelled as that species, and then manually check if the AI was correct.

Think about what each method would allow you to measure. For example:

- Which method is better for evaluating how well the model filters out empty images?
- What are the potential biases of Method B?
- What are the risks of Method A?

### **Question 2**: What Else Needs Consideration?

A random sample, while seemingly unbiased, can easily miss or underrepresent important conditions in your data.
For example, even if 30% of your images were taken at night, a small random sample might happen to include very few of them.
If the model performs poorly in the dark, your evaluation would fail to represent this weakness, giving you a misleadingly optimistic view of its overall performance.

Beyond just day vs. night, what other factors should you consider to ensure your evaluation set is representative and tests the model under different conditions?
List a few examples of how you might stratify your sample to get a more complete picture of the model's performance.

## **Exercise**: Measuring Model Performance

Now, let's compare the AI's predictions to a "ground truth" dataset of annotations already prepared by the UCL team.

There are many ways to measure performance, and the "best" method depends on your research question.
The AI pipeline above gives us a lot of detail: it detects multiple animals, provides their exact locations with bounding boxes, and identifies their species.
While this level of detail is useful for some studies, many ecological questions only require knowing that a certain species was present at a site.
For analyses like occupancy modeling, it doesn't matter how many individuals were in the photo or exactly where they were.

For this exercise, let's assume that's our goal: we just want to know if the AI can correctly tell us which species were present in each image, matching the answers provided by human experts.

### Step 1: Mapping species labels

An important and surprisingly tricky step is to make sure we are comparing apples to apples.
This means ensuring that the species labels from our ground truth match the labels used by the AI model.

Because the manual annotations and the SpeciesNet model were created independently, their labels don't perfectly align.
Run the cell below to see a comparison of the two sets of labels.

In [None]:
# @title Label Comparison
from itertools import zip_longest

import pandas as pd

annotations = pd.read_csv("data/metadata/labels.csv")

ann_labels = annotations.label.sort_values().unique()
pred_labels = species_df["class"].sort_values().unique()

print(f"{'Annotation Labels':^20} | {'SpeciesNet Labels':^20} ")
print(f"{'-'*20} | {'-'*20} ")
for ann_lab, pred_lab in zip_longest(ann_labels, pred_labels, fillvalue=""):
    print(f"{ann_lab:^20} | {pred_lab:^20}")

Take a look at the two lists and think about the following questions:

* How would you match these two different sets of labels to allow for a fair comparison?
* If you were starting a new camera trap project today, how could you plan your annotation process to avoid this problem from the start? is it always possible?

### Step 2: Understanding Different Types of Errors

Let's focus our analysis on just one species: the impala. For this exercise, we'll keep it simple and only count a prediction as correct if SpeciesNet gives the specific "impala" label, ignoring higher-level classifications like "bovidae family".

Our AI pipeline gives us two confidence scores for each detection, but for now, we'll just focus on the final species classification score from SpeciesNet.

Run the cell below. It will go through each image in our ground truth set and calculate two things:
1. Does the image actually contain an impala (based on the manual labels)?
2. What was the AI's highest confidence score for "impala" in that image? (This will be 0 if the AI found no impalas).

In [None]:
# @title Get Per Image Info
gt_classes = ["impala"]
pred_classes = ["impala"]


def get_pred_score(group):
    preds = group[group["class"].isin(pred_classes)]

    if len(preds) == 0:
        return 0

    return preds["class_score"].max()


comparison = (
    annotations.groupby("filename")
    .label.apply(lambda group: (group.isin(gt_classes)).any())
    .rename("ground_truth")
    .to_frame()
    .join(
        species_df.groupby("filepath")
        .apply(get_pred_score, include_groups=False)
        .rename("score")
    )
    .fillna(0)
    .reset_index()
)
comparison

Now we can use a confidence threshold to make a decision.
If an image's score is above the threshold, we'll say the AI predicted "impala". If the score is below, we'll say it did not.

By comparing the AI's prediction to the ground truth for each image, we get one of four possible outcomes:

* True Positive (TP): The image has an impala, and the AI correctly predicted it.
* False Negative (FN): The image has an impala, but the AI missed it.
* False Positive (FP): The image does not have an impala, but the AI said it did.
* True Negative (TN): The image does not have an impala, and the AI correctly said it didn't.

Using these four outcomes, we can calculate two standard performance metrics:

* **Precision**: Of all the times the AI predicted "impala", what fraction was it correct?

  $$ Precision = \frac{TP}{TP + FP} $$

* **Recall**: Of all the images that truly contained an impala, what fraction did the AI find?

  $$ Precision = \frac{TP}{TP + FN} $$

The values of Precision and Recall depend entirely on the confidence threshold you choose. Play with the slider in the next cell to see how they change.

In [None]:
# @title Compute Precision and Recall

import ipywidgets as widgets


@widgets.interact(threshold=(0, 0.99, 0.01))
def compute_precision_recall(threshold=0.8):
    gt = comparison["ground_truth"]
    preds = comparison["score"] >= threshold
    tp = (preds & gt).sum()
    fp = (preds & ~gt).sum()
    fn = (~preds & gt).sum()

    if tp + fp == 0:
        precision = 1
    else:
        precision = tp / (tp + fp)

    if tp + fn == 0:
        recall = 1
    else:
        recall = tp / (tp + fn)

    print(f"{precision=:.1%} {recall=:.2%}")

### Step 3: Final Reflections

Reflect on these final questions:

* Think back to your own manual annotation exercise. How do you think your performance at identifying impalas would compare to the AI's?
* We've been assuming that the human annotations are 100% correct. Is that a safe assumption in a real-world project? What factors could affect the reliability of human annotators?
* We focused on evaluating species classification. How would you design an evaluation for the first step—the detection of animals (did MegaDetector find the animal and draw an accurate box)?

# Conclusion

Hopefully, this notebook has given you a good introduction to how AI can be used for large-scale ecological research.
Tools like MegaDetector and SpeciesNet can be incredibly helpful, and new models are constantly being developed.
However, the most important takeaway is that AI is just one part of the process.
Understanding your data, the ecological context, and critically validating the model's outputs are fundamental steps.
AI models are powerful tools, but only when used carefully and responsibly.

## **Exercise**: Final Thoughts & Discussion

Consider the following questions:

1. What are the limits of this workflow? When would the MegaDetector + SpeciesNet pipeline be a great solution, and for what kinds of research questions would it be unsuitable?
2. How could this apply to your research? Could a similar AI workflow be useful in your specific field of study? If you don't use camera traps, are you aware of similar AI tools for your type of data (e.g., for audio, satellite imagery, or genetic data)?
3. What are the potential downsides? Beyond just getting incorrect predictions, what are some of the other risks or drawbacks of relying on AI for ecological research?