<a href="https://colab.research.google.com/github/NatBrian/wildlife-camera-trap-ai-system/blob/main/notebook/megadetector_hybrid_detection_pipeline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Hybrid Detection Model: MegaDetector + Species Classification

This notebook fine-tunes the **MDV6-yolov10-c** checkpoint to create a hybrid detection model with **7 classes**:

| Class ID | Label | Description |
|----------|-------|-------------|
| 0 | animal | Generic animal (from MDV6 base) |
| 1 | person | Human detection (from MDV6 base) |
| 2 | vehicle | Vehicle detection (from MDV6 base) |
| 3 | bird | Species-specific (Open Images) |
| 4 | squirrel | Species-specific (Open Images) |
| 5 | dog | Species-specific (Open Images) |
| 6 | cat | Species-specific (Open Images) |

**Strategy**: The pretrained MDV6 checkpoint already detects `animal`, `person`, and `vehicle`. We fine-tune with frozen backbone layers to:
- **Preserve** generic detection for unseen animals, people, and vehicles
- **Add** species-specific detection for bird, squirrel, dog, cat

**What this notebook does**:
- Download Open Images data for the 4 species (bird, squirrel, dog, cat)
- Convert to YOLO format with correct class IDs (3, 4, 5, 6)
- Fine-tune MDV6 with frozen backbone to prevent catastrophic forgetting
- Evaluate hybrid detection on test images

## Attribution

The pretrained **MegaDetector (MDV6‚Äëyolov10‚Äëc)** model used in this notebook is provided by Microsoft‚ÄØAI‚ÄØfor‚ÄØCamera‚ÄØTraps.

- **Model source:** https://microsoft.github.io/CameraTraps/model_zoo/megadetector/
- **Citation:** Microsoft AI for Camera Traps. *MegaDetector* https://github.com/microsoft/CameraTraps.
---


## 1) Setup: Install dependencies

In [1]:
# Install required packages
# If running in Colab, keep `--upgrade` flags; in local environment, you may want to control versions.
!pip install -U ultralytics fiftyone fiftyone-brain fiftyone-db tqdm

# Optional: roboflow or fiftyone may help for visualisation and dataset downloads
# !pip install roboflow

Collecting ultralytics
  Downloading ultralytics-8.3.235-py3-none-any.whl.metadata (37 kB)
Collecting fiftyone
  Downloading fiftyone-1.10.0-py3-none-any.whl.metadata (22 kB)
Collecting fiftyone-brain
  Downloading fiftyone_brain-0.21.4-py3-none-any.whl.metadata (4.8 kB)
Collecting fiftyone-db
  Downloading fiftyone_db-1.4.0.tar.gz (8.6 kB)
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting ultralytics-thop>=2.0.18 (from ultralytics)
  Downloading ultralytics_thop-2.0.18-py3-none-any.whl.metadata (14 kB)
Collecting argcomplete (from fiftyone)
  Downloading argcomplete-3.6.3-py3-none-any.whl.metadata (16 kB)
Collecting async_lru>=2 (from fiftyone)
  Downloading async_lru-2.0.5-py3-none-any.whl.metadata (4.5 kB)
Collecting boto3 (from fiftyone)
  Downloading boto3-1.42.2-py3-none-any.whl.metadata (6.8 kB)
Collecting dacite<2,>=1.6.0 (from fiftyone)
  Download

In [2]:
# Uninstall current installations to avoid conflicts
!pip uninstall -y ultralytics torch torchvision torchaudio

# Install PyTorch with CUDA 12.6 (ensure this matches your Colab GPU's CUDA version)
# The previous output showed torch-2.9.0+cu126, so targeting CUDA 12.6
!pip install torch==2.9.0+cu126 torchvision==0.24.0+cu126 torchaudio==2.9.0 --index-url https://download.pytorch.org/whl/cu126

# Reinstall ultralytics
!pip install -U ultralytics

# Verify GPU availability after reinstallation
import torch
print(f"torch.cuda.is_available(): {torch.cuda.is_available()}")
print(f"torch.cuda.device_count(): {torch.cuda.device_count()}")

Found existing installation: ultralytics 8.3.235
Uninstalling ultralytics-8.3.235:
  Successfully uninstalled ultralytics-8.3.235
Found existing installation: torch 2.9.0+cu126
Uninstalling torch-2.9.0+cu126:
  Successfully uninstalled torch-2.9.0+cu126
Found existing installation: torchvision 0.24.0+cu126
Uninstalling torchvision-0.24.0+cu126:
  Successfully uninstalled torchvision-0.24.0+cu126
Found existing installation: torchaudio 2.9.0+cu126
Uninstalling torchaudio-2.9.0+cu126:
  Successfully uninstalled torchaudio-2.9.0+cu126
Looking in indexes: https://download.pytorch.org/whl/cu126
Collecting torch==2.9.0+cu126
  Downloading https://download.pytorch.org/whl/cu126/torch-2.9.0%2Bcu126-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (30 kB)
Collecting torchvision==0.24.0+cu126
  Downloading https://download.pytorch.org/whl/cu126/torchvision-0.24.0%2Bcu126-cp312-cp312-manylinux_2_28_x86_64.whl.metadata (5.9 kB)
Collecting torchaudio==2.9.0
  Downloading https://download.pytorch.org/

## 2) Define 7-class structure and create `data.yaml`

In [3]:
# Define the 7-class structure for hybrid detection
# Classes 0-2 are preserved from MDV6 base model
# Classes 3-6 are species-specific from Open Images fine-tuning
import yaml, os

# IMPORTANT: Class order must match MDV6 base (0=animal, 1=person, 2=vehicle)
classes = ["animal", "person", "vehicle", "bird", "squirrel", "dog", "cat"]

data_yaml = {
    'train': 'datasets/hybrid/images/train',
    'val': 'datasets/hybrid/images/val',
    'test': 'datasets/hybrid/images/test',
    'nc': len(classes),
    'names': classes
}

os.makedirs('datasets/hybrid', exist_ok=True)
with open('datasets/hybrid/data.yaml','w') as f:
    yaml.dump(data_yaml, f)

print('Created datasets/hybrid/data.yaml with 7 classes:')
for i, cls in enumerate(classes):
    source = 'MDV6 base' if i < 3 else 'Open Images'
    print(f'  {i}: {cls} ({source})')


Created datasets/hybrid/data.yaml with 7 classes:
  0: animal (MDV6 base)
  1: person (MDV6 base)
  2: vehicle (MDV6 base)
  3: bird (Open Images)
  4: squirrel (Open Images)
  5: dog (Open Images)
  6: cat (Open Images)


## 3) Download Open Images for the 4 Species

We **only download data for the 4 species** (bird, squirrel, dog, cat).

**Why not download animal/person/vehicle?**
- The MDV6 checkpoint already has strong features for these generic classes
- We use `freeze=10` during training to preserve the backbone
- This prevents catastrophic forgetting of the base detection capabilities

The fine-tuning teaches the model to:
1. Recognize the 4 specific species (bird, squirrel, dog, cat)
2. Keep detecting generic animals, people, and vehicles from the frozen backbone


In [4]:
# Download Open Images for the 4 species only
# We DO NOT need to retrain on animal/person/vehicle - MDV6 base handles those
from fiftyone import zoo as foz
import fiftyone as fo
import os

# Only the 4 species we want to add to MDV6's detection
species_classes = ["bird", "squirrel", "dog", "cat"]

# Map to Open Images label strings (case-sensitive)
oi_class_map = {
    "bird": "Bird",
    "squirrel": "Squirrel",
    "dog": "Dog",
    "cat": "Cat"
}

# YOLO class IDs for the species (classes 3-6 in our 7-class system)
species_class_ids = {
    "bird": 3,
    "squirrel": 4,
    "dog": 5,
    "cat": 6
}

max_samples_per_class = 1000

os.makedirs('datasets/hybrid/images', exist_ok=True)
os.makedirs('datasets/hybrid/labels', exist_ok=True)

for cls in species_classes:
    oi_label = oi_class_map[cls]
    yolo_id = species_class_ids[cls]
    print(f"Downloading {max_samples_per_class} samples for '{oi_label}' ‚Üí class {yolo_id} ({cls})...")
    try:
        dataset = foz.load_zoo_dataset(
            "open-images-v7",
            split="train",
            label_types=["detections"],
            classes=[oi_label],
            max_samples=max_samples_per_class,
            dataset_name=f"oi_species_{cls}",
            seed=51,
            download=True
        )
        print(f"  ‚úì Downloaded {len(dataset)} samples for {cls}")
    except Exception as e:
        print(f"  ‚úó Error downloading {cls}: {e}")


  return '(?ms)' + res + '\Z'


Downloading 1000 samples for 'Bird' ‚Üí class 3 (bird)...
Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


INFO:fiftyone.zoo.datasets:Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


Downloading 'https://storage.googleapis.com/openimages/2018_04/train/train-images-boxable-with-rotation.csv' to '/root/fiftyone/open-images-v7/train/metadata/image_ids.csv'


INFO:fiftyone.utils.openimages:Downloading 'https://storage.googleapis.com/openimages/2018_04/train/train-images-boxable-with-rotation.csv' to '/root/fiftyone/open-images-v7/train/metadata/image_ids.csv'


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà|    4.8Gb/4.8Gb [7.5s elapsed, 0s remaining, 827.8Mb/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà|    4.8Gb/4.8Gb [7.5s elapsed, 0s remaining, 827.8Mb/s]      


Downloading 'https://storage.googleapis.com/openimages/v5/class-descriptions-boxable.csv' to '/root/fiftyone/open-images-v7/train/metadata/classes.csv'


INFO:fiftyone.utils.openimages:Downloading 'https://storage.googleapis.com/openimages/v5/class-descriptions-boxable.csv' to '/root/fiftyone/open-images-v7/train/metadata/classes.csv'


Downloading 'https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy.json' to '/tmp/tmpbw47epcc/metadata/hierarchy.json'


INFO:fiftyone.utils.openimages:Downloading 'https://storage.googleapis.com/openimages/2018_04/bbox_labels_600_hierarchy.json' to '/tmp/tmpbw47epcc/metadata/hierarchy.json'


Downloading 'https://storage.googleapis.com/openimages/v6/oidv6-train-annotations-bbox.csv' to '/root/fiftyone/open-images-v7/train/labels/detections.csv'


INFO:fiftyone.utils.openimages:Downloading 'https://storage.googleapis.com/openimages/v6/oidv6-train-annotations-bbox.csv' to '/root/fiftyone/open-images-v7/train/labels/detections.csv'


Downloading 1000 images


INFO:fiftyone.utils.openimages:Downloading 1000 images


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.2m elapsed, 0s remaining, 6.7 files/s]       


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.2m elapsed, 0s remaining, 6.7 files/s]       


Dataset info written to '/root/fiftyone/open-images-v7/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/root/fiftyone/open-images-v7/info.json'


Ignoring unsupported parameter 'download' for importer type <class 'fiftyone.utils.openimages.OpenImagesV7DatasetImporter'>




Loading 'open-images-v7' split 'train'


INFO:fiftyone.zoo.datasets:Loading 'open-images-v7' split 'train'


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [3.6s elapsed, 0s remaining, 300.1 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [3.6s elapsed, 0s remaining, 300.1 samples/s]      


Dataset 'oi_species_bird' created


INFO:fiftyone.zoo.datasets:Dataset 'oi_species_bird' created


  ‚úì Downloaded 1000 samples for bird
Downloading 1000 samples for 'Squirrel' ‚Üí class 4 (squirrel)...
Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


INFO:fiftyone.zoo.datasets:Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


Downloading 1000 images


INFO:fiftyone.utils.openimages:Downloading 1000 images


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.4m elapsed, 0s remaining, 7.9 files/s]       


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.4m elapsed, 0s remaining, 7.9 files/s]       


Dataset info written to '/root/fiftyone/open-images-v7/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/root/fiftyone/open-images-v7/info.json'


Ignoring unsupported parameter 'download' for importer type <class 'fiftyone.utils.openimages.OpenImagesV7DatasetImporter'>




Loading 'open-images-v7' split 'train'


INFO:fiftyone.zoo.datasets:Loading 'open-images-v7' split 'train'


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.4s elapsed, 0s remaining, 410.4 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.4s elapsed, 0s remaining, 410.4 samples/s]      


Dataset 'oi_species_squirrel' created


INFO:fiftyone.zoo.datasets:Dataset 'oi_species_squirrel' created


  ‚úì Downloaded 1000 samples for squirrel
Downloading 1000 samples for 'Dog' ‚Üí class 5 (dog)...
Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


INFO:fiftyone.zoo.datasets:Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


Found 1 images, downloading the remaining 999


INFO:fiftyone.utils.openimages:Found 1 images, downloading the remaining 999


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 999/999 [2.2m elapsed, 0s remaining, 8.1 files/s]       


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 999/999 [2.2m elapsed, 0s remaining, 8.1 files/s]       


Dataset info written to '/root/fiftyone/open-images-v7/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/root/fiftyone/open-images-v7/info.json'


Ignoring unsupported parameter 'download' for importer type <class 'fiftyone.utils.openimages.OpenImagesV7DatasetImporter'>




Loading 'open-images-v7' split 'train'


INFO:fiftyone.zoo.datasets:Loading 'open-images-v7' split 'train'


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [3.6s elapsed, 0s remaining, 241.4 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [3.6s elapsed, 0s remaining, 241.4 samples/s]      


Dataset 'oi_species_dog' created


INFO:fiftyone.zoo.datasets:Dataset 'oi_species_dog' created


  ‚úì Downloaded 1000 samples for dog
Downloading 1000 samples for 'Cat' ‚Üí class 6 (cat)...
Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


INFO:fiftyone.zoo.datasets:Downloading split 'train' to '/root/fiftyone/open-images-v7/train' if necessary


Found 9 images, downloading the remaining 991


INFO:fiftyone.utils.openimages:Found 9 images, downloading the remaining 991


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 991/991 [2.2m elapsed, 0s remaining, 8.6 files/s]       


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 991/991 [2.2m elapsed, 0s remaining, 8.6 files/s]       


Dataset info written to '/root/fiftyone/open-images-v7/info.json'


INFO:fiftyone.zoo.datasets:Dataset info written to '/root/fiftyone/open-images-v7/info.json'


Ignoring unsupported parameter 'download' for importer type <class 'fiftyone.utils.openimages.OpenImagesV7DatasetImporter'>




Loading 'open-images-v7' split 'train'


INFO:fiftyone.zoo.datasets:Loading 'open-images-v7' split 'train'


 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.8s elapsed, 0s remaining, 342.7 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.8s elapsed, 0s remaining, 342.7 samples/s]      


Dataset 'oi_species_cat' created


INFO:fiftyone.zoo.datasets:Dataset 'oi_species_cat' created


  ‚úì Downloaded 1000 samples for cat


### Convert to YOLO format with correct class IDs

In [5]:
# Convert Open Images to YOLO format with CORRECT class IDs (3-6)
import glob, shutil, os
import fiftyone as fo
from fiftyone.utils.yolo import YOLOv5DatasetExporter

# Load the data.yaml we created earlier
with open('datasets/hybrid/data.yaml', 'r') as f:
    import yaml
    data_config = yaml.safe_load(f)

# Mapping from Open Images labels ‚Üí our YOLO class IDs
# CRITICAL: Species get IDs 3-6, NOT 0-3
oi_to_yolo_id = {
    "Bird": 3,      # bird
    "Squirrel": 4,  # squirrel
    "Dog": 5,       # dog
    "Cat": 6        # cat
}

# Process each downloaded dataset
for ds_name in fo.list_datasets():
    if not ds_name.startswith('oi_species_'):
        continue

    print(f"Processing {ds_name}...")
    dataset = fo.load_dataset(ds_name)

    # Remap labels to our class names
    for sample in dataset.iter_samples(autosave=True):
        if sample.ground_truth is None:
            continue

        filtered_detections = []
        for det in sample.ground_truth.detections:
            if det.label in oi_to_yolo_id:
                # Map to our YOLO class name
                mapped_label = data_config['names'][oi_to_yolo_id[det.label]]
                det.label = mapped_label
                filtered_detections.append(det)
        sample.ground_truth.detections = filtered_detections

    # Set default classes to our 7-class list
    dataset.default_classes = data_config['names']

    # Export to YOLO format
    export_dir = f"export_{ds_name}"
    if os.path.exists(export_dir):
        shutil.rmtree(export_dir)

    exporter = YOLOv5DatasetExporter(
        export_dir=export_dir,
        classes=data_config['names'],  # Use our 7-class list
        export_media=True
    )
    dataset.export(dataset_exporter=exporter, label_field="ground_truth")

    # Move files to main dataset folder
    for split in ['train', 'val', 'test']:
        imgs_src = os.path.join(export_dir, 'images', split)
        lbls_src = os.path.join(export_dir, 'labels', split)

        if os.path.exists(imgs_src):
            for f in glob.glob(imgs_src + '/*'):
                shutil.copy(f, 'datasets/hybrid/images/')
        if os.path.exists(lbls_src):
            for f in glob.glob(lbls_src + '/*'):
                shutil.copy(f, 'datasets/hybrid/labels/')

    shutil.rmtree(export_dir)
    print(f"  ‚úì Exported {ds_name}")

print("\n‚úì Export complete. Verifying class IDs in labels...")

# Verify class IDs are correct (should be 3-6 only for species data)
from collections import Counter
class_counts = Counter()
for label_file in glob.glob('datasets/hybrid/labels/*.txt'):
    with open(label_file, 'r') as f:
        for line in f:
            parts = line.strip().split()
            if parts:
                class_id = int(parts[0])
                class_counts[class_id] += 1

print("\nClass distribution in labels:")
for cls_id in sorted(class_counts.keys()):
    cls_name = data_config['names'][cls_id] if cls_id < len(data_config['names']) else 'UNKNOWN'
    print(f"  {cls_id}: {cls_name} = {class_counts[cls_id]} boxes")


Processing oi_species_bird...
 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [8.2s elapsed, 0s remaining, 162.3 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [8.2s elapsed, 0s remaining, 162.3 samples/s]      


  ‚úì Exported oi_species_bird
Processing oi_species_cat...
 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.8s elapsed, 0s remaining, 320.6 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [2.8s elapsed, 0s remaining, 320.6 samples/s]      


  ‚úì Exported oi_species_cat
Processing oi_species_dog...
 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [3.9s elapsed, 0s remaining, 107.4 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [3.9s elapsed, 0s remaining, 107.4 samples/s]      


  ‚úì Exported oi_species_dog
Processing oi_species_squirrel...
 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [7.1s elapsed, 0s remaining, 259.6 samples/s]      


INFO:eta.core.utils: 100% |‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà‚ñà| 1000/1000 [7.1s elapsed, 0s remaining, 259.6 samples/s]      


  ‚úì Exported oi_species_squirrel

‚úì Export complete. Verifying class IDs in labels...

Class distribution in labels:
  3: bird = 2585 boxes
  4: squirrel = 1069 boxes
  5: dog = 1486 boxes
  6: cat = 1187 boxes


## 4) Optional: Pseudo-label Backyard Images with MDV6

If you have unlabeled backyard images, use MDV6 to generate pseudo-labels.

**Class ID mapping from MDV6**:
- MDV6 class 0 (animal) ‚Üí Our class 0 (animal) - generic detection
- MDV6 class 1 (person) ‚Üí Our class 1 (person)
- MDV6 class 2 (vehicle) ‚Üí Our class 2 (vehicle)

This adds training data for the generic classes to reinforce MDV6's existing capabilities.


In [6]:
# Pseudo-labeling with MDV6 for generic classes (animal, person, vehicle)
from ultralytics import YOLO
import shutil, cv2, os
from pathlib import Path
from tqdm import tqdm

mdv6_checkpoint = "MDV6-yolov10-c.pt"
images_to_label_dir = Path("datasets/hybrid/images_to_label")
output_images_dir = Path("datasets/hybrid/images")
output_labels_dir = Path("datasets/hybrid/labels")

# Create directories
images_to_label_dir.mkdir(parents=True, exist_ok=True)
output_images_dir.mkdir(parents=True, exist_ok=True)
output_labels_dir.mkdir(parents=True, exist_ok=True)

images_to_label = list(images_to_label_dir.glob("*.*"))
print(f"Found {len(images_to_label)} images to pseudo-label")
print("(Place unlabeled images in datasets/hybrid/images_to_label/)")

if len(images_to_label) > 0:
    # Download checkpoint if needed
    if not os.path.exists(mdv6_checkpoint):
        print(f"Downloading {mdv6_checkpoint}...")
        !wget -q https://zenodo.org/records/15398270/files/MDV6-yolov10-c.pt?download=1 -O {mdv6_checkpoint}

    model = YOLO(mdv6_checkpoint)

    # MDV6 classes: 0=animal, 1=person, 2=vehicle
    # These map DIRECTLY to our classes 0, 1, 2
    for img_path in tqdm(images_to_label, desc="Pseudo-labeling"):
        result = model.predict(str(img_path), conf=0.3, imgsz=1280, verbose=False)[0]

        # Get image dimensions
        img = cv2.imread(str(img_path))
        if img is None:
            continue
        h, w = img.shape[:2]

        # Copy image
        shutil.copy(img_path, output_images_dir / img_path.name)

        # Generate label file
        label_path = output_labels_dir / (img_path.stem + ".txt")
        with open(label_path, 'w') as f:
            if result.boxes is not None and len(result.boxes) > 0:
                for box in result.boxes:
                    # Get box coordinates (xyxy format)
                    x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
                    cls_id = int(box.cls[0].cpu().numpy())

                    # Only keep MDV6 classes 0, 1, 2 (animal, person, vehicle)
                    if cls_id > 2:
                        continue

                    # Convert to YOLO format (normalized cx, cy, w, h)
                    cx = ((x1 + x2) / 2) / w
                    cy = ((y1 + y2) / 2) / h
                    bw = (x2 - x1) / w
                    bh = (y2 - y1) / h

                    f.write(f"{cls_id} {cx:.6f} {cy:.6f} {bw:.6f} {bh:.6f}\n")

    print("‚úì Pseudo-labeling complete")
else:
    print("No images to pseudo-label. Skipping.")


Creating new Ultralytics Settings v0.0.6 file ‚úÖ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
Found 0 images to pseudo-label
(Place unlabeled images in datasets/hybrid/images_to_label/)
No images to pseudo-label. Skipping.


## 5) Create train/val/test splits

In [7]:
# Create train/val/test splits
import random, shutil, glob
from pathlib import Path

random.seed(42)

images_dir = Path("datasets/hybrid/images")
labels_dir = Path("datasets/hybrid/labels")

# Find all image files with matching labels
images = [p for p in images_dir.glob("*.*")
          if p.suffix.lower() in ['.jpg', '.jpeg', '.png'] and p.is_file()]

pairs = []
for img in images:
    label = labels_dir / (img.stem + ".txt")
    if label.exists():
        pairs.append((img, label))

print(f"Found {len(pairs)} image/label pairs")

# Shuffle and split (80/10/10)
random.shuffle(pairs)
n = len(pairs)
train_n = int(n * 0.8)
val_n = int(n * 0.1)

train_pairs = pairs[:train_n]
val_pairs = pairs[train_n:train_n + val_n]
test_pairs = pairs[train_n + val_n:]

print(f"Split: train={len(train_pairs)}, val={len(val_pairs)}, test={len(test_pairs)}")

def move_pairs(pairs, split_name):
    img_dir = Path(f"datasets/hybrid/images/{split_name}")
    lbl_dir = Path(f"datasets/hybrid/labels/{split_name}")
    img_dir.mkdir(parents=True, exist_ok=True)
    lbl_dir.mkdir(parents=True, exist_ok=True)

    for img, lbl in pairs:
        shutil.move(str(img), img_dir / img.name)
        shutil.move(str(lbl), lbl_dir / lbl.name)

move_pairs(train_pairs, "train")
move_pairs(val_pairs, "val")
move_pairs(test_pairs, "test")

print("‚úì Created train/val/test splits")


Found 3990 image/label pairs
Split: train=3192, val=399, test=399
‚úì Created train/val/test splits


## 6) Training: Fine-tune MDV6 for Hybrid Detection

**Key training parameters for preserving generic detection:**

- `freeze=10`: Freeze backbone layers to preserve MDV6's learned features
- `lr0=0.001`: Low learning rate to prevent overwriting base knowledge
- `lrf=0.01`: Learning rate decay

The frozen backbone retains the ability to detect generic animals, people, and vehicles,
while the head learns to classify the 4 specific species.


In [8]:
# Fine-tune MDV6 for hybrid detection
from ultralytics import YOLO
import yaml
import os

# Paths
mdv6_checkpoint = "MDV6-yolov10-c.pt"
data_yaml_path = "datasets/hybrid/data.yaml"

# Download checkpoint if needed
if not os.path.exists(mdv6_checkpoint):
    print(f"Downloading {mdv6_checkpoint}...")
    !wget -q https://zenodo.org/records/15398270/files/MDV6-yolov10-c.pt?download=1 -O {mdv6_checkpoint}

# Update data.yaml with absolute paths
base_dir = os.path.abspath('datasets/hybrid')
with open(data_yaml_path, 'r') as f:
    data_config = yaml.safe_load(f)

data_config['train'] = os.path.join(base_dir, 'images', 'train')
data_config['val'] = os.path.join(base_dir, 'images', 'val')
data_config['test'] = os.path.join(base_dir, 'images', 'test')

with open(data_yaml_path, 'w') as f:
    yaml.dump(data_config, f)

print("Data config:")
print(yaml.dump(data_config))

# Load MDV6 and fine-tune
model = YOLO(mdv6_checkpoint)

# Train with frozen backbone to preserve generic detection
model.train(
    data=data_yaml_path,
    epochs=100,
    imgsz=640,
    batch=8,
    device=0,
    name="yolov10_hybrid_7class",
    freeze=10,      # CRITICAL: Freeze backbone to preserve MDV6 features
    lr0=0.001,      # Low learning rate for fine-tuning
    lrf=0.01,       # Final LR = lr0 * lrf
    patience=20,    # Early stopping patience
    save=True,
    plots=True
)


Downloading MDV6-yolov10-c.pt...
Data config:
names:
- animal
- person
- vehicle
- bird
- squirrel
- dog
- cat
nc: 7
test: /content/datasets/hybrid/images/test
train: /content/datasets/hybrid/images/train
val: /content/datasets/hybrid/images/val

Ultralytics 8.3.235 üöÄ Python-3.12.12 torch-2.9.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=8, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=datasets/hybrid/data.yaml, degrees=0.0, deterministic=True, device=0, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=100, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=10, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.001, lrf=0.01, ma

ultralytics.utils.metrics.DetMetrics object with attributes:

ap_class_index: array([3, 4, 5, 6])
box: ultralytics.utils.metrics.Metric object
confusion_matrix: <ultralytics.utils.metrics.ConfusionMatrix object at 0x7e314c4da720>
curves: ['Precision-Recall(B)', 'F1-Confidence(B)', 'Precision-Confidence(B)', 'Recall-Confidence(B)']
curves_results: [[array([          0,    0.001001,    0.002002,    0.003003,    0.004004,    0.005005,    0.006006,    0.007007,    0.008008,    0.009009,     0.01001,    0.011011,    0.012012,    0.013013,    0.014014,    0.015015,    0.016016,    0.017017,    0.018018,    0.019019,     0.02002,    0.021021,    0.022022,    0.023023,
          0.024024,    0.025025,    0.026026,    0.027027,    0.028028,    0.029029,     0.03003,    0.031031,    0.032032,    0.033033,    0.034034,    0.035035,    0.036036,    0.037037,    0.038038,    0.039039,     0.04004,    0.041041,    0.042042,    0.043043,    0.044044,    0.045045,    0.046046,    0.047047,
          0

## 7) Evaluate the trained model

In [9]:
# Evaluate the trained hybrid model
from ultralytics import YOLO
import os

# Path to trained model
trained_model_path = "runs/detect/yolov10_hybrid_7class/weights/best.pt"

if not os.path.exists(trained_model_path):
    print(f"Model not found at {trained_model_path}")
    print("Available runs:")
    for d in os.listdir("runs/detect"):
        print(f"  - runs/detect/{d}")
else:
    model = YOLO(trained_model_path)

    # Evaluate on test set
    print("\nEvaluating on test set...")
    metrics = model.val(data="datasets/hybrid/data.yaml", imgsz=640, batch=8)

    print("\n" + "="*60)
    print("HYBRID DETECTION MODEL EVALUATION")
    print("="*60)
    print(f"mAP50: {metrics.box.map50:.4f}")
    print(f"mAP50-95: {metrics.box.map:.4f}")
    print("\nPer-class AP50:")
    class_names = ["animal", "person", "vehicle", "bird", "squirrel", "dog", "cat"]
    for i, ap in enumerate(metrics.box.ap50):
        source = "(MDV6 base)" if i < 3 else "(Open Images)"
        print(f"  {i}: {class_names[i]:10s} = {ap:.4f} {source}")



Evaluating on test set...
Ultralytics 8.3.235 üöÄ Python-3.12.12 torch-2.9.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
YOLOv10n summary (fused): 102 layers, 2,266,533 parameters, 0 gradients, 6.5 GFLOPs
[34m[1mval: [0mFast image access ‚úÖ (ping: 0.0¬±0.0 ms, read: 3229.8¬±1016.1 MB/s, size: 335.1 KB)
[K[34m[1mval: [0mScanning /content/datasets/hybrid/labels/val.cache... 399 images, 0 backgrounds, 0 corrupt: 100% ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ 399/399 730.8Kit/s 0.0s
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ 50/50 8.6it/s 5.8s
                   all        399        571      0.738      0.731      0.779      0.563
                  bird        105        217      0.691      0.544      0.606      0.349
              squirrel         90         92      0.795      0.902      0.922      0.705
                   dog        121        169      0.754      0.757      0.798      0.587
           

## 8) Run inference on test images

In [10]:
# Run inference on test images
from ultralytics import YOLO
import glob
import os
import matplotlib.pyplot as plt
from PIL import Image

trained_model_path = "runs/detect/yolov10_hybrid_7class/weights/best.pt"
model = YOLO(trained_model_path)

# Get test images
test_images = glob.glob("datasets/hybrid/images/test/*.*")[:8]
print(f"Running inference on {len(test_images)} test images...")

# Create output directory
os.makedirs("datasets/hybrid/images/inference", exist_ok=True)

# Run inference
for img_path in test_images:
    results = model.predict(img_path, imgsz=640, conf=0.25, verbose=False)

    # Save annotated image
    out_path = f"datasets/hybrid/images/inference/{os.path.basename(img_path)}"
    results[0].save(out_path)

    # Print detections
    print(f"\n{os.path.basename(img_path)}:")
    if results[0].boxes is not None:
        for box in results[0].boxes:
            cls_id = int(box.cls[0])
            conf = float(box.conf[0])
            cls_name = model.names[cls_id]
            print(f"  - {cls_name}: {conf:.2f}")

print("\n‚úì Inference complete. Results saved to datasets/hybrid/images/inference/")


Running inference on 8 test images...

05604708a9206b6a.jpg:
  - cat: 0.81
  - dog: 0.75

02cf6b111ab7eda4.jpg:
  - dog: 0.96
  - dog: 0.36

01778c085222c3a8.jpg:
  - bird: 0.88

03371fc8fa17150d.jpg:

09971e3906d3a830.jpg:
  - cat: 0.91

08a161bb6fea0f3d.jpg:
  - cat: 0.50
  - bird: 0.46
  - cat: 0.44

0018a7c115fa7a5b.jpg:
  - dog: 0.93
  - dog: 0.49

06a3a3fcb462d601.jpg:
  - bird: 0.91
  - bird: 0.71
  - bird: 0.50

‚úì Inference complete. Results saved to datasets/hybrid/images/inference/


## 9) Export to ONNX

In [11]:
# Export to ONNX format
from ultralytics import YOLO

trained_model_path = "runs/detect/yolov10_hybrid_7class/weights/best.pt"
model = YOLO(trained_model_path)

# Export to ONNX
model.export(format='onnx', imgsz=640, simplify=True)

print("\n‚úì Exported to ONNX format")
print("Model file: runs/detect/yolov10_hybrid_7class/weights/best.onnx")


Ultralytics 8.3.235 üöÄ Python-3.12.12 torch-2.9.0+cu126 CPU (Intel Xeon CPU @ 2.00GHz)
üí° ProTip: Export to OpenVINO format for best performance on Intel hardware. Learn more at https://docs.ultralytics.com/integrations/openvino/
YOLOv10n summary (fused): 102 layers, 2,266,533 parameters, 0 gradients, 6.5 GFLOPs

[34m[1mPyTorch:[0m starting from 'runs/detect/yolov10_hybrid_7class/weights/best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 300, 6) (5.5 MB)
[31m[1mrequirements:[0m Ultralytics requirements ['onnx>=1.12.0,<=1.19.1', 'onnxslim>=0.1.71', 'onnxruntime-gpu'] not found, attempting AutoUpdate...
Using Python 3.12.12 environment at: /usr
Resolved 14 packages in 211ms
Prepared 6 packages in 7.48s
Installed 6 packages in 264ms
 + colorama==0.4.6
 + coloredlogs==15.0.1
 + humanfriendly==10.0
 + onnx==1.19.1
 + onnxruntime-gpu==1.23.2
 + onnxslim==0.1.78

[31m[1mrequirements:[0m AutoUpdate success ‚úÖ 8.7s


[34m[1mONNX:[0m starting export with onn

## Summary: Hybrid Detection Model

The trained model can now:

1. **Detect generic classes** (preserved from MDV6 base):
   - `animal` (class 0): Any animal not specifically trained
   - `person` (class 1): Human detection
   - `vehicle` (class 2): Cars, trucks, etc.

2. **Detect specific species** (from Open Images fine-tuning):
   - `bird` (class 3)
   - `squirrel` (class 4)
   - `dog` (class 5)
   - `cat` (class 6)

**During inference:**
- A deer ‚Üí detected as `animal` (generic)
- A squirrel ‚Üí detected as `squirrel` (specific)
- A human ‚Üí detected as `person`
- A car ‚Üí detected as `vehicle`

---

**Files:**
- Trained model: `runs/detect/yolov10_hybrid_7class/weights/best.pt`
- ONNX export: `runs/detect/yolov10_hybrid_7class/weights/best.onnx`
- Labels JSON: Create with `{"0": "animal", "1": "person", "2": "vehicle", "3": "bird", "4": "squirrel", "5": "dog", "6": "cat"}`
