# Hybrid Detection Model: MegaDetector + Species Classification

This notebook fine-tunes the **MDV6-yolov10-c** checkpoint to create a hybrid detection model with **7 classes**:

| Class ID | Label | Description |
|----------|-------|-------------|
| 0 | animal | Generic animal (from MDV6 base) |
| 1 | person | Human detection (from MDV6 base) |
| 2 | vehicle | Vehicle detection (from MDV6 base) |
| 3 | bird | Species-specific (Open Images) |
| 4 | squirrel | Species-specific (Open Images) |
| 5 | dog | Species-specific (Open Images) |
| 6 | cat | Species-specific (Open Images) |

**Strategy**: The pretrained MDV6 checkpoint already detects `animal`, `person`, and `vehicle`. We fine-tune with frozen backbone layers to:
- **Preserve** generic detection for unseen animals, people, and vehicles
- **Add** species-specific detection for bird, squirrel, dog, cat

**What this notebook does**:
- Download Open Images data for the 4 species (bird, squirrel, dog, cat)
- Convert to YOLO format with correct class IDs (3, 4, 5, 6)
- Fine-tune MDV6 with frozen backbone to prevent catastrophic forgetting
- Evaluate hybrid detection on test images

---


## 1) Setup: Install dependencies

In [None]:
# Install required packages
# If running in Colab, keep `--upgrade` flags; in local environment, you may want to control versions.
!pip install -U ultralytics fiftyone fiftyone-brain fiftyone-db tqdm

# Optional: roboflow or fiftyone may help for visualisation and dataset downloads
# !pip install roboflow

In [None]:
# Uninstall current installations to avoid conflicts
!pip uninstall -y ultralytics torch torchvision torchaudio

# Install PyTorch with CUDA 12.6 (ensure this matches your Colab GPU's CUDA version)
# The previous output showed torch-2.9.0+cu126, so targeting CUDA 12.6
!pip install torch==2.9.0+cu126 torchvision==0.24.0+cu126 torchaudio==2.9.0 --index-url https://download.pytorch.org/whl/cu126

# Reinstall ultralytics
!pip install -U ultralytics

# Verify GPU availability after reinstallation
import torch
print(f"torch.cuda.is_available(): {torch.cuda.is_available()}")
print(f"torch.cuda.device_count(): {torch.cuda.device_count()}")

## 2) Define 7-class structure and create `data.yaml`

In [None]:
# Define the 7-class structure for hybrid detection
# Classes 0-2 are preserved from MDV6 base model
# Classes 3-6 are species-specific from Open Images fine-tuning
import yaml, os

# IMPORTANT: Class order must match MDV6 base (0=animal, 1=person, 2=vehicle)
classes = ["animal", "person", "vehicle", "bird", "squirrel", "dog", "cat"]

data_yaml = {
    'train': 'datasets/hybrid/images/train',
    'val': 'datasets/hybrid/images/val',
    'test': 'datasets/hybrid/images/test',
    'nc': len(classes),
    'names': classes
}

os.makedirs('datasets/hybrid', exist_ok=True)
with open('datasets/hybrid/data.yaml','w') as f:
    yaml.dump(data_yaml, f)

print('Created datasets/hybrid/data.yaml with 7 classes:')
for i, cls in enumerate(classes):
    source = 'MDV6 base' if i < 3 else 'Open Images'
    print(f'  {i}: {cls} ({source})')


## 3) Download Open Images for the 4 Species

We **only download data for the 4 species** (bird, squirrel, dog, cat).

**Why not download animal/person/vehicle?**
- The MDV6 checkpoint already has strong features for these generic classes
- We use `freeze=10` during training to preserve the backbone
- This prevents catastrophic forgetting of the base detection capabilities

The fine-tuning teaches the model to:
1. Recognize the 4 specific species (bird, squirrel, dog, cat)
2. Keep detecting generic animals, people, and vehicles from the frozen backbone


In [None]:
# Download Open Images for the 4 species only
# We DO NOT need to retrain on animal/person/vehicle - MDV6 base handles those
from fiftyone import zoo as foz
import fiftyone as fo
import os

# Only the 4 species we want to add to MDV6's detection
species_classes = ["bird", "squirrel", "dog", "cat"]

# Map to Open Images label strings (case-sensitive)
oi_class_map = {
    "bird": "Bird",
    "squirrel": "Squirrel",
    "dog": "Dog",
    "cat": "Cat"
}

# YOLO class IDs for the species (classes 3-6 in our 7-class system)
species_class_ids = {
    "bird": 3,
    "squirrel": 4,
    "dog": 5,
    "cat": 6
}

max_samples_per_class = 1000

os.makedirs('datasets/hybrid/images', exist_ok=True)
os.makedirs('datasets/hybrid/labels', exist_ok=True)

for cls in species_classes:
    oi_label = oi_class_map[cls]
    yolo_id = species_class_ids[cls]
    print(f"Downloading {max_samples_per_class} samples for '{oi_label}' → class {yolo_id} ({cls})...")
    try:
        dataset = foz.load_zoo_dataset(
            "open-images-v7",
            split="train",
            label_types=["detections"],
            classes=[oi_label],
            max_samples=max_samples_per_class,
            dataset_name=f"oi_species_{cls}",
            seed=51,
            download=True
        )
        print(f"  ✓ Downloaded {len(dataset)} samples for {cls}")
    except Exception as e:
        print(f"  ✗ Error downloading {cls}: {e}")


### Convert to YOLO format with correct class IDs

In [None]:
# Convert Open Images to YOLO format with CORRECT class IDs (3-6)
import glob, shutil, os
import fiftyone as fo
from fiftyone.utils.yolo import YOLOv5DatasetExporter

# Load the data.yaml we created earlier
with open('datasets/hybrid/data.yaml', 'r') as f:
    import yaml
    data_config = yaml.safe_load(f)

# Mapping from Open Images labels → our YOLO class IDs
# CRITICAL: Species get IDs 3-6, NOT 0-3
oi_to_yolo_id = {
    "Bird": 3,      # bird
    "Squirrel": 4,  # squirrel
    "Dog": 5,       # dog
    "Cat": 6        # cat
}

# Process each downloaded dataset
for ds_name in fo.list_datasets():
    if not ds_name.startswith('oi_species_'):
        continue
    
    print(f"Processing {ds_name}...")
    dataset = fo.load_dataset(ds_name)
    
    # Remap labels to our class names
    for sample in dataset.iter_samples(autosave=True):
        if sample.ground_truth is None:
            continue
        
        filtered_detections = []
        for det in sample.ground_truth.detections:
            if det.label in oi_to_yolo_id:
                # Map to our YOLO class name
                mapped_label = data_config['names'][oi_to_yolo_id[det.label]]
                det.label = mapped_label
                filtered_detections.append(det)
        sample.ground_truth.detections = filtered_detections
    
    # Set default classes to our 7-class list
    dataset.default_classes = data_config['names']
    
    # Export to YOLO format
    export_dir = f"export_{ds_name}"
    if os.path.exists(export_dir):
        shutil.rmtree(export_dir)
    
    exporter = YOLOv5DatasetExporter(
        export_dir=export_dir,
        classes=data_config['names'],  # Use our 7-class list
        export_media=True
    )
    dataset.export(dataset_exporter=exporter, label_field="ground_truth")
    
    # Move files to main dataset folder
    for split in ['train', 'val', 'test']:
        imgs_src = os.path.join(export_dir, 'images', split)
        lbls_src = os.path.join(export_dir, 'labels', split)
        
        if os.path.exists(imgs_src):
            for f in glob.glob(imgs_src + '/*'):
                shutil.copy(f, 'datasets/hybrid/images/')
        if os.path.exists(lbls_src):
            for f in glob.glob(lbls_src + '/*'):
                shutil.copy(f, 'datasets/hybrid/labels/')
    
    shutil.rmtree(export_dir)
    print(f"  ✓ Exported {ds_name}")

print("\n✓ Export complete. Verifying class IDs in labels...")

# Verify class IDs are correct (should be 3-6 only for species data)
from collections import Counter
class_counts = Counter()
for label_file in glob.glob('datasets/hybrid/labels/*.txt'):
    with open(label_file, 'r') as f:
        for line in f:
            parts = line.strip().split()
            if parts:
                class_id = int(parts[0])
                class_counts[class_id] += 1

print("\nClass distribution in labels:")
for cls_id in sorted(class_counts.keys()):
    cls_name = data_config['names'][cls_id] if cls_id < len(data_config['names']) else 'UNKNOWN'
    print(f"  {cls_id}: {cls_name} = {class_counts[cls_id]} boxes")


## 4) Optional: Pseudo-label Backyard Images with MDV6

If you have unlabeled backyard images, use MDV6 to generate pseudo-labels.

**Class ID mapping from MDV6**:
- MDV6 class 0 (animal) → Our class 0 (animal) - generic detection
- MDV6 class 1 (person) → Our class 1 (person)
- MDV6 class 2 (vehicle) → Our class 2 (vehicle)

This adds training data for the generic classes to reinforce MDV6's existing capabilities.


In [None]:
# Pseudo-labeling with MDV6 for generic classes (animal, person, vehicle)
from ultralytics import YOLO
import shutil, cv2, os
from pathlib import Path
from tqdm import tqdm

mdv6_checkpoint = "MDV6-yolov10-c.pt"
images_to_label_dir = Path("datasets/hybrid/images_to_label")
output_images_dir = Path("datasets/hybrid/images")
output_labels_dir = Path("datasets/hybrid/labels")

# Create directories
images_to_label_dir.mkdir(parents=True, exist_ok=True)
output_images_dir.mkdir(parents=True, exist_ok=True)
output_labels_dir.mkdir(parents=True, exist_ok=True)

images_to_label = list(images_to_label_dir.glob("*.*"))
print(f"Found {len(images_to_label)} images to pseudo-label")
print("(Place unlabeled images in datasets/hybrid/images_to_label/)")

if len(images_to_label) > 0:
    # Download checkpoint if needed
    if not os.path.exists(mdv6_checkpoint):
        print(f"Downloading {mdv6_checkpoint}...")
        !wget -q https://zenodo.org/records/15398270/files/MDV6-yolov10-c.pt?download=1 -O {mdv6_checkpoint}
    
    model = YOLO(mdv6_checkpoint)
    
    # MDV6 classes: 0=animal, 1=person, 2=vehicle
    # These map DIRECTLY to our classes 0, 1, 2
    for img_path in tqdm(images_to_label, desc="Pseudo-labeling"):
        result = model.predict(str(img_path), conf=0.3, imgsz=1280, verbose=False)[0]
        
        # Get image dimensions
        img = cv2.imread(str(img_path))
        if img is None:
            continue
        h, w = img.shape[:2]
        
        # Copy image
        shutil.copy(img_path, output_images_dir / img_path.name)
        
        # Generate label file
        label_path = output_labels_dir / (img_path.stem + ".txt")
        with open(label_path, 'w') as f:
            if result.boxes is not None and len(result.boxes) > 0:
                for box in result.boxes:
                    # Get box coordinates (xyxy format)
                    x1, y1, x2, y2 = box.xyxy[0].cpu().numpy()
                    cls_id = int(box.cls[0].cpu().numpy())
                    
                    # Only keep MDV6 classes 0, 1, 2 (animal, person, vehicle)
                    if cls_id > 2:
                        continue
                    
                    # Convert to YOLO format (normalized cx, cy, w, h)
                    cx = ((x1 + x2) / 2) / w
                    cy = ((y1 + y2) / 2) / h
                    bw = (x2 - x1) / w
                    bh = (y2 - y1) / h
                    
                    f.write(f"{cls_id} {cx:.6f} {cy:.6f} {bw:.6f} {bh:.6f}\n")
    
    print("✓ Pseudo-labeling complete")
else:
    print("No images to pseudo-label. Skipping.")


## 5) Create train/val/test splits

In [None]:
# Create train/val/test splits
import random, shutil, glob
from pathlib import Path

random.seed(42)

images_dir = Path("datasets/hybrid/images")
labels_dir = Path("datasets/hybrid/labels")

# Find all image files with matching labels
images = [p for p in images_dir.glob("*.*") 
          if p.suffix.lower() in ['.jpg', '.jpeg', '.png'] and p.is_file()]

pairs = []
for img in images:
    label = labels_dir / (img.stem + ".txt")
    if label.exists():
        pairs.append((img, label))

print(f"Found {len(pairs)} image/label pairs")

# Shuffle and split (80/10/10)
random.shuffle(pairs)
n = len(pairs)
train_n = int(n * 0.8)
val_n = int(n * 0.1)

train_pairs = pairs[:train_n]
val_pairs = pairs[train_n:train_n + val_n]
test_pairs = pairs[train_n + val_n:]

print(f"Split: train={len(train_pairs)}, val={len(val_pairs)}, test={len(test_pairs)}")

def move_pairs(pairs, split_name):
    img_dir = Path(f"datasets/hybrid/images/{split_name}")
    lbl_dir = Path(f"datasets/hybrid/labels/{split_name}")
    img_dir.mkdir(parents=True, exist_ok=True)
    lbl_dir.mkdir(parents=True, exist_ok=True)
    
    for img, lbl in pairs:
        shutil.move(str(img), img_dir / img.name)
        shutil.move(str(lbl), lbl_dir / lbl.name)

move_pairs(train_pairs, "train")
move_pairs(val_pairs, "val")
move_pairs(test_pairs, "test")

print("✓ Created train/val/test splits")


## 6) Training: Fine-tune MDV6 for Hybrid Detection

**Key training parameters for preserving generic detection:**

- `freeze=10`: Freeze backbone layers to preserve MDV6's learned features
- `lr0=0.001`: Low learning rate to prevent overwriting base knowledge
- `lrf=0.01`: Learning rate decay

The frozen backbone retains the ability to detect generic animals, people, and vehicles,
while the head learns to classify the 4 specific species.


In [None]:
# Fine-tune MDV6 for hybrid detection
from ultralytics import YOLO
import yaml
import os

# Paths
mdv6_checkpoint = "MDV6-yolov10-c.pt"
data_yaml_path = "datasets/hybrid/data.yaml"

# Download checkpoint if needed
if not os.path.exists(mdv6_checkpoint):
    print(f"Downloading {mdv6_checkpoint}...")
    !wget -q https://zenodo.org/records/15398270/files/MDV6-yolov10-c.pt?download=1 -O {mdv6_checkpoint}

# Update data.yaml with absolute paths
base_dir = os.path.abspath('datasets/hybrid')
with open(data_yaml_path, 'r') as f:
    data_config = yaml.safe_load(f)

data_config['train'] = os.path.join(base_dir, 'images', 'train')
data_config['val'] = os.path.join(base_dir, 'images', 'val')
data_config['test'] = os.path.join(base_dir, 'images', 'test')

with open(data_yaml_path, 'w') as f:
    yaml.dump(data_config, f)

print("Data config:")
print(yaml.dump(data_config))

# Load MDV6 and fine-tune
model = YOLO(mdv6_checkpoint)

# Train with frozen backbone to preserve generic detection
model.train(
    data=data_yaml_path,
    epochs=100,
    imgsz=640,
    batch=8,
    device=0,
    name="yolov10_hybrid_7class",
    freeze=10,      # CRITICAL: Freeze backbone to preserve MDV6 features
    lr0=0.001,      # Low learning rate for fine-tuning
    lrf=0.01,       # Final LR = lr0 * lrf
    patience=20,    # Early stopping patience
    save=True,
    plots=True
)


## 7) Evaluate the trained model

In [None]:
# Evaluate the trained hybrid model
from ultralytics import YOLO
import os

# Path to trained model
trained_model_path = "runs/detect/yolov10_hybrid_7class/weights/best.pt"

if not os.path.exists(trained_model_path):
    print(f"Model not found at {trained_model_path}")
    print("Available runs:")
    for d in os.listdir("runs/detect"):
        print(f"  - runs/detect/{d}")
else:
    model = YOLO(trained_model_path)
    
    # Evaluate on test set
    print("\nEvaluating on test set...")
    metrics = model.val(data="datasets/hybrid/data.yaml", imgsz=640, batch=8)
    
    print("\n" + "="*60)
    print("HYBRID DETECTION MODEL EVALUATION")
    print("="*60)
    print(f"mAP50: {metrics.box.map50:.4f}")
    print(f"mAP50-95: {metrics.box.map:.4f}")
    print("\nPer-class AP50:")
    class_names = ["animal", "person", "vehicle", "bird", "squirrel", "dog", "cat"]
    for i, ap in enumerate(metrics.box.ap50):
        source = "(MDV6 base)" if i < 3 else "(Open Images)"
        print(f"  {i}: {class_names[i]:10s} = {ap:.4f} {source}")


## 8) Run inference on test images

In [None]:
# Run inference on test images
from ultralytics import YOLO
import glob
import os
import matplotlib.pyplot as plt
from PIL import Image

trained_model_path = "runs/detect/yolov10_hybrid_7class/weights/best.pt"
model = YOLO(trained_model_path)

# Get test images
test_images = glob.glob("datasets/hybrid/images/test/*.*")[:8]
print(f"Running inference on {len(test_images)} test images...")

# Create output directory
os.makedirs("datasets/hybrid/images/inference", exist_ok=True)

# Run inference
for img_path in test_images:
    results = model.predict(img_path, imgsz=640, conf=0.25, verbose=False)
    
    # Save annotated image
    out_path = f"datasets/hybrid/images/inference/{os.path.basename(img_path)}"
    results[0].save(out_path)
    
    # Print detections
    print(f"\n{os.path.basename(img_path)}:")
    if results[0].boxes is not None:
        for box in results[0].boxes:
            cls_id = int(box.cls[0])
            conf = float(box.conf[0])
            cls_name = model.names[cls_id]
            print(f"  - {cls_name}: {conf:.2f}")

print("\n✓ Inference complete. Results saved to datasets/hybrid/images/inference/")


## 9) Export to ONNX

In [None]:
# Export to ONNX format
from ultralytics import YOLO

trained_model_path = "runs/detect/yolov10_hybrid_7class/weights/best.pt"
model = YOLO(trained_model_path)

# Export to ONNX
model.export(format='onnx', imgsz=640, simplify=True)

print("\n✓ Exported to ONNX format")
print("Model file: runs/detect/yolov10_hybrid_7class/weights/best.onnx")


## Summary: Hybrid Detection Model

The trained model can now:

1. **Detect generic classes** (preserved from MDV6 base):
   - `animal` (class 0): Any animal not specifically trained
   - `person` (class 1): Human detection
   - `vehicle` (class 2): Cars, trucks, etc.

2. **Detect specific species** (from Open Images fine-tuning):
   - `bird` (class 3)
   - `squirrel` (class 4)
   - `dog` (class 5)
   - `cat` (class 6)

**During inference:**
- A deer → detected as `animal` (generic)
- A squirrel → detected as `squirrel` (specific)
- A human → detected as `person`
- A car → detected as `vehicle`

---

**Files:**
- Trained model: `runs/detect/yolov10_hybrid_7class/weights/best.pt`
- ONNX export: `runs/detect/yolov10_hybrid_7class/weights/best.onnx`
- Labels JSON: Create with `{"0": "animal", "1": "person", "2": "vehicle", "3": "bird", "4": "squirrel", "5": "dog", "6": "cat"}`
