# Detector Evaluation Notebook

This notebook evaluates the **MegaDetector v6** object detection model for animal/vehicle detection across multiple dataset splits, optimizes confidence thresholds, and analyzes performance under domain shift conditions.

## Main Steps
1. **Confidence Threshold Optimization**
   - Sweeps detection confidence values (0.05–0.5) on the validation set.
   - Plots **Precision–Recall curve** to help select a balanced high-recall threshold.

2. **Evaluation Across Domains & Splits**
   - Tests the detector on:
     - `cis_val` and `cis_test` (in-domain)
     - `trans_val` and `trans_test` (out-of-domain)
   - Computes:
     - Precision, Recall, F1-score
     - mAP@50 and mAP@50-95
     - Confusion matrices (saved as PNGs with class names from the dataset YAML)

3. **Domain Shift Analysis**
   - Compares performance drop from **cis → trans** test sets.
   - Highlights best-performing models per split.
   - Generates visual summaries:
     - F1 by model/domain
     - mAP@50 by model/domain
     - Precision vs Recall scatter plots
     - Domain shift impact bar chart

4. **Cropping Animal Detection Outputs**
   - Uses the trained detector to generate **per-species cropped images**.
   - Skips vehicles and saves crops to:
     ```
     ../data/megadetector_crops/
       ├── train/
       │   ├── species_1/
       │   ├── species_2/
       │   └── background/
       └── val/...
     ```
   - Assigns labels by checking if the **GT box center** falls inside the detector box, otherwise assigns "background".

5. **Model Export & Threshold Storage**
   - Exports MegaDetector v6 to **ONNX** format for deployment.
   - Appends chosen optimal threshold to `models/thresholds.txt`.

6. **ONNX Inference Benchmark**
   - Measures inference speed (ms/img, FPS) on the validation set.
   - Logs results to `models/inference_times.csv`.
   - Prints sample predictions for manual inspection.

## Purpose
This notebook validates and analyzes the detector's accuracy, robustness to domain shift, and computational performance, while also preparing **cropped datasets** and **deployment-ready ONNX models** for the next classification stage in the pipeline.


In [1]:
from ultralytics import YOLO
import numpy as np
import matplotlib.pyplot as plt
import os

det2   = YOLO("../scripts/train/megadetector_v6/megadetector_augmented/weights/best.pt")   # your 2-class ckpt
# your data.yaml that points at split/images and split/labels
DATA_YAML = "../configs\model\megadetector.yaml"      # 1 = vehicle, 0 = animal
assert os.path.exists(DATA_YAML)

ths     = np.linspace(0.05, 0.5, 10)
prec, rec = [], []

for t in ths:
    m = det2.val(data=DATA_YAML, split="val", conf=float(t),
                 iou=0.50, verbose=False)
    prec.append(float(m.box.mp))    # animal precision
    rec .append(float(m.box.mr))    # animal recall

plt.plot(rec, prec, marker='o'); plt.xlabel("Recall"); plt.ylabel("Precision")
best_t = ths[np.argmax(rec)]        # pick left-most ≥ 0.95 yourself
print(f"Chosen detector conf = {best_t:.2f}")


Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
YOLOv9c summary (fused): 156 layers, 25,320,790 parameters, 0 gradients, 102.3 GFLOPs
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 606.0452.8 MB/s, size: 122.1 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:00<00:00,  3.89it/s]


                   all       3736       3830      0.975       0.95      0.972      0.819
Speed: 0.2ms preprocess, 13.3ms inference, 0.0ms loss, 0.7ms postprocess per image
Results saved to [1mruns\detect\val59[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 863.8581.5 MB/s, size: 119.0 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:01<00:00,  3.81it/s]


                   all       3736       3830      0.975       0.95      0.972       0.82
Speed: 0.2ms preprocess, 13.9ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val60[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 848.3314.7 MB/s, size: 106.5 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:02<00:00,  3.75it/s]


                   all       3736       3830      0.975       0.95      0.972      0.821
Speed: 0.2ms preprocess, 14.2ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val61[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 971.7549.3 MB/s, size: 110.9 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:05<00:00,  3.59it/s]


                   all       3736       3830      0.975       0.95      0.971      0.821
Speed: 0.2ms preprocess, 14.3ms inference, 0.0ms loss, 0.7ms postprocess per image
Results saved to [1mruns\detect\val62[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 706.1387.8 MB/s, size: 105.7 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:04<00:00,  3.65it/s]


                   all       3736       3830      0.975       0.95      0.971      0.822
Speed: 0.2ms preprocess, 14.3ms inference, 0.0ms loss, 0.7ms postprocess per image
Results saved to [1mruns\detect\val63[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.1 ms, read: 416.2168.3 MB/s, size: 128.5 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:02<00:00,  3.74it/s]


                   all       3736       3830      0.975       0.95       0.97      0.822
Speed: 0.2ms preprocess, 14.2ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val64[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 833.9232.9 MB/s, size: 109.0 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:01<00:00,  3.79it/s]


                   all       3736       3830      0.976       0.95       0.97      0.822
Speed: 0.2ms preprocess, 14.2ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val65[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 824.3195.6 MB/s, size: 93.7 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:02<00:00,  3.74it/s]


                   all       3736       3830      0.976       0.95      0.969      0.822
Speed: 0.2ms preprocess, 14.2ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val66[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 707.4257.0 MB/s, size: 100.0 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:02<00:00,  3.76it/s]


                   all       3736       3830      0.976       0.95      0.968      0.822
Speed: 0.2ms preprocess, 14.2ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val67[0m
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 1068.9475.7 MB/s, size: 125.3 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\val\labels.cache... 3736 images, 136 backgrounds, 0 corrupt: 100%|██████████| 3736/3736 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 234/234 [01:01<00:00,  3.83it/s]


                   all       3736       3830      0.976      0.949      0.967      0.821
Speed: 0.2ms preprocess, 14.0ms inference, 0.0ms loss, 0.5ms postprocess per image
Results saved to [1mruns\detect\val68[0m
Chosen detector conf = 0.35


In [3]:
from ultralytics import YOLO
from pathlib import Path
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import pickle
from datetime import datetime
import yaml


WEIGHTS = {
    "megadetectorv6": "../scripts/train/megadetector_v6/megadetector_augmented/weights/best.pt"
}

PROJECT_ROOT = Path("C:\caltech_camera_traps_project")

YAMLS = {
    PROJECT_ROOT / "configs" / "megadetector_test" / "cis.yaml"   : "cis",
    PROJECT_ROOT / "configs" / "megadetector_test" / "trans.yaml": "trans",
}

TRANS_YAML = [y for y, tag in YAMLS.items() if tag == "trans"][0]

OUT_DIR   = Path("detector_stage");  OUT_DIR.mkdir(exist_ok=True)
TRANS_YAML = [y for y, tag in YAMLS.items() if tag == "trans"][0]
IOU_FIXED = 0.50

In [7]:
# ───────────────── helpers ─────────────────────
def get_class_names_from_yaml(yaml_path):
    with open(yaml_path, 'r') as f:
        data = yaml.safe_load(f)
    
    if 'names' in data:
        names = data['names']
        if isinstance(names, dict):
            # Convert {0: 'opossum', 1: 'raccoon', ...} to ['opossum', 'raccoon', ...]
            class_names = [names[i] for i in sorted(names.keys())]
            return class_names
        elif isinstance(names, list):
            return names
    
    return None


def save_cm(cm, names, tag):
    # Normalize the confusion matrix
    cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
    # Handle division by zero (empty rows)
    cm_normalized = np.nan_to_num(cm_normalized)
    
    fig, ax = plt.subplots(figsize=(8, 6))  # Larger size for better readability
    
    # Create heatmap with normalized values
    sns.heatmap(cm_normalized, 
                xticklabels=names,  # Use class names instead of indices
                yticklabels=names,  # Use class names instead of indices
                cmap="Blues", 
                cbar=True,  # Show colorbar for normalized values
                ax=ax,
                annot=True,  # Show values in cells
                fmt='.2f',   # Format as 2 decimal places
                cbar_kws={'label': 'Normalized Count'})
    
    ax.set_title(f'Confusion Matrix - {tag}')
    ax.set_xlabel('Predicted')
    ax.set_ylabel('Actual')
    
    # Rotate x-axis labels for better readability
    plt.xticks(rotation=45, ha='right')
    plt.yticks(rotation=0)
    
    fig.tight_layout()
    fn = OUT_DIR / f"{tag}_cm.png"
    fig.savefig(fn, dpi=300, bbox_inches='tight')
    plt.close(fig)
    return fn

# ------------------------------------------------------------
# 2. Evaluate all models with their best thresholds
# ------------------------------------------------------------
print(f"\n🚀 Evaluating model with optimized threshold...")

# Define evaluation splits
SPLITS = [
    (TRANS_YAML,                    "trans", "val"),
    (TRANS_YAML,                    "trans", "test"),  # trans_test
    *[(y, t, "val")  for y, t in YAMLS.items() if t == "cis"],   # cis_val
    *[(y, t, "test") for y, t in YAMLS.items() if t == "cis"],   # cis_test
]

model_objects = {}  # Cache loaded models
rows = []

for model_name, weight_path in WEIGHTS.items():
    print(f"\n📈 Evaluating {model_name}...")
    
    # Load model once and cache it
    if model_name not in model_objects:
        model_objects[model_name] = YOLO(weight_path)
    model = model_objects[model_name]
    
    # Get the best threshold for this model
    best_conf = 0.35
    
    # Evaluate on all splits
    for yaml_path, dom, split in SPLITS:
        tag = f"{model_name}_{dom}_{split}"
        
        # Run evaluation
        met = model.val(data=yaml_path, split=split,
                       conf=float(best_conf), iou=IOU_FIXED,
                       verbose=False, plots=True)
        
        # Save confusion matrix
        try:
            # Newer Ultralytics: metric.confusion_matrix.matrix
            cm = met.confusion_matrix.matrix
        except AttributeError:
            # Older builds: metric.box.confusion_matrix.matrix
            print("FAILED")
            cm = met.box.confusion_matrix.matrix
        
        try:
            # Get class names from the dataset YAML
            class_names = get_class_names_from_yaml(yaml_path)
            print(f"Class names from YAML: {class_names}")
            
            if cm is not None and cm.sum() > 0:
                cm_png = save_cm(cm, class_names, tag)  # Use YAML class names
            else:
                print(f"Empty confusion matrix for {tag}")
                cm_png = None
        except Exception as e:
            print(f"Error with confusion matrix: {e}")
            cm_png = None
        
        # Calculate metrics
        precision = float(met.box.mp)
        recall = float(met.box.mr)
        f1 = 2 * precision * recall / (precision + recall + 1e-9)
        
        # Store results
        row = {
            "model"      : model_name,
            "domain"     : dom,
            "split"      : split,
            "conf"       : round(best_conf, 2),
            "precision"  : precision,
            "recall"     : recall,
            "F1"         : f1,
            "mAP50"      : float(met.box.map50),
            "mAP50-95"   : float(met.box.map),
            "cm_png"     : cm_png.name,
            "eval_time"  : datetime.now().strftime("%Y-%m-%d %H:%M:%S")
        }
        rows.append(row)
        
        print(f"  {tag:35s} P={precision:.3f}  R={recall:.3f}  F1={f1:.3f}")




🚀 Evaluating model with optimized threshold...

📈 Evaluating megadetectorv6...
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
YOLOv9c summary (fused): 156 layers, 25,320,790 parameters, 0 gradients, 102.3 GFLOPs
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 330.0121.2 MB/s, size: 93.3 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\trans_val\labels... 1972 images, 57 backgrounds, 0 corrupt: 100%|██████████| 1972/1972 [00:02<00:00, 980.52it/s] 


[34m[1mval: [0mNew cache created: C:\caltech_camera_traps_project\data\megadetector_images\trans_val\labels.cache


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 124/124 [00:30<00:00,  4.09it/s]


                   all       1972       2047      0.973      0.927      0.957      0.806
Speed: 0.2ms preprocess, 12.8ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val54[0m
Class names from YAML: ['animal', 'vehicle']
  megadetectorv6_trans_val            P=0.973  R=0.927  F1=0.949
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 230.688.8 MB/s, size: 112.2 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\trans_test\labels... 18553 images, 1782 backgrounds, 0 corrupt: 100%|██████████| 18553/18553 [00:20<00:00, 918.83it/s] 


[34m[1mval: [0mNew cache created: C:\caltech_camera_traps_project\data\megadetector_images\trans_test\labels.cache


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 1160/1160 [05:01<00:00,  3.85it/s]


                   all      18553      17490      0.964      0.956      0.975      0.816
Speed: 0.2ms preprocess, 14.0ms inference, 0.0ms loss, 0.5ms postprocess per image
Results saved to [1mruns\detect\val55[0m
Class names from YAML: ['animal', 'vehicle']
  megadetectorv6_trans_test           P=0.964  R=0.956  F1=0.960
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 227.565.5 MB/s, size: 107.3 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\cis_val\labels... 1764 images, 79 backgrounds, 0 corrupt: 100%|██████████| 1764/1764 [00:01<00:00, 887.19it/s]

[34m[1mval: [0mNew cache created: C:\caltech_camera_traps_project\data\megadetector_images\cis_val\labels.cache



                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 111/111 [00:29<00:00,  3.70it/s]


                   all       1764       1783       0.97      0.976      0.982      0.837
Speed: 0.2ms preprocess, 14.2ms inference, 0.0ms loss, 0.6ms postprocess per image
Results saved to [1mruns\detect\val56[0m
Class names from YAML: ['animal', 'vehicle']
  megadetectorv6_cis_val              P=0.970  R=0.976  F1=0.973
Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CUDA:0 (NVIDIA GeForce RTX 4070 Laptop GPU, 8188MiB)
[34m[1mval: [0mFast image access  (ping: 0.10.1 ms, read: 199.245.4 MB/s, size: 115.6 KB)


[34m[1mval: [0mScanning C:\caltech_camera_traps_project\data\megadetector_images\cis_test\labels... 12141 images, 96 backgrounds, 0 corrupt: 100%|██████████| 12141/12141 [00:14<00:00, 859.46it/s]


[34m[1mval: [0mNew cache created: C:\caltech_camera_traps_project\data\megadetector_images\cis_test\labels.cache


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 759/759 [03:28<00:00,  3.64it/s]


                   all      12141      12569      0.979      0.973       0.98      0.822
Speed: 0.2ms preprocess, 14.4ms inference, 0.0ms loss, 0.7ms postprocess per image
Results saved to [1mruns\detect\val57[0m
Class names from YAML: ['animal', 'vehicle']
  megadetectorv6_cis_test             P=0.979  R=0.973  F1=0.976


In [9]:
# ------------------------------------------------------------
# 3. Save and analyze results
# ------------------------------------------------------------

# Save main summary
csv_path = OUT_DIR / "summary_metrics.csv"

df = pd.DataFrame(rows)

df.to_csv(csv_path, index=False)


# Save additional detailed analysis
detailed_path = OUT_DIR / "detailed_analysis.csv"
df_detailed = df.copy()

# Add performance comparisons
df_detailed['domain_shift'] = df_detailed.apply(
    lambda row: 'in_domain' if row['domain'] == 'cis' else 'out_domain', axis=1
)

# Calculate relative performance (compared to best model per split)
for split_combo in df_detailed[['domain', 'split']].drop_duplicates().values:
    dom, spl = split_combo
    mask = (df_detailed['domain'] == dom) & (df_detailed['split'] == spl)
    best_f1 = df_detailed[mask]['F1'].max()
    df_detailed.loc[mask, 'relative_f1'] = df_detailed.loc[mask, 'F1'] / best_f1

df_detailed.to_csv(detailed_path, index=False)

print(f"\n✔︎ Main metrics written to {csv_path.resolve()}")
print(f"✔︎ Detailed analysis written to {detailed_path.resolve()}")

# Display summary statistics
print(f"\n📊 EVALUATION SUMMARY:")
print(f"{'='*60}")

# Summary by domain
print("\n🎯 Performance by Domain:")
domain_summary = df.groupby(['domain', 'split']).agg({
    'F1': ['mean', 'std', 'max', 'min'],
    'mAP50': ['mean', 'std', 'max', 'min']
}).round(3)
print(domain_summary)

# Best performing models per split
print(f"\n🏆 Best Models per Split:")
for split_combo in df[['domain', 'split']].drop_duplicates().values:
    dom, spl = split_combo
    mask = (df['domain'] == dom) & (df['split'] == spl)
    best_row = df[mask].loc[df[mask]['F1'].idxmax()]
    print(f"  {dom}_{spl:4s}: {best_row['model']:30s} (F1={best_row['F1']:.3f})")

# Domain shift analysis
print(f"\n🔄 Domain Shift Analysis:")
cis_test = df[(df['domain'] == 'cis') & (df['split'] == 'test')]
trans_test = df[(df['domain'] == 'trans') & (df['split'] == 'test')]

for model in df['model'].unique():
    cis_f1 = cis_test[cis_test['model'] == model]['F1'].iloc[0]
    trans_f1 = trans_test[trans_test['model'] == model]['F1'].iloc[0]
    drop = ((cis_f1 - trans_f1) / cis_f1) * 100
    print(f"  {model:30s}: {drop:+5.1f}% drop (cis: {cis_f1:.3f} → trans: {trans_f1:.3f})")

# Create visualization
fig, axes = plt.subplots(2, 2, figsize=(15, 10))

# Plot 1: F1 scores by model and domain
ax1 = axes[0, 0]
test_data = df[df['split'] == 'test']
pivot_f1 = test_data.pivot(index='model', columns='domain', values='F1')
pivot_f1.plot(kind='bar', ax=ax1, rot=45)
ax1.set_title('F1 Score by Model and Domain (Test Set)')
ax1.set_ylabel('F1 Score')
ax1.legend(title='Domain')
ax1.grid(True, alpha=0.3)

# Plot 2: mAP50 scores
ax2 = axes[0, 1]
pivot_map = test_data.pivot(index='model', columns='domain', values='mAP50')
pivot_map.plot(kind='bar', ax=ax2, rot=45)
ax2.set_title('mAP50 by Model and Domain (Test Set)')
ax2.set_ylabel('mAP50')
ax2.legend(title='Domain')
ax2.grid(True, alpha=0.3)

# Plot 3: Precision vs Recall
ax3 = axes[1, 0]
for domain in df['domain'].unique():
    domain_data = test_data[test_data['domain'] == domain]
    ax3.scatter(domain_data['recall'], domain_data['precision'], 
               label=domain, alpha=0.7, s=100)
ax3.set_xlabel('Recall')
ax3.set_ylabel('Precision')
ax3.set_title('Precision vs Recall (Test Set)')
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Domain shift impact
ax4 = axes[1, 1]
models = []
domain_drops = []
for model in df['model'].unique():
    try:
        cis_f1 = cis_test[cis_test['model'] == model]['F1'].iloc[0]
        trans_f1 = trans_test[trans_test['model'] == model]['F1'].iloc[0]
        drop = ((cis_f1 - trans_f1) / cis_f1) * 100
        models.append(model)
        domain_drops.append(drop)
    except:
        pass

ax4.bar(range(len(models)), domain_drops, color='lightcoral', alpha=0.7)
ax4.set_xticks(range(len(models)))
ax4.set_xticklabels(models, rotation=45, ha='right')
ax4.set_ylabel('Performance Drop (%)')
ax4.set_title('Domain Shift Impact (CIS → Trans)')
ax4.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(OUT_DIR / "evaluation_summary.png", dpi=300, bbox_inches='tight')
plt.show()

print(f"✔︎ Evaluation summary plot saved to {OUT_DIR / 'evaluation_summary.png'}")
print(f"\n🎉 Evaluation complete! Check {OUT_DIR} for all results.")

# Display final dataframe
print(f"\n📋 Final Results DataFrame:")
df


✔︎ Main metrics written to C:\caltech_camera_traps_project\eval\detector_stage\summary_metrics.csv
✔︎ Detailed analysis written to C:\caltech_camera_traps_project\eval\detector_stage\detailed_analysis.csv

📊 EVALUATION SUMMARY:

🎯 Performance by Domain:
                 F1                    mAP50                  
               mean std    max    min   mean std    max    min
domain split                                                  
cis    test   0.976 NaN  0.976  0.976  0.980 NaN  0.980  0.980
       val    0.973 NaN  0.973  0.973  0.982 NaN  0.982  0.982
trans  test   0.960 NaN  0.960  0.960  0.975 NaN  0.975  0.975
       val    0.949 NaN  0.949  0.949  0.957 NaN  0.957  0.957

🏆 Best Models per Split:
  trans_val : megadetectorv6                 (F1=0.949)
  trans_test: megadetectorv6                 (F1=0.960)
  cis_val : megadetectorv6                 (F1=0.973)
  cis_test: megadetectorv6                 (F1=0.976)

🔄 Domain Shift Analysis:
  megadetectorv6                

<Figure size 1500x1000 with 4 Axes>

✔︎ Evaluation summary plot saved to detector_stage\evaluation_summary.png

🎉 Evaluation complete! Check detector_stage for all results.

📋 Final Results DataFrame:


Unnamed: 0,model,domain,split,conf,precision,recall,F1,mAP50,mAP50-95,cm_png,eval_time
0,megadetectorv6,trans,val,0.35,0.972706,0.926648,0.949118,0.957226,0.805949,megadetectorv6_trans_val_cm.png,2025-07-23 15:20:39
1,megadetectorv6,trans,test,0.35,0.963596,0.956059,0.959813,0.974539,0.816402,megadetectorv6_trans_test_cm.png,2025-07-23 15:26:24
2,megadetectorv6,cis,val,0.35,0.969513,0.975908,0.9727,0.982211,0.837487,megadetectorv6_cis_val_cm.png,2025-07-23 15:27:13
3,megadetectorv6,cis,test,0.35,0.97885,0.973242,0.976038,0.979707,0.822318,megadetectorv6_cis_test_cm.png,2025-07-23 15:31:21


In [4]:
from ultralytics import YOLO
from pathlib import Path
import cv2, os, random

# ── CONFIG ────────────────────────────────────────────────
YOLO_SPLITS = {
    "train": ("../data/yolo_images/train_balanced/images", "../data/yolo_images/train_balanced/labels"),
    "val"  : ("../data/yolo_images/val/images",   "../data/yolo_images/val/labels"),
}
DET_WGT   = "../scripts/train/megadetector_v6/megadetector_augmented/weights/best.pt"
CONF_DET  = 0.35                         # high-recall value you chose
DATA_YAML   = "../configs/model/yolo_balanced.yaml"
BG_NAME   = "background"
OUT_ROOT  = Path("../data/megadetector_crops/")     # will hold train/ val/ subdirs
# ──────────────────────────────────────────────────────────

with open(DATA_YAML) as f:
    names_yaml = yaml.safe_load(f)["names"]        # list or {id: name}


if isinstance(names_yaml, dict):                   # dict?  sort by key
    names = [names_yaml[i] for i in sorted(names_yaml)]
else:                                              # already list
    names = names_yaml

vehicle_ids = [i for i,n in enumerate(names) if n.lower() in ("car")]
species_ids = [i for i in range(len(names)) if i not in vehicle_ids]

print("species ids:", species_ids)
print("vehicle ids:", vehicle_ids, "\n")

# 2) prepare directories -------------------------------------
for split in YOLO_SPLITS:
    for cls_name in [names[i] for i in species_ids] + [BG_NAME]:
        (OUT_ROOT/split/cls_name).mkdir(parents=True, exist_ok=True)

# 3) helper ---------------------------------------------------
def iou_xyxy(a,b):
    xA=max(a[0],b[0]); yA=max(a[1],b[1])
    xB=min(a[2],b[2]); yB=min(a[3],b[3])
    inter=max(0,xB-xA)*max(0,yB-yA)
    if inter==0: return 0
    union=(a[2]-a[0])*(a[3]-a[1])+(b[2]-b[0])*(b[3]-b[1])-inter
    return inter/union

det = YOLO(DET_WGT, task="detect"); det.conf = CONF_DET

# 4) generate crops ------------------------------------------
for split, (img_dir, lbl_dir) in YOLO_SPLITS.items():
    img_dir, lbl_dir = Path(img_dir), Path(lbl_dir)

    for res in det.predict(img_dir, imgsz=640, verbose=False):
        img = cv2.imread(res.path); h,w = img.shape[:2]
        lbl_path = lbl_dir / Path(res.path).with_suffix(".txt").name
        if not lbl_path.exists():
            print("missing label file:", lbl_path); continue

        # -- load GT species boxes ----------------------------
        gt_boxes, gt_ids = [], []
        with open(lbl_path) as f:
            for ln in f:
                cid, xc, yc, bw, bh = map(float, ln.split())
                cid=int(cid)
                if cid in vehicle_ids:              # skip vehicle GT
                    continue
                gx1=(xc-bw/2)*w; gy1=(yc-bh/2)*h
                gx2=(xc+bw/2)*w; gy2=(yc+bh/2)*h
                gt_boxes.append((gx1,gy1,gx2,gy2)); gt_ids.append(cid)

        # -- walk detector boxes ------------------------------
        for box, cls_det in zip(res.boxes.xyxy.cpu().numpy(),
                        res.boxes.cls.cpu().numpy()):
            if int(cls_det) == 1:              # detector says "vehicle"  → skip
                continue

            x1, y1, x2, y2 = box.astype(int)

            # ----------- NEW centre-inside-GT logic ----------------
            label_id = None
            for (gx1, gy1, gx2, gy2), gid in zip(gt_boxes, gt_ids):
                cx, cy = (gx1 + gx2) / 2, (gy1 + gy2) / 2   # GT box centre point
                if x1 <= cx <= x2 and y1 <= cy <= y2:       # centre inside detector box?
                    label_id = gid                          # use that species
                    break                                   # stop after first match

            label_name = names[label_id] if label_id is not None else BG_NAME
            # -------------------------------------------------------

            out = (
                OUT_ROOT / split / label_name /
                f"{Path(res.path).stem}_{x1}_{y1}.jpg"
            )
            cv2.imwrite(str(out), img[y1:y2, x1:x2])

print("✅ All crops finished →", OUT_ROOT.resolve())

species ids: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13]
vehicle ids: [11] 

inference results will accumulate in RAM unless `stream=True` is passed, causing potential out-of-memory
errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
        masks = r.masks  # Masks object for segment masks outputs
        probs = r.probs  # Class probabilities for classification outputs

inference results will accumulate in RAM unless `stream=True` is passed, causing potential out-of-memory
errors for large sources or long-running streams and videos. See https://docs.ultralytics.com/modes/predict/ for help.

Example:
    results = model(source=..., stream=True)  # generator of Results objects
    for r in results:
        boxes = r.boxes  # Boxes object for bbox outputs
   

In [None]:
from pathlib import Path
from ultralytics import YOLO   

ckpt_path = Path("../scripts/train/megadetector_v6/megadetector_augmented/weights/best.pt")   # adjust if needed
out_dir   = Path("../models")
out_dir.mkdir(parents=True, exist_ok=True)

model = YOLO(str(ckpt_path))
onnx_name = out_dir / "megadetectorv6.onnx"
model.export(format="onnx", imgsz=640, simplify=True, dynamic=True, nms=True)
(ckpt_path.parent / "best.onnx").rename(onnx_name)

print(f" exported → {onnx_name.resolve()}")


Ultralytics 8.3.163  Python-3.11.13 torch-2.5.1+cu121 CPU (AMD Ryzen 9 7940HS w/ Radeon 780M Graphics)
YOLOv9c summary (fused): 156 layers, 25,320,790 parameters, 0 gradients, 102.3 GFLOPs

[34m[1mPyTorch:[0m starting from '..\scripts\train\megadetector_v6\megadetector_augmented\weights\best.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 300, 6) (49.2 MB)

[34m[1mONNX:[0m starting export with onnx 1.17.0 opset 19...
[34m[1mONNX:[0m slimming with onnxslim 0.1.61...
[34m[1mONNX:[0m export success  9.5s, saved as '..\scripts\train\megadetector_v6\megadetector_augmented\weights\best.onnx' (96.8 MB)

Export complete (10.7s)
Results saved to [1mC:\caltech_camera_traps_project\scripts\train\megadetector_v6\megadetector_augmented\weights[0m
Predict:         yolo predict task=detect model=..\scripts\train\megadetector_v6\megadetector_augmented\weights\best.onnx imgsz=640  
Validate:        yolo val task=detect model=..\scripts\train\megadetector_v6\megadetector

In [10]:
from pathlib import Path

# Define file path
txt_path =  Path("../models/thresholds.txt")  # replace with your .txt file name
new_line = "megadetector_v6: 0.35"

# Append the line
with open(txt_path, "a") as f:
    f.write(f"{new_line}\n")

print(f"✅ Line added to {txt_path.resolve()}")


✅ Line added to C:\caltech_camera_traps_project\models\thresholds.txt


In [None]:
# ──────────────────────────────────────────────────────────────
# Benchmark one exported ONNX model on the test set and log time
# ──────────────────────────────────────────────────────────────
from ultralytics import YOLO
from pathlib import Path
import pandas as pd
import time
out_dir = Path("../models")
# ── 1. CONFIG ────────────────────────────────────────────────
MODEL_NAME  = "megadetectorv6"           # ← change to "megadetector_v6"
CONF_TH     = 0.35                       # ← change to 0.15 for MD-v6
ONNX_PATH   = out_dir / f"{MODEL_NAME}.onnx"
TEST_DIR    = Path("../data/megadetector_images/val/images")   # folder of *.jpg / *.png

LOG_CSV     = Path("../models/inference_times.csv")
LOG_CSV.parent.mkdir(parents=True, exist_ok=True)

# ── 2. RUN & TIME  ───────────────────────────────────────────
model = YOLO(ONNX_PATH, task="detect")

t0 = time.perf_counter()
results = model.predict(source=str(TEST_DIR),
                        imgsz=640,
                        conf=CONF_TH,
                        save=False,
                        verbose=False)
t1 = time.perf_counter()

num_imgs     = len(results)
total_secs   = t1 - t0
avg_secs     = total_secs / num_imgs
fps          = 1 / avg_secs

print(f"\n{MODEL_NAME}: {num_imgs} images  |  {total_secs:6.2f} s total  "
      f"→  {avg_secs*1000:5.1f} ms/img  ({fps:5.1f} FPS)")

# ── 3. LOG TO CSV  ───────────────────────────────────────────
row = {
    "model"        : MODEL_NAME,
    "images"       : num_imgs,
    "conf_thresh"  : CONF_TH,
    "total_sec"    : round(total_secs, 3),
    "sec_per_img"  : round(avg_secs, 4),
    "fps"          : round(fps, 2),
}

if LOG_CSV.exists():
    df_log = pd.read_csv(LOG_CSV)
    # drop any previous entry for this model so the latest timing wins
    df_log = df_log[df_log["model"] != MODEL_NAME]
    df_log = pd.concat([df_log, pd.DataFrame([row])], ignore_index=True)
else:
    df_log = pd.DataFrame([row])

df_log.to_csv(LOG_CSV, index=False)
print(f"✅ timing appended → {LOG_CSV.resolve()}")


# 3. Peek at the first few predictions -------------------------------------
print(f"\nFound {len(results)} images in {TEST_DIR}")
for i, r in enumerate(results[:3]):        # just display first 3
    print(f"\n🖼️  Image {i+1}: {Path(r.path).name}")
    for cls, conf, xyxy in zip(r.boxes.cls, r.boxes.conf, r.boxes.xyxy):
        label = r.names[int(cls)]
        x1, y1, x2, y2 = map(int, xyxy)
        print(f"   • {label:12s} {conf:5.2f}   box=({x1},{y1})–({x2},{y2})")