## 1. Imports

# YOLO Object Detection Evaluation

This notebook performs systematic evaluation of the trained YOLO object detection model. It assesses:

1. Detection Performance:
   - Mean Average Precision (mAP)
   - Multiple IoU thresholds
   - Per-class metrics

2. Testing Protocol:
   - Dedicated test set evaluation
   - Confidence analysis
   - Automated metrics logging

3. Results Analysis:
   - Comprehensive metrics
   - Performance visualization
   - Detailed reporting

In [None]:
# Object detection framework
from ultralytics import YOLO

# File and path handling
from pathlib import Path

## 2. Evaluation Configuration

Setup the evaluation pipeline:

1. Model Selection:
   - Trained model path
   - Best weights selection
   - Path resolution

2. Dataset Configuration:
   - Test set definition
   - Data YAML path
   - Class mapping

3. Output Organization:
   - Results directory
   - Metrics storage
   - Plot generation

In [None]:
# Model identification
RUN_NAME = "peatland_detector_v12"     # Name of the training run
PROJECT_DIR = "metrics/detection"       # Base metrics directory

# Path configuration
MODEL_PATH = Path(PROJECT_DIR) / RUN_NAME / "weights/best.pt"     # Best model weights
DATA_YAML_PATH = Path("../data/processed/detection/data.yaml")    # Dataset config

# Display configuration
print(f"Loading model from: {MODEL_PATH}")
print(f"Using dataset configuration: {DATA_YAML_PATH}")

Loading model from: metrics/detection/peatland_detector_v12/weights/best.pt
Using dataset configuration: ../data/processed/detection/data.yaml


## 3. Model Loading

Initialize YOLO model for evaluation:

1. Model Loading:
   - Trained weights
   - Model architecture
   - Configuration parameters

2. Memory Management:
   - Efficient loading
   - Resource allocation
   - Cache handling

3. Inference Setup:
   - Evaluation mode
   - Batch processing
   - Device optimization

In [9]:
model = YOLO(MODEL_PATH)

print("Trained model loaded successfully.")

Trained model loaded successfully.


## 4. Run Evalutation

In [10]:
metrics = model.val(
    data=str(DATA_YAML_PATH),
    split='test', # Explicitly specify the test set
    name=f'{RUN_NAME}_test_evaluation' # Custom name for the output folder
)

print("\n--- Evaluation Complete ---")
# The metrics object contains a dictionary of all the results.
# For example, to see the main mAP50-95 score:
print(f"mAP50-95: {metrics.box.map}")
print(f"mAP50: {metrics.box.map50}")
print(f"\nDetailed metrics and plots saved in 'metrics/detection/{RUN_NAME}_test_evaluation'")



Ultralytics 8.3.173 🚀 Python-3.11.8 torch-2.7.1 CPU (Apple M2 Pro)
YOLO11m summary (fused): 125 layers, 20,032,345 parameters, 0 gradients, 67.7 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.1±0.1 ms, read: 104.0±36.7 MB/s, size: 61.1 KB)


[34m[1mval: [0mScanning /Users/stahlma/Desktop/01_Studium/09_Vision_Project/peatland_navigation/data/processed/detection/labels/test... 611 images, 0 backgrounds, 0 corrupt: 100%|██████████| 611/611 [00:00<00:00, 989.67it/s] 

[34m[1mval: [0m/Users/stahlma/Desktop/01_Studium/09_Vision_Project/peatland_navigation/data/processed/detection/images/test/cone_4_z8126b7d060d92d74599e0618_f103ee1115bfb3eed_d20170226_m022615_c001_v0001038_t0047_png.rf.13dfcca46af542f8aad3f98df957d441.jpg: 3 duplicate labels removed
[34m[1mval: [0m/Users/stahlma/Desktop/01_Studium/09_Vision_Project/peatland_navigation/data/processed/detection/images/test/cone_4_z8126b7d060d92d74599e0618_f103ee1115bfb41ca_d20170226_m022825_c001_v0001038_t0047_png.rf.4ef757c8882ed5d85dee5a648f17ba1b.jpg: 2 duplicate labels removed
[34m[1mval: [0m/Users/stahlma/Desktop/01_Studium/09_Vision_Project/peatland_navigation/data/processed/detection/images/test/cone_4_z8126b7d060d92d74599e0618_f103ee1115bfb64e9_d20170226_m025747_c001_v0001038_t0047_png.rf.e405f032c51fdd296c2ccb4a2e24c398.jpg: 2 duplicate labels removed
[34m[1mval: [0m/Users/stahlma/Desktop/01_Studium/09_Vision_Project/peatland_navigation/data/processed/detection/images/test/cone_4_z8


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 39/39 [09:31<00:00, 14.66s/it]


                   all        611        707      0.935      0.916      0.964        0.7
                 bench        307        311      0.865      0.939      0.965      0.711
                  cone        166        256      0.956      0.929      0.979      0.704
                  sign        138        140      0.984       0.88      0.947      0.686
Speed: 1.3ms preprocess, 929.5ms inference, 0.0ms loss, 0.5ms postprocess per image
Results saved to [1mruns/detect/peatland_detector_v12_test_evaluation[0m

--- Evaluation Complete ---
mAP50-95: 0.7004630530806742
mAP50: 0.9636802041346447

Detailed metrics and plots saved in 'metrics/detection/peatland_detector_v12_test_evaluation'
