# Comparing Results between all Detectors


## 1. Introduction

This notebook will go over different methods of comparing and assessing the perforamnces of each object detection model trained and used. These methods of evaluating and compaaring include accuracy metrics, inference speed, per-class performance, and qualitative comparison on samples. For complete fairness, each model was trained on the same version of the dataset.

### Models Compared:

- `YOLOv8`:  1-stage detector mostly used for real-time solutions.
- `YOLO AGAIN??? IDK WHO HAS THIS`:
- `Faster R-CNN`:  2-Stage detector which focuses on optimising detection accuracy.
- `SSD`:  1-stage detector and lightwieght to optimise speed.


#### Import necessary libraries and Quantitative Summary

The script below summarises quantitative results for each model.Mean Average Precision (mAP) is reported at IoU thresholds of 0.5 and 0.5:0.95.
Inference speed is measured in frames per second (FPS) on GPU hardware.


In [None]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_theme(style="whitegrid")

results_data = {
    'Model': [                              #REPLACE
        'YOLOv8 (This Work)',               #REPLACE
        'Faster R-CNN (Member 2)',          #REPLACE
        'SSD (Member 3)'
    ],
    'mAP@0.5': [0.92, 0.89, 0.85],          # Placeholder
    'mAP@0.5:0.95': [0.75, 0.78, 0.65],     # Placeholder
    'Inference_Speed_FPS': [45, 12, 55]     # Placeholder
}

df = pd.DataFrame(results_data)
df



## 2. Comparing with mAP



In [None]:
df_melted = df.melt(
    id_vars="Model",
    value_vars=["mAP@0.5", "mAP@0.5:0.95"],
    var_name="Metric",
    value_name="Score"
)

plt.figure(figsize=(10, 6))
sns.barplot(data=df_melted, x="Model", y="Score", hue="Metric")
plt.title("Model Accuracy Comparison (mAP)")
plt.ylim(0, 1.0)
plt.ylabel("mAP Score")
plt.xticks(rotation=15)
plt.show()


## 3. Acuracy and Speed

These 2 metrics usually need to be traded off depending on the project spwecificities and requirements, as each project will have a different balance between speed and accuracy. This section will visualise inference speed and detection accuracy.

In [None]:
plt.figure(figsize=(8, 6))
sns.scatterplot(
    data=df,
    x="Inference_Speed_FPS",
    y="mAP@0.5:0.95",
    s=200
)

for i in range(df.shape[0]):
    plt.text(
        df.Inference_Speed_FPS[i] + 1,
        df['mAP@0.5:0.95'][i],
        df.Model[i]
    )

plt.title("Inference Speed vs Accuracy")
plt.xlabel("Inference Speed (FPS) → Faster")
plt.ylabel("Accuracy (mAP@0.5:0.95) → Better")
plt.grid(True)
plt.show()


## 4. Comparing models per Class

Comparing models per class average precision will help us indetify any specifically worse or better performing model on an attribute.

In [None]:
# TODO: Replace with real per-class AP values
per_class_data = pd.DataFrame({
    'Class': ['Stop', 'No Entry', 'Pedestrian Crossing'],
    'YOLOv8': [0.91, 0.88, 0.85],
    'Faster R-CNN': [0.93, 0.90, 0.80],
    'SSD': [0.85, 0.83, 0.75]
})

per_class_melted = per_class_data.melt(
    id_vars="Class",
    var_name="Model",
    value_name="AP"
)

plt.figure(figsize=(10, 5))
sns.barplot(
    data=per_class_melted,
    x="Class",
    y="AP",
    hue="Model"
)
plt.title("Per-Class Average Precision Comparison")
plt.ylim(0, 1.0)
plt.show()


### Training Behavoiur with TensorBoard

In [None]:
%load_ext tensorboard
%tensorboard --logdir runs

### Training Metrics using Ultralytics

In [None]:
results_path = "runs/detect/sign_detector/results.csv"  # adjust if needed
yolo_results = pd.read_csv(results_path)
yolo_results.tail()

plt.figure(figsize=(10,4))
plt.plot(yolo_results['epoch'], yolo_results['metrics/mAP50(B)'], label='mAP@0.5')
plt.plot(yolo_results['epoch'], yolo_results['metrics/mAP50-95(B)'], label='mAP@0.5:0.95')
plt.xlabel("Epoch")
plt.ylabel("Score")
plt.title("YOLOv8 Validation mAP Over Epochs")
plt.legend()
plt.show()

NameError: name 'pd' is not defined

## 5. Qualititave Comparison


Apart from conducting a quantitative analysis, we can also produce a qualitiative comparison between the models by using test images, for example images with partial occlusion, or poor lighting on traffic signs.

Visualisations of model predictions identify:

- `Missed detection`
- `False positives`
- `Localisation errors on bounding boxes`

These demonstrate where the model purely fails and demonstrates the weaknesses in the built dataset.

## 6. Discussion and Conclusion


DO NOT FORHGET THIS!!!!!!!!!

