# Project Notes (Sanitized for Git)

This repository contains a **sanitized** version of the Gracity Insects YOLOv8 Classification notebooks.
All tenant-specific identifiers (bucket names, namespaces, OCIDs, local absolute paths) have been replaced by placeholders.

**Author:** Cristina Varas Menadas  
**Last updated:** 2026-02-19

> To run these notebooks, set the configuration values in the first "Configuration" section of each notebook.


# Gracity Insects â€” 04. Evaluation & Metrics

Checks:
- Accuracy on validation
- Per-class report + confusion matrix
- Recall for mosquito class (if present)
- Latency p50/p95 on local runtime


## Configuration

Update these variables for your tenancy/project.

- **Bucket**: `<BUCKET_NAME>`
- **Dataset prefix** (images): `<PROJECT_PREFIX>/v1/raw/datasets/insects_kaggle_v1/`
- **Labels prefix** (metadata/manifests): `<PROJECT_PREFIX>/v1/labels/insects_kaggle_v1/`
- **Runs prefix** (artifacts): `<PROJECT_PREFIX>/yolo/runs/insects_kaggle_v1/`

We intentionally keep **`test/` as validation** for this starter project (to match your current bucket structure).

## 4.1 Imports

In [None]:
from __future__ import annotations

import time
from pathlib import Path
from typing import List, Tuple

import numpy as np
import pandas as pd
from ultralytics import YOLO
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
import matplotlib.pyplot as plt

## 4.2 Paths

In [None]:
DATASET_ROOT: str = "<LOCAL_PATH> Gracity/gracity-insects-yolo-cls/outputs/dataset"  # <-- update
VAL_DIR = Path(DATASET_ROOT) / "test"

RUN_DIR: str = "./runs"
RUN_NAME: str = ""  # <-- set your run name
WEIGHTS_PATH: Path = Path(RUN_DIR) / RUN_NAME / "weights" / "best.pt"

assert VAL_DIR.exists(), VAL_DIR
assert WEIGHTS_PATH.exists(), WEIGHTS_PATH

## 4.3 Load validation set

In [None]:
def iter_images(val_dir: Path) -> List[Tuple[Path, str]]:
    items: List[Tuple[Path, str]] = []
    for class_dir in sorted([p for p in val_dir.iterdir() if p.is_dir()]):
        for img_path in class_dir.iterdir():
            if img_path.is_file():
                items.append((img_path, class_dir.name))
    return items

val_items = iter_images(VAL_DIR)
len(val_items), val_items[0]

## 4.4 Predict + accuracy

In [None]:
model = YOLO(str(WEIGHTS_PATH))

y_true: List[str] = []
y_pred: List[str] = []

for img_path, label in val_items:
    out = model.predict(str(img_path), verbose=False)
    probs = out[0].probs
    pred_name = model.names[int(probs.top1)]
    y_true.append(label)
    y_pred.append(pred_name)

acc = accuracy_score(y_true, y_pred)
print("Validation accuracy:", acc)

## 4.5 Report + confusion matrix

In [None]:
labels = sorted(list(set(y_true) | set(y_pred)))
print(classification_report(y_true, y_pred, labels=labels))

cm = confusion_matrix(y_true, y_pred, labels=labels)
plt.figure(figsize=(10, 8))
plt.imshow(cm)
plt.title("Confusion Matrix (Validation)")
plt.xticks(range(len(labels)), labels, rotation=90)
plt.yticks(range(len(labels)), labels)
plt.colorbar()
plt.tight_layout()
plt.show()

## 4.6 Requirement checks

In [None]:
print("Meets accuracy >= 0.85 ?", acc >= 0.85)

mos_label_candidates = {"Mosquito", "MOS"}
mos_label = next((c for c in mos_label_candidates if c in set(y_true)), None)

if mos_label is None:
    print("Mosquito class not found; skipping MOS recall.")
else:
    cm_df = pd.DataFrame(cm, index=labels, columns=labels)
    tp = cm_df.loc[mos_label, mos_label]
    fn = cm_df.loc[mos_label, :].sum() - tp
    recall_mos = float(tp / (tp + fn)) if (tp + fn) > 0 else 0.0
    print(f"Recall({mos_label}) =", recall_mos)
    print("Meets MOS recall >= 0.80 ?", recall_mos >= 0.80)

## 4.7 Latency p50/p95 (local)

In [None]:
import statistics

sample_paths = [p for p,_ in val_items[:50]]
lat_ms: List[float] = []

for p in sample_paths:
    t0 = time.perf_counter()
    _ = model.predict(str(p), verbose=False)
    lat_ms.append((time.perf_counter() - t0) * 1000.0)

p50 = statistics.median(lat_ms)
p95 = float(np.percentile(lat_ms, 95))
print(f"Latency ms p50: {p50:.1f}")
print(f"Latency ms p95: {p95:.1f}")