## Dataset Verification

Before evaluating or refining the YOLOv8 model, we first verify that the dataset configuration file (`data.yaml`) exists and is accessible.

The `data.yaml` file is a critical component in the YOLO workflow. It defines:
- The dataset directory paths (train, validation, and test sets)
- The number of classes
- The class names used during training and evaluation

Verifying its existence ensures that:
- The dataset is correctly structured
- The model evaluation is based on the intended data
- Subsequent evaluation metrics are reliable and reproducible

In [None]:
from pathlib import Path
import yaml

# Path to dataset config
DATASET_DIR = Path("datasets/ecoclassifier")
DATA_YAML = DATASET_DIR / "data.yaml"

print("Checking dataset structure...\n")

# 1. Check data.yaml exists
print("data.yaml exists:", DATA_YAML.exists())

# 2. Load and display data.yaml contents
if DATA_YAML.exists():
    with open(DATA_YAML, "r") as f:
        data = yaml.safe_load(f)

    print("\nLoaded data.yaml contents:")
    for key, value in data.items():
        print(f"{key}: {value}")

# 3. Check image and label folders
folders_to_check = [
    "images/train",
    "images/val",
    "images/test",
    "labels/train",
    "labels/val",
    "labels/test",
]

print("\nChecking required folders:")
for folder in folders_to_check:
    path = DATASET_DIR / folder
    print(f"{folder}: {'FOUND' if path.exists() else 'MISSING'}")

### Interpretation

The successful detection of the `data.yaml` file confirms that the dataset has been correctly prepared and organized according to the YOLOv8 requirements.

With the dataset configuration verified, we can confidently proceed to:
- Load the trained model
- Evaluate its performance using validation and test data
- Compute and visualize evaluation metrics such as precision, recall, and mAP

In [None]:
from pathlib import Path
from ultralytics import YOLO

# 1) List common locations where best.pt usually is
candidate_paths = [
    Path("runs/detect/train/weights/best.pt")
]

# 2) If not found in common spots, search the project for best.pt
best_pt = None
for p in candidate_paths:
    if p.exists():
        best_pt = p
        break

if best_pt is None:
    matches = list(Path(".").rglob("best.pt"))
    if matches:
        best_pt = matches[0]  # pick first match
    else:
        raise FileNotFoundError("Could not find best.pt anywhere in this project folder.")

print("✅ Found model weights at:", best_pt)

# 3) Load the model
model = YOLO(str(best_pt))
print("✅ Model loaded successfully.")
print("Model info:", model.model.yaml if hasattr(model.model, "yaml") else "No YAML info available")