# YOLOv8x: Fine-tuning and Evaluation

This notebook demonstrates how to:
1. Install required dependencies
2. Prepare dataset for training
3. Fine-tune YOLOv8x on a custom dataset
4. Run inference on test images
5. Calculate and visualize evaluation metrics

## 1. Install Required Dependencies

In [None]:
# Install ultralytics package for YOLOv8
!pip install ultralytics
!pip install opencv-python matplotlib seaborn

# Check CUDA availability 
import torch
print(f"PyTorch version: {torch.__version__}")
print(f"CUDA available: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"CUDA device: {torch.cuda.get_device_name(0)}")

PyTorch version: 2.7.0+cu126
CUDA available: False


## 2. Import Required Libraries

In [18]:
import os
import yaml
import random
import numpy as np
import cv2
import matplotlib.pyplot as plt
import seaborn as sns
from ultralytics import YOLO
from IPython.display import display, Image
import pandas as pd
from tqdm.notebook import tqdm
from pathlib import Path

# Set random seed for reproducibility
random.seed(42)
np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed_all(42)

## 3. Dataset Preparation

Let's assume we're working with a dataset that follows the YOLO format:
- images/ folder containing training images
- labels/ folder containing corresponding labels in YOLO format
- A YAML configuration file describing classes and dataset paths

If your dataset is structured differently, you'll need to adjust this section accordingly.

In [19]:
# Define dataset paths - customize these for your specific project
DATASET_DIR = "../dataset_split"  # Change this!
TRAIN_DIR = os.path.join(DATASET_DIR, "train")
VAL_DIR = os.path.join(DATASET_DIR, "val")
TEST_DIR = os.path.join(DATASET_DIR, "test")

# # Create dataset configuration YAML
# dataset_config = {
#     'path': DATASET_DIR,
#     'train': os.path.relpath(TRAIN_DIR, DATASET_DIR),
#     'val': os.path.relpath(VAL_DIR, DATASET_DIR),
#     'test': os.path.relpath(TEST_DIR, DATASET_DIR),
#     'names': {
#         # Add your class names and indices here
#         # For example:
#         # 0: 'car',
#         # 1: 'truck',
#         # 2: 'bus',
#         # ...
#     }
# }

# Write the dataset configuration to a YAML file
yaml_path = os.path.join(DATASET_DIR, "data.yaml")
# with open(yaml_path, 'w') as file:
#     yaml.dump(dataset_config, file)

# print(f"Dataset configuration saved to: {yaml_path}")

## 4. Fine-tuning YOLOv8x Model

Now we'll load a pre-trained YOLOv8x model and fine-tune it on our custom dataset.

In [None]:
# Load pre-trained YOLOv8x model
model = YOLO('yolov8x.pt')

# Define training hyperparameters optimized for small dataset (~400 images)
hyperparameters = {
    'epochs': 100,          # More epochs for small dataset
    'batch': 8,             # Smaller batch size
    'imgsz': 640,           # Image size
    'patience': 20,         # Increased patience for early stopping
    'device': 0,            # Device to use (0 for first GPU)
    'workers': 4,           # Reduced worker threads
    'optimizer': 'AdamW',   # Optimizer
    'lr0': 0.0005,          # Lower initial learning rate
    'lrf': 0.01,            # Final learning rate factor
    'momentum': 0.937,      # SGD momentum
    'weight_decay': 0.001,  # Increased weight decay to prevent overfitting
    'warmup_epochs': 5.0,   # Longer warmup
    'warmup_momentum': 0.8, # Warmup momentum
    'warmup_bias_lr': 0.1,  # Warmup bias lr
    'box': 7.5,             # Box loss gain
    'cls': 0.5,             # Class loss gain
    'hsv_h': 0.015,         # Image HSV-Hue augmentation
    'hsv_s': 0.7,           # Image HSV-Saturation augmentation
    'hsv_v': 0.4,           # Image HSV-Value augmentation
    'translate': 0.2,       # Increased translation for better augmentation
    'scale': 0.6,           # Increased scale variation
    'fliplr': 0.5,          # Image flip left-right probability
    'flipud': 0.2,          # Add up-down flipping
    'mosaic': 1.0,          # Maximize mosaic augmentation
    'mixup': 0.15,          # Add mixup augmentation
    'copy_paste': 0.1,      # Add copy-paste augmentation
}

# Create model results directory
results_dir = os.path.join(os.getcwd(), "yolov8x_results")
os.makedirs(results_dir, exist_ok=True)

# Train the model
results = model.train(
    data=yaml_path,
    project=results_dir,
    name='fine_tuned_model',
    exist_ok=True,
    **hyperparameters
)

print(f"Training completed. Model saved to: {os.path.join(results_dir, 'fine_tuned_model')}")

New https://pypi.org/project/ultralytics/8.3.155 available 😃 Update with 'pip install -U ultralytics'
Ultralytics 8.3.153 🚀 Python-3.12.3 torch-2.7.0+cu126 CPU (11th Gen Intel Core(TM) i7-1165G7 2.80GHz)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=../dataset_split/data.yaml, degrees=0.0, deterministic=True, device=cpu, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=30, erasing=0.4, exist_ok=True, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.001, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8x.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=fine_tuned_model, nbs=

100%|██████████| 755k/755k [00:00<00:00, 2.21MB/s]

Overriding model.yaml nc=80 with nc=1

                   from  n    params  module                                       arguments                     





  0                  -1  1      2320  ultralytics.nn.modules.conv.Conv             [3, 80, 3, 2]                 
  1                  -1  1    115520  ultralytics.nn.modules.conv.Conv             [80, 160, 3, 2]               
  2                  -1  3    436800  ultralytics.nn.modules.block.C2f             [160, 160, 3, True]           
  3                  -1  1    461440  ultralytics.nn.modules.conv.Conv             [160, 320, 3, 2]              
  4                  -1  6   3281920  ultralytics.nn.modules.block.C2f             [320, 320, 6, True]           
  5                  -1  1   1844480  ultralytics.nn.modules.conv.Conv             [320, 640, 3, 2]              
  6                  -1  6  13117440  ultralytics.nn.modules.block.C2f             [640, 640, 6, True]           
  7                  -1  1   3687680  ultralytics.nn.modules.conv.Conv             [640, 640, 3, 2]              
  8                  -1  3   6969600  ultralytics.nn.modules.block.C2f             [640,

[34m[1mtrain: [0mScanning /home/mayank/Vault/work_space/AIMS/Summer Project/Traffic_Flow_Analysis/dataset_split/train/labels... 438 images, 0 backgrounds, 0 corrupt: 100%|██████████| 438/438 [00:00<00:00, 2525.91it/s]

[34m[1mtrain: [0mNew cache created: /home/mayank/Vault/work_space/AIMS/Summer Project/Traffic_Flow_Analysis/dataset_split/train/labels.cache
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 167.8±47.9 MB/s, size: 60.8 KB)



[34m[1mval: [0mScanning /home/mayank/Vault/work_space/AIMS/Summer Project/Traffic_Flow_Analysis/dataset_split/val/labels... 93 images, 0 backgrounds, 0 corrupt: 100%|██████████| 93/93 [00:00<00:00, 2255.68it/s]

[34m[1mval: [0mNew cache created: /home/mayank/Vault/work_space/AIMS/Summer Project/Traffic_Flow_Analysis/dataset_split/val/labels.cache





Plotting labels to /home/mayank/Vault/work_space/AIMS/Summer Project/Traffic_Flow_Analysis/comparative_study/yolov8x_results/fine_tuned_model/labels.jpg... 
[34m[1moptimizer:[0m AdamW(lr=0.001, momentum=0.937) with parameter groups 97 weight(decay=0.0), 104 weight(decay=0.0005), 103 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 0 dataloader workers
Logging results to [1m/home/mayank/Vault/work_space/AIMS/Summer Project/Traffic_Flow_Analysis/comparative_study/yolov8x_results/fine_tuned_model[0m
Starting training for 30 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


  0%|          | 0/28 [00:00<?, ?it/s]

## 5. Model Inference and Evaluation

Now, let's evaluate the fine-tuned model on the test set and calculate performance metrics.

In [None]:
# Load the fine-tuned model
fine_tuned_model_path = os.path.join(results_dir, 'fine_tuned_model', 'weights', 'best.pt')
model = YOLO(fine_tuned_model_path)

# Run validation on the test set
test_results = model.val(
    data=yaml_path,
    split='test',  # Use the test split
    imgsz=640,
    batch=16,
    verbose=True,
    conf=0.25,    # Confidence threshold
    iou=0.5,      # IoU threshold
    project=results_dir,
    name='evaluation',
    exist_ok=True
)

print("Test results summary:")
print(f"mAP50: {test_results.box.map50:.5f}")
print(f"mAP50-95: {test_results.box.map:.5f}")
print(f"Precision: {test_results.box.mp:.5f}")
print(f"Recall: {test_results.box.mr:.5f}")

## 6. Detailed Analysis per Class

Let's analyze the model performance for each class separately.

In [None]:
# Get class-wise metrics from the validation results
class_map = test_results.names  # Class index to name mapping

# Access class metrics correctly from test_results.box
# The DetMetrics object doesn't have a 'metrics' attribute as per the error
class_precisions = test_results.box.p  # Class precisions
class_recalls = test_results.box.r     # Class recalls
ap50_per_class = test_results.box.ap50  # AP50 per class
ap_per_class = test_results.box.ap      # AP50-95 per class

# Create a DataFrame for better visualization
metrics_df = pd.DataFrame({
    'Class': [class_map[i] for i in range(len(class_map))],
    'AP50': ap50_per_class,
    'AP50-95': ap_per_class,
    'Precision': class_precisions,
    'Recall': class_recalls
})

display(metrics_df)

# Plot AP50 for each class
plt.figure(figsize=(12, 6))
sns.barplot(x='Class', y='AP50', data=metrics_df)
plt.title('AP50 for Each Class')
plt.xticks(rotation=45, ha='right')
plt.tight_layout()
plt.show()

## 7. Confusion Matrix

The confusion matrix helps us see how well the model differentiates between different classes.

In [None]:
# Plot confusion matrix
conf_matrix = test_results.confusion_matrix.matrix
plt.figure(figsize=(12, 10))
sns.heatmap(
    conf_matrix / np.sum(conf_matrix, axis=1)[:, None],  # Normalize by row (true classes)
    annot=True,
    fmt='.2f',
    cmap='Blues',
    xticklabels=list(class_map.values()),
    yticklabels=list(class_map.values())
)
plt.xlabel('Predicted')
plt.ylabel('True')
plt.title('Normalized Confusion Matrix')
plt.tight_layout()
plt.show()

## 8. Visualizing Detection Results on Test Images

Let's visualize some predictions on test images:

In [None]:
# Get list of test images
test_images_dir = os.path.join(TEST_DIR, 'images')
test_images = list(Path(test_images_dir).glob('*.jpg')) + list(Path(test_images_dir).glob('*.png'))
test_images = [str(img) for img in test_images]

# Select random images for visualization
if len(test_images) > 0:
    sample_images = random.sample(test_images, min(5, len(test_images)))
    
    for img_path in sample_images:
        # Run inference
        results = model(img_path, conf=0.25)
        
        # Display results
        for result in results:
            fig, ax = plt.subplots(1, 1, figsize=(12, 9))
            img = result.orig_img
            
            # Plot detections
            for box, conf, cls in zip(result.boxes.xyxy, result.boxes.conf, result.boxes.cls):
                x1, y1, x2, y2 = box.cpu().numpy().astype(int)
                class_id = int(cls.item())
                class_name = class_map[class_id]
                confidence = conf.item()
                
                # Draw bounding box
                cv2.rectangle(img, (x1, y1), (x2, y2), (0, 255, 0), 2)
                
                # Add label
                label = f"{class_name}: {confidence:.2f}"
                cv2.putText(img, label, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), 2)
            
            ax.imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
            ax.set_title(f"Predictions on {os.path.basename(img_path)}")
            ax.axis("off")
            plt.tight_layout()
            plt.show()
else:
    print("No test images found")

## 9. Precision-Recall Curves

Let's plot precision-recall curves for each class:

In [None]:
# Create P-R curve plots for each class
plt.figure(figsize=(12, 8))

# Get P-R curve data from results
# Make sure to access the curves data correctly
try:
    # First attempt - if curves are stored directly in test_results
    precision_data = test_results.curves[0].data
    recall_data = test_results.curves[1].data
except (AttributeError, IndexError):
    # Alternative access method - check if it's in box
    try:
        precision_data = test_results.box.curves[0].data
        recall_data = test_results.box.curves[1].data
    except (AttributeError, IndexError):
        # If can't access curves, create simple PR curve from class values
        print("Could not access PR curves directly, creating simplified version from class values")
        precision_data = np.array([class_precisions]).T
        recall_data = np.array([class_recalls]).T

# Plot P-R curves for each class
for i in range(len(class_map)):
    try:
        plt.plot(recall_data[:, i], precision_data[:, i], label=f'{class_map[i]}')
    except IndexError:
        plt.scatter(class_recalls[i], class_precisions[i], label=f'{class_map[i]}')

plt.xlabel('Recall')
plt.ylabel('Precision')
plt.title('Precision-Recall Curves')
plt.xlim([0, 1])
plt.ylim([0, 1])
plt.legend(loc='lower left')
plt.grid(True)
plt.show()

## 10. Export Model for Deployment

Let's save our fine-tuned model in different formats for deployment.

In [None]:
# Export model to different formats
export_path = os.path.join(results_dir, "exported_models")
os.makedirs(export_path, exist_ok=True)

# Export to ONNX format
model.export(format="onnx", imgsz=640)

# Export to TorchScript format
model.export(format="torchscript", imgsz=640)

print(f"Models exported to {export_path}")
print("Available formats:")
for file in os.listdir(export_path):
    print(f"- {file}")

## 11. Summary and Conclusion

We have successfully:
1. Fine-tuned YOLOv8x on a custom dataset
2. Evaluated its performance on the test set
3. Analyzed per-class metrics and visualized results
4. Exported the model for deployment

Key metrics:
- mAP50: How accurate the model is at IoU threshold of 0.5
- mAP50-95: How accurate the model is across multiple IoU thresholds
- Precision: How many of the predicted detections are correct
- Recall: How many of the ground truth objects are detected

To improve results further, consider:
- Increasing the number of training epochs
- Adding more training data or using data augmentation
- Adjusting hyperparameters like learning rate and batch size
- Using different model variants (YOLOv8s, YOLOv8m, etc.)