YOLO Model Configuration and Setup

This notebook defines the YOLO model architecture and training hyperparameters for Pascal VOC 2012 dataset. It is configuration-only - prepares all parameters for training in notebook 03. No training execution here.

Configuration Coverage:
1. YOLO model selection (v8n - nano variant)
2. Dataset and class definition (3 classes: person, car, dog from Pascal VOC)
3. data.yaml configuration with normalized bounding boxes
4. Model architecture parameters (input size 416x416, batch size 16)
5. Training hyperparameters optimized for large dataset (50 epochs)
6. Reproducibility settings (seed 42)

Dataset: Pascal VOC 2012 (~3000-5000 images after filtering)
Training Time: 30-60 minutes on GPU, 3-4 hours on CPU
Output: Model ready for inference in notebook 04

This notebook is prerequisite for notebook 03_training.ipynb


In [None]:
import yaml
from pathlib import Path

# Reproducibility
SEED = 42

# Project structure
PROJECT_ROOT = Path('../')
DATA_DIR = PROJECT_ROOT / 'data'
MODELS_DIR = PROJECT_ROOT / 'models'
MODELS_DIR.mkdir(parents=True, exist_ok=True)

print("YOLO MODEL CONFIGURATION")
print("=" * 60)
print(f"Project Root: {PROJECT_ROOT}")
print(f"Data Dir: {DATA_DIR}")
print(f"Models Dir: {MODELS_DIR}")
print("=" * 60)

YOLO TRAINING PIPELINE
Device: cpu
Classes: 3 (person, car, dog)
Project Root: C:\Users\mlata\Documents\iajordy2
Data Dir: C:\Users\mlata\Documents\iajordy2\data
Models Dir: C:\Users\mlata\Documents\iajordy2\models


Stage 1: Environment Setup and Paths

This stage initializes the Python environment by setting up file paths and directories required for the training configuration.

Configuration components:
- SEED: Fixed random seed (42) for reproducibility across runs
- PROJECT_ROOT: Base directory containing all project files
- DATA_DIR: Location of prepared YOLO dataset from notebook 01
- MODELS_DIR: Directory where trained models will be saved

All paths use pathlib.Path for cross-platform compatibility (Windows, Linux, macOS).

In [None]:
# Model selection
MODEL_NAME = 'yolov8n'
PRETRAINED_WEIGHTS = 'yolov8n.pt'

print("\n[1] Model Selection")
print("-" * 60)
print(f"Model: {MODEL_NAME}")
print(f"Pretrained: COCO")
print(f"Weights: {PRETRAINED_WEIGHTS}")
print(f"Architecture: Nano (efficient, fast)")
print(f"Use case: Object detection with 3 classes")

Ultralytics 8.4.9  Python-3.10.0 torch-2.10.0+cpu CPU (AMD Ryzen 9 5900XT 16-Core Processor)
[34m[1mengine\trainer: [0magnostic_nms=False, amp=True, angle=1.0, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=C:\Users\mlata\Documents\iajordy2\data\data.yaml, degrees=0.0, deterministic=True, device=cpu, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, end2end=None, epochs=50, erasing=0.4, exist_ok=True, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=416, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=1.0, multi_scale=0.0, name=yolo_run, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, o

2026/02/01 12:55:42 INFO mlflow.tracking.fluent: Experiment with name 'C:\Users\mlata\Documents\iajordy2\models' does not exist. Creating a new experiment.


[34m[1mMLflow: [0mlogging run_id(a4667a31980e41ea9f39506ccc9d600f) to file:///mlruns
[34m[1mMLflow: [0mdisable with 'yolo settings mlflow=False'
Image sizes 416 train, 416 val
Using 0 dataloader workers
Logging results to [1mC:\Users\mlata\Documents\iajordy2\models\yolo_run[0m
Starting training for 50 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
[K       1/50         0G      2.999      3.976      2.654         54        416: 100% ━━━━━━━━━━━━ 25/25 1.1it/s 23.8s0.9ss
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 2/2 1.5it/s 1.4s2.9s
                   all         50         99    0.00133      0.289    0.00155   0.000365

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size
[K       2/50         0G      2.863      3.883      2.518         50        416: 100% ━━━━━━━━━━━━ 25/25 1.1it/s 23.2s0.9ss
[K                 Class     Images  Instances   

Stage 2: Model Selection

This stage specifies the YOLO model architecture and pretrained weights to use.

YOLOv8 variants by size:
- yolov8n (nano): 3.2M parameters - fast and efficient
- yolov8s (small): 11.2M parameters - balanced speed/accuracy
- yolov8m (medium): 25.9M parameters - good accuracy
- yolov8l (large): 43.7M parameters - high accuracy
- yolov8x (xlarge): 68.2M parameters - best accuracy but slower

This notebook uses yolov8n because:
1. Lightweight for remote environments with limited resources
2. Pretrained on COCO (80 classes) provides good transfer learning base
3. Will be fine-tuned to our 3-class detection task

In [None]:
# Dataset configuration
NUM_CLASSES = 3
CLASS_NAMES = ['person', 'car', 'dog']
CLASS_MAPPING = {i: name for i, name in enumerate(CLASS_NAMES)}

print("\n[2] Dataset Configuration")
print("-" * 60)
print(f"Number of classes: {NUM_CLASSES}")
print(f"Classes: {', '.join(CLASS_NAMES)}")
print(f"Format: YOLO (normalized bounding boxes)")
print(f"Splits: train, val, test")
print("\nClass IDs:")
for class_id, class_name in CLASS_MAPPING.items():
    print(f"  {class_id}: {class_name}")

Ultralytics 8.4.9  Python-3.10.0 torch-2.10.0+cpu CPU (AMD Ryzen 9 5900XT 16-Core Processor)
Model summary (fused): 73 layers, 3,006,233 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access  (ping: 0.10.0 ms, read: 899.362.2 MB/s, size: 102.2 KB)
[K[34m[1mval: [0mScanning C:\Users\mlata\Documents\iajordy2\data\labels\val.cache... 50 images, 0 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 50/50  0.0s
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 4/4 3.2it/s 1.3s0.6s
                   all         50         99    0.00226      0.344     0.0146    0.00425
Speed: 0.4ms preprocess, 18.0ms inference, 0.0ms loss, 2.9ms postprocess per image
Results saved to [1mC:\Users\mlata\Documents\iajordy2\runs\detect\val[0m
Validation Metrics:
  mAP50: 0.0146
  mAP50_95: 0.0043
  precision: 0.0023
  recall: 0.3438
Validation completed


Stage 3: Dataset Configuration

This stage defines the dataset composition and class mappings for the training task.

Dataset specification:
- NUM_CLASSES: 3 target object classes
- CLASS_NAMES: Names of classes (person, car, dog)
- Format: YOLO normalized bounding boxes (as prepared in notebook 01)

Class ID mapping:
- Class ID is the index position in CLASS_NAMES list
- Stored in first column of label files
- Used by YOLO model to identify object categories during training and inference

In [None]:
# Create/Update data.yaml for YOLO
data_yaml_path = DATA_DIR / 'data.yaml'

data_yaml_content = {
    'path': str(DATA_DIR.absolute()),
    'train': 'images/train',
    'val': 'images/val',
    'test': 'images/test',
    'nc': NUM_CLASSES,
    'names': CLASS_NAMES
}

with open(data_yaml_path, 'w') as f:
    yaml.dump(data_yaml_content, f, default_flow_style=False, sort_keys=False)

print("\n[3] Data Configuration (data.yaml)")
print("-" * 60)
print(f"Path: {data_yaml_path}")
print(f"\nContent:")
with open(data_yaml_path, 'r') as f:
    print(f.read())



Model registered at: file:///c:/mlruns/980429098773043767/5a6580d8183242a39dfda7d1ce3c132f/artifacts/yolo_model
Model transitioned to Production - Version: 1
Model Registration completed


  return FileStore(store_uri)
Successfully registered model 'yolo_3class_detector'.
Created version '1' of model 'yolo_3class_detector'.
  model_version = client.get_latest_versions('yolo_3class_detector')[0]
  client.transition_model_version_stage(


Stage 4: Data Configuration File Generation

This stage creates the data.yaml file, which is the interface between the dataset and YOLO training.

data.yaml is required by YOLO and specifies:
- path: Absolute path to dataset base directory
- train: Relative path to training images
- val: Relative path to validation images
- test: Relative path to test images
- nc: Number of classes
- names: List of class names

This file is created during notebook 01 but verified/recreated here for consistency.

In [None]:
# Training hyperparameters (to be used in notebook 03)
TRAINING_CONFIG = {
    'epochs': 50,
    'batch_size': 16,
    'imgsz': 416,
    'patience': 10,
    'device': 'cuda',  # or 'cpu' if GPU not available
    'seed': SEED,
    'lr0': 0.01,  # initial learning rate
    'lrf': 0.01,  # final learning rate
    'momentum': 0.937,
    'weight_decay': 0.0005,
    'warmup_epochs': 3.0,
    'warmup_momentum': 0.8,
    'verbose': True,
    'save': True,
    'exist_ok': True
}

print("\n[4] Training Hyperparameters")
print("-" * 60)
print("Configuration (to be used in notebook 03):")
for key, value in TRAINING_CONFIG.items():
    print(f"  {key}: {value}")

print("\n" + "=" * 60)
print("Configuration complete")
print("Ready for training in notebook 03_training.ipynb")

Stage 5: Training Hyperparameters Definition

This stage defines all hyperparameters that control the training process executed in notebook 03. Optimized for lightweight academic dataset.

Key hyperparameters:
- epochs: 20 complete passes through training data (reduced from 50 for small dataset)
- batch_size: 8 images per gradient update (reduced from 16 for small dataset)
- imgsz: Input image size 416x416 pixels
- patience: Early stopping at 5 epochs without improvement (reduced from 10)
- device: GPU (cuda) or CPU for training
- seed: Fixed random seed 42 for reproducibility
- learning rate (lr0, lrf): Initial 0.01, final 0.001 with decay schedule
- momentum: SGD momentum 0.937 for optimization
- weight_decay: L2 regularization 0.0005 to prevent overfitting
- warmup: Gradual learning rate increase in first 2 epochs

For academic lightweight dataset:
- Smaller number of epochs prevents overfitting
- Reduced batch size matches small dataset size
- Shorter early stopping patience
- Total training time: 5-15 minutes