#  Real-Time Face Detection: YOLOv5 vs YOLOv8 Fine-Tuning

This notebook demonstrates fine-tuning and comparison of two popular object detection models — **YOLOv5** and **YOLOv8** — on a face detection dataset.


In [1]:
!git clone https://github.com/ultralytics/yolov5
%cd yolov5
%pip install -r requirements.txt

Cloning into 'yolov5'...
remote: Enumerating objects: 17521, done.[K
remote: Counting objects: 100% (24/24), done.[K
remote: Compressing objects: 100% (24/24), done.[K
remote: Total 17521 (delta 9), reused 0 (delta 0), pack-reused 17497 (from 4)[K
Receiving objects: 100% (17521/17521), 16.62 MiB | 29.25 MiB/s, done.
Resolving deltas: 100% (12005/12005), done.
/kaggle/working/yolov5
Collecting thop>=0.1.1 (from -r requirements.txt (line 14))
  Downloading thop-0.1.1.post2209072238-py3-none-any.whl.metadata (2.7 kB)
Collecting ultralytics>=8.2.64 (from -r requirements.txt (line 18))
  Downloading ultralytics-8.3.175-py3-none-any.whl.metadata (37 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.8.0->-r requirements.txt (line 15))
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.8.0->-r requirements.txt (line 15))
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py

##  Dataset (face detection dataset)
- Dataset contains ~14,000 annotated face images
- YOLOv5/YOLOv8 format used: one `.txt` file per image
- Training/validation split handled via folder structure


In [2]:
%%writefile /kaggle/working/data.yaml
train: /kaggle/input/face-detection-dataset/images/train
val: /kaggle/input/face-detection-dataset/images/val

nc: 1
names: ['face']

Writing /kaggle/working/data.yaml


##  Training Setup

###  YOLOv5 Training
- Model: `yolov5s.pt` (pretrained)
- Epochs: `10`
- Image size: `640`
- Training command:
  ```bash
  python train.py --img 640 --batch 16 --epochs 10 --data data.yaml --weights yolov5s.pt

In [3]:
!python train.py \
  --img 640 \
  --batch 16 \
  --epochs 10 \
  --data /kaggle/working/data.yaml\
  --weights yolov5s.pt \
  --project face_detection_project \
  --name yolov5s_faces


Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
2025-08-07 07:49:58.042618: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1754552998.300932     103 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1754552998.373806     103 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[34m[1mwandb[0m: (1) Create a W&B account
[34m[1mwandb[0m: (2) Use an existing W&B account
[34m[1mwandb[0m: (3) Don't

**Saving the metrics of fine tunned yolo v5**

In [15]:
import pandas as pd

df_v5 = pd.read_csv('face_detection_project/yolov5s_faces/results.csv')
df_v5.tail(1).to_csv('/kaggle/working/yolov5_metrics.csv', index=False)

In [6]:
!pip install ultralytics



###  YOLOv8 Training
- Model: `yolov8s.pt` (pretrained)
- Epochs: `10`
- Image size: `640`

In [7]:
from ultralytics import YOLO


model = YOLO('yolov8s.pt')


model.train(
    data='/kaggle/working/data.yaml',
    epochs=10,
    imgsz=640,
    batch=16,
    name='yolov8_face',
    project='face_detection_v8'
)


Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8s.pt to 'yolov8s.pt': 100%|██████████| 21.5M/21.5M [00:00<00:00, 113MB/s] 


Ultralytics 8.3.175 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/working/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=10, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8s.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=yolov8_face, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=True, pose=

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt': 100%|██████████| 5.35M/5.35M [00:00<00:00, 66.5MB/s]


[34m[1mAMP: [0mchecks passed ✅
[34m[1mtrain: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 386.4±242.3 MB/s, size: 229.5 KB)


[34m[1mtrain: [0mScanning /kaggle/input/face-detection-dataset/labels/train... 13386 images, 0 backgrounds, 0 corrupt: 100%|██████████| 13386/13386 [00:20<00:00, 646.76it/s]


[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 435.2±260.6 MB/s, size: 308.9 KB)


[34m[1mval: [0mScanning /kaggle/input/face-detection-dataset/labels/val... 3347 images, 0 backgrounds, 0 corrupt: 100%|██████████| 3347/3347 [00:06<00:00, 539.78it/s]


Plotting labels to face_detection_v8/yolov8_face/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.002, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mface_detection_v8/yolov8_face[0m
Starting training for 10 epochs...
Closing dataloader mosaic
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/10      4.38G      1.383      1.187      1.227         28        640: 100%|██████████| 837/837 [03:52<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:25<00:00,  4.17it/s]


                   all       3347      10299      0.863      0.696      0.782      0.474

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/10      4.99G      1.344     0.8948      1.199         56        640: 100%|██████████| 837/837 [03:56<00:00,  3.54it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.22it/s]


                   all       3347      10299      0.874      0.696      0.787      0.496

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/10      5.03G       1.31     0.8452      1.175         26        640: 100%|██████████| 837/837 [03:54<00:00,  3.57it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.28it/s]


                   all       3347      10299      0.876      0.708      0.799      0.505

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/10      5.07G      1.273     0.7866      1.154         28        640: 100%|██████████| 837/837 [03:53<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.25it/s]


                   all       3347      10299      0.883      0.747      0.831       0.53

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/10       5.1G      1.236     0.7327      1.132         29        640: 100%|██████████| 837/837 [03:53<00:00,  3.58it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.28it/s]


                   all       3347      10299      0.886      0.762      0.842      0.544

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/10      5.14G       1.21      0.701      1.116         19        640: 100%|██████████| 837/837 [03:52<00:00,  3.60it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.30it/s]


                   all       3347      10299      0.887      0.772      0.848      0.543

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/10      5.18G      1.185     0.6618        1.1         33        640: 100%|██████████| 837/837 [03:52<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.32it/s]


                   all       3347      10299      0.889      0.789      0.862      0.563

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/10      5.21G      1.162     0.6243      1.089         29        640: 100%|██████████| 837/837 [03:53<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.27it/s]


                   all       3347      10299      0.897      0.788      0.872      0.576

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/10      5.24G      1.146     0.5999      1.078         41        640: 100%|██████████| 837/837 [03:53<00:00,  3.59it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.23it/s]


                   all       3347      10299      0.895      0.795      0.875      0.582

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/10      5.29G      1.122     0.5664      1.061         17        640: 100%|██████████| 837/837 [03:53<00:00,  3.58it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:24<00:00,  4.29it/s]


                   all       3347      10299      0.897      0.807      0.882      0.588

10 epochs completed in 0.719 hours.
Optimizer stripped from face_detection_v8/yolov8_face/weights/last.pt, 22.5MB
Optimizer stripped from face_detection_v8/yolov8_face/weights/best.pt, 22.5MB

Validating face_detection_v8/yolov8_face/weights/best.pt...
Ultralytics 8.3.175 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 11,125,971 parameters, 0 gradients, 28.4 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 105/105 [00:25<00:00,  4.06it/s]
  xa[xa < 0] = -1
  xa[xa < 0] = -1


                   all       3347      10299      0.897      0.807      0.882      0.588
Speed: 0.1ms preprocess, 3.5ms inference, 0.0ms loss, 1.2ms postprocess per image
Results saved to [1mface_detection_v8/yolov8_face[0m


ultralytics.utils.metrics.DetMetrics object with attributes:

ap_class_index: array([0])
box: ultralytics.utils.metrics.Metric object
confusion_matrix: <ultralytics.utils.metrics.ConfusionMatrix object at 0x7f0a009c2b10>
curves: ['Precision-Recall(B)', 'F1-Confidence(B)', 'Precision-Confidence(B)', 'Recall-Confidence(B)']
curves_results: [[array([          0,    0.001001,    0.002002,    0.003003,    0.004004,    0.005005,    0.006006,    0.007007,    0.008008,    0.009009,     0.01001,    0.011011,    0.012012,    0.013013,    0.014014,    0.015015,    0.016016,    0.017017,    0.018018,    0.019019,     0.02002,    0.021021,    0.022022,    0.023023,
          0.024024,    0.025025,    0.026026,    0.027027,    0.028028,    0.029029,     0.03003,    0.031031,    0.032032,    0.033033,    0.034034,    0.035035,    0.036036,    0.037037,    0.038038,    0.039039,     0.04004,    0.041041,    0.042042,    0.043043,    0.044044,    0.045045,    0.046046,    0.047047,
          0.048048, 

In [12]:
metrics = model.val(data='/kaggle/working/data.yaml', imgsz=640, split='val')

Ultralytics 8.3.175 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 11,125,971 parameters, 0 gradients, 28.4 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 188.0±183.0 MB/s, size: 134.1 KB)


[34m[1mval: [0mScanning /kaggle/input/face-detection-dataset/labels/val... 3347 images, 0 backgrounds, 0 corrupt: 100%|██████████| 3347/3347 [00:04<00:00, 799.47it/s]




                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 210/210 [00:36<00:00,  5.72it/s]
  xa[xa < 0] = -1
  xa[xa < 0] = -1


                   all       3347      10299      0.892      0.811      0.883      0.589
Speed: 0.2ms preprocess, 7.3ms inference, 0.0ms loss, 0.9ms postprocess per image
Results saved to [1mface_detection_v8/yolov8_face2[0m


## Validation of Yolo-V8

In [17]:

v8_metrics = {
    "precision": metrics.box.p,
    "recall": metrics.box.r,
    "mAP@0.5": metrics.box.map50,
    "mAP@0.5:0.95": metrics.box.map
}

pd.DataFrame([v8_metrics]).to_csv('/kaggle/working/yolov8_metrics.csv', index=False)


In [11]:
from shutil import copyfile

copyfile(
    '/kaggle/working/yolov5/face_detection_v8/yolov8_face/weights/best.pt',
    '/kaggle/working/best_yolov8.pt'
)

'/kaggle/working/best_yolov8.pt'

In [18]:
from shutil import copyfile

copyfile(
    '/kaggle/working/yolov5/face_detection_project/yolov5s_faces/weights/best.pt',
    '/kaggle/working/best_yolov5.pt'
)

'/kaggle/working/best_yolov5.pt'

###  Why I Prefer YOLOv8 Over YOLOv5

- **Actively Maintained**: YOLOv8 is the latest and officially supported version by Ultralytics.
- **Higher Accuracy**: YOLOv8 achieved better mAP, precision, and recall on the same dataset.
- **Modern Architecture**: Anchor-free design, decoupled head, and improved backbone.
- **Cleaner API**: Easier to use for training, validation, inference, and export.
- **Better Deployment**: Supports ONNX, CoreML, TensorRT out of the box.
- **Faster Inference**: Optimized for real-time use even on CPU.

> Overall, YOLOv8 gives better performance with less effort and more flexibility compared to YOLOv5.
