# Custom Training with YOLOv5

In this tutorial, we assemble a dataset and train a custom YOLOv5 model to recognize the objects in our dataset. To do so we will take the following steps:

* Gather a dataset of images and label our dataset
* Export our dataset to YOLOv5
* Train YOLOv5 to recognize the objects in our dataset
* Evaluate our YOLOv5 model's performance
* Run test inference to view our model at work



![](https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/615627e5824c9c6195abfda9_computer-vision-cycle.png)

# Step 1: Install Requirements

In [1]:
#clone YOLOv5 and
!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
%pip install -qr requirements.txt # install dependencies
%pip install -q roboflow

import torch
import os
from IPython.display import Image, clear_output  # to display images

print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

Cloning into 'yolov5'...
remote: Enumerating objects: 17483, done.[K
remote: Total 17483 (delta 0), reused 0 (delta 0), pack-reused 17483 (from 1)[K
Receiving objects: 100% (17483/17483), 16.53 MiB | 26.62 MiB/s, done.
Resolving deltas: 100% (11988/11988), done.
/kaggle/working/yolov5
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.4/363.4 MB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m664.8/664.8 MB[0m [31m2.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m211.5/211.5 MB[0m [31m8.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.3/56.3 MB[0m [31m30.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m127.9/127.9 MB[0m [31m13.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m207.5/207.5 MB[0m [31m7.7 MB/s[0m eta

In [2]:
!nvidia-smi

Mon Jun  2 07:06:39 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   39C    P8             10W /   70W |       3MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
|   1  Tesla T4                      

# Step 2: Assemble Our Dataset

In order to train our custom model, we need to assemble a dataset of representative images with bounding box annotations around the objects that we want to detect. And we need our dataset to be in YOLOv5 format.

In Roboflow, you can choose between two paths:

* Convert an existing dataset to YOLOv5 format. Roboflow supports over [30 formats object detection formats](https://roboflow.com/formats) for conversion.
* Upload raw images and annotate them in Roboflow with [Roboflow Annotate](https://docs.roboflow.com/annotate).

**Annotate**

![](https://roboflow-darknet.s3.us-east-2.amazonaws.com/roboflow-annotate.gif)

**Version**

![](https://roboflow-darknet.s3.us-east-2.amazonaws.com/robolfow-preprocessing.png)


In [3]:
from roboflow import Roboflow

# set up environment
os.environ["DATASET_DIRECTORY"] = "/kaggle/working"

In [4]:
rf = Roboflow(api_key="1gcBcT7UXZw9ANuN6LKS")
project = rf.workspace("skripsi-dataset").project("yolobytetrack-model")
version = project.version(3)
dataset = version.download("yolov5")

loading Roboflow workspace...
loading Roboflow project...


Downloading Dataset Version Zip in /kaggle/working/YoloByteTrack-Model-3 to yolov5pytorch:: 100%|██████████| 652779/652779 [00:08<00:00, 73979.51it/s]





Extracting Dataset Version Zip to /kaggle/working/YoloByteTrack-Model-3 in yolov5pytorch:: 100%|██████████| 12724/12724 [00:02<00:00, 4763.52it/s]


# Step 3: Train Our Custom YOLOv5 model

Here, we are able to pass a number of arguments:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here!)
- **data:** Our dataset locaiton is saved in the `dataset.location`
- **weights:** specify a path to weights to start transfer learning from. Here we choose the generic COCO pretrained checkpoint.
- **cache:** cache images for faster training

**Tunning Hyperparameters**

In [5]:
%%writefile data/hyp.custom.yaml
lr0: 0.001         # Learning rate awal — tidak terlalu kecil, tidak agresif
lrf: 0.1
momentum: 0.85     # Agak lebih stabil di kondisi objek padat
weight_decay: 0.001
warmup_epochs: 2.0
warmup_momentum: 0.8
warmup_bias_lr: 0.1

box: 0.05          # Bobot untuk bounding box loss
obj: 1.0           # Bobot untuk objectness (penting untuk deteksi objek kecil)
obj_pw: 1.0
cls: 0.8           # Sedikit dinaikkan karena kita ingin klasifikasi jelas antar mobil/orang/motor
cls_pw: 1.0

iou_t: 0.2         # Threshold untuk anchor match — biarkan default
anchor_t: 4.0      # Threshold pemilihan anchor
fl_gamma: 0.0

hsv_h: 0.015       # Augmentasi warna — cukup ringan agar tidak over-distort CCTV
hsv_s: 0.5
hsv_v: 0.4
degrees: 0.0

translate: 0.1     # Translasi ringan, karena kamera statis
scale: 0.3         # Skala tidak terlalu besar — CCTV sudah cukup stabil
shear: 0.0
perspective: 0.0
fliplr: 0.3        # Flipping horizontal dibolehkan, tapi tidak ekstrem
flipud: 0.0        # Tidak perlu flip vertikal — tidak realistis untuk orang/mobil
mosaic: 0.5        # Sedikit dikurangi agar tidak terlalu mencampur objek
mixup: 0.0
copy_paste: 0.0

Writing data/hyp.custom.yaml


In [6]:
!python train.py --img 640 --batch 16 --epochs 100 \
--data {dataset.location}/data.yaml \
--weights yolov5m.pt \
--hyp data/hyp.custom.yaml \
--name yolov5m-roboflow-v1 \
--patience 20 \
--cache

Creating new Ultralytics Settings v0.0.6 file ✅ 
View Ultralytics Settings with 'yolo settings' or at '/root/.config/Ultralytics/settings.json'
Update Settings with 'yolo settings key=value', i.e. 'yolo settings runs_dir=path/to/dir'. For help see https://docs.ultralytics.com/quickstart/#ultralytics-settings.
2025-06-02 07:07:12.294321: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1748848032.548814      72 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1748848032.607214      72 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more informati

# Step 4: Evaluate Custom YOLOv5 Detector Performance
Training losses and performance metrics are saved to Tensorboard and also to a logfile.

If you are new to these metrics, the one you want to focus on is `mAP_0.5` - learn more about mean average precision [here](https://blog.roboflow.com/mean-average-precision/).

In [7]:
!python detect.py --weights runs/train/yolov5m-roboflow-v1/weights/best.pt --img 640 --conf 0.1 --source {dataset.location}/test/images

[34m[1mdetect: [0mweights=['runs/train/yolov5m-roboflow-v1/weights/best.pt'], source=/kaggle/working/YoloByteTrack-Model-3/test/images, data=data/coco128.yaml, imgsz=[640, 640], conf_thres=0.1, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_format=0, save_csv=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False, vid_stride=1
YOLOv5 🚀 v7.0-419-gcd44191c Python-3.11.11 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)

Fusing layers... 
Model summary: 212 layers, 20861016 parameters, 0 gradients, 47.9 GFLOPs
image 1/121 /kaggle/working/YoloByteTrack-Model-3/test/images/merge_video_1715688542414_mp4-0022_jpg.rf.b3fd4a5fb56b1f319867e8cd7d8db6f7.jpg: 640x640 9 Orangs, 27.0ms
image 2/121 /kaggle/working/YoloByteTrack-Model-3/test/images/merge_video_17156885

# Conclusion and Next Steps

Congratulations! You've trained a custom YOLOv5 model to recognize your custom objects.

To improve you model's performance, we recommend first interating on your datasets coverage and quality. See this guide for [model performance improvement](https://github.com/ultralytics/yolov5/wiki/Tips-for-Best-Training-Results).

To deploy your model to an application, see this guide on [exporting your model to deployment destinations](https://github.com/ultralytics/yolov5/issues/251).

Once your model is in production, you will want to continually iterate and improve on your dataset and model via [active learning](https://blog.roboflow.com/what-is-active-learning/).

In [8]:
!zip -r /kaggle/working/all_output.zip /kaggle/working/*

  adding: kaggle/working/__notebook__.ipynb (deflated 95%)
  adding: kaggle/working/YoloByteTrack-Model-3/ (stored 0%)
  adding: kaggle/working/YoloByteTrack-Model-3/README.roboflow.txt (deflated 49%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/ (stored 0%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/images/ (stored 0%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/images/screenshot_20250508_141757_png.rf.98d0adc5f11e6ecbaf9f3875c2b4d682.jpg (deflated 0%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/images/screenshot_20250424_144948_png.rf.8ea251fa90f05a5a5fb0d7f809cd8be2.jpg (deflated 0%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/images/screenshot_20250424_134536_png.rf.a2e8333ffebdf92d4a3764797bf99b44.jpg (deflated 0%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/images/screenshot_20250505_161547_png.rf.773f52525c6d488c9207cdccfa2e8a87.jpg (deflated 1%)
  adding: kaggle/working/YoloByteTrack-Model-3/test/images/screenshot_2025051