[![Roboflow Notebooks](https://media.roboflow.com/notebooks/template/bannertest2-2.png?ik-sdk-version=javascript-1.4.3&updatedAt=1672932710194)](https://github.com/roboflow/notebooks)

# How to Train YOLOv8 Object Detection on a Custom Dataset

---

[![Roboflow](https://raw.githubusercontent.com/roboflow-ai/notebooks/main/assets/badges/roboflow-blogpost.svg)](https://blog.roboflow.com/how-to-train-yolov8-on-a-custom-dataset)
[![YouTube](https://badges.aleen42.com/src/youtube.svg)](https://youtu.be/wuZtUMEiKWY)
[![GitHub](https://badges.aleen42.com/src/github.svg)](https://github.com/ultralytics/ultralytics)

Ultralytics YOLOv8 is a popular version of the YOLO (You Only Look Once) object detection and image segmentation model developed by Ultralytics. The YOLOv8 model is designed to be fast, accurate, and easy to use, making it an excellent choice for a wide range of object detection and image segmentation tasks. It can be trained on large datasets and is capable of running on a variety of hardware platforms, from CPUs to GPUs.

## Disclaimer

If you notice that our notebook behaves incorrectly - especially if you experience errors that prevent you from going through the tutorial - don't hesitate! Let us know and open an [issue](https://github.com/roboflow/notebooks/issues) on the Roboflow Notebooks repository.

## Accompanying Blog Post

We recommend that you follow along in this notebook while reading the accompanying [Blog Post](https://blog.roboflow.com/how-to-train-yolov8-on-a-custom-dataset/).

## Pro Tip: Use GPU Acceleration

If you are running this notebook in Google Colab, navigate to `Edit` -> `Notebook settings` -> `Hardware accelerator`, set it to `GPU`, and then click `Save`. This will ensure your notebook uses a GPU, which will significantly speed up model training times.

## Steps in this Tutorial

In this tutorial, we are going to cover:

- Before you start
- Install YOLOv8
- CLI Basics
- Inference with Pre-trained COCO Model
- Roboflow Universe
- Preparing a custom dataset
- Custom Training
- Validate Custom Model
- Inference with Custom Model

**Let's begin!**

In [8]:
!ls /content

runs  unripe  yolov8s.pt


In [15]:
!ls /content/unripe/ripe-tomatoes-1/*

/content/unripe/ripe-tomatoes-1/data.yaml
/content/unripe/ripe-tomatoes-1/README.dataset.txt
/content/unripe/ripe-tomatoes-1/README.roboflow.txt

/content/unripe/ripe-tomatoes-1/test:
images	labels

/content/unripe/ripe-tomatoes-1/train:
images	labels


In [17]:
yaml_path = "/content/unripe/ripe-tomatoes-1/data.yaml"

with open(yaml_path, "r") as f:
    data = f.read()

data = data.replace("/content/unripe/ripe-tomatoes-1/valid/images",
                    "/content/unripe/ripe-tomatoes-1/test/images")

with open(yaml_path, "w") as f:
    f.write(data)

print("✅ data.yaml updated successfully!")

✅ data.yaml updated successfully!


In [18]:
!cat /content/unripe/ripe-tomatoes-1/data.yaml

names:
- b_fully_ripened
- b_green
- b_half_ripened
- l_fully_ripened
- l_green
- l_half_ripened
nc: 6
roboflow:
  license: CC BY 4.0
  project: unripe-ripe-tomatoes-bwwtz
  url: https://universe.roboflow.com/ella-4wefc/unripe-ripe-tomatoes-bwwtz/dataset/1
  version: 1
  workspace: ella-4wefc
test: ../test/images
train: ../train/images
val: ../valid/images


In [19]:
yaml_path = "/content/unripe/ripe-tomatoes-1/data.yaml"

with open(yaml_path, "r") as f:
    lines = f.readlines()

new_lines = []
for line in lines:
    if line.strip().startswith("val:"):
        new_lines.append("val: ../test/images\n")
    else:
        new_lines.append(line)

with open(yaml_path, "w") as f:
    f.writelines(new_lines)

print("✅ val path fixed!")

!cat /content/unripe/ripe-tomatoes-1/data.yaml

✅ val path fixed!
names:
- b_fully_ripened
- b_green
- b_half_ripened
- l_fully_ripened
- l_green
- l_half_ripened
nc: 6
roboflow:
  license: CC BY 4.0
  project: unripe-ripe-tomatoes-bwwtz
  url: https://universe.roboflow.com/ella-4wefc/unripe-ripe-tomatoes-bwwtz/dataset/1
  version: 1
  workspace: ella-4wefc
test: ../test/images
train: ../train/images
val: ../test/images


In [20]:
!pip install ultralytics



In [22]:
from ultralytics import YOLO

model = YOLO("yolov8n.pt")

model.train(
    data="/content/unripe/ripe-tomatoes-1/data.yaml",
    epochs=50,
    imgsz=640,
    batch=16
)

[KDownloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt': 100% ━━━━━━━━━━━━ 6.2MB 98.8MB/s 0.1s
Ultralytics 8.3.202 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, compile=False, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/content/unripe/ripe-tomatoes-1/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=50, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosa

ultralytics.utils.metrics.DetMetrics object with attributes:

ap_class_index: array([0, 1, 2, 3, 4, 5])
box: ultralytics.utils.metrics.Metric object
confusion_matrix: <ultralytics.utils.metrics.ConfusionMatrix object at 0x79416248e240>
curves: ['Precision-Recall(B)', 'F1-Confidence(B)', 'Precision-Confidence(B)', 'Recall-Confidence(B)']
curves_results: [[array([          0,    0.001001,    0.002002,    0.003003,    0.004004,    0.005005,    0.006006,    0.007007,    0.008008,    0.009009,     0.01001,    0.011011,    0.012012,    0.013013,    0.014014,    0.015015,    0.016016,    0.017017,    0.018018,    0.019019,     0.02002,    0.021021,    0.022022,    0.023023,
          0.024024,    0.025025,    0.026026,    0.027027,    0.028028,    0.029029,     0.03003,    0.031031,    0.032032,    0.033033,    0.034034,    0.035035,    0.036036,    0.037037,    0.038038,    0.039039,     0.04004,    0.041041,    0.042042,    0.043043,    0.044044,    0.045045,    0.046046,    0.047047,
     

In [24]:
results = model.predict(
    source="/content/unripe/ripe-tomatoes-1/test/images",
    save=True,
    conf=0.5
)

print("✅ Inference done! Annotated images saved in /content/runs/detect/predict/")


image 1/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0983_jpg.rf.cbde5d890f4a0d1cc229e34483708149.jpg: 640x640 1 b_green, 8.6ms
image 2/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0986_jpg.rf.26c3eed9f013cbefca6ef71e1eafadb3.jpg: 640x640 (no detections), 7.3ms
image 3/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0987_jpg.rf.236db0b43093948cfaa7464b161a78c3.jpg: 640x640 (no detections), 7.3ms
image 4/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0988_jpg.rf.e0cbbbae7e2f07d9871ded557e6b74fe.jpg: 640x640 (no detections), 7.2ms
image 5/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0991_jpg.rf.2e1ed9dfa58076ba722b4f6062ba731f.jpg: 640x640 (no detections), 7.3ms
image 6/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0992_jpg.rf.0e0db44405d10e113af37299ba269ce7.jpg: 640x640 1 b_green, 7.2ms
image 7/161 /content/unripe/ripe-tomatoes-1/test/images/IMG_0993_jpg.rf.c2e035ab6d879774bdb708089711e53a.jpg: 640x640 1 l_green, 7.2ms
image 8/161 /content/unripe/ri

In [25]:
group_map = {
    "b_green": "unripe",
    "l_green": "unripe",
    "b_half_ripened": "unripe",
    "l_half_ripened": "unripe",
    "b_fully_ripened": "ripe",
    "l_fully_ripened": "ripe"
}

for r in results:
    print(f"\n📌 Results for {r.path}")
    for box in r.boxes:
        cls_id = int(box.cls[0].item())
        cls_name = r.names[cls_id]
        grouped_label = group_map[cls_name]
        print(f"Detected: {cls_name} → {grouped_label}")


📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0983_jpg.rf.cbde5d890f4a0d1cc229e34483708149.jpg
Detected: b_green → unripe

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0986_jpg.rf.26c3eed9f013cbefca6ef71e1eafadb3.jpg

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0987_jpg.rf.236db0b43093948cfaa7464b161a78c3.jpg

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0988_jpg.rf.e0cbbbae7e2f07d9871ded557e6b74fe.jpg

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0991_jpg.rf.2e1ed9dfa58076ba722b4f6062ba731f.jpg

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0992_jpg.rf.0e0db44405d10e113af37299ba269ce7.jpg
Detected: b_green → unripe

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_0993_jpg.rf.c2e035ab6d879774bdb708089711e53a.jpg
Detected: l_green → unripe

📌 Results for /content/unripe/ripe-tomatoes-1/test/images/IMG_1005_jpg.rf.d5397673aedae5213b6b46348c431405.jpg

📌 Results for /content

In [26]:
import csv
import os

csv_path = "/content/grouped_results.csv"

header = ["image", "class", "grouped_label", "confidence"]

with open(csv_path, mode="w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(header)

    group_map = {
        "b_green": "unripe",
        "l_green": "unripe",
        "b_half_ripened": "unripe",
        "l_half_ripened": "unripe",
        "b_fully_ripened": "ripe",
        "l_fully_ripened": "ripe"
    }

    for r in results:
        image_name = os.path.basename(r.path)

        if not len(r.boxes):
            writer.writerow([image_name, "none", "none", 0.0])
            print(f"📌 {image_name} → No detections")
            continue

        for box in r.boxes:
            cls_id = int(box.cls[0].item())
            cls_name = model.names[cls_id]
            conf = float(box.conf[0].item())
            grouped_label = group_map.get(cls_name, "unknown")

            writer.writerow([image_name, cls_name, grouped_label, round(conf, 2)])
            print(f"Detected: {cls_name} → {grouped_label} (conf={conf:.2f})")

print(f"\n✅ Grouped results saved to {csv_path}")

Detected: b_green → unripe (conf=0.61)
📌 IMG_0986_jpg.rf.26c3eed9f013cbefca6ef71e1eafadb3.jpg → No detections
📌 IMG_0987_jpg.rf.236db0b43093948cfaa7464b161a78c3.jpg → No detections
📌 IMG_0988_jpg.rf.e0cbbbae7e2f07d9871ded557e6b74fe.jpg → No detections
📌 IMG_0991_jpg.rf.2e1ed9dfa58076ba722b4f6062ba731f.jpg → No detections
Detected: b_green → unripe (conf=0.71)
Detected: l_green → unripe (conf=0.51)
📌 IMG_1005_jpg.rf.d5397673aedae5213b6b46348c431405.jpg → No detections
Detected: b_green → unripe (conf=0.68)
📌 IMG_1013_jpg.rf.f5959d3b13f0cf62adb4f858433ab6d7.jpg → No detections
📌 IMG_1024_jpg.rf.d8eae96104196f4828c0563bfd45a3a5.jpg → No detections
📌 IMG_1025_jpg.rf.7a47583aa2dc7065f7c488e0bfe10714.jpg → No detections
📌 IMG_1029_jpg.rf.211a9b5123a0c3e7686802aa1cd841a1.jpg → No detections
Detected: b_green → unripe (conf=0.58)
📌 IMG_1031_jpg.rf.bde49488f562a8475150d98c92e8c4f8.jpg → No detections
📌 IMG_1035_jpg.rf.35e54773d8108e481e75972a873263ac.jpg → No detections
Detected: b_green → unri

In [27]:
from google.colab import files

files.download("/content/grouped_results.csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [29]:
import glob
import os
from google.colab import files

pt_files = glob.glob("/content/runs/detect/*/weights/best.pt")

if pt_files:
    latest_pt = max(pt_files, key=os.path.getctime)
    print("📂 Found best.pt at:", latest_pt)

    files.download(latest_pt)
else:
    print("❌ Walang best.pt na nahanap. Siguraduhin nakapag-train ka muna.")

📂 Found best.pt at: /content/runs/detect/train6/weights/best.pt


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [32]:
!find /content/runs/detect/ -name best.pt

/content/runs/detect/train6/weights/best.pt


In [34]:
from ultralytics import YOLO

model = YOLO("/content/runs/detect/train6/weights/best.pt")
results = model.val()

Ultralytics 8.3.202 🚀 Python-3.12.11 torch-2.8.0+cu126 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 3,006,818 parameters, 0 gradients, 8.1 GFLOPs
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 234.8±67.0 MB/s, size: 5.1 KB)
[K[34m[1mval: [0mScanning /content/unripe/ripe-tomatoes-1/test/labels.cache... 161 images, 52 backgrounds, 0 corrupt: 100% ━━━━━━━━━━━━ 161/161 316.9Kit/s 0.0s
[K                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100% ━━━━━━━━━━━━ 11/11 2.8it/s 3.9s
                   all        161        152      0.459      0.441       0.44      0.351
       b_fully_ripened         10         12      0.255        0.5      0.355      0.293
               b_green         34         47      0.555      0.404      0.473      0.366
        b_half_ripened         19         22      0.388       0.26      0.339      0.293
       l_fully_ripened         26         34      0.582      0.676      0.631      0.502
    

In [36]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [37]:
!mkdir -p /content/drive/MyDrive/yolo_models

In [38]:
!cp /content/runs/detect/train6/weights/best.pt /content/drive/MyDrive/yolo_models/best.pt

In [39]:
!ls /content/drive/MyDrive/yolo_models

best.pt
