The purpose of this notebook is to convert a dataset to a format that can be used to train a YOLOv8 model.

To use, make a copy of this notebook, and adapt it to work with your specific dataset. Please save your version of this ipynb file on GitHub in *recyclo/scripts*.

(File > Save a copy in GitHub > File path = "scripts/my_filename.ipynb" to save notebook in scripts folder)

Once you've generated your YOLOv8 dataset, and are confident you can train a model with it, please upload your converted dataset to the Recyclo datasets google drive, https://drive.google.com/drive/folders/1bUkIYQRXX08OKI5TuOSg-eqntSudGaFB.

(Why Google Drive? Because these datasets are too large for GitHub!)

In [1]:
!pip install -U ultralytics

from ultralytics import YOLO
import os

Collecting ultralytics
  Downloading ultralytics-8.3.147-py3-none-any.whl.metadata (37 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.14-py3-none-any.whl.metadata (9.4 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.8.0->ultralytics)
  Downloading n

# Convert datasets for YOLOv8

### General

In general, YOLO models output the following for a given image:
* Bounding box
* Class label
* Confidence score

To train a YOLO model, we need object detection datasets that contain images of what we're looking for (trash), and annotations: class labels and bounding boxes.

### YOLOv8

In this project we will use Ultralytics YOLOv8 object detection.

YOLOv8 expects datasets in the following format:

```
dataset/
├── images/
│   ├── train/  <-- image files for training
│   └── val/    <-- image files for validation after each epoch. Must not overlap with images in train.
├── labels/
│   ├── train/  <-- one .txt file per train image. Contains class and bbox info..
│   └── val/    <-- one .txt file per val image
└── data.yaml   <-- config file,
```

Example labels/train file:
```
<class_id> <x_center> <y_center> <width> <height>
```

Example data.yaml file:
```
path: /content/dataset  # Root folder
train: images/train
val: images/val

nc: 5  # number of classes
names: ['bottle', 'can', 'plastic bag', 'wrapper', 'paper']

```

### Colabs

When you open the "Files" tab on the left, you'll find yourself in a folder containing
* ..
* sample data

This is a colab thing, the "content" folder, to get you started.
Ignore it: click the .. to go up a level.

---
⚠️ ***CHANGE THIS FILE FROM HERE DOWN TO SUIT YOUR DATASET*** ⚠️

The sections above apply for all dataset conversions.

---

### TACO dataset

#### Overview

The TACO dataset uses COCO-style formatting (segmentation).

Sidebar: COCO, Common Objects in Context, is a object detection, segmentation, and captioning dataset developed by Microsoft. It uses an annotations.json file to organize image data. This json annotation approach has become standard for other datasets to use.

So, TACO has an annotations.json file containing:
*   "images":  List of image metadata
*   "annotations":  List of label data (type of trash, bounding box definition, segmentation data; corresponds to images list)
*   "categories":  List of the different categories this dataset uses

#### Conversion
To convert the TACO dataset to a format YOLOv8 can use, we must:
* Split the TACO images into train and val sets
* Extract label and bbox info from annotations.json, and save it in individual txt files corresponding to the image files
* Make a data.yaml file

To verify that your conversion worked, make sure you can train a model and that it outputs images with a bounding box and label.

In [None]:
import json
import shutil
import random
from pathlib import Path
from sklearn.model_selection import train_test_split
import yaml
import pandas as pd # Import pandas

# === IMPORT DATASET ===

# Define the target directory for the dataset
dataset_target_path = Path('/kaggle/working/AquaTrash_dataset')
# Check if the dataset directory already exists
if not dataset_target_path.exists():
    print(f"Cloning AquaTrash dataset to {dataset_target_path}...")
    !git clone https://github.com/Harsh9524/AquaTrash.git /kaggle/working/AquaTrash_dataset
    print(f"Dataset downloaded to {dataset_target_path}\n")
else:
    print(f"Dataset directory {dataset_target_path} already exists. Skipping clone.\n")
# print(f"Dataset downloaded to {input_root}\n")

# === CONFIG ===
input_root = Path('/kaggle/working/AquaTrash_dataset') # Update to the directory where AquaTrash was cloned
output_root = Path('/kaggle/working/aquatrash_yolo') # You can change the output directory name
train_ratio = 0.8  # 80% training, 20% validation

# === OUTPUT STRUCTURE ===
images_train = output_root / 'images' / 'train'
images_val = output_root / 'images' / 'val'
labels_train = output_root / 'labels' / 'train'
labels_val = output_root / 'labels' / 'val'

# Create folders
for folder in [images_train, images_val, labels_train, labels_val]:
    folder.mkdir(parents=True, exist_ok=True)

# === LOAD ANNOTATIONS ===
# Assuming annotations.csv contains columns: 'image_name', 'x_min', 'y_min', 'x_max', 'y_max', 'class_name'

# Initialize variables before the try block
annotations_df = None
image_names_list = [] # Use image names as identifiers
image_name_to_info = {} # Map image name to its info
annotations_by_image = {} # Group annotations by image name
category_info_list = [] # List of unique categories
category_map = {} # Map category name to class ID
category_names = [] # List of category names for data.yaml

# # Define the suffix to remove if it exists in the image_name
# # Based on the previous warning, this seems to be the format
# SUFFIX_TO_REMOVE = '_jpg.rf.beffaf3b548106ccf1da5dc629bc9504.jpg' # Adjust if the suffix is different
# # Also define the correct directory for images within the input_root
IMAGE_SOURCE_DIR = 'Images'

try:
    # Attempt to read the CSV file
    # Add dtype specification for bounding box columns to ensure they are numbers
    annotations_df = pd.read_csv(
        input_root / 'annotations.csv',
        dtype={'x_min': float, 'y_min': float, 'x_max': float, 'y_max': float}
    )

    # We need image dimensions (width and height) which are not in our CSV.
    # You will need to get these from the image files themselves.
    # Let's create a helper function to get image size.
    # This requires Pillow (PIL) library.
    try:
        from PIL import Image
    except ImportError:
        print("Pillow library not found. Installing...")
        !pip install Pillow
        from PIL import Image # Import again after installation

    def get_image_size(image_path):
        try:
            with Image.open(image_path) as img:
                return img.size # Returns (width, height)
        except FileNotFoundError:
            print(f"Warning: Image file not found when trying to get size: {image_path}") # Keep this warning if you want to know which files are missing
            return None # Indicate failure
        except Exception as e:
            print(f"Error getting size of image {image_path}: {e}")
            return None


    # Create 'images' information from unique image entries in the DataFrame
    unique_image_names_in_csv = annotations_df['image_name'].drop_duplicates().tolist()
    image_name_to_info = {}
    for image_name_in_csv in unique_image_names_in_csv:
        # Clean the image name from the CSV if the suffix is present
        # Assume the original file name is the part before the suffix
        original_image_name = image_name_in_csv
        # Check if the suffix exists and remove it. Also ensure it ends with .jpg or similar image extension
        # if original_image_name.endswith(SUFFIX_TO_REMOVE):
        #     original_image_name = original_image_name[:-len(SUFFIX_TO_REMOVE)] + '.jpg' # Assume original is .jpg
        # # You might need additional logic here if file extensions vary or if there are other suffix patterns

        # Construct the full image path using the potentially cleaned name and the correct image directory
        img_path_full = input_root / IMAGE_SOURCE_DIR / original_image_name # <--- Changed path here

        size = get_image_size(img_path_full)
        if size:
            width, height = size
            # Store the original image_name from the CSV in the info dictionary
            # This helps link back to the annotations DataFrame rows
            image_name_to_info[image_name_in_csv] = {
                'file_name': original_image_name, # Store the actual file name on disk
                'width': width,
                'height': height
            }
        else:
            # If size couldn't be obtained (file not found etc.), print a warning and skip this image
            print(f"Warning: Skipping image {image_name_in_csv} due to missing file or error getting size.")
            print(f"Attempted path: {img_path_full}")


    # Group annotations by image name (using the original image_name from CSV as the key)
    annotations_by_image = {}
    for index, row in annotations_df.iterrows():
        image_name = row['image_name']

        # Skip if image info wasn't loaded (due to missing file/size error)
        if image_name not in image_name_to_info:
            continue

        # Get bbox directly from columns
        bbox = [row['x_min'], row['y_min'], row['x_max'], row['y_max']]

        annotation = {
            'image_name': image_name, # Keep original image name from CSV
            'class_name': row['class_name'], # Use class_name directly
            'bbox': bbox # Store as [x_min, y_min, x_max, y_max]
        }
        annotations_by_image.setdefault(image_name, []).append(annotation)

    # Create 'categories' information
    unique_class_names = annotations_df['class_name'].drop_duplicates().tolist()
    # Sort category names alphabetically to ensure consistent class IDs
    unique_class_names.sort()

    category_info_list = []
    for idx, class_name in enumerate(unique_class_names):
        category_info_list.append({
            'id': idx, # Assign integer ID based on sorted order
            'name': class_name
        })

    category_map = {cat['name']: cat['id'] for cat in category_info_list} # Map name to ID
    category_names = [cat['name'] for cat in category_info_list] # List of names for data.yaml


except pd.errors.EmptyDataError:
    print(f"Error: The file {input_root / 'annotations.csv'} is empty.")
    # Do not exit, allow script to continue with empty data
except FileNotFoundError:
    print(f"Error: The file {input_root / 'annotations.csv'} was not found.")
    # Do not exit, allow script to continue with empty data
except KeyError as e:
    print(f"Error: Missing expected column in annotations.csv: {e}. Please check your column names ('image_name', 'x_min', 'y_min', 'x_max', 'y_max', 'class_name').")
    # Do not exit, allow script to continue with empty data
except Exception as e:
    print(f"An unexpected error occurred while processing the CSV file: {e}")
    # Do not exit, allow script to continue with empty data

# === SPLIT DATA ===
# Check if image_name_to_info was successfully populated
if not image_name_to_info:
    print("No image data loaded from annotations.csv or image files. Skipping data splitting and processing.")
    train_names = []
    val_names = []
else:
    # Split using the original image names from the CSV which are keys in image_name_to_info
    all_image_names = list(image_name_to_info.keys())
    # Only split if there are images
    if all_image_names:
        train_names, val_names = train_test_split(all_image_names, train_size=train_ratio, random_state=42)
    else:
        print("No valid images found after processing annotations.csv. Skipping data splitting and processing.")
        train_names = []
        val_names = []

# === PROCESS AND WRITE LABELS AND COPY IMAGES ===

def convert_bbox_to_yolo(bbox, img_w, img_h):
    # Add check for valid image dimensions to avoid division by zero
    if img_w <= 0 or img_h <= 0:
        print(f"Warning: Invalid image dimensions ({img_w}x{img_h}). Cannot convert bbox.")
        return None # Indicate failure to convert

    # Input bbox is [x_min, y_min, x_max, y_max]
    x_min, y_min, x_max, y_max = bbox

    # Calculate YOLO format: [x_center, y_center, width, height] (normalized)
    x_center = ((x_min + x_max) / 2) / img_w
    y_center = ((y_min + y_max) / 2) / img_h
    width = (x_max - x_min) / img_w
    height = (y_max - y_min) / img_h

    # Ensure values are within [0, 1] and handle potential floating point issues near boundaries
    x_center = max(0.0, min(1.0, x_center))
    y_center = max(0.0, min(1.0, y_center))
    width = max(0.0, min(1.0, width))
    height = max(0.0, min(1.0, height))


    # Add validation for calculated bbox values
    if width <= 1e-6 or height <= 1e-6: # Using a small epsilon for float comparison
        print(f"Warning: Calculated YOLO bbox has zero or near-zero width ({width}) or height ({height}). Original bbox: {bbox}, Image WxH: {img_w}x{img_h}")
        # Optionally, you could return None here if zero-size boxes are invalid
        # return None

    return x_center, y_center, width, height

def process_dataset_split(image_names_split, target_image_folder, target_label_folder):
    """
    Copies images to the target folder and creates YOLO label files.
    image_names_split: List of image names (from CSV) for this split (train/val).
    target_image_folder: Path to the folder where images for this split will be copied.
    target_label_folder: Path to the folder where .txt label files will be saved.
    """
    print(f"Processing split for {target_image_folder.parent.name}/{target_image_folder.name}...")
    files_processed_count = 0
    labels_written_count = 0

    for image_name_from_csv in image_names_split:
        if image_name_from_csv not in image_name_to_info:
            print(f"Warning: Skipping {image_name_from_csv}, not found in image_name_to_info (likely skipped during annotation loading).")
            continue

        img_info = image_name_to_info[image_name_from_csv]
        original_file_name = img_info['file_name'] # Actual file name on disk
        img_w, img_h = img_info['width'], img_info['height']

        source_image_path = input_root / IMAGE_SOURCE_DIR / original_file_name
        target_image_path = target_image_folder / original_file_name

        # Copy image
        if source_image_path.exists():
            shutil.copy(source_image_path, target_image_path)
        else:
            print(f"Warning: Source image not found, cannot copy: {source_image_path}")
            continue # Skip if image doesn't exist

        # Prepare label data
        yolo_data_for_file = []
        current_image_annotations = annotations_by_image.get(image_name_from_csv, [])

        if not current_image_annotations:
            pass


        for ann in current_image_annotations:
            class_name = ann['class_name']
            bbox = ann['bbox'] # [x_min, y_min, x_max, y_max]

            if class_name not in category_map:
                print(f"Warning: Class name '{class_name}' for image {image_name_from_csv} not in category_map. Skipping this annotation.")
                continue

            class_id = category_map[class_name]
            yolo_bbox_coords = convert_bbox_to_yolo(bbox, img_w, img_h)

            if yolo_bbox_coords:
                # yolo_bbox_coords is (x_center, y_center, width, height)
                yolo_data_for_file.append(f"{class_id} {yolo_bbox_coords[0]:.6f} {yolo_bbox_coords[1]:.6f} {yolo_bbox_coords[2]:.6f} {yolo_bbox_coords[3]:.6f}")

        # Write label file if there's any valid annotation data
        if yolo_data_for_file:
            # Use the original_file_name's stem for the .txt file
            label_file_name = Path(original_file_name).stem + '.txt'
            label_file_path = target_label_folder / label_file_name
            with open(label_file_path, 'w') as f:
                f.write('\n'.join(yolo_data_for_file))
            labels_written_count +=1

        files_processed_count += 1
        if files_processed_count % 100 == 0: # Print progress every 100 files
            print(f"  Processed {files_processed_count}/{len(image_names_split)} images for {target_image_folder.name}...")

    print(f"Finished processing for {target_image_folder.name}. Copied {files_processed_count} images. Wrote {labels_written_count} label files.")


# Process training data
if train_names:
    print("\nProcessing training data...")
    process_dataset_split(train_names, images_train, labels_train)
else:
    print("\nNo training data to process.")

# Process validation data
if val_names:
    print("\nProcessing validation data...")
    process_dataset_split(val_names, images_val, labels_val)
else:
    print("\nNo validation data to process.")

print("\nFinished copying images and writing label files.\n")


Dataset directory /kaggle/working/AquaTrash_dataset already exists. Skipping clone.


Processing training data...
Processing split for images/train...
  Processed 100/295 images for train...
  Processed 200/295 images for train...
Finished processing for train. Copied 295 images. Wrote 295 label files.

Processing validation data...
Processing split for images/val...
Finished processing for val. Copied 74 images. Wrote 74 label files.

Finished copying images and writing label files.



In [None]:
data_yaml_content = {
    'train': str(images_train.resolve()),  # Absolute path to training images
    'val': str(images_val.resolve()),    # Absolute path to validation images
    'nc': len(category_names),             # Number of classes
    'names': category_names                # List of class names
}

yaml_path = output_root / 'data.yaml'
try:
    with open(yaml_path, 'w') as f:
        yaml.dump(data_yaml_content, f, default_flow_style=False, sort_keys=False)
    print(f"Successfully created {yaml_path}")
    print("Content of data.yaml:")
    # Print content for verification
    with open(yaml_path, 'r') as f:
        print(f.read())
    if not category_names:
        print("Warning: 'category_names' is empty in data.yaml. 'nc' is 0. YOLO training will likely fail due to no classes defined.")
except Exception as e:
    print(f"Error creating or writing to {yaml_path}: {e}")

model = YOLO("yolo11n.pt")

# --- Verification for model initialization ---
if model is not None:
    print(f"Model initialized successfully. Type: {type(model)}")
    try:
        model.info() # This usually prints model summary if loaded correctly
    except Exception as e:
        print(f"Could not get model.info(): {e}")
else:
    print("Error: Model initialization failed, 'model' is None.")
    # Consider exiting or raising an error if model initialization is critical
# --- End verification ---

results = model.train(data=output_root / 'data.yaml', epochs=5, imgsz=640)

# --- Verification for model training ---
if results is not None:
    print("Training process completed. Results object obtained.")
    if hasattr(results, 'metrics') and results.metrics:
        print(f"Training metrics available: {results.metrics.keys()}")
        # Example: print mAP50-95 if available
        if hasattr(results, 'maps') and results.maps is not None: # maps is a list of mAP per epoch
             print(f"mAP50-95 for last epoch: {results.maps[-1] if results.maps else 'N/A'}")
    else:
        print("Warning: Training results metrics not found or empty in the results object.")

    # Check if the trainer object and save directory are available
    if hasattr(model, 'trainer') and model.trainer is not None:
        print(f"Trainer object is available. Save directory: {model.trainer.save_dir}")
        if hasattr(model.trainer, 'best') and model.trainer.best:
            print(f"Path to best weights: {model.trainer.best}")
        else:
            print("Warning: model.trainer.best path not found.")
    else:
        print("Error: model.trainer is None after training. Training likely failed to initialize or complete properly.")

else:
    print("Error: Training process did not return a results object. 'results' is None.")
# --- End verification ---

Successfully created /kaggle/working/aquatrash_yolo/data.yaml
Content of data.yaml:
train: /kaggle/working/aquatrash_yolo/images/train
val: /kaggle/working/aquatrash_yolo/images/val
nc: 4
names:
- glass
- metal
- paper
- plastic

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolov8n.pt to 'yolov8n.pt'...


100%|██████████| 6.25M/6.25M [00:00<00:00, 77.2MB/s]


Model initialized successfully. Type: <class 'ultralytics.models.yolo.model.YOLO'>
YOLOv8n summary: 129 layers, 3,157,200 parameters, 0 gradients, 8.9 GFLOPs
Ultralytics 8.3.147 🚀 Python-3.11.12 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/working/aquatrash_yolo/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=5, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolov8n.pt, momentum=0.937, mosaic=

100%|██████████| 755k/755k [00:00<00:00, 14.2MB/s]

Overriding model.yaml nc=80 with nc=4

                   from  n    params  module                                       arguments                     
  0                  -1  1       464  ultralytics.nn.modules.conv.Conv             [3, 16, 3, 2]                 
  1                  -1  1      4672  ultralytics.nn.modules.conv.Conv             [16, 32, 3, 2]                
  2                  -1  1      7360  ultralytics.nn.modules.block.C2f             [32, 32, 1, True]             
  3                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  4                  -1  2     49664  ultralytics.nn.modules.block.C2f             [64, 64, 2, True]             
  5                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  6                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  7                  -1  1    295424  ultralytics




Model summary: 129 layers, 3,011,628 parameters, 3,011,612 gradients, 8.2 GFLOPs

Transferred 319/355 items from pretrained weights
Freezing layer 'model.22.dfl.conv.weight'
[34m[1mAMP: [0mrunning Automatic Mixed Precision (AMP) checks...
Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt'...


100%|██████████| 5.35M/5.35M [00:00<00:00, 61.1MB/s]


[34m[1mAMP: [0mchecks passed ✅
[34m[1mtrain: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 749.4±298.5 MB/s, size: 23.7 KB)


[34m[1mtrain: [0mScanning /kaggle/working/aquatrash_yolo/labels/train... 295 images, 0 backgrounds, 0 corrupt: 100%|██████████| 295/295 [00:00<00:00, 2621.47it/s]

[34m[1mtrain: [0mNew cache created: /kaggle/working/aquatrash_yolo/labels/train.cache





[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))
[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 577.8±137.2 MB/s, size: 154.9 KB)


[34m[1mval: [0mScanning /kaggle/working/aquatrash_yolo/labels/val... 74 images, 0 backgrounds, 0 corrupt: 100%|██████████| 74/74 [00:00<00:00, 1872.63it/s]

[34m[1mval: [0mNew cache created: /kaggle/working/aquatrash_yolo/labels/val.cache





Plotting labels to runs/detect/train/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.00125, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mruns/detect/train[0m
Starting training for 5 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        1/5       2.1G      1.821      4.047      1.675         20        640: 100%|██████████| 19/19 [00:10<00:00,  1.90it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:03<00:00,  1.12s/it]

                   all         74         91     0.0024      0.441     0.0763     0.0438






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        2/5      2.57G      1.767       3.42      1.582         16        640: 100%|██████████| 19/19 [00:06<00:00,  2.73it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.47it/s]

                   all         74         91     0.0063      0.827      0.185     0.0907






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        3/5      2.59G      1.725       3.14      1.535         13        640: 100%|██████████| 19/19 [00:07<00:00,  2.42it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.78it/s]


                   all         74         91    0.00725      0.844      0.206     0.0958

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        4/5       2.6G       1.67      2.999      1.514         17        640: 100%|██████████| 19/19 [00:08<00:00,  2.34it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.87it/s]


                   all         74         91    0.00607      0.925      0.206     0.0858

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        5/5      2.63G      1.716      2.909      1.553         12        640: 100%|██████████| 19/19 [00:06<00:00,  3.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  1.67it/s]

                   all         74         91      0.592     0.0748      0.255       0.11






5 epochs completed in 0.015 hours.
Optimizer stripped from runs/detect/train/weights/last.pt, 6.2MB
Optimizer stripped from runs/detect/train/weights/best.pt, 6.2MB

Validating runs/detect/train/weights/best.pt...
Ultralytics 8.3.147 🚀 Python-3.11.12 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
Model summary (fused): 72 layers, 3,006,428 parameters, 0 gradients, 8.1 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.70it/s]


                   all         74         91      0.591     0.0722      0.254       0.11
                 glass          8          8          1          0      0.198     0.0478
                 metal         16         21      0.148     0.0476       0.13     0.0527
                 paper         23         24          1     0.0809      0.434      0.202
               plastic         34         38      0.217       0.16      0.253      0.138
Speed: 0.3ms preprocess, 3.2ms inference, 0.0ms loss, 5.0ms postprocess per image
Results saved to [1mruns/detect/train[0m
Training process completed. Results object obtained.
Trainer object is available. Save directory: runs/detect/train
Path to best weights: runs/detect/train/weights/best.pt


If the model outputs even one image with a bounding box and label, then the dataset should work for our project! Verify this using the code below.

In [None]:
import cv2
from random import sample
import matplotlib.pyplot as plt

if model.trainer is None:
    raise RuntimeError(
        "model.trainer is None. This usually means the training process "
        "failed or was interrupted. Please check the output of the "
        "model.train() call for specific error messages from YOLO. "
        "Common causes include issues with 'data.yaml' (e.g., nc=0, incorrect paths), "
        "missing/corrupt data, or resource limitations."
    )

# Dynamically get the path to the best weights from the completed training.
# The 'model' object is the one on which 'train' was called, and its 'trainer'
# attribute stores information about the run, including the save directory.
best_weights_path = Path(model.trainer.save_dir) / 'weights' / 'best.pt'

# Load the trained model for inference using the dynamically found path.
# This reassigns the 'model' variable to the loaded inference model.
model = YOLO(best_weights_path)

# Use the 'images_train' path defined earlier for consistency.
train_images_path = images_train 

image_files = list(train_images_path.glob('*.jpg'))

# Check if images were found before attempting to sample
if not image_files:
    print(f"Error: No images found in {train_images_path}. Cannot display predictions.")
    sample_images = [] # Ensure sample_images is empty to skip the loop
elif len(image_files) < 10:
    print(f"Warning: Found only {len(image_files)} images in {train_images_path}. Using all available images for prediction.")
    sample_images = image_files # Use all found images if less than 10
else:
    sample_images = sample(image_files, 10)

for image_path in sample_images:
    result = model(image_path)[0]
    annotated_image = result.plot()

    plt.figure(figsize=(8, 6))
    plt.imshow(annotated_image)
    plt.title(f'Predictions: {image_path.name}')
    plt.axis('off')
    plt.show()

Output hidden; open in https://colab.research.google.com to view.

If the model successfully generated even one image with a bounding box and label, please download the converted dataset and upload it on Google Drive, https://www.google.com/url?q=https%3A%2F%2Fdrive.google.com%2Fdrive%2Ffolders%2F1bUkIYQRXX08OKI5TuOSg-eqntSudGaFB.

In [16]:
from datetime import datetime

dataset_name = 'aquatrash_yolo'

# Generate date prefix
date_str = datetime.now().strftime('%Y%m%d')
zip_name = f"{date_str}_{dataset_name}.zip"

# Change directory and zip
%cd /kaggle/working/{dataset_name}
!zip -r /content/{zip_name} .

print(f"✅ Zip created at /content/{zip_name}")

/kaggle/working/aquatrash_yolo
  adding: labels/ (stored 0%)
  adding: labels/val/ (stored 0%)
  adding: labels/val/000019_jpg.rf.e4d28c72e4ca25a59b07de7d8cc87080.txt (deflated 14%)
  adding: labels/val/000021_jpg.rf.22832d293d317b13e588374218a71e00.txt (deflated 40%)
  adding: labels/val/000009_jpg.rf.8434a803f417f51fe5ebe4f6b9697cce.txt (deflated 16%)
  adding: labels/val/000015_jpg.rf.8a658891a7ac82f12000145ec489cf4c.txt (deflated 31%)
  adding: labels/val/000094_JPG.rf.b6241bd0418850c0ca1916017d039774.txt (deflated 5%)
  adding: labels/val/000007_JPG.rf.a40debf74b3f5228742cd22b8ea9ce08.txt (deflated 8%)
  adding: labels/val/000015_jpg.rf.eebd5ba151182f6edadd573ffdf6fe85.txt (deflated 28%)
  adding: labels/val/000015_JPG.rf.df7d83766fb301cd8635988698f82196.txt (deflated 5%)
  adding: labels/val/000025_jpg.rf.2c5a598559f4ac3e19a1a3bf2a098d4a.txt (deflated 11%)
  adding: labels/val/000011_jpg.rf.8b7fff6ea8f7330e402b6a7757b1866b.txt (deflated 8%)
  adding: labels/val/000091_JPG.rf.6ab3