The purpose of this notebook is to convert a COCO-ish dataset to a format that can be used to train an Ultralytics YOLO model.

# Using this notebook: workflow

To use, make a copy of this notebook, and adapt it to work with your specific dataset. Please save your version of this ipynb file on GitHub in *recyclo/scripts*.

(File > Save a copy in GitHub > File path = "scripts/my_filename.ipynb" to save notebook in scripts folder)

Once you've generated your YOLO dataset, and are confident you can train a model with it, please upload your converted dataset to the Recyclo datasets google drive, https://drive.google.com/drive/folders/1bUkIYQRXX08OKI5TuOSg-eqntSudGaFB.

(Why Google Drive? Because these datasets are too large for GitHub!)

# What's in this notebook: contents

Notebook contents:
- intro to YOLO
- intro to COCO
- dataset specific notes (update for your specific dataset)
- convert dataset to simple COCO (update for your specific dataset)
- convert simple COCO to YOLO

# Pro tips about Colabs

When you open the "Files" tab on the left, you'll find yourself in a folder containing
* ..
* sample data

This is a colab thing, the "content" folder, to get you started.
Ignore it: click the .. to go up a level.

# Intro to YOLO

## General

In general, YOLO models output the following for a given image:
* Bounding box
* Class label
* Confidence score

To train a YOLO model, we need object detection datasets that contain images of what we're looking for (trash), and annotations: class labels and bounding boxes.

## Ultralytics YOLO

In this project we will use Ultralytics YOLO object detection, eg their YOLO11n model. YOLO11n is a pretrained object detection model developed by Ultralytics.

Ultralytics YOLO expects datasets in the following format:

```
dataset/
├── images/
│   ├── train/  <-- image files for training.
│   ├── val/    <-- image files for validation after each epoch. Must not overlap with images in train.
|   └── test/   <-- optional: can put some image files here for benchmarking.
├── labels/
│   ├── train/  <-- one .txt file per train image (must have same name). Contains class and bbox info..
│   ├── val/    <-- one .txt file per val image.
|   └── test/   <-- one .txt file per test image.
└── data.yaml   <-- config file; helps tie all the above together.
```

Example labels/train file:
```
<class_id> <x_center> <y_center> <width> <height>
```

Example data.yaml file:
```
path: /content/dataset  # Root folder
train: images/train
val: images/val

nc: 5  # number of classes
names: ['bottle', 'can', 'plastic bag', 'wrapper', 'paper']

```

# Intro to COCO

## General
COCO, Common Objects in Context, is a object detection, segmentation, and captioning dataset developed by Microsoft. It uses an annotations.json file to organize image data. This json annotation approach has become standard for other datasets to use.

Lots of datasets use COCO-style of formatting. In addition to the training images themselves, these datasets have at least one annotations json file which contains the following:
*   "images":  List of image metadata
*   "annotations":  List of label data (type of trash, bounding box definition, segmentation data; corresponds to images list)
*   "categories":  List of the different categories this dataset uses

An example of a COCO-style dataset file structure is as follows:
```
dataset/
├── annotations/
│   ├── instances_train2017.json
│   ├── instances_val2017.json
│   ├── person_keypoints_train2017.json
│   ├── captions_train2017.json
│   └── ... (other task-specific .json files)
├── images/
│   ├── train2017/
│   │   ├── 000000000009.jpg
│   │   ├── 000000000025.jpg
│   │   └── ...
│   └── val2017/
│       ├── 000000000139.jpg
│       ├── 000000000285.jpg
│       └── ...
└── LICENSE.txt (optional)
```

## Simple COCO format required by the COCO-to-YOLO conversion function


To use the COCO to YOLO conversion function below, your data set must conform to the following (vastly simplified) COCO-like directory structure and json structure. It's unlikely that your dataset will conform to these specifications out of the box, so please use the code section below to modify your data's structure to match.

The simple COCO directory structure must be as follows, with a folder called "dataset" located in your "content" directory:
```
dataset/
├── images/
│   ├── 000001.jpg  # or png or whatever
│   ├── 000002.jpg
│   └── ...
└── annotations.json
```

And the annotations.json file must contain information in the following structure, and using the following json keywords:
```
{
  "images": [
    {
      "id": 0,
      "file_name": "000000.jpg",
      "width": 640,
      "height": 480
    },
    {
      "id": 1,
      "file_name": "000001.jpg",
      "width": 800,
      "height": 600
    }
  ],
  "annotations": [
    {
      "id": 0,
      "image_id": 0,
      "category_id": 0,
      "bbox": [100, 120, 50, 60],
      "area": 3000,
      "iscrowd": 0
    },
    {
      "id": 1,
      "image_id": 1,
      "category_id": 0,
      "bbox": [20, 30, 40, 50],
      "area": 2000,
      "iscrowd": 0
    }
  ],
  "categories": [
    {
      "id": 0,
      "name": "trash"
    }
  ]
}
```
Note that there is only one category: "trash".


# Helpful functions

This section has some helpful functions you can use later in this notebook.

In [1]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell sets up a helpful function for inspecting json file contents

import json
from pprint import pprint

def show_first_two_per_category(json_path):
    """
    Prints the first two entries of each root-level list in a JSON file.

    Useful for quickly inspecting the structure and content of a COCO-style
    annotations json file.

    It pretty-prints the first two entries of each top-level key that contains a list.

    Args:
        json_path (str or Path): Path to the JSON file to inspect.

    Raises:
        FileNotFoundError: If the provided path does not point to an existing file.
        json.JSONDecodeError: If the file is not valid JSON.
    """
    json_path = Path(json_path)

    if not json_path.exists():
        print(f"File not found: {json_path}")
        return

    with open(json_path, 'r') as f:
        data = json.load(f)

    for key, value in data.items():
        print(f"\n--- {key.upper()} (showing first 2 entries) ---")
        if isinstance(value, list):
            for item in value[:2]:
                pprint(item)
        else:
            print(f"{key} is not a list, skipping.")


---
⚠️‼️ ***THE SECTION TO CHANGE FOR YOUR SPECIFIC DATASET STARTS HERE*** ‼️⚠️

The sections above apply for all dataset conversions.

---

In [2]:
# ✏️ Enter your dataset-specific code here
# This cell is for importing your dataset to the notebook, and defining its name and path.

# TACO: Sometimes you have to wait a while or run this twice for it to show up in the file tree
# import kagglehub
from pathlib import Path

dataset_name = "TACO"
taco_dataset_path = Path('/kaggle/working/AquaTrash_dataset')
if not taco_dataset_path.exists():
    print(f"Cloning AquaTrash dataset to {taco_dataset_path}...")
    !git clone https://github.com/Harsh9524/AquaTrash.git /kaggle/working/AquaTrash_dataset
    print(f"Dataset downloaded to {taco_dataset_path}\n")
else:
    print(f"Dataset directory {taco_dataset_path} already exists. Skipping clone.\n")

print(f"{dataset_name} dataset downloaded to {taco_dataset_path}\n")

Cloning AquaTrash dataset to /kaggle/working/AquaTrash_dataset...
Cloning into '/kaggle/working/AquaTrash_dataset'...
remote: Enumerating objects: 2948, done.[K
remote: Counting objects: 100% (46/46), done.[K
remote: Compressing objects: 100% (44/44), done.[K
remote: Total 2948 (delta 22), reused 6 (delta 1), pack-reused 2902 (from 1)[K
Receiving objects: 100% (2948/2948), 122.45 MiB | 16.83 MiB/s, done.
Resolving deltas: 100% (22/22), done.
Dataset downloaded to /kaggle/working/AquaTrash_dataset

TACO dataset downloaded to /kaggle/working/AquaTrash_dataset



In [3]:
# ✏️ Enter your dataset-specific code here
# This cell is for inspecting your annotations json file(s), to help discover some of the quirks your dataset has

show_first_two_per_category(taco_dataset_path / "data" / "annotations.json")

File not found: /kaggle/working/AquaTrash_dataset/data/annotations.json


# TACO dataset
✏️ Modify this section for your specific dataset.

The TACO dataset uses COCO-style formatting (it has segmentation).

## Quirks

TACO has 15 batch folders, each containing jpgs with names like 000000.jpg, 000001.jpg, 000002.jpg, etc.
* Some numbers are skipped, eg 000000.jpg, 000001.jpg, 000003.jpg.
* The name 000000.jpg is used in multiple batch folders, to name different image files.
* It contains segmentation data (will be discarded).
* It uses categories (specific) and supercategories (more general) - only the supercategories will be retained.
* It includes scene categories (eg clearn streets) - this will be discarded.
  * It might be interesting in the future to consider using a multi-tasking model - detecting trash and also classsifying the scene might improve the models performance by adding a bit of context awareness.

## Conversion
To convert the TACO dataset to a format ultralytics YOLO can use, we must:
* Give the images unique names
* Split the TACO images into train and val sets
* Extract label and bbox info from annotations.json, and save it in individual txt files corresponding to the image files
* Make a data.yaml file

In [11]:
import json
import shutil
import random
from pathlib import Path
from sklearn.model_selection import train_test_split
import yaml
import pandas as pd
from PIL import Image
from collections import defaultdict # Import defaultdict

# === IMPORT DATASET ===
dataset_target_path = Path('/kaggle/working/AquaTrash_dataset')
if not dataset_target_path.exists():
    print(f"Cloning AquaTrash dataset to {dataset_target_path}...")
    !git clone https://github.com/Harsh9524/AquaTrash.git /kaggle/working/AquaTrash_dataset
    print(f"Dataset downloaded to {dataset_target_path}\n")
else:
    print(f"Dataset directory {dataset_target_path} already exists. Skipping clone.\n")

# === CONFIG ===
input_root = Path('/kaggle/working/AquaTrash_dataset')
# Output directory for the simple COCO format data
simple_coco_output_root = Path('/kaggle/working/aquatrash_simple_coco')
dataset_name = "AquaTrash" # Define dataset name for the output YOLO dataset


# Create simple COCO output folders
simple_coco_images_dir = simple_coco_output_root / 'images'
simple_coco_images_dir.mkdir(parents=True, exist_ok=True)

# === LOAD ANNOTATIONS ===
IMAGE_SOURCE_DIR = 'Images'

try:
    annotations_df = pd.read_csv(
        input_root / 'annotations.csv',
        dtype={'x_min': float, 'y_min': float, 'x_max': float, 'y_max': float}
    )

    def get_image_size(image_path):
        try:
            with Image.open(image_path) as img:
                return img.size
        except FileNotFoundError:
            # print(f"Warning: Image file not found when trying to get size: {image_path}") # Too chatty
            return None
        except Exception as e:
            print(f"Error getting size of image {image_path}: {e}")
            return None

    # Prepare data for simple COCO format
    simple_coco_images = []
    simple_coco_annotations = []
    simple_coco_categories = []

    # Create category map
    unique_class_names = annotations_df['class_name'].drop_duplicates().tolist()
    unique_class_names.sort() # Ensure consistent class IDs
    category_map = {name: idx for idx, name in enumerate(unique_class_names)}

    simple_coco_categories = [{"id": idx, "name": name} for name, idx in category_map.items()]


    image_id_counter = 0
    annotation_id_counter = 0

    # Collect unique image names from the CSV that actually have corresponding files
    # and build the simple_coco_images list and a mapping for quick lookup
    image_name_to_simple_coco_id = {}

    # Iterate through unique image names in the CSV to find existing files and get their sizes
    processed_image_count = 0
    for original_image_name_in_csv in annotations_df['image_name'].drop_duplicates():
        # Construct the full image path using the correct image directory
        img_path_full = input_root / IMAGE_SOURCE_DIR / original_image_name_in_csv

        size = get_image_size(img_path_full)
        if size:
            width, height = size
            # Assign a new sequential ID for the simple COCO format
            simple_coco_id = image_id_counter
            image_name_to_simple_coco_id[original_image_name_in_csv] = simple_coco_id

            simple_coco_images.append({
                "id": simple_coco_id,
                "file_name": original_image_name_in_csv, # Use the name from the CSV directly for now
                "width": width,
                "height": height
            })
            image_id_counter += 1
            processed_image_count += 1
            if processed_image_count % 100 == 0:
                print(f"Processed info for {processed_image_count} unique images...")
        # else:
            # Warning about missing image is handled in get_image_size function
            # print(f"Skipping image {original_image_name_in_csv} due to missing file or size error.")


    print(f"Finished processing info for {processed_image_count} unique images that exist on disk.")


    # Create simple_coco_annotations from the DataFrame
    skipped_annotations_count = 0
    for index, row in annotations_df.iterrows():
        original_image_name_in_csv = row['image_name']

        # Check if the image associated with this annotation was successfully added to simple_coco_images
        if original_image_name_in_csv in image_name_to_simple_coco_id:
            image_id = image_name_to_simple_coco_id[original_image_name_in_csv]
            class_name = row['class_name']

            if class_name in category_map:
                category_id = category_map[class_name]
                bbox = [row['x_min'], row['y_min'], row['x_max'], row['y_max']]
                # Calculate area (simple bounding box area)
                area = (bbox[2] - bbox[0]) * (bbox[3] - bbox[1])

                simple_coco_annotations.append({
                    "id": annotation_id_counter,
                    "image_id": image_id,
                    "category_id": category_id,
                    "bbox": bbox, # Keep as [x_min, y_min, width, height] for now, conversion function expects this
                    "area": area,
                    "iscrowd": 0 # Assuming no crowded objects for simplicity
                })
                annotation_id_counter += 1
            else:
                # This case should be rare if category_map is built from the same data
                print(f"Warning: Skipping annotation for image {original_image_name_in_csv} with unknown class '{class_name}'.")
                skipped_annotations_count += 1
        else:
            # This annotation belongs to an image that was skipped (e.g., missing file)
            # print(f"Warning: Skipping annotation for image {original_image_name_in_csv} because the image was not processed.") # Too chatty
            skipped_annotations_count += 1


    print(f"Finished processing {len(simple_coco_annotations)} annotations. Skipped {skipped_annotations_count} annotations.")


    # Write the simple COCO annotations.json file
    simple_coco_json_data = {
        "images": simple_coco_images,
        "annotations": simple_coco_annotations,
        "categories": simple_coco_categories
    }

    simple_coco_json_path = simple_coco_output_root / 'annotations.json'
    with open(simple_coco_json_path, 'w') as f:
        json.dump(simple_coco_json_data, f, indent=4)

    print(f"Simple COCO annotations saved to {simple_coco_json_path}")

    # Copy image files to the simple COCO images directory
    print(f"\nCopying image files to {simple_coco_images_dir}...")
    copied_image_count = 0
    for img_info in simple_coco_images:
        original_file_name = img_info['file_name']
        source_image_path = input_root / IMAGE_SOURCE_DIR / original_file_name
        target_image_path = simple_coco_images_dir / original_file_name

        if source_image_path.exists():
            shutil.copy(source_image_path, target_image_path)
            copied_image_count += 1
            if copied_image_count % 100 == 0:
                 print(f"  Copied {copied_image_count}/{len(simple_coco_images)} images...")

    print(f"Finished copying {copied_image_count} images.")


except pd.errors.EmptyDataError:
    print(f"Error: The file {input_root / 'annotations.csv'} is empty.")
except FileNotFoundError:
    print(f"Error: The file {input_root / 'annotations.csv'} was not found.")
except KeyError as e:
    print(f"Error: Missing expected column in annotations.csv: {e}. Please check your column names ('image_name', 'x_min', 'y_min', 'x_max', 'y_max', 'class_name').")
except Exception as e:
    print(f"An unexpected error occurred while processing the CSV file: {e}")


# === CONVERT SIMPLE COCO TO YOLO USING THE PROVIDED FUNCTION ===
# Call the conversion function here with the path to the simple COCO dataset
print("\nConverting simple COCO to YOLO format...")
# The convert_coco_to_yolo function expects a Path object to the root of the simple COCO dataset
yolo_data_yaml_path = convert_coco_to_yolo(simple_coco_output_root, dataset_name)

print(f"\nYOLO dataset created at: {yolo_data_yaml_path.parent}")
print(f"Data YAML file: {yolo_data_yaml_path}")

# Store the path to the data.yaml file in a variable that can be used by subsequent cells
output_root = yolo_data_yaml_path.parent # Set output_root to the root of the generated YOLO dataset

Dataset directory /kaggle/working/AquaTrash_dataset already exists. Skipping clone.

Processed info for 100 unique images...
Processed info for 200 unique images...
Processed info for 300 unique images...
Finished processing info for 369 unique images that exist on disk.
Finished processing 469 annotations. Skipped 0 annotations.
Simple COCO annotations saved to /kaggle/working/aquatrash_simple_coco/annotations.json

Copying image files to /kaggle/working/aquatrash_simple_coco/images...
  Copied 100/369 images...
  Copied 200/369 images...
  Copied 300/369 images...
Finished copying 369 images.

Converting simple COCO to YOLO format...
Dataset split: Train=295 images, Validation=37 images, Test=37 images
Processing split for images/train...
  Processed 100/295 images for train...
  Processed 200/295 images for train...
Finished processing for train. Copied 295 images. Wrote 295 label files.
Processing split for images/val...
Finished processing for val. Copied 37 images. Wrote 37 label


---
⚠️‼️ ***THE SECTION TO CHANGE FOR YOUR SPECIFIC DATASET STOPS HERE*** ‼️⚠️

The sections below apply for all dataset conversions.

---

In [None]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell shows part of your annotations.json file to make sure the format and field names match the required simple COCO format

show_first_two_per_category(target_root / "annotations.json")


--- IMAGES (showing first 2 entries) ---
{'file_name': '000000.jpg', 'height': 2049, 'id': 0, 'width': 1537}
{'file_name': '000001.jpg', 'height': 2049, 'id': 1, 'width': 1537}

--- ANNOTATIONS (showing first 2 entries) ---
{'area': 403954.0,
 'bbox': [517.0, 127.0, 447.0, 1322.0],
 'category_id': 0,
 'id': 0,
 'image_id': 0,
 'iscrowd': 0}
{'area': 1071259.5,
 'bbox': [1.0, 457.0, 1429.0, 1519.0],
 'category_id': 0,
 'id': 1,
 'image_id': 1,
 'iscrowd': 0}

--- CATEGORIES (showing first 2 entries) ---
{'id': 0, 'name': 'trash'}


In [12]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell executes the simple COCO dataset to YOLO conversion

# Use the output_root defined in the data processing cell
# target_root = output_root # No longer needed as conversion happens in the previous cell
# dataset_name = "AquaTrash" # Define dataset_name based on the dataset being processed

# yolo_data = convert_coco_to_yolo(target_root, dataset_name) # Remove this line
print(f"YOLO data.yaml path is stored in the variable `output_root`: {output_root / 'data.yaml'}")

YOLO data.yaml path is stored in the variable `output_root`: /kaggle/working/AquaTrash_yolo/data.yaml


To verify that your conversion worked, make sure you can train a model and that it outputs images with a bounding box and label.

In [13]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell imports the ultralytics library required for training a model

!pip install -U ultralytics

from ultralytics import YOLO
import os

Collecting ultralytics
  Downloading ultralytics-8.3.152-py3-none-any.whl.metadata (37 kB)
Collecting ultralytics-thop>=2.0.0 (from ultralytics)
  Downloading ultralytics_thop-2.0.14-py3-none-any.whl.metadata (9.4 kB)
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cuda_cupti_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cudnn-cu12==9.1.0.70 (from torch>=1.8.0->ultralytics)
  Downloading nvidia_cudnn_cu12-9.1.0.70-py3-none-manylinux2014_x86_64.whl.metadata (1.6 kB)
Collecting nvidia-cublas-cu12==12.4.5.8 (from torch>=1.8.0->ultralytics)
  Downloading n

In [16]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell sets up the function that converts the simple COCO dataset to YOLO format

import shutil
import random
import yaml
from collections import defaultdict

def convert_coco_to_yolo(coco_root: Path, dataset_name: str, train_split: float = 0.8):
    """
    Converts a simple COCO dataset to YOLOv8 format, including train/val split and data.yaml generation.

    Args:
        coco_root (Path): Path to the root of the simple COCO dataset (should contain images/ and annotations.json).
        dataset_name (str): Name of the output dataset folder (e.g., "taco" -> creates "taco_yolo").
        train_split (float, optional): Fraction of images to use for training. Defaults to 0.8.
          The remaining images are split between validation and testing.

    Returns:
        Path: Path to the data.yaml file
    """
    # Paths
    coco_json_path = coco_root / 'annotations.json'
    coco_images_path = coco_root / 'images'

    # Load COCO JSON and get number of images for naming
    with open(coco_json_path, 'r') as f:
        coco = json.load(f)
    n_total = len(coco['images'])

    # Paths con't
    yolo_root = coco_root.parent / f"{dataset_name}_yolo_{n_total}"
    yolo_img_dirs = {
        'train': yolo_root / 'images' / 'train',
        'val': yolo_root / 'images' / 'val',
        'test': yolo_root / 'images' / 'test',
    }
    yolo_lbl_dirs = {
        'train': yolo_root / 'labels' / 'train',
        'val': yolo_root / 'labels' / 'val',
        'test': yolo_root / 'labels' / 'test',
    }

    # Clear and recreate folders
    for d in list(yolo_img_dirs.values()) + list(yolo_lbl_dirs.values()):
        if d.exists():
            shutil.rmtree(d)
        d.mkdir(parents=True, exist_ok=True)

    # Map image_id -> metadata
    image_info = {img['id']: (img['width'], img['height'], img['file_name']) for img in coco['images']}

    # Map image_id -> annotations
    annots_per_image = defaultdict(list)
    for ann in coco['annotations']:
        annots_per_image[ann['image_id']].append(ann)

    # Shuffle and split image IDs
    all_image_ids = list(image_info.keys())
    random.shuffle(all_image_ids)
    n_total = len(all_image_ids)
    n_train = int(n_total * train_split)
    n_val = int((n_total - n_train) / 2)
    n_test = n_total - n_train - n_val

    split_ids = {
        'train': set(all_image_ids[:n_train]),
        'val': set(all_image_ids[n_train:n_train + n_val]),
        'test': set(all_image_ids[n_train + n_val:]),
    }

    def write_labels_and_copy_images(image_ids, img_dir, lbl_dir):
        for image_id in image_ids:
            width, height, filename = image_info[image_id]
            orig_stem = Path(filename).stem
            new_stem = f"{dataset_name}_{orig_stem}"
            label_path = lbl_dir / f"{new_stem}.txt"
            image_src = coco_images_path / filename
            image_dst = img_dir / f"{new_stem}.jpg"

            # Copy image
            if image_src.exists():
                shutil.copy(image_src, image_dst)
            else:
                print(f"Warning: Image not found: {image_src}")
                continue

            # Write labels
            with open(label_path, 'w') as f:
                for ann in annots_per_image.get(image_id, []):
                    class_id = ann['category_id']
                    x, y, w, h = ann['bbox']
                    x_center = (x + w / 2) / width
                    y_center = (y + h / 2) / height
                    w /= width
                    h /= height
                    f.write(f"{class_id} {x_center:.6f} {y_center:.6f} {w:.6f} {h:.6f}\n")

    # Process splits
    for split in ['train', 'val', 'test']:
        write_labels_and_copy_images(
            split_ids[split],
            yolo_img_dirs[split],
            yolo_lbl_dirs[split]
        )

    # Build data.yaml
    categories = sorted(coco['categories'], key=lambda x: x['id'])
    names = [cat['name'] for cat in categories]
    data_yaml = {
        'path': str(yolo_root),
        'train': 'images/train', # Corrected path
        'val': 'images/val',     # Corrected path
        'test': 'images/test',
        'nc': len(names),
        'names': names
    }

    with open(yolo_root / 'data.yaml', 'w') as f:
        yaml.dump(data_yaml, f)

    print(f"YOLO conversion complete: {yolo_root}")
    print(f"  Train: {len(split_ids['train'])}, Val: {len(split_ids['val'])}, Test: {len(split_ids['test'])}")
    return yolo_root / 'data.yaml'

In [19]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell trains a YOLO model on the converted YOLO dataset to see if it's set up correctly
# Tip: inspect the output of this cell to assess whether training occured properly.

model = YOLO('yolo11n.pt')
# Use the output_root directly as the data path
results = model.train(data=str(output_root / 'data.yaml'), epochs=20, imgsz=640)  # epoch size is small - this is just to see if it can work!

Ultralytics 8.3.152 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/working/AquaTrash_yolo/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=20, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train2, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=T

[34m[1mtrain: [0mScanning /kaggle/working/AquaTrash_yolo/labels/train.cache... 295 images, 0 backgrounds, 0 corrupt: 100%|██████████| 295/295 [00:00<?, ?it/s]

[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))





[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 505.1±222.9 MB/s, size: 174.3 KB)


[34m[1mval: [0mScanning /kaggle/working/AquaTrash_yolo/labels/val.cache... 37 images, 0 backgrounds, 0 corrupt: 100%|██████████| 37/37 [00:00<?, ?it/s]


Plotting labels to runs/detect/train2/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.002, momentum=0.9) with parameter groups 81 weight(decay=0.0), 88 weight(decay=0.0005), 87 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mruns/detect/train2[0m
Starting training for 20 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       1/20       2.4G      2.473      3.187       2.77         22        640: 100%|██████████| 19/19 [00:10<00:00,  1.80it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.67it/s]

                   all         37         45    0.00396      0.978     0.0937     0.0394






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       2/20      2.79G      2.054      2.973      2.318         18        640: 100%|██████████| 19/19 [00:07<00:00,  2.63it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.67it/s]

                   all         37         45    0.00396      0.978      0.132     0.0427






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       3/20       2.8G      2.078      2.859      2.321         20        640: 100%|██████████| 19/19 [00:07<00:00,  2.44it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.58it/s]


                   all         37         45      0.045      0.178     0.0542     0.0203

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       4/20       2.8G      2.064      2.823      2.328         21        640: 100%|██████████| 19/19 [00:08<00:00,  2.36it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.55it/s]

                   all         37         45      0.453      0.222      0.262     0.0676






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       5/20      2.81G       2.08      2.777      2.321         18        640: 100%|██████████| 19/19 [00:06<00:00,  2.80it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.21it/s]

                   all         37         45     0.0575     0.0444     0.0447     0.0155






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       6/20      2.81G      1.986      2.676       2.24         22        640: 100%|██████████| 19/19 [00:07<00:00,  2.67it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.29it/s]

                   all         37         45     0.0528        0.2     0.0459     0.0115






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       7/20      2.82G      1.938      2.672      2.229         15        640: 100%|██████████| 19/19 [00:08<00:00,  2.28it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.31it/s]

                   all         37         45     0.0683     0.0889     0.0356      0.011






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       8/20      2.83G      1.897      2.627      2.204         12        640: 100%|██████████| 19/19 [00:07<00:00,  2.70it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:01<00:00,  1.89it/s]

                   all         37         45      0.257      0.267      0.183     0.0638






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


       9/20      2.83G      1.892      2.561      2.196         23        640: 100%|██████████| 19/19 [00:06<00:00,  2.96it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.07it/s]

                   all         37         45      0.272      0.267      0.199      0.108






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      10/20      2.83G      1.909       2.53      2.193         17        640: 100%|██████████| 19/19 [00:07<00:00,  2.48it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.90it/s]


                   all         37         45      0.326      0.467       0.33      0.192
Closing dataloader mosaic
[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      11/20      2.83G      1.618      2.678      2.106          7        640: 100%|██████████| 19/19 [00:11<00:00,  1.67it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.56it/s]


                   all         37         45      0.601      0.489      0.464      0.232

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      12/20      2.83G      1.562      2.423      2.062          9        640: 100%|██████████| 19/19 [00:07<00:00,  2.62it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:01<00:00,  1.72it/s]

                   all         37         45      0.267      0.578       0.22      0.111






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      13/20      2.83G      1.454       2.35       1.93         10        640: 100%|██████████| 19/19 [00:06<00:00,  3.01it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.19it/s]

                   all         37         45      0.407      0.467      0.315      0.161






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      14/20      2.83G      1.482      2.263      1.937         10        640: 100%|██████████| 19/19 [00:07<00:00,  2.45it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.61it/s]

                   all         37         45      0.369      0.599      0.438      0.272






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      15/20      2.83G      1.532      2.279      1.968         10        640: 100%|██████████| 19/19 [00:07<00:00,  2.51it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:01<00:00,  1.96it/s]

                   all         37         45      0.672      0.556      0.592      0.358






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      16/20      2.83G      1.437      2.196      1.887          8        640: 100%|██████████| 19/19 [00:06<00:00,  3.10it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.98it/s]


                   all         37         45      0.423      0.667      0.377       0.21

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      17/20      2.83G      1.439       2.13      1.881         10        640: 100%|██████████| 19/19 [00:07<00:00,  2.46it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  2.57it/s]

                   all         37         45      0.383       0.62      0.338      0.168






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      18/20      2.83G      1.396      2.093      1.868          9        640: 100%|██████████| 19/19 [00:06<00:00,  2.78it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:01<00:00,  1.87it/s]

                   all         37         45      0.399      0.644      0.413      0.228






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      19/20      2.83G      1.475       2.09      1.902         10        640: 100%|██████████| 19/19 [00:06<00:00,  3.08it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.39it/s]

                   all         37         45      0.425      0.689      0.406      0.216






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


      20/20      2.83G      1.423      2.108      1.851          8        640: 100%|██████████| 19/19 [00:07<00:00,  2.44it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.19it/s]

                   all         37         45       0.41       0.71      0.425      0.239






20 epochs completed in 0.051 hours.
Optimizer stripped from runs/detect/train2/weights/last.pt, 5.5MB
Optimizer stripped from runs/detect/train2/weights/best.pt, 5.5MB

Validating runs/detect/train2/weights/best.pt...
Ultralytics 8.3.152 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
YOLO11n summary (fused): 100 layers, 2,582,347 parameters, 0 gradients, 6.3 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 2/2 [00:00<00:00,  3.48it/s]


                   all         37         45      0.672      0.556      0.593      0.359
Speed: 0.2ms preprocess, 3.6ms inference, 0.0ms loss, 2.3ms postprocess per image
Results saved to [1mruns/detect/train2[0m


If the model outputs even one image with a bounding box and label, then the dataset should work for our project! Verify this using the code below.

In [20]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell passes the trained model some images, to see if the model can identify some trash

import cv2
from random import sample
import matplotlib.pyplot as plt
import os # Moved import here

# Get the latest model
runs_detect_dir = Path('runs/detect')
train_dirs = [d for d in runs_detect_dir.iterdir() if d.is_dir() and d.name.startswith("train")]
train_dirs.sort(key=lambda d: d.stat().st_mtime, reverse=True)  # sort by modification time
latest_train_dir = train_dirs[0]
best_model_path = latest_train_dir / 'weights' / 'best.pt'
print(f"Loading {best_model_path}")

# Load the model and try it out
model = YOLO(best_model_path)
train_images_path = output_root / "images" / "train" # Corrected path to use output_root
image_files = list(train_images_path.glob('*.jpg'))

sample_images = sample(image_files, 10)

for image_path in sample_images:
    result = model(image_path)[0]
    annotated_image = result.plot()

    plt.figure(figsize=(8, 6))
    plt.imshow(annotated_image)
    plt.title(f'Predictions: {image_path.name}')
    plt.axis('off')
    plt.show()

Output hidden; open in https://colab.research.google.com to view.

If the model successfully generated even one image with a bounding box and label, please run the following code block to zip the taco_yolo dataset, download the zipped file, and upload it on Google Drive, https://www.google.com/url?q=https%3A%2F%2Fdrive.google.com%2Fdrive%2Ffolders%2F1bUkIYQRXX08OKI5TuOSg-eqntSudGaFB.

In [24]:
import zipfile
import os
from pathlib import Path
from google.colab import files

def zip_and_download_folder(folder_path, zip_name=None):
    """
    Zips a specified folder and triggers a download in Google Colab.

    Args:
        folder_path (str or Path): The path to the folder to zip.
        zip_name (str, optional): The name of the output zip file.
                                  If None, uses the folder name with a .zip extension.
    """
    folder_path = Path(folder_path)

    if not folder_path.is_dir():
        print(f"Error: Folder not found at {folder_path}")
        return

    if zip_name is None:
        zip_name = f"{folder_path.name}.zip"

    # Ensure the zip name has a .zip extension
    if not zip_name.lower().endswith('.zip'):
        zip_name += '.zip'

    print(f"Creating zip archive: {zip_name} from {folder_path}...")

    try:
        with zipfile.ZipFile(zip_name, 'w', zipfile.ZIP_DEFLATED) as zipf:
            for root, dirs, files_in_dir in os.walk(folder_path):
                # Create archive path relative to the folder_path
                arc_root = Path(root).relative_to(folder_path)
                for file in files_in_dir:
                    file_path = Path(root) / file
                    arc_path = arc_root / file
                    zipf.write(file_path, arc_path)
        print("Zip archive created successfully.")

        # Trigger download
        print(f"Downloading {zip_name}...")
        files.download(zip_name)
        print("Download triggered.")

    except Exception as e:
        print(f"An error occurred during zipping or downloading: {e}")


In [27]:
# ⚠️ DO NOT MODIFY THIS CELL
# This cell zips your converted YOLO dataset with an informative name, so you can download it and upload it to google drive.

from datetime import datetime
from pathlib import Path
import os

# Generate date prefix
date_str = datetime.now().strftime('%Y%m%d')

# Define the folder to zip (the final YOLO dataset folder)
folder_to_zip = output_root # output_root is set to the path of the generated YOLO dataset

# Define the name of the output zip file
# Using the parent directory name in the zip name might be confusing if the folder to zip is the final output
# Let's use the folder_to_zip name directly for the zip name prefix
zip_name = f"{date_str}_{folder_to_zip.name}.zip"

# Define the full path for the output zip file
output_zip_path = Path('/content') / zip_name

# Change directory to the parent of the folder to zip so the zip command includes the folder itself
# Or, more simply, use the zip command directly on the folder_to_zip path
# %cd {folder_to_zip.parent} # No need to change directory if zipping directly

print(f"Creating zip archive from {folder_to_zip}...")

# Use the zip command to archive the specific folder
# The -r flag is for recursive zipping (includes subdirectories)
# The first argument is the output zip file path
# The second argument is the folder to be zipped (relative to the current directory or absolute path)
# We will use the absolute path to the folder_to_zip
!zip -r {output_zip_path} {folder_to_zip}

print(f"Zip created at {output_zip_path}")

Creating zip archive from /kaggle/working/AquaTrash_yolo...
  adding: kaggle/working/AquaTrash_yolo/ (stored 0%)
  adding: kaggle/working/AquaTrash_yolo/data.yaml (deflated 20%)
  adding: kaggle/working/AquaTrash_yolo/labels/ (stored 0%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/ (stored 0%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/000098_JPG.rf.a5665b7913007535189b824dfb05aedc.txt (deflated 11%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/000082_jpg.rf.aa55332900465a30687548c20145a4e4.txt (deflated 14%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/000021_jpg.rf.686ef3b74a19214af6e22e15ad692bca.txt (deflated 8%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/000020_JPG.rf.ebbafb64b09f4937003dcda6b94d27ee.txt (deflated 8%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/000024_jpg.rf.5545363767994e0edf0a31ceda3dc6e5.txt (deflated 8%)
  adding: kaggle/working/AquaTrash_yolo/labels/val/000086_jpg.rf.21d6c190ab94e71b2f49fb97de73d378.txt (deflated 11%)

In [None]:
# Rerun the training cell with the corrected data.yaml path
# ⚠️ DO NOT MODIFY THIS CELL
# This cell trains a YOLO model on the converted YOLO dataset to see if it's set up correctly
# Tip: inspect the output of this cell to assess whether training occured properly.

from ultralytics import YOLO # Moved import here

model = YOLO('yolo11n.pt')
# Use the output_root directly as the data path
results = model.train(data=str(output_root / 'data.yaml'), epochs=4, imgsz=640)  # epoch size is small - this is just to see if it can work!

Downloading https://github.com/ultralytics/assets/releases/download/v8.3.0/yolo11n.pt to 'yolo11n.pt'...


100%|██████████| 5.35M/5.35M [00:00<00:00, 343MB/s]

Ultralytics 8.3.152 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
[34m[1mengine/trainer: [0magnostic_nms=False, amp=True, augment=False, auto_augment=randaugment, batch=16, bgr=0.0, box=7.5, cache=False, cfg=None, classes=None, close_mosaic=10, cls=0.5, conf=None, copy_paste=0.0, copy_paste_mode=flip, cos_lr=False, cutmix=0.0, data=/kaggle/working/aquatrash_yolo/data.yaml, degrees=0.0, deterministic=True, device=None, dfl=1.5, dnn=False, dropout=0.0, dynamic=False, embed=None, epochs=4, erasing=0.4, exist_ok=False, fliplr=0.5, flipud=0.0, format=torchscript, fraction=1.0, freeze=None, half=False, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, imgsz=640, int8=False, iou=0.7, keras=False, kobj=1.0, line_width=None, lr0=0.01, lrf=0.01, mask_ratio=4, max_det=300, mixup=0.0, mode=train, model=yolo11n.pt, momentum=0.937, mosaic=1.0, multi_scale=False, name=train, nbs=64, nms=False, opset=None, optimize=False, optimizer=auto, overlap_mask=True, patience=100, perspective=0.0, plots=Tru




YOLO11n summary: 181 layers, 2,590,620 parameters, 2,590,604 gradients, 6.4 GFLOPs

Transferred 448/499 items from pretrained weights
Freezing layer 'model.23.dfl.conv.weight'
[34m[1mAMP: [0mrunning Automatic Mixed Precision (AMP) checks...
[34m[1mAMP: [0mchecks passed ✅
[34m[1mtrain: [0mFast image access ✅ (ping: 0.1±0.1 ms, read: 676.5±338.7 MB/s, size: 23.7 KB)


[34m[1mtrain: [0mScanning /kaggle/working/aquatrash_yolo/labels/train.cache... 295 images, 0 backgrounds, 0 corrupt: 100%|██████████| 295/295 [00:00<?, ?it/s]

[34m[1malbumentations: [0mBlur(p=0.01, blur_limit=(3, 7)), MedianBlur(p=0.01, blur_limit=(3, 7)), ToGray(p=0.01, method='weighted_average', num_output_channels=3), CLAHE(p=0.01, clip_limit=(1.0, 4.0), tile_grid_size=(8, 8))





[34m[1mval: [0mFast image access ✅ (ping: 0.0±0.0 ms, read: 358.2±161.8 MB/s, size: 154.9 KB)


[34m[1mval: [0mScanning /kaggle/working/aquatrash_yolo/labels/val.cache... 74 images, 0 backgrounds, 0 corrupt: 100%|██████████| 74/74 [00:00<?, ?it/s]


Plotting labels to runs/detect/train/labels.jpg... 
[34m[1moptimizer:[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
[34m[1moptimizer:[0m AdamW(lr=0.00125, momentum=0.9) with parameter groups 81 weight(decay=0.0), 88 weight(decay=0.0005), 87 bias(decay=0.0)
Image sizes 640 train, 640 val
Using 2 dataloader workers
Logging results to [1mruns/detect/train[0m
Starting training for 4 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        1/4       2.4G      1.822      4.109      1.719         20        640: 100%|██████████| 19/19 [00:09<00:00,  2.03it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.31it/s]


                   all         74         91     0.0138      0.787      0.101     0.0463

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        2/4      2.79G      1.763      3.708      1.639         16        640: 100%|██████████| 19/19 [00:08<00:00,  2.34it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.58it/s]

                   all         74         91    0.00716      0.913      0.105     0.0444






      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        3/4       2.8G      1.743      3.432      1.617         13        640: 100%|██████████| 19/19 [00:09<00:00,  2.05it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.47it/s]


                   all         74         91    0.00675      0.822       0.17     0.0814

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size


        4/4       2.8G      1.703      3.226      1.602         17        640: 100%|██████████| 19/19 [00:08<00:00,  2.18it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.64it/s]


                   all         74         91    0.00549      0.891      0.229      0.112

4 epochs completed in 0.013 hours.
Optimizer stripped from runs/detect/train/weights/last.pt, 5.5MB
Optimizer stripped from runs/detect/train/weights/best.pt, 5.5MB

Validating runs/detect/train/weights/best.pt...
Ultralytics 8.3.152 🚀 Python-3.11.13 torch-2.6.0+cu124 CUDA:0 (Tesla T4, 15095MiB)
YOLO11n summary (fused): 100 layers, 2,582,932 parameters, 0 gradients, 6.3 GFLOPs


                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|██████████| 3/3 [00:01<00:00,  2.10it/s]


                   all         74         91     0.0055      0.891      0.231      0.113
                 glass          8          8   0.000863          1     0.0892     0.0408
                 metal         16         21     0.0046      0.762      0.168     0.0685
                 paper         23         24    0.00327      0.958      0.422      0.223
               plastic         34         38     0.0133      0.842      0.246      0.121
Speed: 0.3ms preprocess, 3.4ms inference, 0.0ms loss, 4.8ms postprocess per image
Results saved to [1mruns/detect/train[0m


In [23]:
# Not required
import json
import shutil
import random
from pathlib import Path
from sklearn.model_selection import train_test_split
import yaml
from PIL import Image # Import Image from PIL

def convert_bbox_to_yolo(bbox, img_w, img_h):
    """
    Converts COCO bbox format [x_min, y_min, width, height] to YOLO format
    [x_center, y_center, width, height] normalized by image dimensions.

    Args:
        bbox (list): Bounding box in COCO format [x_min, y_min, width, height].
        img_w (int): Image width.
        img_h (int): Image height.

    Returns:
        tuple: Bounding box in YOLO format [x_center, y_center, width, height] normalized,
               or None if image dimensions are invalid.
    """
    # Add check for valid image dimensions to avoid division by zero
    if img_w <= 0 or img_h <= 0:
        print(f"Warning: Invalid image dimensions ({img_w}x{img_h}). Cannot convert bbox.")
        return None # Indicate failure to convert

    # Input bbox is [x_min, y_min, width, height]
    x_min, y_min, width, height = bbox

    # Calculate YOLO format: [x_center, y_center, width, height] (normalized)
    x_center = (x_min + width / 2) / img_w
    y_center = (y_min + height / 2) / img_h
    width = width / img_w
    height = height / img_h

    # Ensure values are within [0, 1] and handle potential floating point issues near boundaries
    x_center = max(0.0, min(1.0, x_center))
    y_center = max(0.0, min(1.0, y_center))
    width = max(0.0, min(1.0, width))
    height = max(0.0, min(1.0, height))

    # Add validation for calculated bbox values (optional, but good practice)
    if width < 0 or width > 1 or height < 0 or height > 1 or x_center < 0 or x_center > 1 or y_center < 0 or y_center > 1:
         print(f"Warning: Calculated YOLO bbox coordinates are outside [0, 1]. Original bbox: {bbox}, Image WxH: {img_w}x{img_h}, YOLO coords: {x_center, y_center, width, height}")
         # Depending on strictness, you might return None here or try to clamp values

    return x_center, y_center, width, height


def convert_coco_to_yolo(coco_root: Path, dataset_name: str, train_split: float = 0.8):
    """
    Converts a simple COCO-formatted dataset to YOLO format, splitting into
    train, validation, and test sets.

    Assumes the COCO data is structured as:
    coco_root/
        images/
            image1.jpg
            image2.jpg
            ...
        annotations.json

    The annotations.json should be a simple COCO format with 'images', 'annotations',
    and 'categories' keys. The 'bbox' in annotations should be [x_min, y_min, width, height].

    Args:
        coco_root (Path): Path to the root directory of the simple COCO dataset.
        dataset_name (str): Name for the output YOLO dataset directory.
        train_split (float): The proportion of the dataset to use for training (0.0 to 1.0).
                             The remaining data will be split equally between validation and test.

    Returns:
        Path: The path to the generated data.yaml file in the YOLO dataset directory.
    """

    images_dir = coco_root / 'images'
    annotations_path = coco_root / 'annotations.json'

    if not images_dir.is_dir():
        print(f"Error: Image directory not found at {images_dir}")
        return None
    if not annotations_path.is_file():
        print(f"Error: annotations.json not found at {annotations_path}")
        return None

    # Define output YOLO directory structure
    output_root = Path(f'/kaggle/working/{dataset_name}_yolo')
    images_train = output_root / 'images' / 'train'
    images_val = output_root / 'images' / 'val'
    images_test = output_root / 'images' / 'test'
    labels_train = output_root / 'labels' / 'train'
    labels_val = output_root / 'labels' / 'val'
    labels_test = output_root / 'labels' / 'test'

    # Clear and recreate folders
    for d in [images_train, images_val, images_test, labels_train, labels_val, labels_test]:
        if d.exists():
            shutil.rmtree(d)
        d.mkdir(parents=True, exist_ok=True)

    # Load COCO annotations
    with open(annotations_path, 'r') as f:
        coco_data = json.load(f)

    images_info = coco_data['images']
    annotations_info = coco_data['annotations']
    categories_info = coco_data['categories']

    # Create mappings for quick lookup
    image_id_to_info = {img['id']: img for img in images_info}
    annotations_by_image_id = defaultdict(list)
    for ann in annotations_info:
        annotations_by_image_id[ann['image_id']].append(ann)

    # Map category_id to the new single class ID (0 for 'trash')
    # We still need the original category_id to filter annotations if needed,
    # but for the YOLO labels, we'll always use the new single class ID.
    # category_id_to_name = {cat['id']: cat['name'] for cat in categories_info}
    # category_names = [cat['name'] for cat in categories_info] # For data.yaml

    # Get a list of all image IDs
    all_image_ids = list(image_id_to_info.keys())

    # Split image IDs into train, validation, and test sets
    # Split into train and the rest (val + test)
    train_ids, rest_ids = train_test_split(all_image_ids, train_size=train_split, random_state=42)
    # Split the rest into validation and test (50/50 of the remaining)
    val_ids, test_ids = train_test_split(rest_ids, test_size=0.5, random_state=42)

    print(f"Dataset split: Train={len(train_ids)} images, Validation={len(val_ids)} images, Test={len(test_ids)} images")


    # Function to process a single split
    def process_split(image_ids, target_image_folder, target_label_folder):
        print(f"Processing split for {target_image_folder.parent.name}/{target_image_folder.name}...")
        files_processed_count = 0
        labels_written_count = 0

        for img_id in image_ids:
            img_info = image_id_to_info.get(img_id)
            if not img_info:
                print(f"Warning: Image ID {img_id} not found in images_info. Skipping.")
                continue

            original_file_name = img_info['file_name']
            img_w, img_h = img_info['width'], img_info['height']

            source_image_path = images_dir / original_file_name
            target_image_path = target_image_folder / original_file_name

            # Copy image
            if source_image_path.exists():
                shutil.copy(source_image_path, target_image_path)
            else:
                print(f"Warning: Source image not found, cannot copy: {source_image_path}")
                continue # Skip if image doesn't exist

            # Prepare label data
            yolo_data_for_file = []
            current_image_annotations = annotations_by_image_id.get(img_id, [])

            for ann in current_image_annotations:
                # We don't need the original category_id for the YOLO label file,
                # as we are collapsing all categories into 'trash' (class ID 0).
                # category_id = ann['category_id']
                bbox_coco = ann['bbox'] # [x_min, y_min, width, height]

                # Convert COCO bbox to YOLO format
                yolo_bbox_coords = convert_bbox_to_yolo(bbox_coco, img_w, img_h)

                if yolo_bbox_coords:
                    # Always use class ID 0 for 'trash'
                    class_id = 0
                    # yolo_bbox_coords is (x_center, y_center, width, height)
                    yolo_data_for_file.append(f"{class_id} {yolo_bbox_coords[0]:.6f} {yolo_bbox_coords[1]:.6f} {yolo_bbox_coords[2]:.6f} {yolo_bbox_coords[3]:.6f}")

            # Write label file if there's any valid annotation data
            # Use the original_file_name's stem for the .txt file
            label_file_name = Path(original_file_name).stem + '.txt'
            label_file_path = target_label_folder / label_file_name

            # Only create label file if there are annotations
            if yolo_data_for_file:
                 with open(label_file_path, 'w') as f:
                    f.write('\n'.join(yolo_data_for_file))
                 labels_written_count += 1
            # else:
                # If no annotations for this image and you still want an empty label file:
                # Path(label_file_path).touch() # Create an empty file

            files_processed_count += 1
            if files_processed_count % 100 == 0: # Print progress every 100 files
                print(f"  Processed {files_processed_count}/{len(image_ids)} images for {target_image_folder.name}...")


        print(f"Finished processing for {target_image_folder.name}. Copied {files_processed_count} images. Wrote {labels_written_count} label files.")


    # Process each split
    if train_ids:
        process_split(train_ids, images_train, labels_train)
    else:
        print("No training data to process.")

    if val_ids:
        process_split(val_ids, images_val, labels_val)
    else:
        print("No validation data to process.")

    if test_ids:
         process_split(test_ids, images_test, labels_test)
    else:
        print("No test data to process.")


    # Create data.yaml file with a single 'trash' category
    data_yaml = {
        'path': str(output_root), # Path to the dataset root
        'train': 'images/train', # Path to training images relative to 'path'
        'val': 'images/val',     # Path to validation images relative to 'path'
        'test': 'images/test',   # Path to test images relative to 'path'
        'nc': 1,                 # Number of classes (only one: trash)
        'names': ['trash']       # List of class names
    }

    data_yaml_path = output_root / 'data.yaml'
    with open(data_yaml_path, 'w') as f:
        yaml.dump(data_yaml, f, default_flow_style=False)

    print(f"\nGenerated data.yaml at {data_yaml_path}")

    return data_yaml_path # Return the path to the data.yaml file

Helper function for zipping and downloading a folder

In [26]:
# Example usage (you can modify this to zip a specific folder like your YOLO dataset)
# Replace '/kaggle/working/AquaTrash_yolo' with the actual path to the folder you want to zip
# zip_and_download_folder('/kaggle/working/AquaTrash_yolo', 'my_yolo_dataset_archive.zip')
zip_and_download_folder('/content/runs/detect')

Creating zip archive: detect.zip from /content/runs/detect...
Zip archive created successfully.
Downloading detect.zip...


<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

Download triggered.
