# Trash Detection using YOLOv8

## Part 1 - Data processing

-------------------------------------
The [TrashCan (1.0)](https://conservancy.umn.edu/handle/11299/214865) dataset is composed of images annotated for detecting trash, ROVs, and flora & fauna on the ocean floors.

<p align="center">
  <img style="width: 300px" src='https://learnopencv.com/wp-content/uploads/2022/11/annotated-trash-dataset-images-for-yolov6-custom-dataset-training.png' />
</p>

This dataset is part of a research that also has an [accompanying technical article](https://arxiv.org/abs/2007.08097). It contains *7212* images with annotations, for instance, segmentation and bounding box detection. 

The **TrashCan dataset** contains two versions. They are:
- **TrashCan-Material**: Contains *16* different classes.
- **TrashCan-Instance**: Contains *22* different classes

In [None]:
from google.colab import drive
drive.mount('/content/drive')

In [2]:
%cd /content/drive/MyDrive/SELab-Tutorials/Tutorial-YOLO

/content/drive/MyDrive/SELab-Tutorials/Tutorial-YOLO


Firstly, let's download the [TrashCan 1.0](https://conservancy.umn.edu/bitstream/handle/11299/214865/dataset.zip?sequence=12&isAllowed=y) for Trash Detection problem


In [None]:
# extract dataset
!unzip dataset.zip

After extracting, we have the structure directory as:

    .
    |-- README.txt
    |-- instance_version
    |   |-- README.txt
    |   |-- instances_train_trashcan.json
    |   |-- instances_val_trashcan.json
    |   |-- train
    |   `-- val
    |-- material_version
    |   |-- README.txt
    |   |-- instances_train_trashcan.json
    |   |-- instances_val_trashcan.json
    |   |-- train
    |   `-- val
    |-- original_data
    |   |-- README.txt
    |   |-- annotations
    |   `-- images
    `-- scripts
        `-- trash_can_coco.py

In this tutorial, we will be utilizing the **instance_version** of this dataset, which follows the COCO format. To use it, we need to convert it to the YOLO format.

In [None]:
import logging
logging.getLogger().setLevel(logging.CRITICAL)
!pip install pylabel > /dev/null


In [None]:
from pylabel import importer
import os
import zipfile


In [None]:
def convert_to_YOLO_dataset(path_to_images="dataset/instance_version/train", path_to_annotations="dataset/instance_version/instances_train_trashcan.json", output_path='data/train/labels'):
    dataset = importer.ImportCoco(
        path_to_annotations, path_to_images=path_to_images)

    dataset.path_to_annotations = path_to_annotations
    dataset.path_to_images = path_to_images

    dataset.export.ExportToYoloV5(output_path)

    return dataset


In [None]:
# train 
train_data = convert_to_YOLO_dataset(path_to_images="dataset/instance_version/train",
                        path_to_annotations="dataset/instance_version/instances_train_trashcan.json",
                        output_path="datasets/data/train/labels")

!mkdir -p 'datasets/data/train/images'                        
!cp dataset/instance_version/train/* 'datasets/data/train/images'

# val
val_data = convert_to_YOLO_dataset(path_to_images="dataset/instance_version/val",
                        path_to_annotations="dataset/instance_version/instances_val_trashcan.json",
                        output_path="datasets/data/val/labels")

!mkdir -p 'datasets/data/val/images'
!cp dataset/instance_version/val/* 'datasets/data/val/images'


Exporting files: 100%|██████████| 6065/6065 [00:10<00:00, 573.85it/s]
Exporting files: 100%|██████████| 1147/1147 [00:01<00:00, 700.24it/s]


In [None]:
!rm datasets/data/train/dataset.yaml
!rm datasets/data/val/dataset.yaml
!touch datasets/dataset.yaml

In [None]:
# Write the given content to a file

filename = "datasets/dataset.yaml"

content = "# number of classes\nnc: 22\n\n# class names\nnames:\n- rov\n- plant\n- animal_fish\n- animal_starfish\n- animal_shells\n- animal_crab\n- animal_eel\n- animal_etc\n- trash_clothing\n- trash_pipe\n- trash_bottle\n- trash_bag\n- trash_snack_wrapper\n- trash_can\n- trash_cup\n- trash_container\n- trash_unknown_instance\n- trash_branch\n- trash_wreckage\n- trash_tarp\n- trash_rope\n- trash_net\n\n# Train/val dir\npath: data\ntrain: data/train/images\nval: data/val/images"

with open(filename, "w") as file:
    file.write(content)


And now, we have the dataset that in YOLO format

In [8]:
!tree -L 2 datasets

[01;34mdatasets[00m
├── [01;34mdata[00m
│   ├── [01;34mtrain[00m
│   └── [01;34mval[00m
└── data.yaml

3 directories, 1 file


Zip processed dataset to `trashcan-yolo.zip`

In [None]:
!zip -r trashcan-yolo.zip datasets