# Finetune YoloV8 for binary classification

In this practical session, we're going to dive into the fascinating world of computer vision and machine learning by fine-tuning the YOLOv8 model for a specific task: binary classification. Our goal is to teach the model to differentiate between two categories of images - trucks and non-trucks. YOLOv8  is a powerful deep learning model known for its speed and accuracy in object detection, image segmentation or image classification tasks.

Use nvidia-smi command to display your GPU config

In [None]:
# FIXME

## Download the dataset

The following cells can be used to download the dataset

In [None]:
# Install tool to download the dataset and to measure the performances
# Install library for Yolo model

!pip install gdown
!pip install tqdm
!pip install scikit-learn
!pip install ultralytics

In [None]:
# Download the dataset and untar the dataset

!gdown https://drive.google.com/uc?id=19NFEwpG0uu7gdJ9NTpQuGRw50CeVOI0R
!tar xvf truck_dataset.tar.gz

## Split the dataset into train and test

`train`: Used to train the model

`test`: Used to test the performances of the model

By measuring the performances on both set, we can see if the model can properly generalize to unknown images.

You can use this code to split the dataset into 2 sets (train and test)

In [None]:
from typing import List, Dict

In [None]:
import os
import shutil
from tqdm import tqdm
from sklearn.model_selection import train_test_split

def train_test_split_folder(
    dataset_base_path: str = 'truck_dataset/',
    split_size: float = 0.2,
    classes: List[str] = ['truck', 'other'],
):
    '''
    Split the dataset into train and test folder

    Parameters
    ----------
    dataset_base_path: Root path of the dataset (before 'train/' folder)
    split_size: Size of the test set (0.2 means we use 20% of the dataset for test and 80% for train)
    classes: List of the classes in the dataset
    '''
    dataset_dir_train = os.path.join(dataset_base_path, 'train')
    dataset_dir_test = os.path.join(dataset_base_path, 'test')

    # Create the test folder
    os.makedirs(dataset_dir_test, exist_ok=True)

    # Function to split and move files
    def split_and_move_files(class_name: str):
        source_dir = os.path.join(dataset_dir_train, class_name)
        files = os.listdir(source_dir)

        # Splitting files into train and test
        train_files, test_files = train_test_split(files, test_size=split_size, random_state=42)

        # Moving test files to the test directory
        for f in tqdm(test_files):
            shutil.move(os.path.join(source_dir, f), os.path.join(dataset_dir_test, class_name, f))

    # Split and move files for each class
    for class_dir in ['truck', 'other']:
        print("Move files for class: ", class_dir)
        os.makedirs(os.path.join(dataset_dir_test, class_dir), exist_ok=True)
        split_and_move_files(class_dir)

In [None]:
# You can vary the split size
train_test_split_folder(dataset_base_path="truck_dataset/", split_size=0.15)

Move files for class:  truck


100%|██████████| 105/105 [00:00<00:00, 22107.42it/s]


Move files for class:  other


100%|██████████| 132/132 [00:00<00:00, 13607.16it/s]


## Load and finetune the pretrained model

Important `ultralytics` library and call the `checks` function to display all the informations: yolo version, python version, torch version, number of CPU / GPU / RAM ...

In [None]:
# FIXME

Load yoloV8 model for classification, you can start with the smallest model.

Keep in mind that the Yolo classification model can classify into 1000 classes. When you test the model on different images, you need to identify the classes that can be considered as truck

In [None]:
# FIXME

Now send the model on the GPU device

In [None]:
# FIXME

Load an image, plot the image and make the prediction of the model.
Apply those steps for both classes

In [None]:
# FIXME

In [59]:
classes = [
    'moving_van',
    'minivan',
    'police_van',
    'freight_car',
    'trailer_truck',
    'crane_(machine)',
    'tow_truck',
    'garbage_truck',
    'half_track',
    'tow_truck'
]

First measure the performances of the model on the images. The model can work directly with saved images, you don't need to load all of them manually directly

In [None]:
# FIXME

We will try to improve the performances and finetune the YoloV8 model

In [None]:
# FIXME

Now compare the accuracy and the confusion matrix

In [None]:
# FIXME

Now measure the performances with other Yolov8 models (bigger ones for instance)


You can find the list of models for classification here:

https://docs.ultralytics.com/fr/tasks/classify/

Compare the performances of all the models

In [None]:
# FIXME