#### Basics of Yolo v5 - Balloon detection
by @vbookshelf<br>
29 June 2021

## Introduction

These are some of the questions that this notebook will answer:

- What does the step by step Yolo v5 workflow look like?<br>
- What format does the input data need to have?<br>
- How do you load a trained model and make a prediction?<br>

A few things to keep in mind:
- Yolo v5 uses mosaic augmentation by default during training (probability = 1). Please scroll down to the end of this notebook to see a batch of mosaic training images. Also, refer to the section on using custom training settings to see how the probability of doing mosaic augmentation can be set.
- We will be changing the working directory as we move in and out of the yolov5 folder. It's helpful to understand how the command line is used to cd into a directory and to print the working directory.
- One thing I found confusing is that the results of each training (and inference) run are stored in a different folder e.g. exp, exp2 etc. Therefore, when displaying or retrieving predictions it's important to make sure that you are getting the preds for the latest experiment. If you print a list of train experiments you will find that the latest experiment is the first element in the list. But,when I printed a list of prediction (detect) experiments I found that the latest experiment is sometimes the last element in the list and sometimes it's the first element in the list. To prevent this confusion use the --exist-ok parameter. This tells Yolo not to increment the experiment names. Each run will always be named exp. But keep in mind that if you use --exist-ok then the results of every run are appended to the same text file i.e. the results already in the text file are not overwritten by the results from the latest run.

**1- Do I need to resize images for Yolo?**<br>
No. We give Yolo the original image sizes and the original bounding box sizes. Yolo does the resizing automatically. During training and inference we specify the --img parameter.<br>

This quote explains what the --img parameter does:<br>

*python train.py --img 640 means that the mosaic dataloader pulls up the selected image along with 3 random images, resizes them all to 640, joins all 4 at the seams into a 1280x1280 mosaic, augments them, and then crops a center 640x640 area for placement as 1 image into the batch.*<br>
https://github.com/ultralytics/yolov5/issues/46

**2- If the image does not contain any objects do I need to create a txt file for that image?**<br>
Quote:
*if no objects in image, no *.txt file is required*<br>
https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data

**3- How can I train Yolo without applying the default mosaic augmentation?**<br>
Quote:
*you can use train.py --rect to omit mosaic, and you can set any image augmentation hyps you want in the data/hyps files.*<br>
https://github.com/ultralytics/yolov5/issues/700

**4- How do I change the training and augmentation hyperparameters?**<br>
Quote: 
https://github.com/ultralytics/yolov5/issues/607#issuecomment-680685682
- data/hyp.scratch.yaml will be automatically used by python train.py --weights ''<br>
- data/hyp.finetune.yaml will be automatically used by python train.py --weights yolov5s.pt<br>
- hyp.custom.yaml can be force-selected by python train.py --hyp hyp.custom.yaml<br>


This notebook, by Alien, shows how to use custom hyperparameters:<br>
https://www.kaggle.com/h053473666/siim-cov19-yolov5-train

You can find the hyps files inside the yolov5 folder: yolov5/data/hyps

## References and Resources

These are a few good resources that helped me understand how to use Yolo. I suggest that you start by watching the video tutorial by Abishek Thakur.

Train custom object detection model with YOLO V5<br>
Abishek Thakur<br>
https://www.youtube.com/watch?v=NU9Xr_NYslo

Notebook explaining yolo by Awsaf<br>
https://www.kaggle.com/awsaf49/vinbigdata-cxr-ad-yolov5-14-class-train/data?select=yolov5

Yolo v5 training notebook by Alien<br>
https://www.kaggle.com/h053473666/siim-cov19-yolov5-train

Yolo v5 inference notebook by Alien<br>
https://www.kaggle.com/h053473666/siim-cov19-efnb7-yolov5-infer

Ultralytics GitHub<br>
https://github.com/ultralytics/yolov5

Ultralytics Yolo getting started tutorial<br>
https://github.com/ultralytics/yolov5/wiki/Train-Custom-Data

Ultralytics notebook<br>
https://github.com/ultralytics/yolov5/blob/master/tutorial.ipynb

In [None]:
import pandas as pd
import numpy as np
import os

import ast
import cv2

from sklearn.model_selection import train_test_split
import shutil
from tqdm.notebook import tqdm
import tqdm.notebook as tq

import albumentations as albu
from albumentations import Compose

import matplotlib.pyplot as plt

In [None]:
os.listdir('../input/v2-balloon-detection-dataset')

In [None]:
base_path = '../input/v2-balloon-detection-dataset/'

## Set up yolov5

In [None]:
import torch
from IPython.display import Image, clear_output

clear_output()
print(f"Setup complete. Using torch {torch.__version__} ({torch.cuda.get_device_properties(0).name if torch.cuda.is_available() else 'CPU'})")

In [None]:
# Print the current working directory
!pwd

In [None]:
# List the files and folders in the working directory
!ls

In [None]:
# clone repo
!git clone https://github.com/ultralytics/yolov5.git  

# change the working directory to yolov5
#%cd yolov5
os.chdir('/kaggle/working/yolov5')

# install dependencies
%pip install -qr requirements.txt 

# Change the working directory back to /kaggle/working/
os.chdir('/kaggle/working/')

!pwd

## How to use yolov5 offline

In some Kaggle competitions the internet cannot be on during inference. In those cases it helps to know how to use Yolov5 offline. 

I have included a yolov5 folder in the v2-balloon-detection-dataset. To use yolov5 offline first comment out all the lines in the cell above. Then uncomment and run the cell below. It will work even if the internet is off. 

Installing requirements.txt is not essential on Kaggle.

In [None]:
# Ref: https://www.kaggle.com/awsaf49/vinbigdata-cxr-ad-yolov5-14-class-train/data

# Copy the yolov5 folder from the dataset to  /kaggle/working/

# Uncomment this line to set up yolov5 offline.
#shutil.copytree('../input/v2-balloon-detection-dataset/yolov5', '/kaggle/working/yolov5')


#### I followed the following steps to get the yolov5 folder used in the above cell:<br>

1- Create a new notebook<br>
2- Place the following line of code in that notebook to clone the repo. The yolov5 folder will download into the notebook.<br>

!git clone https://github.com/ultralytics/yolov5.git  

3- Commit the notebook. The yolov5 folder will be saved in the notebook's output.<br>
4- I downloaded the yolov5 folder from the output of the committed notebook to my local computer.<br>
5- I then uploaded the yolov5 folder to the v2-balloon-detection-dataset. When uploading to a dataset, don't allow the system to 'skip duplicates'. Click on the right portion of the button and select 'include duplicates'. Also, if you upload a zipped yolov5 folder then the kaggle dataset system will create a yolov5 folder within another yolov5 folder. In that case you will need to correct some of the paths below in order for the code to work.

## Load the data

In [None]:
# We need to be in /kaggle/working/ for the next cell to run.

!pwd

In [None]:
path = base_path + 'balloon-data.csv'

df_data = pd.read_csv(path)

# Convert bbox column entries from strings to lists
# "[........]" to [......]
df_data['bbox'] = df_data['bbox'].apply(ast.literal_eval)

print(df_data.shape)

df_data.head()

## Helper functions

In [None]:
# https://pythonprogramming.net/drawing-writing-python-opencv-tutorial/
# https://codeyarns.com/tech/2015-03-11-fonts-in-opencv.html
# https://stackoverflow.com/questions/60674501/how-to-make-black-background-in-cv2-puttext-with-python-opencv
# https://www.geeksforgeeks.org/python-opencv-cv2-puttext-method/
# https://pysource.com/2018/01/22/drawing-and-writing-on-images-opencv-3-4-with-python-3-tutorial-3/


def draw_bbox(image, xmin, ymin, xmax, ymax, text=None):
    
    """
    This functions draws one bounding box on an image.
    
    Input: Image (numpy array)
    Output: Image with the bounding box drawn in. (numpy array)
    
    If there are multiple bounding boxes to draw then simply
    run this function multiple times on the same image.
    
    Set text=None to only draw a bbox without
    any text or text background.
    E.g. set text='Balloon' to write a 
    title above the bbox.
    
    xmin, ymin --> coords of the top left corner.
    xmax, ymax --> coords of the bottom right corner.
    
    """


    w = xmax - xmin
    h = ymax - ymin

    # Draw the bounding box
    # ......................
    
    start_point = (xmin, ymin) 
    end_point = (xmax, ymax) 
    bbox_color = (255, 0, 0) 
    bbox_thickness = 15

    image = cv2.rectangle(image, start_point, end_point, bbox_color, bbox_thickness) 
    
    
    
    # Draw the tbackground behind the text and the text
    # .................................................
    
    # Only do this if text is not None.
    if text:
        
        # Draw the background behind the text
        text_bground_color = (0,0,0) # black
        cv2.rectangle(image, (xmin, ymin-150), (xmin+w, ymin), text_bground_color, -1)

        # Draw the text
        text_color = (255, 255, 255) # white
        font = cv2.FONT_HERSHEY_DUPLEX
        origin = (xmin, ymin-30)
        fontScale = 3
        thickness = 10

        image = cv2.putText(image, text, origin, font, 
                           fontScale, text_color, thickness, cv2.LINE_AA)



    return image

In [None]:
def display_images(df):

    # set up the canvas for the subplots
    plt.figure(figsize=(20,70))


    for i in range(1,13):

        index = i

        # Load an image
        path = base_path + 'images/' + df.loc[index, 'fname']
        image = plt.imread(path)
        #image = cv2.resize(image, (IMAGE_SIZE, IMAGE_SIZE))

        plt.subplot(10,3,i)

        plt.imshow(image)
        plt.axis('off')

## Display a few images

In [None]:
display_images(df_data)

## Display one image with bounding boxes

In [None]:
# set the figsize so the image is larger
plt.figure(figsize=(8,8))

# Choose an index.
# Change this number to see different images.
i = 4   

# Load an image
fname = df_data.loc[i, 'fname']

path = base_path + 'images/' + fname
image = plt.imread(path)

bbox_list = df_data.loc[i, 'bbox']

# Draw the bboxes on the image
for coord_dict in bbox_list:
    
    xmin = int(coord_dict['xmin'])
    ymin = int(coord_dict['ymin'])
    xmax = int(coord_dict['xmax'])
    ymax = int(coord_dict['ymax'])
    
    image = draw_bbox(image, xmin, ymin, xmax, ymax, text=None)

print(image.dtype)
print(image.min())
print(image.max())
print(image.shape)

plt.imshow(image)
plt.axis('off')
plt.show()

## Create train and val data

In [None]:
df_train, df_val = train_test_split(df_data, test_size=0.2, random_state=101)

print(df_train.shape)
print(df_val.shape)

## Create the Yolo directory structure

We need to create a directory structure inside the yolov5 folder. This is where the training and validation data will need to be stored.

In [None]:
# Note the the following folder structure must be
# located inside the yolov5 folder

# base_dir
    # images
        # train (contains image files)
        # validation (contains image files)
    # labels 
        # train (contains .txt files)
        # validation (contains .txt files)
        
# Yolo expects the bounding box dimensions to be
# normalized to have values between 0 and 1.
        
# Label format in .txt file
# class x-center y-center width height
# E.g. 0 0.1 0.2 200 300

# Each label is on a new line, in the .txt file:
# 0 0.1 0.2 200 300
# 0 0.1 0.2 200 300

In [None]:
! pwd

In [None]:
# change the working directory to yolov5
#%cd yolov5

os.chdir('/kaggle/working/yolov5')

!pwd

In [None]:
# Create a new directory (this is happening inside the yolov5 directory)

base_dir = 'base_dir'
os.mkdir(base_dir)


# Now we create folders inside 'base_dir':

# base_dir

    # images
        # train
        # validation

    # labels
        # train
        # validation

# images
images = os.path.join(base_dir, 'images')
os.mkdir(images)

# labels
labels = os.path.join(base_dir, 'labels')
os.mkdir(labels)



# Inside each folder we create seperate folders for each class

# create new folders inside images
train = os.path.join(images, 'train')
os.mkdir(train)
validation = os.path.join(images, 'validation')
os.mkdir(validation)


# create new folders inside labels
train = os.path.join(labels, 'train')
os.mkdir(train)
validation = os.path.join(labels, 'validation')
os.mkdir(validation)

In [None]:
# This is the contents of the yolov5 folder
!ls

In [None]:
# check that the folders have been created
os.listdir('base_dir/images')

In [None]:
# Display the folder structure

!tree base_dir

## Process the data

Here we will write a function to process the training and validation data. 

We need to create a separate txt file for each image that contains the details of all the bounding boxes on that image.  This function will also move the training and val data into the directory structure that we created above. We won't need to do any image resizing for Yolo. It will do that automatically during training.

In [None]:
!pwd

In [None]:
# Change the working directory
# go one level up --> back to kaggle/working
#%cd ..

os.chdir('/kaggle/working/')

!pwd

In [None]:
df_train.head()

In [None]:
# Iterate through each row in the dataframe

# We run the function below separately for
# the train and val sets.
# Remember that each image gets it's own text file
# containing the info for all bboxes on that image.

# For each image:
# 1- get the info for each bounding box
# 2- write the bounding box info to a txt file
# 3- save the txt file in the correct folder
# 4- copy the image to the correct folder


def process_data_for_yolo(df, data_type='train'):

    for _, row in tq.tqdm(df.iterrows(), total=len(df)):
        
        image_name = row['fname']
        bbox_list = row['bbox']
        
        image_width = row['width']
        image_height = row['height']
 
        
        # Convert into the Yolo input format
        # ...................................
        

        yolo_data = []
        
        # row by row
        for coord_dict in bbox_list:

            xmin = int(coord_dict['xmin'])
            ymin = int(coord_dict['ymin'])
            xmax = int(coord_dict['xmax'])
            ymax = int(coord_dict['ymax'])
            
            # We only have one class i.e. balloon.
            # We will set the class_id to 0 for all images.
            # Class numbers must start from 0.
            class_id = 0
            
            bbox_h = int(ymax - ymin)
            bbox_w = int(xmax - xmin)

            x_center = xmin + (bbox_w/2)
            y_center = ymin + (bbox_h/2)
            

            # Normalize
            # Yolo expects the dimensions to be normalized i.e.
            # all values between 0 and 1.

            x_center = x_center/image_width
            y_center = y_center/image_height
            bbox_w = bbox_w/image_width
            bbox_h = bbox_h/image_height

            # [class_id, x-center, y-center, width, height]
            yolo_list = [class_id, x_center, y_center, bbox_w, bbox_h]

            yolo_data.append(yolo_list)

        # convert to nump array
        yolo_data = np.array(yolo_data)


        # save the text file
        image_id = image_name.split('.')[0]
        np.savetxt(os.path.join('yolov5/base_dir', 
                    f"labels/{data_type}/{image_id}.txt"),
                    yolo_data, 
                    fmt=["%d", "%f", "%f", "%f", "%f"]
                    ) # fmt means format the columns

        # Copy the image to images
        shutil.copyfile(
            os.path.join(base_path, f"images/{image_name}"),
            os.path.join('yolov5/base_dir', f"images/{data_type}/{image_name}")
        )
        

# Call the function    
process_data_for_yolo(df_train, data_type='train')
process_data_for_yolo(df_val, data_type='validation')

In [None]:
# Check that the files have been created

print(len(os.listdir('yolov5/base_dir/images/train')))
print(len(os.listdir('yolov5/base_dir/images/validation')))

print(len(os.listdir('yolov5/base_dir/labels/train')))
print(len(os.listdir('yolov5/base_dir/labels/validation')))

In [None]:
text_file_list = os.listdir('yolov5/base_dir/labels/train')

text_file = text_file_list[0]

text_file

In [None]:
# Display the contents of a text file

! cat 'yolov5/base_dir/labels/train/155815494_800fc9aa32_b.txt'

In [None]:
# List the images in the train folder

#os.listdir('yolov5/base_dir/images/train')

## Create the yaml file
Yolo requires that we also create a yaml file inside the yolov5 folder.

In [None]:
# Ref:
# Reading and Writing YAML to a File in Python
# https://stackabuse.com/reading-and-writing-yaml-to-a-file-in-python

In [None]:
yaml_dict = {'train': 'base_dir/images/train',   # path to the train folder
            'val': 'base_dir/images/validation', # path to the val folder
            'nc': 1,                             # number of classes
            'names': ['balloon']}                # list of label names

In [None]:
# Create the yaml file called my_data.yaml
# We will save this file inside the yolov5 folder.

import yaml

with open(r'yolov5/my_data.yaml', 'w') as file:
    documents = yaml.dump(yaml_dict, file)

In [None]:
# Check that the my_data.yaml file is in the yolov5 folder.
# It should appear in the list of files.

os.listdir('yolov5')

In [None]:
# Display the contents of the yaml file

! cat 'yolov5/my_data.yaml'

## Make a prediction on an existing image

This is how to make a prediction on the popular "Zedan" image. This image is not from our dataset. It comes with the yolov5 folder by default.

In [None]:
# change the working directory to yolov5
#%cd yolov5

os.chdir('/kaggle/working/yolov5')

!pwd

In [None]:
# Make a prediction.
# Note that we are typing on the command line.
# The exclamation mark (!) allows command line instructions
# to be issued from a notebook cell.

# Make a prediction
!python detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source data/images/ --save-txt --save-conf --exist-ok

# Display the predicted image.
# Yolo draws the predicted bounding boxes onto the image.
Image(filename='runs/detect/exp/zidane.jpg', width=600)

In [None]:
os.listdir('runs/detect/exp/labels/')

In [None]:
# These are the predicted bounding box coords and their confidence scores.
# The results are in a txt file - one txt file per test image.
# Because we used --exist-ok each time a prediction is made
# the results get appended to the same text file (not overwritten).

# Next we will see how to put this info into a dataframe.
# Format: [class, x-center, y-center, width, height, conf-score]

# Display the contents of the text file.
# The number of lines keep increasing each time you run the prediction cell above.
!cat 'runs/detect/exp/labels/zidane.txt'

In [None]:
# get a list of detect experiments
exp_list = os.listdir('runs/detect/')

exp_list

## How to put the predictions into a dataframe

In [None]:
# How to put the bbox info from one txt file into a dataframe
# https://stackoverflow.com/questions/21546739/load-data-from-txt-with-pandas

path = f'runs/detect/exp/labels/zidane.txt'

cols = ['class', 'x-center', 'y-center', 'width', 'height', 'conf-score']

df = pd.read_csv(path, sep=" ", header=None)

df.columns = cols

df.head()

## Train the model

In [None]:
# We should be inside the yolov5 folder
!pwd

In [None]:
# What some of the parameters mean:

# --weights => the pre-trained model that we are using.
# The list of available pre-trained models can be found here:
# https://github.com/ultralytics/yolov5

# --save-txt => The predicted bbox coordinates get saved to a txt file. One txt file per image.
# --save-conf => The conf score gets included in the above txt file.
# --img => The image will be resized to this size before creating the mosaic.
# --conf => The confidence threshold
# --rect => Means don't use mosaic augmentation during training
# --name => Give a model a name e.g. --name my_model
# --batch => batch size
# --epochs => number of training epochs
# --data => the yaml file path
# --exist-ok => do not increment the project names with each run i.e. don't change exp to epx2, exp3 etc.
# --nosave => do not save the images/videos (helpful when deploying to a server)

# It's helpful to review the source code in detect.py to know what the above parameters mean.
# detect.py is located inside the yolov5 folder.

In [None]:
# If you uncomment and run this line you will get a request to enter a 
# wandb password. To solve this problem we include WANDB_MODE="dryrun" in
# the next line.
#! python train.py --img 1024 --batch 8 --epochs 2 --data my_data.yaml --cfg models/yolov5s.yaml --name wheatmodel

# Without using pre-trained weights
#!WANDB_MODE="dryrun" python train.py --img 1024 --batch 24 --epochs 10 --data my_data.yaml --cfg models/yolov5s.yaml --name my_model

# Using pre-trained weights and image cache i.e. loading all images into RAM
#!WANDB_MODE="dryrun" python train.py --img 1024 --batch 24 --epochs 10 --data my_data.yaml --weights yolov5s.pt --cache

# Not caching images
!WANDB_MODE="dryrun" python train.py --img 640 --batch 16 --epochs 200 --data my_data.yaml --weights yolov5s.pt

## How to train using custom augmentation and hyperparameter settings

By default Yolo uses the augmentation and hyperparameter settings in this file during training: hyp=data/hyps/hyp.scratch.yaml

To train with custom settings we need to create a custom hyp yaml file and then set the --hyp parameter before training. --hyp is the path to the custom file that we created.

In [None]:
# Create the custom file

# Ref:
# Reading and Writing YAML to a File in Python
# https://stackabuse.com/reading-and-writing-yaml-to-a-file-in-python


yaml_dict = {
    
'lr0': 0.01,  # initial learning rate (SGD=1E-2, Adam=1E-3)
'lrf': 0.032,  # final OneCycleLR learning rate (lr0 * lrf)
'momentum': 0.937,  # SGD momentum/Adam beta1
'weight_decay': 0.0005,  # optimizer weight decay 5e-4
'warmup_epochs': 3.0,  # warmup epochs (fractions ok)
'warmup_momentum': 0.8,  # warmup initial momentum
'warmup_bias_lr': 0.1,  # warmup initial bias lr
'box': 0.1,  # box loss gain
'cls': 1.0,  # cls loss gain
'cls_pw': 0.5,  # cls BCELoss positive_weight
'obj': 2.0,  # obj loss gain (scale with pixels)
'obj_pw': 0.5,  # obj BCELoss positive_weight
'iou_t': 0.20,  # IoU training threshold
'anchor_t': 4.0,  # anchor-multiple threshold
'anchors': 0,  # anchors per output layer (0 to ignore)
'fl_gamma': 0.0,  # focal loss gamma (efficientDet default gamma=1.5)
'hsv_h': 0,  # image HSV-Hue augmentation (fraction)
'hsv_s': 0,  # image HSV-Saturation augmentation (fraction)
'hsv_v': 0,  # image HSV-Value augmentation (fraction)
'degrees': 0,  # image rotation (+/- deg)
'translate': 0.2,  # image translation (+/- fraction)
'scale': 0.3,  # image scale (+/- gain)
'shear': 0.0,  # image shear (+/- deg)
'perspective': 0.0,  # image perspective (+/- fraction), range 0-0.001
'flipud': 0,  # image flip up-down (probability)
'fliplr': 0.5,  # image flip left-right (probability)
'mosaic': 1.0,  # image mosaic (probability)
'mixup': 0.0  # image mixup (probability)
    
}


# Create the yaml file called my_hyp.yaml
# We will save this file inside the yolov5 folder.

# change the working directory to yolov5
os.chdir('/kaggle/working')

import yaml

with open(r'yolov5/my_hyp.yaml', 'w') as file:
    documents = yaml.dump(yaml_dict, file)
    

In [None]:
# Train the model

# Add the --hyp parameter with the path to the custom file.

# !WANDB_MODE="dryrun" python train.py --img 640 --batch 16 --epochs 200 --data my_data.yaml --hyp my_hyp.yaml --weights yolov5s.pt

## Get the name of the last experiment

In [None]:
# change the working directory to yolov5
os.chdir('/kaggle/working/yolov5')

!pwd

In [None]:
os.listdir('runs/train/')

In [None]:
# get a list of experiments
exp_list = os.listdir('runs/train/')

exp_list

In [None]:
# Get the latest exp.
# I found that the first item in the list is the latest experiment. Not
# the last item as one would normally expect.
exp = exp_list[0]

exp

In [None]:
# Display the contents of the "exp" folder
os.listdir(f'runs/train/{exp}')

# Review the training results

**Please look at the list of files in the output of the the above cell.**

- Yolo stores all the training curves as one png file. To view the training curves we need to display the png file.

- **IMPORTANT NOTE:** The summary displayed at the end of training shows the resuts for the LAST epoch. This is not the results for the BEST epoch. The results for each epoch are logged in a file called results.txt. We will load that file into a pandas dataframe and then get the best epoch and the best mAP score.

- Yolo also stores png images showing the true and predicted labels for each val batch. In these images the true and predicted bounding boxes are drawn in. One batch is shown on one image.

There's more results info available. The yolov5 folder will appear in the output of this notebook. I suggest that you download it and look at the contents of the yolov5/runs/train/exp folder.

In [None]:
# Display the contents of the "exp" folder
os.listdir(f'runs/train/{exp}')

In [None]:
# change the working directory to yolov5

os.chdir('/kaggle/working/yolov5')

!pwd

## Display the training curves

From these curves you will be able to see at which epoch the model started to overfit.

In [None]:
plt.figure(figsize = (15, 15))
plt.imshow(plt.imread(f'runs/train/{exp}/results.png'))

## Get the best mAP and best epoch

Here we will read the results.txt file and put the results into a dataframe. We will then filter this dataframe to find the max map0.5 value.

In [None]:
# Display the contents of the results.txt file

path = f'runs/train/{exp}/results.txt'

!cat $path

In [None]:
# Read the results from the training log: results.txt

# https://stackoverflow.com/questions/3277503/how-to-read-a-file-line-by-line-into-a-list
# https://stackoverflow.com/questions/65381312/how-to-convert-a-yolo-darknet-format-into-csv-file


filename = f'runs/train/{exp}/results.txt'

file_list = []

with open(filename) as f:
    # read a line into a list, format: ['item item item', 'item item item', ...]
    file_line_list = f.readlines()
    
    
for i in range(0, len(file_line_list)):
    
    # Get the first item in the list and split on the spaces.
    # This returns a list of all items in the line: ['item', 'item', 'item']
    line_list = file_line_list[i].split()
    
    # remove whitespace characters like `\n` at the end of each line
    line_list = [x.strip() for x in line_list]
    
    # Save the list.
    # all_lines_list is a list of lists
    file_list.append(line_list)
    
len(file_list)

In [None]:
# Put the file data into a dataframe

df = pd.DataFrame(file_list)

df.head(10)

In [None]:
# choose only the columns we want

col_names = ['epoch', 'P', 'R', 'map0.5', 'map0.5:0.95']

# filter out specific columns
df_results = df[[0, 8, 9, 10, 11]]

df_results.columns = col_names

# change the column names
df_results.head(10)

In [None]:
# Get the best map0.5

best_map = df_results['map0.5'].max()

print('---------------------')

print('Best map0.5:', best_map)
print()

# print the row that contains the best map0.5
df = df_results[df_results['map0.5'] == best_map]

print(df.head())

print('---------------------')

## Display one batch of train images

In [None]:
# Train
# One mosaic batch of train images with labels

plt.figure(figsize = (15, 15))
plt.imshow(plt.imread(f'runs/train/{exp}/train_batch0.jpg'))

## Display true and predicted val set bboxes

Here we will display the true and predicted bboxes for two val batches.

In [None]:
# BATCH 0 - TRUE BBOXES

plt.figure(figsize = (15, 15))
plt.imshow(plt.imread(f'runs/train/{exp}/test_batch0_labels.jpg'))

In [None]:
# BATCH 0 - PREDICTED BBOXES

plt.figure(figsize = (15, 15))
plt.imshow(plt.imread(f'runs/train/{exp}/test_batch0_pred.jpg'))

## Where is the trained model saved?

In [None]:
!pwd

In [None]:
# From the printout at the end of the training. It tells us where
# the weights for the last model and the weights for the best model are stored:

# Example:
# Optimizer stripped from runs/train/exp2/weights/last.pt, 14.5MB
# Optimizer stripped from runs/train/exp2/weights/best.pt, 14.5MB

# Display the contents of the "weights" folder.
# You will see the weights of the best model and the last model.
os.listdir(f'runs/train/{exp}/weights')

## How to do inference

In [None]:
# change the working directory to yolov5
#%cd yolov5

In [None]:
!ls

In [None]:
# Make a prediction on the test images

# Note if we had test images, the absolute path to the folder containing the
# test images can be set as follows:
# '/kaggle/input/global-wheat-detection/test'
# (note it's /kaggle/input/ and not /kaggle/working/)

# Adding --save-txt means that after prediction each image will have a txt file with bounding box info.
# https://github.com/ultralytics/yolov5/issues/388
# --save-conf means that we will also save the confidence scores for each bounding box.

# Here we are just making a prediction on images that are inside the 'images' folder.
# This is just a demo. Change the path to point to your test images.
# Note that we are using --save-txt and --save-conf because we want to save the 
# predicted bounding box coordinates and confidence scores.
!python detect.py --source '/kaggle/input/v2-balloon-detection-dataset/images' --weights 'runs/train/exp/weights/best.pt' --img 640 --save-txt --save-conf --exist-ok

In [None]:
# get a list of detect experiments
exp_list = os.listdir('runs/detect/')

#latest_index = len(exp_list) - 1

# Get the latest experiment

# ** NOTE: Here the latest experiment is sometimes the last element in the list and
# sometimes it's the first element in the list. Could be a bug. **
# This is not the same as for training experiments.
detect_exp = exp_list[0]

detect_exp

In [None]:
exp_list

In [None]:
# How to display the contents of a predicted txt file

# Example:
# ! cat 'runs/detect/exp2/labels/cc3532ff6.txt'

In [None]:
# These are the predicted images with the bboxes already drawn in:

# os.listdir(f'runs/detect/{detect_exp}')

In [None]:
pred_list = os.listdir(f'runs/detect/{detect_exp}')

image_id = pred_list[3]

path = f'runs/detect/{detect_exp}/' + image_id
image = plt.imread(path)

plt.imshow(image)

## Delete base_dir to prevent a Kaggle error

If there are too many files in this notebook's output then the notebook commit may fail. That's why we need to delete the base_dir folder.

In [None]:
# change the working directory to yolov5
#%cd yolov5

os.chdir('/kaggle/working/yolov5')

!pwd

In [None]:
# Delete the folder to prevent a Kaggle error.

if os.path.isdir('base_dir') == True:
    shutil.rmtree('base_dir')
    

In [None]:
# Check that base_dir is no longer in the yolov5 folder

os.listdir('/kaggle/working/yolov5')

## Conclusion

Thank you for reading.