Task 2: Model Training

Implement and configure the YOLO object detection model using a suitable framework.
Train the model on the prepared training dataset using appropriate hyperparameters. Monitor the training progress and ensure convergence.

**Connect google drive**

In [1]:
# Check if NVIDIA GPU is enabled
!nvidia-smi

Tue Aug 15 19:55:33 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla V100-SXM2...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   33C    P0    24W / 300W |      0MiB / 16384MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [2]:
# Connect to google drive
from google.colab import drive
drive.mount('/content/gdrive')
!ln -s /content/gdrive/My\ Drive/ /mydrive
!ls /mydrive

Mounted at /content/gdrive
 10k.docx
 1637895821717_1637878768570_LOA.pdf
 2010Q1-house-disburse-detail.csv
'2010Q1-house-disburse-summary (1).gsheet'
'2010Q1-house-disburse-summary (2).gsheet'
'2010Q1-house-disburse-summary (3).gsheet'
 2010Q1-house-disburse-summary.csv
 2010Q1-house-disburse-summary.gsheet
 286170200-Service-Management-Operations.pdf
'4802 Personal Branding Marking Rubric SUMMER 2023SH.pdf'
'7th -  SO 2'
'7th - SO 4'
'7th - SO 5'
'7th - SO 6'
 9.1-PCA_Data_Viz.ipynb
 9.2-PCA_LogisticRegression.ipynb
'Ama Birthday'
 Auto-mpg.gsheet
 auto-mpg.txt
 B9B4983E-299A-4FD9-B8E4-B0D965B1F998.MP4
'CC Peony.pdf'
'Colab Notebooks'
'Copy of bulat reedit.jpeg'
'Copy of Freelance Invoice - A4.gslides'
'Copy of IN-05.3.jpeg'
'Copy of IN-05.jpeg'
'Copy of IN-06.2.jpeg'
'Copy of IN-07.2.jpeg'
'Copy of IN-07.jpeg'
'Copy of Leaf Earring 1.jpeg'
'Copy of Moon Earring 2.jpeg'
'Copy of Round Earring 2.jpeg'
'Copy of Round Earring 3.jpeg'
'CoV Task Force'
'Customer Feedback.gform'
'DANA Proj

**1) Clone the Darknet**



In [3]:
!git clone https://github.com/AlexeyAB/darknet

Cloning into 'darknet'...
remote: Enumerating objects: 15549, done.[K
remote: Counting objects: 100% (35/35), done.[K
remote: Compressing objects: 100% (29/29), done.[K
remote: Total 15549 (delta 10), reused 27 (delta 6), pack-reused 15514[K
Receiving objects: 100% (15549/15549), 14.24 MiB | 6.28 MiB/s, done.
Resolving deltas: 100% (10422/10422), done.


**2) Compile Darknet using Nvidia GPU**


In [4]:
# change makefile to have GPU and OPENCV enabled
%cd darknet
!sed -i 's/OPENCV=0/OPENCV=1/' Makefile
!sed -i 's/GPU=0/GPU=1/' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/' Makefile
!make

/content/darknet
mkdir -p ./obj/
mkdir -p backup
chmod +x *.sh
g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -DOPENCV `pkg-config --cflags opencv4 2> /dev/null || pkg-config --cflags opencv` -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -rdynamic -Ofast -DOPENCV -DGPU -DCUDNN -I/usr/local/cudnn/include -c ./src/image_opencv.cpp -o obj/image_opencv.o
[01m[K./src/image_opencv.cpp:[m[K In function ‘[01m[Kvoid draw_detections_cv_v3(void**, detection*, int, float, char**, image**, int, int)[m[K’:
  946 |                 float [01;35m[Krgb[m[K[3];
      |                       [01;35m[K^~~[m[K
[01m[K./src/image_opencv.cpp:[m[K In function ‘[01m[Kvoid draw_train_loss(char*, void**, int, float, float, int, int, float, int, char*, float, int, int, double)[m[K’:
 1147 |             [01;35m[Kif[m[K (iteration_old == 0)
      |             [01;35m[K^~[m[K
[01m[K./src/image_opencv.cpp:1150:

**3) Configure Darknet network for training YOLO V3**

Because we have 10 classes we are setting the filters using the formula:

[number of classes + (5)] * (3)


We will also change the max_batches based on the formula of:

number of classes * 2000

However, as we have tried previously the number is too large to process. Therefore we will do half of what we are supposed to do by using 8,000 batches.

In [5]:
!cp cfg/yolov3.cfg cfg/yolov3_training.cfg

In [6]:
!sed -i 's/batch=1/batch=64/' cfg/yolov3_training.cfg
!sed -i 's/subdivisions=1/subdivisions=16/' cfg/yolov3_training.cfg
!sed -i 's/max_batches = 500200/max_batches = 8000/' cfg/yolov3_training.cfg
!sed -i '610 s@classes=80@classes=10@' cfg/yolov3_training.cfg
!sed -i '696 s@classes=80@classes=10@' cfg/yolov3_training.cfg
!sed -i '783 s@classes=80@classes=10@' cfg/yolov3_training.cfg
!sed -i '603 s@filters=255@filters=45@' cfg/yolov3_training.cfg
!sed -i '689 s@filters=255@filters=45@' cfg/yolov3_training.cfg
!sed -i '776 s@filters=255@filters=45@' cfg/yolov3_training.cfg

In [7]:
!echo -e 'carrot\ncabbage\ntomato\nspinach\nbanana\norange\nlettuce\nonion\nstrawberry\npotato' > data/obj.names
!echo -e 'classes= 10\ntrain  = data/train.txt\nvalid  = data/test.txt\nnames = data/obj.names\nbackup = /mydrive/yolov3' > data/obj.data

In [8]:
!cp cfg/yolov3_training.cfg /mydrive/yolov3/yolov3_testing.cfg
!cp data/obj.names /mydrive/yolov3/classes.txt

**4) Extract Images**

The images need to be inside a zip archive called "images.zip" and they need to be inside the folder "yolov3" on Google Drive

In [9]:
!mkdir data/obj
!unzip /mydrive/yolov3/images.zip -d data/obj

Archive:  /mydrive/yolov3/images.zip
   creating: data/obj/images/
  inflating: data/obj/__MACOSX/._images  
  inflating: data/obj/images/cabbage_14.jpg  
  inflating: data/obj/__MACOSX/images/._cabbage_14.jpg  
  inflating: data/obj/images/spinach_10.xml.txt  
  inflating: data/obj/images/potato_7.xml.txt  
  inflating: data/obj/images/potato_9.jpg  
  inflating: data/obj/__MACOSX/images/._potato_9.jpg  
  inflating: data/obj/images/tomato_13.xml.txt  
  inflating: data/obj/images/carrot_27.xml.txt  
  inflating: data/obj/images/tomato_5.xml.txt  
  inflating: data/obj/images/onion_9.jpg  
  inflating: data/obj/__MACOSX/images/._onion_9.jpg  
  inflating: data/obj/images/tomato_15.jpg  
  inflating: data/obj/__MACOSX/images/._tomato_15.jpg  
  inflating: data/obj/images/potato_8.jpg  
  inflating: data/obj/__MACOSX/images/._potato_8.jpg  
  inflating: data/obj/images/orange_17.xml.txt  
  inflating: data/obj/images/lettuce_10.xml.txt  
  inflating: data/obj/images/lettuce_4.xml.txt  


In [10]:
# Download weights darknet model 53
!wget https://pjreddie.com/media/files/darknet53.conv.74

--2023-08-15 19:57:37--  https://pjreddie.com/media/files/darknet53.conv.74
Resolving pjreddie.com (pjreddie.com)... 128.208.4.108
Connecting to pjreddie.com (pjreddie.com)|128.208.4.108|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 162482580 (155M) [application/octet-stream]
Saving to: ‘darknet53.conv.74’


2023-08-15 19:57:46 (18.7 MB/s) - ‘darknet53.conv.74’ saved [162482580/162482580]



#### Note:
Since we have 10 classes we need to adjust it to these classes. I'm also printing the result just to make sure that the code works fine.

In [11]:
import glob
import os
import re

# Map the class names to class indices based on the txt file
class_mapping = {
    'banana': 0,
    'cabbage': 1,
    'carrot': 2,
    'lettuce': 3,
    'onion' :4,
    'orange' : 5,
    'potato' :6,
    'spinach': 7,
    'strawberry': 8,
    'tomato':9
}

# Get the paths of all annotation files

xml_txt_file_paths = glob.glob("data/obj/images/*.xml.txt")

for xml_txt_file_path in xml_txt_file_paths:
    # Load content from .xml.txt file
    with open(xml_txt_file_path, "r") as xml_txt_file:
        lines = xml_txt_file.readlines()

    # Create corresponding .txt file path
    txt_file_path = xml_txt_file_path.replace(".xml.txt", ".txt")

    # Modify and save content to .txt file
    with open(txt_file_path, "w") as txt_file:
        for line in lines:
            numbers = line.strip().split()
            if len(numbers) == 5:
                class_idx = int(numbers[0])
                x_center = float(numbers[1])
                y_center = float(numbers[2])
                width = float(numbers[3])
                height = float(numbers[4])

                # Convert to YOLO format
                x_center_yolo = x_center
                y_center_yolo = y_center
                width_yolo = width
                height_yolo = height

                txt_line = f"{class_idx} {x_center_yolo} {y_center_yolo} {width_yolo} {height_yolo}\n"
                txt_file.write(txt_line)

    print(f"Converted {xml_txt_file_path} to {txt_file_path}")

Converted data/obj/images/orange_9.xml.txt to data/obj/images/orange_9.txt
Converted data/obj/images/carrot_12.xml.txt to data/obj/images/carrot_12.txt
Converted data/obj/images/orange_18.xml.txt to data/obj/images/orange_18.txt
Converted data/obj/images/cabbage_14.xml.txt to data/obj/images/cabbage_14.txt
Converted data/obj/images/potato_17.xml.txt to data/obj/images/potato_17.txt
Converted data/obj/images/potato_6.xml.txt to data/obj/images/potato_6.txt
Converted data/obj/images/carrot_13.xml.txt to data/obj/images/carrot_13.txt
Converted data/obj/images/spinach_5.xml.txt to data/obj/images/spinach_5.txt
Converted data/obj/images/strawberry_8.xml.txt to data/obj/images/strawberry_8.txt
Converted data/obj/images/potato_4.xml.txt to data/obj/images/potato_4.txt
Converted data/obj/images/tomato_7.xml.txt to data/obj/images/tomato_7.txt
Converted data/obj/images/spinach_20.xml.txt to data/obj/images/spinach_20.txt
Converted data/obj/images/tomato_5.xml.txt to data/obj/images/tomato_5.txt

We're only taking the jpg files and will be saving it in a list called images_list

In [12]:
import glob
images_list = glob.glob("data/obj/images/*.jpg")
print(images_list)

['data/obj/images/potato_7.jpg', 'data/obj/images/strawberry_18.jpg', 'data/obj/images/orange_8.jpg', 'data/obj/images/strawberry_13.jpg', 'data/obj/images/lettuce_12.jpg', 'data/obj/images/spinach_13.jpg', 'data/obj/images/strawberry_15.jpg', 'data/obj/images/spinach_10.jpg', 'data/obj/images/banana_15.jpg', 'data/obj/images/cabbage_12.jpg', 'data/obj/images/lettuce_9.jpg', 'data/obj/images/potato_11.jpg', 'data/obj/images/carrot_13.jpg', 'data/obj/images/tomato_6.jpg', 'data/obj/images/lettuce_4.jpg', 'data/obj/images/orange_9.jpg', 'data/obj/images/lettuce_8.jpg', 'data/obj/images/onion_18.jpg', 'data/obj/images/cabbage_17.jpg', 'data/obj/images/strawberry_7.jpg', 'data/obj/images/orange_20.jpg', 'data/obj/images/strawberry_9.jpg', 'data/obj/images/carrot_23.jpg', 'data/obj/images/spinach_20.jpg', 'data/obj/images/spinach_12.jpg', 'data/obj/images/carrot_26.jpg', 'data/obj/images/onion_7.jpg', 'data/obj/images/cabbage_8.jpg', 'data/obj/images/cabbage_10.jpg', 'data/obj/images/carrot

In [13]:
#Create training.txt file
file = open("data/train.txt", "w")
file.write("\n".join(images_list))
file.close()

**6) Start the training**

In [14]:
# Start the training
!./darknet detector train data/obj.data cfg/yolov3_training.cfg darknet53.conv.74 -dont_show

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
 total_bbox = 504400, rewritten_bbox = 0.000000 % 
]2;7926/8000: loss=0.0 hours left=0.1
 7926: 0.020773, 0.028735 avg loss, 0.001000 rate, 2.783315 seconds, 507264 images, 0.093698 hours left
Loaded: 0.000107 seconds
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 82 Avg (IOU: 0.902826), count: 4, class_loss = 0.000002, iou_loss = 0.116200, total_loss = 0.116201 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 94 Avg (IOU: 0.000000), count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 106 Avg (IOU: 0.000000), count: 1, class_loss = 0.000000, iou_loss = 0.000000, total_loss = 0.000000 
 total_bbox = 504404, rewritten_bbox = 0.000000 % 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 82 Avg (IOU: 0.914047), count: 4, class_loss = 0.000071, iou_loss = 0.020453, total_loss = 0