#**Using Google Colab to train YOLOv3**

## **Mount Google Drive**

In [0]:
from google.colab import drive
drive.mount("/content/drive/")

## **Download darknet to your drive**

In [0]:
%cd "drive/My Drive"

In [0]:
# After you install once, you do not need to do it again. Comment our this line or delete this cell
!git clone https://github.com/pjreddie/darknet

## **Download NVIDIA cuDNN**

You will need to sign in with an account.

https://developer.nvidia.com/compute/machine-learning/cudnn/secure/v7.5.0.56/prod/10.0_20190219/cudnn-10.0-linux-x64-v7.5.0.56.tgz

After you download the file, go to the darknet folder in your Google Drive and create a folder called cuDNN and upload the .tgz file you just downloaded to the folder. Run the command below to install the software needed for GPU acceleration.

To have YOLOv3 use the GPU, edit the Makefile in the darknet folder and set GPU=1. (Should be the first line of the file)

In [0]:
%cd darknet

/content/drive/My Drive/darknet


In [0]:
!tar -xzvf cuDNN/cudnn-10.0-linux-x64-v7.5.0.56.tgz -C /usr/local/
!chmod a+r /usr/local/cuda/include/cudnn.h

#**Compile the code**

Everytime you reset the notebook or stop the training, you will need to clean and recompile. There will be a lot of gcc print outs.

In [0]:
!make clean
!make

#**Test darknet**

Download pretrained weights

In [0]:
!wget https://pjreddie.com/media/files/yolov3.weights

Test the detector. This may take a few minutes.

In [0]:
!./darknet detect cfg/yolov3.cfg yolov3.weights data/dog.jpg

After is is done, you can check the results in predictions.jpg

#**Train Test Split**

In order to start training, we need to split our data into a training set and a testing set. Put your data folder within the data folder inside of darknet. When you run the script below, it will ask for a directory.

Ex. Directory: data/{your_data_folder_here}

In [0]:
import os

# Data Directory
obj_dir = input("Directory: ")

# Get Number of Files in Directory
file_list = [file for file in os.listdir(obj_dir)]
num_files = len(file_list) / 2

# Percentage of images to be used for the test set
percentage = 0.25;

# Get nth image as test file
n_test = round(num_files/(num_files*percentage))

# Create and/or truncate train.txt and test.txt
train = open('train.txt', 'w')  
test = open('test.txt', 'w')

for i in enumerate(file_list):
    if(i[1].endswith(".jpg")):
        file_name = obj_dir + '/' + i[1] + '\n'
        text_version = file_name.replace(".jpg\n",".txt")
        if(os.path.exists(text_version)):          
          if i[0] % n_test == 0:
              test.write(file_name)
          else:
              train.write(file_name)
        else:
          # These are the files that do not have an annotation file
          print(file_name)

train.close()
test.close()
print("DONE")

# **Training Time**

If you are training a custom model, you will need to create a obj.data file which contains the number of classes and path to train.txt, test.txt, obj.names, and backup folder. You will also need to create a obj.names file which contains the name of the objects you are trying to train. If you are training from scratch, you will need to download the darknet53.conv.74 weights with "!wget https://pjreddie.com/media/files/darknet53.conv.74". If you want to continue training, put the most recent weights instead of the darknet53.conv.74 weights.

Ex. ./darknet detector train obj.data cfg/obj.cfg backup/{your_most_recent_weights_here}

IMPORTANT

If you are training new objects, train from scratch (use darknet53.conv.74 weights)

If training is really slow, make sure your GPU is enabled.

In [0]:
!dos2unix cfg/obj.cfg
!dos2unix obj.data
!dos2unix obj.names

In [0]:
!./darknet detector train obj.data cfg/obj.cfg darknet53.conv.74

# **Common Problems**

Note: I am writing these off the top of my head, I do not remember the exact errors.

1. Assert 0 error. Run dos2unix on your cfg file to fix it.
2. Couldn't open train.txt or test.txt. Run dos2unix to fix it. Dos2unix every file you transfer from Windows.
3. Cannot run bash commands. It's probably cuz you stopped YOLO while it was running. Reset runtime, clean, make.
4. Testing dog.jpg takes too long, just wait, it may take a few minutes, possibly due to uploading the weights to the VM.
5. Mounting Drive issues, reset runtime and try again.
6. Notebook crashed, Google has a time limit of 12 hrs (supposedly), but it may crash sooner, hence why it's important to keep backups.
7. If GPU is enabled, and training is still slow, make sure the Makefile has GPU=1. Reset runtime, clean, make.

Useful Links:
https://github.com/AlexeyAB/darknet/#how-to-train-to-detect-your-custom-objects