Custom dataset preparation and Training using YOLOv3 darknet
---

Labelling from scratch
- Create **custom_data** folder on local with images to train and test. Can use Imageye as chrome extension to download images in bulk for each class
- Download **LabelImg** from https://github.com/tzutalin/labelImg. This is a visual GUI-software for marking bounded boxes of objects and generating annotation files.
- Go to installation folder of LabelImg>data. Remove the contents of Predefined classes file and add custom classes in order of detection
- Launch LabelImg via anaconda(using python labelImg.py), go to View and enable autosave mode.
- Above Create RectBox, toggle on PascalVOC once to change to YOLO format
- On LabelImg, select Open dir. Browse to custom_data folder, start annotating the 1st image using Create RectBox. 
- Annotate each object on images and tag to existing classes
- Save each image once tagged, this creates the .txt file for that image in custom_data folder. 
**Note**   Both image and txt files should be inside custom_data folder and share the same names
- Once all images are annotated, place **Rename_files.ipynb** into the custom_data folder and execute. This converts file names into sequential format
- Zip the custom_data folder


**Google Drive Preparation**
- Create a yolo_custom_model_Training folder on google drive
- Move the custom_data.zip file to above folder
- Create a **custom_weight** folder within yolo_custom_model_Training. Download pre-trained weights for the convolutional layers (154 MB): https://pjreddie.com/media/files/darknet53.conv.74 and move it to custom_weight folder
- create **darknet** and **backup** folders within yolo_custom_model_Training

To access Google Drive Folder and Files

In [1]:
# Load the Drive helper and mount
from google.colab import drive

# This will prompt for authorization
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
# unzip dataset
!unzip '/content/drive/MyDrive/Juice_Box/custom_data.zip' -d '/content/drive/MyDrive/Juice_Box/'

Archive:  /content/drive/MyDrive/Juice_Box/custom_data.zip
   creating: /content/drive/MyDrive/Juice_Box/custom_data/
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil1.jpg  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil1.txt  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil10.jpg  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil10.txt  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil100.jpg  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil100.txt  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil101.jpg  
 extracting: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil101.txt  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil102.jpg  
 extracting: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil102.txt  
  inflating: /content/drive/MyDrive/Juice_Box/custom_data/aluminum_foil103.jpg

**Install modified Version of Darknet**

In [None]:
%cd '/content/drive/MyDrive/Juice_Box/darknet'
!git clone 'https://github.com/AlexeyAB/darknet' '/content/drive/MyDrive/Juice_Box/darknet'

/content/drive/MyDrive/Juice_Box/darknet
Cloning into '/content/drive/MyDrive/Juice_Box/darknet'...
remote: Enumerating objects: 14730, done.[K
remote: Total 14730 (delta 0), reused 0 (delta 0), pack-reused 14730[K
Receiving objects: 100% (14730/14730), 13.27 MiB | 6.35 MiB/s, done.
Resolving deltas: 100% (10020/10020), done.
Checking out files: 100% (2023/2023), done.


In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


**More configurations on google drive**
- Edit **makefile** in darknet folder by setting GPU=1,CUDNN=1 and OPENCV=1(lines 1,2,4 respectively)
- Upload the 2 .py files creating-files-data-and-name.py and creating-train-and-test-txt-files.py(shared via google drive yolo_custom_model_Training > Python Notebooks) to custom_data folder on googledrive
- Download classes.txt from custom_data and rename to classes.names. Upload this to custom_data folder

In [3]:
%cd '/content/drive/MyDrive/Juice_Box/'
!python custom_data/creating-files-data-and-name.py

/content/drive/MyDrive/Juice_Box


In [4]:
# Run only if the train and test files are not manually created and uploaded to custom_data folder
%cd '/content/drive/MyDrive/Juice_Box/'
!python custom_data/creating-train-and-test-txt-files.py

/content/drive/MyDrive/Juice_Box


**Custom configurations**
Copy the yolov3.cfg file from darknet>cfg folder and save as yolov3_custom.cfg file. Upload this yolov3_custom.cfg file to same folder once below changes are done:
- Lines 3 and 4 for testing commented
- Lines 6 and 7 for training uncommented
- Line 20 maxbatches(# of iterations for training yolo v3) edited from 500200 to (num_classes*2000). Never go below 2000. In POC, we used 3 classes, so this parameter was set to 6000
- Line 22 steps edited to about 20% less than the batches(experiment)

**Transfer Learning Step** 
- Edit last 3 YOLO layers and their preceeding Conv layers as follows:
   - classes on Lines 610,696 and 783 to 3(for POC)
   - filters on lines 603,689 and 776 to 24 (num_classes+5)*3

In [2]:
# Compile darknet (build)
%cd '/content/drive/MyDrive/Juice_Box/darknet'
!make clean
!make

/content/drive/MyDrive/Juice_Box/darknet
rm -rf ./obj/image_opencv.o ./obj/http_stream.o ./obj/gemm.o ./obj/utils.o ./obj/dark_cuda.o ./obj/convolutional_layer.o ./obj/list.o ./obj/image.o ./obj/activations.o ./obj/im2col.o ./obj/col2im.o ./obj/blas.o ./obj/crop_layer.o ./obj/dropout_layer.o ./obj/maxpool_layer.o ./obj/softmax_layer.o ./obj/data.o ./obj/matrix.o ./obj/network.o ./obj/connected_layer.o ./obj/cost_layer.o ./obj/parser.o ./obj/option_list.o ./obj/darknet.o ./obj/detection_layer.o ./obj/captcha.o ./obj/route_layer.o ./obj/writing.o ./obj/box.o ./obj/nightmare.o ./obj/normalization_layer.o ./obj/avgpool_layer.o ./obj/coco.o ./obj/dice.o ./obj/yolo.o ./obj/detector.o ./obj/layer.o ./obj/compare.o ./obj/classifier.o ./obj/local_layer.o ./obj/swag.o ./obj/shortcut_layer.o ./obj/activation_layer.o ./obj/rnn_layer.o ./obj/gru_layer.o ./obj/rnn.o ./obj/rnn_vid.o ./obj/crnn_layer.o ./obj/demo.o ./obj/tag.o ./obj/cifar.o ./obj/go.o ./obj/batchnorm_layer.o ./obj/art.o ./obj/region_l

In [7]:
#Check if darknet compiled correctly
# Check if compiled correctly
! darknet/darknet
# should return : usage: ./darknet <function>

usage: darknet/darknet <function>


**Train to detect your custom objects**

This will create weights in yolo_custom_model_Training > backup folder.
If there are any input files that got rejected during training, it gets logged in a badlist within yolo_custom_model_Training

In [None]:
%cd '/content/drive/MyDrive/Juice_Box'
!darknet/darknet detector train custom_data/labelled_data.data darknet/cfg/yolov3_custom.cfg custom_weight/darknet53.conv.74 -dont_show

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 82 Avg (IOU: 0.483777), count: 4, class_loss = 2.749210, iou_loss = 0.623124, total_loss = 3.372334 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 94 Avg (IOU: 0.000000), count: 1, class_loss = 0.027490, iou_loss = 0.000000, total_loss = 0.027490 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 106 Avg (IOU: 0.000000), count: 1, class_loss = 0.019430, iou_loss = 0.000000, total_loss = 0.019430 
 total_bbox = 30815, rewritten_bbox = 0.162259 % 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 82 Avg (IOU: 0.371734), count: 4, class_loss = 2.473497, iou_loss = 1.350051, total_loss = 3.823548 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1.00, cls: 1.00) Region 94 Avg (IOU: 0.000000), count: 1, class_loss = 0.024699, iou_loss = 0.000000, total_loss = 0.024699 
v3 (mse loss, Normalizer: (iou: 0.75, obj: 1

Once training is complete, get **yolov3_custom_final.weights** from path `custom_data\backup\`


*   After each set of iterations you can stop and later start training from this point. For example, after 1000 iterations you can stop training, and later just copy yolov3_custom_1000.weights from `custom_data\backup\`