# Dependencies
Darknet framework requires certain dependencies needed to be installed in our Linux machine. The dependencies include: 
- Libopencv-dev (General OpenCV library)
- Python-opencv (For image processing in python)
- ffmpeg (For image output viewing)

In [None]:
%%capture
#Dependencies needed for Darknet Framework 
!apt update
!apt upgrade -y
!uname -m && cat /etc/*release
!gcc --version
!uname -r
!apt install libopencv-dev python-opencv ffmpeg
!sudo apt-get install xxd

# Pre-Training 

The framework that are used to perform training is called **Darknet** . It is an open source neural network framework written in C and CUDA, and able to support Yolo (You Only Look Once) model. The training flow requires a few ingredients as follows:

1. Downloading training package (this consists of Darknet framework and scripts supported to generate model up to quantized tflite which will be used in our hardware)
2. Downloading COCO dataset and labels (Refer to : https://cocodataset.org/#home for more details about COCO dataset). This will take up about ~20GB for initial download.
3. Extracting classes from COCO dataset and seperate to only two classes ("Person" and "Not Person")
4. Changing and appending CPU,GPU and OPENCV settings. Ensure that the google colab GPU settings is turned on (Runtime > Change Runtime Type > Choose GPU)

In [2]:
%%capture
# Download the Training file that contain model and framework
!wget -O YoloTraining.zip https://www.dropbox.com/s/u5y6jtnln16ej2s/YoloTraining.zip?dl=0
!unzip YoloTraining.zip
!rm -rf YoloTraining.zip

In [None]:
# Download COCO dataset label
%cd /content/YoloTraining
!gdown 1cXZR_ckHki6nddOmcysCuuJFM--T-Q6L
!unzip -q coco2017labels.zip
!rm -rf coco2017labels.zip

In [None]:
#Download COCO dataset (Training and validation)
%cd coco/images
!f="train2017.zip" && curl http://images.cocodataset.org/zips/$f -o $f && unzip -q $f && rm $f  # 19G, 118k images
!f="val2017.zip" && curl http://images.cocodataset.org/zips/$f -o $f && unzip -q $f && rm $f  # 1G, 5k images

In [None]:
%cd /content/YoloTraining/

#Renaming and moving dataset 
!mkdir datasets && mv ./coco/ ./datasets/

#Renaming path in dataset to match the latest path after moving dataset
!sed -i 's/.\/images/.\/datasets\/coco\/images/g' ./datasets/coco/train2017.txt
!sed -i 's/.\/images/.\/datasets\/coco\/images/g' ./datasets/coco/val2017.txt
!sed -i 's/.\/images/.\/datasets\/coco\/images/g' ./datasets/coco/test-dev2017.txt

In [None]:
#Extract the subset of person data from coco dataset. All the models are using the same name which is "person", thus we can use the same file (coco.names)
!python ./scripts/extract_dataset_subset.py --original ./datasets/coco/ --names ./models/yolo-pico_coco_person_96x96x3/coco.names --output ./datasets/coco_person/ --classes 'person'

In [None]:
%cd /content/YoloTraining/darknet
#Change lines to enable opencv and GPU. Note that you might need to change GPU settings in Makefile if you are running locally based on your GPU model.
#Refer to Makefile comments in darknet/Makefile for more info
!sed -i 's/OPENCV=0/OPENCV=1/g' Makefile
!sed -i 's/GPU=0/GPU=1/g' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/g' Makefile

In [None]:
%%capture
#Build project to append new settings
!make -j8 

# Training

The model is trained with 500,000 steps with 0.000010 learning rate, thus it will take around 14 hours for smaller input size, and can go up to 24 hours for bigger input size. 

###Training Configuration
- The training is set to use one GPU, as colab machine only supports one GPU per machine.
- The training parameters and layer adjustment can be made by changing the parameters in .cfg file. E.g;
  -  ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.cfg: 

```
[net]
batch=32
subdivisions=1
width=96
height=96
channels=3
momentum=0.949
decay=0.0005
angle=0
saturation=1.5
exposure=1.5
hue=.1
```
- These are the training parameters which will be consumed during training runtime.


###Method of trainings available:
1.Training from scratch 
- This will start the training from step 1, and will go up to 500K steps by default (following what we set in .cfg file).
2.Using pre-trained weights 
- this will load weight which has been trained previously, and continue to train from that point. In this case, our last trained model is trained up to 500K steps.



These are the sample of completing the training snapshot: 
The sample of last 3 steps before completing the training is shown below:


```
 499998: 1.113633, 1.115932 avg loss, 0.000010 rate, 0.069034 seconds, 15999936 images, 0.002009 hours left
Loaded: 0.000079 seconds
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 59 Avg (IOU: 0.749662), count: 19, class_loss = 0.275771, iou_loss = 0.027597, total_loss = 0.303368 
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 66 Avg (IOU: 0.417471), count: 171, class_loss = 2.373719, iou_loss = 1.208714, total_loss = 3.582433 
 total_bbox = 84496405, rewritten_bbox = 17.515007 % 

 499999: 1.325656, 1.136904 avg loss, 0.000010 rate, 0.070176 seconds, 15999968 images, 0.001989 hours left
Loaded: 0.000121 seconds
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 59 Avg (IOU: 0.739210), count: 19, class_loss = 0.256957, iou_loss = 0.029729, total_loss = 0.286686 
v3 (iou loss, Normalizer: (iou: 0.07, obj: 1.00, cls: 1.00) Region 66 Avg (IOU: 0.485464), count: 120, class_loss = 2.001330, iou_loss = 0.876470, total_loss = 2.877800 
 total_bbox = 84496544, rewritten_bbox = 17.514999 % 

 500000: 1.129992, 1.136213 avg loss, 0.000010 rate, 0.064493 seconds, 16000000 images, 0.001969 hours left
Saving weights to models/yolo-pico_coco_person_96x96x3_500000.weights
Saving weights to models/yolo-pico_coco_person_96x96x3_last.weights
Saving weights to models/yolo-pico_coco_person_96x96x3_final.weights
```


In [None]:
%cd /content/YoloTraining
#Perform training on model. Specify higher number of GPUs if available ; e.g -gpus 0,1,2,3
#Note that if training from scratch, it is adviced to disabled the output by using %%capture as it might overflow and hang

##1.Training from scratch
#%%capture
# !./darknet/darknet detector train ./models/yolo-pico_coco_person_96x96x3/coco.data ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.cfg  -gpus 0 -dont_show

##2.Pre-trained weights
!./darknet/darknet detector train ./models/yolo-pico_coco_person_96x96x3/coco.data ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.cfg ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.weights -gpus 0 -dont_show 



# Post-Training 

Upon training the model, we will get a final weighted model , which will be converted to h5 format , followed by tflite model. Before converting , we can do a few evaluation/verification steps to ensure our model is accurate enough to deploy. The post-training steps are as follows:

### Evaluation
- Evaluate the accuracy of model using validation dataset (we will get the precision, recall, f1-score, IoU and mAP score)
- Evaluation at single test image (This will allow us to see the score between person and non-person and can visualize the bounding box drawn on the image using openCV)



### Model conversion

- The model is converted as following the flow:

(final weight) > (Keras .h5 model) > (TFLite .tflite model) 

- The .tflite model is quantized using post-training quantization, which the model will be used to deploy in hardware.
- Upon converting to quantized TFLite model, we will generate the model in .cc file (in C format) before deploying to the hardware.
- At the end of the steps, we will get two files , the .cc model which contain the model information (mostly in hexadecimal), and .h which contains the variable used by the model

In [None]:
#Accuracy Evaluation of Trained Models
!./darknet/darknet detector map ./models/yolo-pico_coco_person_96x96x3/coco.data ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.cfg ./models/yolo-pico_coco_person_96x96x3_final.weights -points 101

In [None]:
#Evaluation of Trained Models Using Test Images
!./darknet/darknet detector test ./models/yolo-pico_coco_person_96x96x3/coco.data ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.cfg ./models/yolo-pico_coco_person_96x96x3_final.weights ./darknet/data/person.jpg -i 0 -thresh 0.25 -dont_show 

In [None]:
#View output image after prediction. Bounding box will be shown that shows person is detected
from google.colab.patches import cv2_imshow
import cv2
img = cv2.imread('predictions.jpg', cv2.IMREAD_UNCHANGED)
cv2_imshow(img)

In [None]:
#Model Conversion to Keras H5 Format
!python ./scripts/convert.py ./models/yolo-pico_coco_person_96x96x3/yolo-pico_coco_person_96x96x3.cfg ./models/yolo-pico_coco_person_96x96x3_final.weights yolo-pico_coco_person_96x96x3.h5 -f

#Move the h5 file to model folder
!mv ./yolo-pico_coco_person_96x96x3.h5 ./models/yolo-pico_coco_person_96x96x3

In [None]:
#Model Conversion to Quantized Int8 TFLite Format
!python ./scripts/convert_darknet_tflite.py --model yolo-pico_coco_person_96x96x3 --images ./datasets/coco/val2017.txt --input_shape 96x96x3 --sample ./datasets/coco/images/val2017/000000018380.jpg

In [None]:
#Generate .cc and .h file to be used for hardware implementation
%cd /content
!xxd -i /content/YoloTraining/models/yolo-pico_coco_person_96x96x3/96x96x3/yolo-pico_coco_person_96x96x3_quant.tflite > network_model_data_tmp.cc
!wget -O convert_model_cc.py https://www.dropbox.com/s/1q8n4jm9fk4gzzf/convert_model_cc.py?dl=0
!python convert_model_cc.py --network="yolo_pico" --application="person_detect"
!rm -rf network_model_data_tmp.cc