<a href="https://colab.research.google.com/github/anirudh201098/Object-detection-with-Yolo-models/blob/main/Yolo%20models/Yolo2_Darknet_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Yolo(v2) Architecture**

![alt text](https://media.geeksforgeeks.org/wp-content/uploads/20200401004021/darknet-19-simplified.jpg)

#Pre-requisites :
A) YOLO(v2) requires image annotations in .txt extension in the following format: 


```
<object-class> <x> <y> <height> <width>
For example if tiger is object class 0 in an image "img1.png" 
then annotation file will also have the name img1.txt 
and annotation in the file will be this for one bounding box:

0 0.295491 0.631188 0.589920 0.502829


```


B) All the images and respective annotation file should be present in the same directory. 



 **Introduction**

The main goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). It is fundamentally a supervised learning learning problem in that a training set of labelled images is provided. The twenty object classes that have been selected are:

    Person: person
    Animal: bird, cat, cow, dog, horse, sheep
    Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
    Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor


**PSACAL VOC 2012 dataset: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html**



**1. Mount the drive**

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


**2. List the files and directories at current path**

In [None]:
%ls

[0m[01;34mdrive[0m/  [01;34msample_data[0m/


**3. Move to the folder where dataset and main files are present**

In [None]:
%cd /content/drive/My\ Drive/Research_Internship/Yolo-2



/content/drive/My Drive/Research_Internship


**4. List the files and directories at current path**

In [None]:
%ls 

[0m[01;34mdarknet[0m/  [01;34mYolo-2[0m/  [01;34mYolo-3[0m/


**5. Clone darknet** (uncomment when running it for first time after clone comment it again)

In [None]:
#!git clone https://github.com/pjreddie/darknet/

In [None]:
%ls

[0m[01;34mdarknet[0m/


In [None]:
%cd darknet  

/content/drive/My Drive/Research_Internship/darknet


**6. List the files and directories at current path**

In [None]:
!ls

3rdparty	       DarknetConfig.cmake.in	predictions.jpg
backup		       darknet.py		README.md
bad.list	       darknet_video.py		result.avi
build		       data			results
build.ps1	       dataset			scripts
build.sh	       extraction.conv.weights	sed2jULfN
cfg		       image_yolov2.sh		sed2tyMGH
chart.png	       image_yolov3.sh		sedveAFiM
chart_voc-custom.png   include			sedziVXLS
cmake		       json_mjpeg_streams.sh	src
CMakeLists.txt	       LICENSE			video_v2.sh
custom-yolo2.backup    Makefile			video_yolov3.sh
darknet		       net_cam_v3.sh		yolo1_pred.avi
darknet19_448.conv.23  obj			yolov4.conv.137


**7. Activate GPU, CUDNN and OPENCV in makefile** (Uncomment it for first time run then again comment it)

In [None]:
 !sed -i 's/GPU=0/GPU=1/g' Makefile
 !sed -i 's/CUDNN=0/CUDNN=1/g' Makefile
 !sed -i 's/OPENCV=0/OPENCV=1/g' Makefile

**8. Build executable code using make** (Uncomment it for first time run after that again comment below make command)

In [None]:
!make


gcc -Iinclude/ -Isrc/ -DOPENCV `pkg-config --cflags opencv`  -DGPU -I/usr/local/cuda/include/ -DCUDNN  -Wall -Wno-unused-result -Wno-unknown-pragmas -Wfatal-errors -fPIC -Ofast -DOPENCV -DGPU -DCUDNN -c ./src/gemm.c -o obj/gemm.o
[01m[K./src/gemm.c:[m[K In function ‘[01m[Ktime_gpu[m[K’:
         [01;35m[KcudaThreadSynchronize[m[K();
         [01;35m[K^~~~~~~~~~~~~~~~~~~~~[m[K
In file included from [01m[K/usr/local/cuda/include/cuda_runtime.h:96:0[m[K,
                 from [01m[Kinclude/darknet.h:11[m[K,
                 from [01m[K./src/utils.h:5[m[K,
                 from [01m[K./src/gemm.c:2[m[K:
[01m[K/usr/local/cuda/include/cuda_runtime_api.h:957:57:[m[K [01;36m[Knote: [m[Kdeclared here
 extern __CUDA_DEPRECATED __host__ cudaError_t CUDARTAPI [01;36m[KcudaThreadSynchronize[m[K(void);
                                                         [01;36m[K^~~~~~~~~~~~~~~~~~~~~[m[K
gcc -Iinclude/ -Isrc/ -DOPENCV `pkg-config --cflags opencv` 

**9. Downloading the Imagenet weights for YOLO V2**

In [None]:
#!wget https://pjreddie.com/media/files/darknet19_448.conv.23

**10. Change the permission for dakrnet file**

In [None]:
!chmod +x ./darknet  

**11. Setting up the configuration**

**(A) Altering .cfg file**

It contains the model architecture information.

**Download custom-yolo2.cfg file from cfg folder of cfg folder of darknet**



**Open the file using any editor and do following changes:**
*   change line batch to batch=64
*   change line subdivisions to subdivisions=8
*   height =416, width=416 or any value multiple of 32
*   change line max_batches to 40000(classes*2000 i.e 20x2000)
*   change line steps to 80% and 90% of max_batches, f.e. steps=32000,36000
*   change line classes=20 to your number of objects in [region] layer
 

**(B) Creating .names file**

*   It contains the class information
*   Create a new file in your editor named it as voc.names
*   In the file write the name of classes(each class name in new line)

```
voc.names:

aeroplane
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
diningtable
dog
horse
motorbike
person
pottedplant
sheep
sofa
train
tvmonitor
```

**(C) Creating .data file**

It contains information about path of training file, testing file, class name file and path where to store the trained weights 

```
voc.data
classes= 20
train  = data/voc/train.txt    => Train file path
valid  = data/test.txt  => Test file path
names = data/voc.names => Classes file path
backup = backup/   => Weights will be saved after every 100 epochs in this directory.
```
Make sure you save the train and test .txt  files in data folder of darknet

**12. Before you proeed ahead upload the modifiled files:**

*   Upload voc.names and voc.data inside data folder of darknet
*   Upload voc.cfg inside cfg folder of darknet

**13. Install dos2unix library to change files as per unix compatabiity**

In [None]:
!sudo apt-get install dos2unix 

Reading package lists... Done
Building dependency tree       
Reading state information... Done
dos2unix is already the newest version (7.3.4-3).
0 upgraded, 0 newly installed, 0 to remove and 11 not upgraded.


**14. Convert the uploaded file in unix format**

If you get an error here, then recheck you paths for all the files and paste them accordingly.


In [None]:
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/train.txt
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/test.txt
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/cfg/voc2.data
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/voc.names
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/cfg/custom-yolo2.cfg

dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/train.txt to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/test.txt to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/cfg/voc2.data to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/voc.names to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/cfg/custom-yolo2.cfg to Unix format...


**15. Training dataset** (When training the model from pre-trained weights use below command )

Since I have already trained the PASCAL VOC 2012 train data for 39000 epochs, I am using the following command.

If you are training the data for the first time, then change the code to 
```
!./darknet detector train cfg/voc2.data cfg/custom-yolo2.cfg darknet19_448.conv.23 -dont_show 

```

If you have dataset which has similar classes to PASCAL VOC 2012 dataset, you can use pre-trained weights from the following link: **https://drive.google.com/file/d/10bfZVCsFxu88ENTElGyl_kYdPmDSeXbe/view?usp=sharing**

In [None]:
!./darknet detector train cfg/voc2.data cfg/custom-yolo2.cfg  backup/custom-yolo2.backup -dont_show 

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
Region Avg IOU: 0.831987, Class: 0.998954, Obj: 0.789348, No Obj: 0.007125, Avg Recall: 0.933333,  count: 15
Region Avg IOU: 0.831555, Class: 0.999490, Obj: 0.910431, No Obj: 0.008985, Avg Recall: 1.000000,  count: 12
Region Avg IOU: 0.820264, Class: 0.997692, Obj: 0.747039, No Obj: 0.008619, Avg Recall: 0.954545,  count: 22
Region Avg IOU: 0.826577, Class: 0.997257, Obj: 0.821963, No Obj: 0.009908, Avg Recall: 1.000000,  count: 23
Region Avg IOU: 0.874357, Class: 0.997432, Obj: 0.890297, No Obj: 0.007548, Avg Recall: 1.000000,  count: 15
39511: 1.727652, 3.079902 avg, 0.000010 rate, 1.077664 seconds, 2528704 images
Loaded: 0.000050 seconds
Region Avg IOU: 0.704967, Class: 0.998896, Obj: 0.730774, No Obj: 0.006126, Avg Recall: 0.764706,  count: 17
Region Avg IOU: 0.844564, Class: 0.998599, Obj: 0.814527, No Obj: 0.011427, Avg Recall: 1.000000,  count: 20
Region Avg IOU: 0.862173, Class: 0.999143, Obj: 0.885512, No Obj: 0.

**16. Calculate the Mean average Precision of the trained model**

In [None]:

!./darknet detector map cfg/voc2.data cfg/custom-yolo2.cfg backup/custom-yolo2_40000.weights -dont_show -points 0 


 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 600, cudnn_half = 0, GPU: Tesla P100-PCIE-16GB 
net.optimized_memory = 0 
mini_batch = 1, batch = 8, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  32 0.299 BF
   1 max                2x 2/ 2    416 x 416 x  32 ->  208 x 208 x  32 0.006 BF
   2 conv     64       3 x 3/ 1    208 x 208 x  32 ->  208 x 208 x  64 1.595 BF
   3 max                2x 2/ 2    208 x 208 x  64 ->  104 x 104 x  64 0.003 BF
   4 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   5 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
   6 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   7 max                2x 2/ 2    104 x 104 x 128 ->   52 x  52 x 128 0.001 BF
   8 conv    256       3 x 3/ 1     52 x  52 x 128 ->   

**17. Calculate the Recall of the trained model**

In [None]:
!./darknet detector recall cfg/voc2.data cfg/custom-yolo2.cfg backup/custom-yolo2_40000.weights -dont_show 

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  32  0.299 BFLOPs
    1 max          2 x 2 / 2   416 x 416 x  32   ->   208 x 208 x  32
    2 conv     64  3 x 3 / 1   208 x 208 x  32   ->   208 x 208 x  64  1.595 BFLOPs
    3 max          2 x 2 / 2   208 x 208 x  64   ->   104 x 104 x  64
    4 conv    128  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x 128  1.595 BFLOPs
    5 conv     64  1 x 1 / 1   104 x 104 x 128   ->   104 x 104 x  64  0.177 BFLOPs
    6 conv    128  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x 128  1.595 BFLOPs
    7 max          2 x 2 / 2   104 x 104 x 128   ->    52 x  52 x 128
    8 conv    256  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 256  1.595 BFLOPs
    9 conv    128  1 x 1 / 1    52 x  52 x 256   ->    52 x  52 x 128  0.177 BFLOPs
   10 conv    256  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 256  1.595 BFLOPs
   11 max          2 x 2 / 2    52 x  52 x 256   ->

**18. Predicting bounding boxes for new image:**

The predicted image will be saved in ./darknet folder with the name predictions.jpg

In [None]:
!./darknet detector test cfg/voc2.data cfg/custom-yolo2.cfg backup/custom-yolo2_40000.weights data/dog.jpg -dont_show 

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 600, cudnn_half = 0, GPU: Tesla P100-PCIE-16GB 
net.optimized_memory = 0 
mini_batch = 1, batch = 8, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    416 x 416 x   3 ->  416 x 416 x  32 0.299 BF
   1 max                2x 2/ 2    416 x 416 x  32 ->  208 x 208 x  32 0.006 BF
   2 conv     64       3 x 3/ 1    208 x 208 x  32 ->  208 x 208 x  64 1.595 BF
   3 max                2x 2/ 2    208 x 208 x  64 ->  104 x 104 x  64 0.003 BF
   4 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   5 conv     64       1 x 1/ 1    104 x 104 x 128 ->  104 x 104 x  64 0.177 BF
   6 conv    128       3 x 3/ 1    104 x 104 x  64 ->  104 x 104 x 128 1.595 BF
   7 max                2x 2/ 2    104 x 104 x 128 ->   52 x  52 x 128 0.001 BF
   8 conv    256       3 x 3/ 1     52 x  52 x 128 ->   

In [None]:
!./darknet detector test cfg/voc2.data cfg/custom-yolo2.cfg backup/custom-yolo2_40000.weights data/person.jpg -dont_show 

layer     filters    size              input                output
    0 conv     32  3 x 3 / 1   416 x 416 x   3   ->   416 x 416 x  32  0.299 BFLOPs
    1 max          2 x 2 / 2   416 x 416 x  32   ->   208 x 208 x  32
    2 conv     64  3 x 3 / 1   208 x 208 x  32   ->   208 x 208 x  64  1.595 BFLOPs
    3 max          2 x 2 / 2   208 x 208 x  64   ->   104 x 104 x  64
    4 conv    128  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x 128  1.595 BFLOPs
    5 conv     64  1 x 1 / 1   104 x 104 x 128   ->   104 x 104 x  64  0.177 BFLOPs
    6 conv    128  3 x 3 / 1   104 x 104 x  64   ->   104 x 104 x 128  1.595 BFLOPs
    7 max          2 x 2 / 2   104 x 104 x 128   ->    52 x  52 x 128
    8 conv    256  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 256  1.595 BFLOPs
    9 conv    128  1 x 1 / 1    52 x  52 x 256   ->    52 x  52 x 128  0.177 BFLOPs
   10 conv    256  3 x 3 / 1    52 x  52 x 128   ->    52 x  52 x 256  1.595 BFLOPs
   11 max          2 x 2 / 2    52 x  52 x 256   ->

**19. Prediction for videos:**

In [None]:
!./darknet detector demo cfg/voc2.data cfg/custom-yolo2.cfg backup/custom-yolo2_40000.weights data/traffic.mp4 -out_filename yolo2_pred.avi -dont_show

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
car: 65% 
car: 55% 
car: 50% 
car: 44% 
car: 41% 
car: 41% 
car: 41% 
car: 40% 
car: 40% 
car: 40% 
car: 39% 
car: 38% 
car: 37% 
car: 28% 
car: 25% 
bus: 33% 
bus: 26% 

FPS:117.4 	 AVG_FPS:115.8

 cvWriteFrame 
Objects:

person: 30% 
car: 83% 
car: 80% 
car: 79% 
car: 77% 
car: 58% 
car: 53% 
car: 52% 
car: 47% 
car: 46% 
car: 42% 
car: 42% 
car: 41% 
car: 37% 
car: 34% 
car: 33% 
car: 31% 
car: 29% 
car: 27% 
bus: 26% 

FPS:117.4 	 AVG_FPS:115.8

 cvWriteFrame 
Objects:

car: 83% 
car: 82% 
car: 79% 
car: 71% 
car: 63% 
car: 57% 
car: 54% 
car: 52% 
car: 52% 
car: 49% 
car: 48% 
car: 44% 
car: 43% 
car: 43% 
car: 34% 
car: 33% 
car: 27% 
car: 26% 
car: 26% 
car: 26% 
bus: 39% 

FPS:116.9 	 AVG_FPS:115.8

 cvWriteFrame 
Objects:

car: 84% 
car: 82% 
car: 82% 
car: 70% 
car: 66% 
car: 61% 
car: 60% 
car: 55% 
car: 45% 
car: 45% 
car: 37% 
car: 35% 
car: 34% 
car: 34% 
car: 34% 
car: 34% 
car: 31% 
car: 28% 
car: 25% 
bus