<a href="https://colab.research.google.com/github/anirudh201098/Object-detection-with-Yolo-models/blob/main/Yolo%20models/DarknetYolo4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![alt text](https://miro.medium.com/max/1076/1*EVPqmfh38YT5KDGXT950tA.png)

#Pre-requisites :
A) YOLO(v4) requires image annotations in .txt extension in the following format: 


```
<object-class> <x> <y> <height> <width>
For example if tiger is object class 0 in an image "img1.png" 
then annotation file will also have the name img1.txt 
and annotation in the file will be this for one bounding box:

0 0.295491 0.631188 0.589920 0.502829


```


B) All the images and respective annotation file should be present in the same directory. 



 **Introduction**

The main goal of this challenge is to recognize objects from a number of visual object classes in realistic scenes (i.e. not pre-segmented objects). It is fundamentally a supervised learning learning problem in that a training set of labelled images is provided. The twenty object classes that have been selected are:

    Person: person
    Animal: bird, cat, cow, dog, horse, sheep
    Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
    Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor


**PSACAL VOC 2012 dataset: http://host.robots.ox.ac.uk/pascal/VOC/voc2012/index.html**


**1. Mount the drive**

In [1]:
from google.colab import drive
drive.mount('/content/drive')
#cdrive.mount("/content/drive", force_remount=True)

Mounted at /content/drive


**2. List the files and directories at current path**

In [2]:
%ls

[0m[01;34mdrive[0m/  [01;34msample_data[0m/


**3. Move to the folder where dataset and main files are present**

In [3]:
%cd /content/drive/My\ Drive/Research_Internship

/content/drive/My Drive/Research_Internship


**4. List the files and directories at current path**

In [4]:
%ls 

[0m[01;34mdarknet[0m/  [01;34mYolo-2[0m/  [01;34mYolo-3[0m/


**5. Clone darknet** (uncomment when running it for first time after clone comment it again)

In [None]:
#!git clone https://github.com/AlexeyAB/darknet/  

Cloning into 'darknet'...
remote: Enumerating objects: 14, done.[K
remote: Counting objects: 100% (14/14), done.[K
remote: Compressing objects: 100% (12/12), done.[K
remote: Total 13927 (delta 5), reused 6 (delta 2), pack-reused 13913[K
Receiving objects: 100% (13927/13927), 12.45 MiB | 8.75 MiB/s, done.
Resolving deltas: 100% (9515/9515), done.
Checking out files: 100% (2008/2008), done.


**6. List the files and directories at current path**

In [5]:
!ls  

darknet  Yolo-2  Yolo-3


In [6]:
%cd darknet  

/content/drive/My Drive/Research_Internship/darknet


In [7]:
!ls

3rdparty		darknet.py		 result.avi
backup			darknet_video.py	 results
bad.list		data			 scripts
build			dataset			 sed2jULfN
build.ps1		extraction.conv.weights  sed2tyMGH
build.sh		image_yolov2.sh		 sedveAFiM
cfg			image_yolov3.sh		 sedziVXLS
chart.png		include			 src
chart_voc-custom.png	json_mjpeg_streams.sh	 video_v2.sh
cmake			LICENSE			 video_yolov3.sh
CMakeLists.txt		Makefile		 yolo1_pred.avi
custom-yolo2.backup	net_cam_v3.sh		 yolo2_pred.avi
darknet			obj			 yolov4.conv.137
darknet19_448.conv.23	predictions.jpg
DarknetConfig.cmake.in	README.md


**7. Activate GPU, CUDNN and OPENCV in makefile** (Uncomment it for first time run then again comment it)

In [8]:
 !sed -i 's/GPU=0/GPU=1/g' Makefile
 !sed -i 's/CUDNN=0/CUDNN=1/g' Makefile
 !sed -i 's/OPENCV=0/OPENCV=1/g' Makefile

**8. Build executable code using make** (Uncomment it for first time run after that again comment below make command)

In [None]:
 !make

chmod +x *.sh
g++ -std=c++11 -std=c++11 -Iinclude/ -I3rdparty/stb/include -DOPENCV `pkg-config --cflags opencv4 2> /dev/null || pkg-config --cflags opencv` -DGPU -I/usr/local/cuda/include/ -DCUDNN -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -DOPENCV -DGPU -DCUDNN -I/usr/local/cudnn/include -c ./src/image_opencv.cpp -o obj/image_opencv.o
[01m[K./src/image_opencv.cpp:[m[K In function ‘[01m[Kvoid draw_detections_cv_v3(void**, detection*, int, float, char**, image**, int, int)[m[K’:
                 float [01;35m[Krgb[m[K[3];
                       [01;35m[K^~~[m[K
[01m[K./src/image_opencv.cpp:[m[K In function ‘[01m[Kvoid draw_train_loss(char*, void**, int, float, float, int, int, float, int, char*, float, int, int, double)[m[K’:
             [01;35m[Kif[m[K (iteration_old == 0)
             [01;35m[K^~[m[K
[01m[K./src/image_opencv.cpp:1130:10:[m[K [01;36m[Knote: [m[K...this statement, but the latter is misleadingly inde

**9. Change the permission for dakrnet file**

In [9]:
!chmod +x ./darknet         #change made of darknet file for execution (+x -> executable)

**10. Downloading the Imagenet weights for YOLO V4**

In [None]:
!wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137

--2020-07-11 07:09:32--  https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137
Resolving github.com (github.com)... 140.82.113.4
Connecting to github.com (github.com)|140.82.113.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://github-production-release-asset-2e65be.s3.amazonaws.com/75388965/48bfe500-889d-11ea-819e-c4d182fcf0db?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20200711%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20200711T070932Z&X-Amz-Expires=300&X-Amz-Signature=5b2d29821b81f917a17eea69377c49793ee6b6f3b9a7f8517b8f24020727ca09&X-Amz-SignedHeaders=host&actor_id=0&repo_id=75388965&response-content-disposition=attachment%3B%20filename%3Dyolov4.conv.137&response-content-type=application%2Foctet-stream [following]
--2020-07-11 07:09:32--  https://github-production-release-asset-2e65be.s3.amazonaws.com/75388965/48bfe500-889d-11ea-819e-c4d182fcf0db?X-Amz-Algorithm=AWS4-HMAC-SHA

**11. Setting up the configuration**

**(A) Altering .cfg file**

It contains the model architecture information.

**Download voc-custom.cfg file from cfg folder of darknet**



**Open the file using any editor and do following changes:**
*   change line batch to batch=64
*   change line subdivisions to subdivisions=32
*   height =608, width=608 or any value multiple of 32
*   change line max_batches to 40000(classes*2000 i.e 20x2000)
*   change line steps to 80% and 90% of max_batches, f.e. steps=32000,36000
* change line classes=80 to your number of objects in each of 3 [yolo]-layers, f. e. classes=20
*  change [filters=255] to filters=(classes + 5)x3 in the 3 [convolutional] before each [yolo] layer, keep in mind that it only has to be the last [convolutional] before each of the [yolo] layers, i. e. filters=75

**(B) Creating .names file**

*   It contains the class information
*   Create a new file in your editor named it as voc.names
*   In the file write the name of classes(each class name in new line)

```
voc.names:

aeroplane
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
diningtable
dog
horse
motorbike
person
pottedplant
sheep
sofa
train
tvmonitor
```

**(C) Creating .data file**

It contains information about path of training file, testing file, class name file and path where to store the trained weights 

```
voc.data
classes= 20
train  = data/voc/train.txt    => Train file path
valid  = data/test.txt  => Test file path
names = data/voc.names => Classes file path
backup = backup/   => Weights will be saved after every 100 epochs in this directory.
```
Make sure you save the train and test .txt  files in data folder of darknet

**12. Before you proeed ahead upload the modifiled files:**

*   Upload voc.names and voc.data inside data folder of darknet
*   Upload voc.cfg inside cfg folder of darknet

**13. Install dos2unix library to change files as per unix compatabiity**

In [10]:
!sudo apt-get install dos2unix 

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  dos2unix
0 upgraded, 1 newly installed, 0 to remove and 11 not upgraded.
Need to get 351 kB of archives.
After this operation, 1,267 kB of additional disk space will be used.
Get:1 http://archive.ubuntu.com/ubuntu bionic/universe amd64 dos2unix amd64 7.3.4-3 [351 kB]
Fetched 351 kB in 1s (369 kB/s)
debconf: unable to initialize frontend: Dialog
debconf: (No usable dialog-like program is installed, so the dialog based frontend cannot be used. at /usr/share/perl5/Debconf/FrontEnd/Dialog.pm line 76, <> line 1.)
debconf: falling back to frontend: Readline
debconf: unable to initialize frontend: Readline
debconf: (This frontend requires a controlling tty.)
debconf: falling back to frontend: Teletype
dpkg-preconfigure: unable to re-open stdin: 
Selecting previously unselected package dos2unix.
(Reading database ... 144628 files and directories curren

**14. Convert the uploaded file in unix format**

If you get an error here, then recheck you paths for all the files and paste them accordingly.



In [11]:
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/train.txt
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/test.txt
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/voc.data
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/data/voc.names
 !dos2unix /content/drive/My\ Drive/Research_Internship/darknet/cfg/voc-custom.cfg

dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/train.txt to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/test.txt to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/voc.data to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/data/voc.names to Unix format...
dos2unix: converting file /content/drive/My Drive/Research_Internship/darknet/cfg/voc-custom.cfg to Unix format...


In [15]:
 #! ./darknet detector train data/voc.data cfg/voc-custom.cfg yolov4.conv.137 -dont_show

**15. Training dataset** (When training the model from pre-trained weights use below command )

Since I have already trained the PASCAL VOC 2012, I am using the following command to continue the training.

If you are training the data for the first time, then change the code to 
```
 ! ./darknet detector train data/voc.data cfg/voc-custom.cfg yolov4.conv.137 -dont_show
```

If you have dataset which has similar classes to PASCAL VOC 2012 dataset, you can use pre-trained weights from the following link: **https://drive.google.com/file/d/1-j1scUl-Qzpg7QeBcW7fQCaBuioELxvY/view?usp=sharing**

In [None]:
! ./darknet detector train data/voc.data cfg/voc-custom.cfg backup/voc-custom_40000.weights -dont_show

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
voc-custom
 0 : compute_capability = 600, cudnn_half = 0, GPU: Tesla P100-PCIE-16GB 
net.optimized_memory = 0 
mini_batch = 1, batch = 64, time_steps = 1, train = 1 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    608 x 608 x   3 ->  608 x 608 x  32 0.639 BF
   1 conv     64       3 x 3/ 2    608 x 608 x  32 ->  304 x 304 x  64 3.407 BF
   2 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   3 route  1 		                           ->  304 x 304 x  64 
   4 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   5 conv     32       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  32 0.379 BF
   6 conv     64       3 x 3/ 1    304 x 304 x  32 ->  304 x 304 x  64 3.407 BF
   7 Shortcut Layer: 4,  wt = 0, wn = 0, outputs: 304 x 304 x  64 0.006 BF
   8 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 

**16. Calculate the Mean average Precision of the trained model**

In [12]:
 !./darknet detector map data/voc.data cfg/voc-custom.cfg backup/voc-custom_40000.weights  -dont_show -points 0  

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 750, cudnn_half = 0, GPU: Tesla T4 
net.optimized_memory = 0 
mini_batch = 1, batch = 32, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    608 x 608 x   3 ->  608 x 608 x  32 0.639 BF
   1 conv     64       3 x 3/ 2    608 x 608 x  32 ->  304 x 304 x  64 3.407 BF
   2 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   3 route  1 		                           ->  304 x 304 x  64 
   4 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   5 conv     32       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  32 0.379 BF
   6 conv     64       3 x 3/ 1    304 x 304 x  32 ->  304 x 304 x  64 3.407 BF
   7 Shortcut Layer: 4,  wt = 0, wn = 0, outputs: 304 x 304 x  64 0.006 BF
   8 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   9 rou

**17. Predicting bounding boxes for new image:**

The predicted image will be saved in ./darknet folder with the name predictions.jpg

In [None]:
!./darknet detector test data/voc.data cfg/voc-custom.cfg backup/voc-custom_last.weights data/person.jpg  -dont_show 

 CUDA-version: 10010 (10010), cuDNN: 7.6.5, GPU count: 1  
 OpenCV version: 3.2.0
 0 : compute_capability = 370, cudnn_half = 0, GPU: Tesla K80 
net.optimized_memory = 0 
mini_batch = 1, batch = 64, time_steps = 1, train = 0 
   layer   filters  size/strd(dil)      input                output
   0 conv     32       3 x 3/ 1    608 x 608 x   3 ->  608 x 608 x  32 0.639 BF
   1 conv     64       3 x 3/ 2    608 x 608 x  32 ->  304 x 304 x  64 3.407 BF
   2 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   3 route  1 		                           ->  304 x 304 x  64 
   4 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   5 conv     32       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  32 0.379 BF
   6 conv     64       3 x 3/ 1    304 x 304 x  32 ->  304 x 304 x  64 3.407 BF
   7 Shortcut Layer: 4,  wt = 0, wn = 0, outputs: 304 x 304 x  64 0.006 BF
   8 conv     64       1 x 1/ 1    304 x 304 x  64 ->  304 x 304 x  64 0.757 BF
   9 ro

**18. Prediction for videos**

In [16]:
!./darknet detector demo data/voc.data cfg/voc-custom.cfg backup/voc-custom_40000.weights data/traffic.mp4 -out_filename yolo4_pred.avi -dont_show

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
car: 99% 
car: 97% 
car: 97% 
car: 95% 
car: 80% 
car: 53% 
car: 51% 
car: 31% 
car: 30% 

FPS:18.1 	 AVG_FPS:18.9

 cvWriteFrame 
Objects:

train: 67% 
bus: 58% 
train: 30% 
person: 95% 
person: 83% 
person: 81% 
person: 43% 
person: 32% 
person: 25% 
car: 99% 
car: 99% 
car: 99% 
car: 98% 
car: 96% 
car: 93% 
car: 89% 
car: 68% 
car: 38% 

FPS:18.2 	 AVG_FPS:18.9

 cvWriteFrame 
Objects:

person: 95% 
person: 69% 
person: 49% 
person: 47% 
car: 99% 
car: 99% 
car: 96% 
car: 95% 
car: 83% 
car: 60% 
car: 35% 
car: 35% 

FPS:18.2 	 AVG_FPS:18.9

 cvWriteFrame 
Objects:

train: 32% 
person: 92% 
person: 86% 
person: 73% 
person: 52% 
person: 41% 
car: 99% 
car: 99% 
car: 93% 
car: 40% 
car: 28% 
car: 28% 

FPS:18.0 	 AVG_FPS:18.9

 cvWriteFrame 
Objects:

person: 98% 
person: 86% 
person: 75% 
person: 67% 
person: 64% 
person: 52% 
person: 40% 
person: 39% 
person: 36% 
person: 34% 
car: 99% 
car: 95% 
car: 89% 
car: 66% 
