The YOLOv5 model for Object Detection is provided by [Ultralytics](https://github.com/ultralytics/yolov5), hence cloning the repository to run the model.

In [1]:
!rm -rf /content/yolov5

In [2]:
!git clone https://github.com/ultralytics/yolov5  # cloning Ultralytics repo
%cd yolov5

Cloning into 'yolov5'...
remote: Enumerating objects: 16413, done.[K
remote: Counting objects: 100% (5/5), done.[K
remote: Compressing objects: 100% (5/5), done.[K
remote: Total 16413 (delta 0), reused 3 (delta 0), pack-reused 16408[K
Receiving objects: 100% (16413/16413), 14.90 MiB | 8.86 MiB/s, done.
Resolving deltas: 100% (11265/11265), done.
/content/yolov5


In [3]:
!git clone https://github.com/amalnairr/buddha-poses-detection

Cloning into 'buddha-poses-detection'...
remote: Enumerating objects: 6133, done.[K
remote: Counting objects: 100% (1489/1489), done.[K
remote: Compressing objects: 100% (1041/1041), done.[K
remote: Total 6133 (delta 15), reused 1486 (delta 15), pack-reused 4644[K
Receiving objects: 100% (6133/6133), 148.05 MiB | 27.50 MiB/s, done.
Resolving deltas: 100% (867/867), done.
Updating files: 100% (12359/12359), done.


The "yolov5" folder, or any git repo of this kind contains a text file called _requirements.txt_ which has a list of all the dependencies required to run the particular model. Here we proceed with installing the required dependencies for YOLOv5 and the other libraries, like Pytorch which are necessary for the model.

In [None]:
# installing dependencies as necessary
!pip install -qr requirements.txt #ignore errors if any - rare but possible

import torch
from IPython.display import Image, clear_output  # to display plots and images
from utils.downloads import attempt_download  # to download models/datasets

# checking if connected to the GPU
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

The dataset here is from the [Warburg Institute Iconographic Database](https://iconographic.warburg.sas.ac.uk), and we have chosen a subsection of [Buddhist Sculptures](https://iconographic.warburg.sas.ac.uk/category/vpc-taxonomy-001747) for our project, and we will make attempts to detect buddha and his poses in the images of the sculptures using YOLOv5.

- However YOLOv5 requires annotated dataset, and the annotation format is specific to the model. We used Roboflow to annotate and export the dataset in the required format, as a .zip file
- What began with just annotation of the images into 7 classes, was then later, iteratively converted into a [dataset with 5 classes](https://universe.roboflow.com/warburg-buddha-poses/buddha-poses-detection/dataset/6). This last version has **Random Gaussian Blur** applied as an augementation method upfront, as the versions prior to the augmentation showed poorer performance.
  - Update: The data was then converted to have [4 classes](https://universe.roboflow.com/warburg-buddha-poses/buddha-poses-detection/dataset/8) to address underrepresentation.


In [None]:
# commenting out because cloning the repository on Colab for access
# !rm -rf /content/yolov5/BuddhaPosesDataSet8
# !unzip /content/BuddhaPosesDetection.v8.zip -d /content/yolov5/

The data.yaml file has the information regarding the train, test and validation datasets, the number of classes and their names.

In [None]:
yaml_path = '/content/yolov5/buddha-poses-detection/buddha-detection-yolov5/version2-original/data.yaml'

In [None]:
%cat {yaml_path}

**Commented for v8 dataset**

Here it is evident that the dataset has 4 classes representing Buddha in 4 poses - Horse Riding (as Prince Siddhartha), Sitting, Standing, and Sleeping (while attaining Nirvana).

- It is evident that 2 out of the 4 poses directly map to two different stages of Buddha's life, already.

### Choice of Model and Model Parameters

In [None]:
# define number of classes based on YAML - we already saw that the set has 4 classes in our case
import yaml
with open(yaml_path, 'r') as stream:
    num_classes = str(yaml.safe_load(stream)['nc'])

We will use the YOLOv5 - Small model for this project, as our idea is to use it for smartphones later on, hence the lighter, the better. Here are the model configuration parameters associated with YOLOv5s:

In [None]:
#this is the model configuration we will use for our project
%cat /content/yolov5/models/yolov5s.yaml

Here the number of classes defined are *80* while we only have **4**. We can choose to rewrite the file, as below:

In [None]:
#customize iPython writefile so we can write variables
from IPython.core.magic import register_line_cell_magic

@register_line_cell_magic
def writetemplate(line, cell):
    with open(line, 'w') as f:
        f.write(cell.format(**globals()))

In [None]:
%%writetemplate /content/yolov5/models/custom_yolov5s.yaml

# parameters
nc: {num_classes}  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, BottleneckCSP, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, BottleneckCSP, [1024, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, BottleneckCSP, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, BottleneckCSP, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, BottleneckCSP, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, BottleneckCSP, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

### Training the model

Training the model is quite simple - just in terms of the code, and not the computational time, however. WE are using the Tesla T4 free GPU given by Google Colab for this case.

In [None]:
import shutil

source_directory = '/content/buddha-poses-detection/yolov5/buddha-poses-detection/buddha-detection-yolov5/version2-original'
destination_directory = '/content/yolov5'

# Copy the contents of the source directory to the destination directory
shutil.copytree(source_directory, destination_directory, dirs_exist_ok=True)


In [None]:
# train yolov5s on custom data for 100 epochs
# time its performance
%%time
%cd /content/yolov5/
!python train.py --img 640 --batch 16 --epochs 100 --data /content/yolov5/data.yaml --cfg ./models/custom_yolov5s.yaml --weights '' --name yolov5s_results_v2  --cache

The arguments parsed in the train program are:
- **img:** define input image size
- **batch:** determine batch size
- **epochs:** define the number of training epochs. (Note: often, 3000+ are common here!)
- **data:** set the path to our yaml file
- **cfg:** specify our model configuration
- **weights:** specify a custom path to weights. (Note: you can download weights from the Ultralytics Google Drive [folder](https://drive.google.com/open?id=1Drs_Aiu7xx6S-ix95f9kNsA6ueKRpN2J))
- **name:** result names
- **nosave:** only save the final checkpoint
- **cache:** cache images for faster training

In [None]:
import shutil
from google.colab import files

# Create a download link for the entire Results folder
shutil.make_archive("/content/version1-original", 'zip', "/content/yolov5/runs/train/yolov5s_results_v2/")

# Rename the zip file to remove the ".zip" extension
shutil.move("/content/version1-original.zip", "/content/version2-original")

# Create a download link for the Results folder (unzipped)
files.download("/content/version2-original")


### Plotting the results:

The model keeps a record of the above metrics, and stores them as results.txt. We can also plot the values of different metrics throughout the training process as follows:

In [None]:
from utils.plots import plot_results  # plot results.txt as results.png
Image(filename='/content/yolov5/runs/train/yolov5s_results_v2/results.png', width=1000)  # view results.png

In [None]:
Image(filename='/content/yolov5/runs/train/yolov5s_results_v2/val_batch0_labels.jpg', width=900)

In [None]:
# print out an augmented training example
print("GROUND TRUTH AUGMENTED TRAINING DATA:")
Image(filename='/content/yolov5/runs/train/yolov5s_results_v2/train_batch0.jpg', width=900)

Upon going through the documentation, we could see that after training, a new folder named **runs** is created.

Also, **what is the output of the training?** The output of training is a set of weights, iteratively optimized and stored at **best.pt**

There is a script "detect.py" then uses the weights from the training to detect the different classes in the test dataset. Let us see if the model can detect the classes in the test images.

In [None]:
# looking at the weights (output of train)
%ls runs/train/yolov5s_results_v2/weights

In [None]:
# we can either choose to run "detect.py" on one image, or the entire set of images
%cd /content/yolov5/
# testing on the entire test folder
!python detect.py --weights runs/train/yolov5s_results_v2/weights/best.pt --img 416 --conf 0.4 --source test/images/

In [None]:
import glob
from IPython.display import Image, display

# displaying 10 random images post detection to show the result of the model
for imageName in glob.glob('/content/yolov5/runs/detect/exp/*.jpg')[:10]: #assuming JPG
    display(Image(filename=imageName))