<h2> DATA PREP and PRE-REQUISITES </h2>

- Download the script image_capture.py to your local machine. This script will use your webcam on your laptop to capture images from your webcam and save it to your disk
- The script could function a little differently on a MAC vs Windows. I have tested it on MAC and it works very well
- The images you capture depends on your use case. I took a simple video that detects if a person is wearing a mask. I took 3 picture types (labels) for it
-    mask
-    nomask
-    incmask

- You can easily change that to hardhat or any other safety gears. Make sure you are clear on what you want to capture as your labels.
- if you take more than 3 labels then you need to modify the script accordingly.
- This script will save images in the relevant folders based on your labels.
- Once you capture images using this script, create the following folder structure.
    train_data
       images
         train
         val
       labels
         train
         val
- Divide your pictures in such a way that 80% of your pictures are used for training and 20% are used for validation and then load these images into train and val folders respectively
- For the labels, we will visit <a href = "https://www.makesense.ai/">Makesense.ai</a> where we will label our training and validation images, the instructor will help you in this step. You can also use Amazon Sagemaker GroundTruth to label these images. However, the augmented manifest file that is generated once the GroundTruth labeling job is complete needs to be converted into a format that is compatible with the yolov5 model. makesense.ai provides us yolo format out of the box. From a security standpoint, these images are NOT uploaded to makesense.ai when you are labeling them
- Once the labels are created, download them and add them to the train and val labels folder.
- As a final step, zip the entire train_data folder and bring the zip file into the Jupyter Notebook


Once the zip folder is uploaded to Studio, unzip it using the following command

In [None]:
!unzip train_data.zip -d train_data

<h4> You can delete the zip folder once extracted </h4>

<h4> Clone the ultralytics reposity for yolov5. We will use the yolov5s pre-trained model. The "s" stands for small </h4>

In [None]:
!git clone https://github.com/ultralytics/yolov5.git

<h4> Change the path to yolov5 once the repo is cloned</h4>

In [None]:
%cd yolov5

<h4>Install the required dependencies<h4>

In [None]:
%pip install -qr requirements.txt  # install

import torch
import utils
display = utils.notebook_init()  # checks

<h4> set an env variable <h4>

In [None]:
%env KMP_DUPLICATE_LIB_OK=TRUE

<h3>Configuration Steps</h3>

- Download the coco128.yaml file located at yolov5/data
- Modify this download file to include your labels and data locations. The coco128.yaml file contains over 90 classes the yolov5 model is trained on. You can remove all those classes and add your own. This is what I have done.

------------------------------------------------------------------------------------------------------------------------
```
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]

train: ../train_data/images/train  # train images (relative to 'path') 128 images
val: ../train_data/images/val  # val images (relative to 'path') 128 images
test:  # test images (optional)

# Classes
nc: 3  # number of classes
names: ['mask','nomask','incmask']  # class names

</code>
```
------------------------------------------------------------------------------------------------------------------------

Save the file and give it a custom name and upload it to the yolov5/data folder. I have named my file as coco_dataset.yaml


<h2> TRAINING </h2>

<p> We will use transfer learning and take advantage of the awesome yolov5 pre trained model to perform our customized dataset training. yolov5 requires annotated images(labels) with the same names as the image names. We did that in our pre-requisite steps. The labels are nothing but the class and the bounding box locations that we drew on the makesense.api site. <p>
    
    
<p>There are many hyper-parameters that can be tweaked for this model. The file is located at yolov5/data/hyps. This file has many hyper parameters such as learning rate, weight, decays etc etc. However, we will train the model with the default hyperparameters before we start tuning them.</p>

In [None]:
!python train.py --img 416 --batch 1 --epochs 100 --data coco_dataset.yaml --weights yolov5s.pt --cache

<h2> INFERENCING </h2>

<p> Use the weights generated by the training job after "n" epochs. This weight indicates the best trained model for your use case. You can add multiple sources for detection from images to videos to RTSP streams or even youtube videos for inferencing </p>

In [None]:
!python detect.py --weights runs/train/exp2/weights/best.pt --img 640 --conf 0.25 --source ../outpy.avi

<h3> EXPORT </h3>

<p> Once the model is trained it can be exported to a onnx format. It can be compiled by Amazon Sagemaker neo to deploy to the edge. </p>

<p> Exporting to ONNX format </p>