Based on this tutorial [Train Your Own YoloV5 Object Detection Model | analyticsvidhya.com](https://www.analyticsvidhya.com/blog/2021/08/train-your-own-yolov5-object-detection-model/#h2_3)

## Creating Dataset

### Create annotations Using the VIA tool
Create annotations by using [this VIA (VGG Image Annotator) tool](https://drive.google.com/file/d/1rJx0fNgnnhODM7H3GP9RQQ5QEsWdkYEd/view?usp=sharing)

using tool:
first create attribute named char  
upload files
create annotations
and export annotations by clicking menu > annotations > export annotations as CSV


### Convert CSV annotations to COCO format

To convert CSV annotations to COCO format you can use the following code chunk:

Start with importing dependencies to create COCO dataset.

In [None]:
!pip install wandb

In [None]:
import os
import numpy as np 
import pandas as pd
import shutil as sh
from PIL import Image
from tqdm.auto import tqdm
from pathlib import Path

In [None]:
!mkdir '/content/images/'
!mkdir '/content/images/train'

upload annotation .csv 
 files to above second  path

Upload images zip file on main path

In [None]:
!nvidia-smi

Thu Mar 24 07:27:38 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla K80           Off  | 00000000:00:04.0 Off |                    0 |
| N/A   58C    P8    31W / 149W |      0MiB / 11441MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

## Prepare Data

### Preprocess .csv file

In [None]:
!wget ibmhcc.mooo.com/annotations.csv -P '/content/images/train'

--2022-03-24 07:27:38--  http://ibmhcc.mooo.com/annotations.csv
Resolving ibmhcc.mooo.com (ibmhcc.mooo.com)... 35.176.54.82
Connecting to ibmhcc.mooo.com (ibmhcc.mooo.com)|35.176.54.82|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1947523 (1.9M) [text/csv]
Saving to: ‘/content/images/train/annotations.csv’


2022-03-24 07:27:40 (1.08 MB/s) - ‘/content/images/train/annotations.csv’ saved [1947523/1947523]



In [None]:
csv_path = '/content/images/train/annotations.csv'
df = pd.read_csv(csv_path)
df["region_attributes"] = df["region_attributes"].str.upper()
df.to_csv(csv_path)

Get images

In [None]:
!wget ibmhcc.mooo.com/11kcaptchas.zip
!unzip -qq '/content/11kcaptchas.zip'

--2022-03-24 07:27:41--  http://ibmhcc.mooo.com/11kcaptchas.zip
Resolving ibmhcc.mooo.com (ibmhcc.mooo.com)... 35.176.54.82
Connecting to ibmhcc.mooo.com (ibmhcc.mooo.com)|35.176.54.82|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 34157070 (33M) [application/zip]
Saving to: ‘11kcaptchas.zip’


2022-03-24 07:27:48 (4.56 MB/s) - ‘11kcaptchas.zip’ saved [34157070/34157070]



In [None]:
!cp -r '/content/captcha/.' '/content/images/'

In [None]:
data_path = '/content/images/'
df = pd.read_csv(data_path+'train/'+ csv_path.split("/")[-1])
## create x, y, w, h columns 
x, y, w, h = [], [], [], []
count = 0
for row in df['region_shape_attributes']:
    count = count + 1
    row = row.replace('{}', '').replace('}', '')
    row = row.split(',')
    x.append(int(row[1].split(':')[-1]))
    y.append(int(row[2].split(':')[-1]))
    w.append(int(row[3].split(':')[-1]))
    h.append(int(row[4].split(':')[-1]))
## calculating x, y, width and height coordinates
df['x'], df['y'], df['w'], df['h'] = x, y, w, h
## creating a column name image_id having images names as id 
df['image_id'] = [name.split('.')[0] for name in df['filename']]
## creating two columns for storing x and y center values
df['x_center'] = df['x'] + df['w']/2
df['y_center'] = df['y'] + df['h']/2
## define number of classes 
labels = df['region_attributes'].unique()
labels_to_dict = dict(zip(labels, range(0, len(labels))))
print('Lables Directory:', labels_to_dict)
df['classes'] = df['region_attributes']
df.replace({'classes':labels_to_dict}, inplace=True)
df = df[['image_id','x', 'y', 'w', 'h','x_center','y_center','classes']]
## set index of images
index = list(set(df.image_id))

Lables Directory: {'{"CHAR":"2"}': 0, '{"CHAR":"A"}': 1, '{"CHAR":"H"}': 2, '{"CHAR":"N"}': 3, '{"CHAR":"3"}': 4, '{"CHAR":"L"}': 5, '{"CHAR":"S"}': 6, '{"CHAR":"G"}': 7, '{"CHAR":"P"}': 8, '{"CHAR":"6"}': 9, '{"CHAR":"T"}': 10, '{"CHAR":"J"}': 11, '{"CHAR":"F"}': 12, '{"CHAR":"R"}': 13, '{"CHAR":"Q"}': 14, '{"CHAR":"D"}': 15, '{"CHAR":"5"}': 16, '{"CHAR":"B"}': 17, '{"CHAR":"7"}': 18, '{"CHAR":"8"}': 19, '{"CHAR":"E"}': 20, '{"CHAR":"9"}': 21, '{"CHAR":"Y"}': 22, '{"CHAR":"4"}': 23, '{"CHAR":"W"}': 24, '{"CHAR":"K"}': 25, '{"CHAR":"U"}': 26, '{"CHAR":"X"}': 27, '{"CHAR":"V"}': 28, '{"CHAR":"C"}': 29, '{"CHAR":"M"}': 30, '{"CHAR":" Q"}': 31, '{"CHAR":"T "}': 32}


In [None]:
if True:
    for fold in [0]:
        val_index = index[len(index) * fold // 7 : len(index) * (fold + 1) // 7]
        for name, mini in tqdm(df.groupby("image_id")):
            if name in val_index:
                path2save = "val2017/"
            else:
                path2save = "train2017/"
            if not os.path.exists("convertor/fold{}/labels/".format(fold) + path2save):
                os.makedirs("convertor/fold{}/labels/".format(fold) + path2save)
            with open(
                "convertor/fold{}/labels/".format(fold) + path2save + name + ".txt",
                "w+",
            ) as f:
                row = (
                    mini[["classes", "x_center", "y_center", "w", "h"]]
                    .astype(float)
                    .values
                )
                # imagename = data_path + "{}.png".format(name)
                ext = ["png", "JPG", "PNG", "jpg"]
                for ext_ in ext:
                    imagename = data_path + "{}.{}".format(name, ext_)
                    if os.path.exists(imagename):
                        break
                check_image_width_height = Image.open(imagename)
                img_width, img_height = check_image_width_height.size
                for r in row:
                    r[1] = r[1] / img_width
                    r[2] = r[2] / img_height
                    r[3] = r[3] / img_width
                    r[4] = r[4] / img_height
                row = row.astype(str)
                for j in range(len(row)):
                    # print(row[j], "n")
                    row[j][0] = str(int(float(row[j][0])))
                    text = " ".join(row[j])
                    f.write(text)
                    f.write("\n")
            if not os.path.exists("convertor/fold{}/images/{}".format(fold, path2save)):
                os.makedirs("convertor/fold{}/images/{}".format(fold, path2save))
            sh.copy(
                imagename,
                "convertor/fold{}/images/{}/{}.{}".format(fold, path2save, name, ext_),
            )


  0%|          | 0/2749 [00:00<?, ?it/s]

In [None]:
!ls /content/convertor/fold0/images/val2017 | wc -l
!ls /content/convertor/fold0/images/train2017 | wc -l

392
2357


In [None]:
!git clone https://github.com/ultralytics/yolov5

Cloning into 'yolov5'...
remote: Enumerating objects: 11789, done.[K
remote: Total 11789 (delta 0), reused 0 (delta 0), pack-reused 11789[K
Receiving objects: 100% (11789/11789), 11.38 MiB | 14.75 MiB/s, done.
Resolving deltas: 100% (8160/8160), done.


In [None]:
%cd '/content/yolov5'

/content/yolov5


### Creating YAML file for training

TODO: Paste below output after names in below code

In [None]:
print([l.split('"')[3].strip() for l in labels])

['2', 'A', 'H', 'N', '3', 'L', 'S', 'G', 'P', '6', 'T', 'J', 'F', 'R', 'Q', 'D', '5', 'B', '7', '8', 'E', '9', 'Y', '4', 'W', 'K', 'U', 'X', 'V', 'C', 'M', 'Q', 'T']


In [None]:
%%writefile data/coco.yml
train: /content/convertor/fold0/images/train2017
val: /content/convertor/fold0/images/val2017
nc: 33 # number of classes
names: ['2', 'A', 'H', 'N', '3', 'L', 'S', 'G', 'P', '6', 'T', 'J', 'F', 'R', 'Q', 'D', '5', 'B', '7', '8', 'E', '9', 'Y', '4', 'W', 'K', 'U', 'X', 'V', 'C', 'M', 'Q', 'T']   # index of character according to dataset

Writing data/coco.yml


## Training

In [None]:
!python train.py --batch 50 --img-size 320 --epochs 250 --data ./data/coco.yml --weights ./data/yolov5s.pt  #./runs/train/exp2/weights/best.pt

[34m[1mwandb[0m: (1) Create a W&B account
[34m[1mwandb[0m: (2) Use an existing W&B account
[34m[1mwandb[0m: (3) Don't visualize my results
[34m[1mwandb[0m: Enter your choice: (30 second timeout) 2
[34m[1mwandb[0m: You chose 'Use an existing W&B account'
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mtrain: [0mweights=./data/yolov5s.pt, cfg=, data=./data/coco.yml, hyp=data/hyps/hyp.scratch-low.yaml, epochs=250, batch_size=50, imgsz=320, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=runs/train, name=exp, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0,

## Inference

In [None]:
# possible weights:  #./runs/train/exp7/weights/best.pt ./weights/yolov5l.pt 
!python detect.py --img 320 --source /content/convertor/fold0/images/val2017 --weights ./runs/train/exp3/weights/best.pt  \
     --conf-thres 0.35 --line-thickness 1

[34m[1mdetect: [0mweights=['./runs/train/exp3/weights/best.pt'], source=/content/convertor/fold0/images/val2017, data=data/coco128.yaml, imgsz=[320, 320], conf_thres=0.35, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=runs/detect, name=exp, exist_ok=False, line_thickness=1, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.1-61-gbc3ed95 torch 1.10.0+cu111 CUDA:0 (Tesla K80, 11441MiB)

Fusing layers... 
Model summary: 213 layers, 7099126 parameters, 0 gradients
image 1/392 /content/convertor/fold0/images/val2017/24C2Q2.png: 96x320 3 2s, 1 Q, 1 4, 1 C, Done. (0.019s)
image 2/392 /content/convertor/fold0/images/val2017/2AP6TJ.png: 96x320 1 2, 1 A, 1 P, 1 6, 1 T, 1 J, Done. (0.018s)
image 3/392 /content/convertor/fold0/images/val2017/2B7NTP.png: 96x320 1 2, 1 N, 1 P, 1 T, 1 B, 1 7, Done. (0.019s)
image 4/392 

## Testing

In [None]:
# possible weights:  #./runs/train/exp7/weights/best.pt ./weights/yolov5l.pt 
!python val.py --data ./data/coco.yml --weights ./runs/train/exp3/weights/best.pt #--conf-thres 0.35

[34m[1mval: [0mdata=./data/coco.yml, weights=['./runs/train/exp3/weights/best.pt'], batch_size=32, imgsz=640, conf_thres=0.001, iou_thres=0.6, task=val, device=, workers=8, single_cls=False, augment=False, verbose=False, save_txt=False, save_hybrid=False, save_conf=False, save_json=False, project=runs/val, name=exp, exist_ok=False, half=False, dnn=False
YOLOv5 🚀 v6.1-61-gbc3ed95 torch 1.10.0+cu111 CUDA:0 (Tesla K80, 11441MiB)

Fusing layers... 
Model summary: 213 layers, 7099126 parameters, 0 gradients
[34m[1mval: [0mScanning '/content/convertor/fold0/labels/val2017.cache' images and labels... 392 found, 0 missing, 0 empty, 0 corrupt: 100% 392/392 [00:00<?, ?it/s]
               Class     Images     Labels          P          R     mAP@.5 mAP@.5:.95: 100% 13/13 [00:07<00:00,  1.76it/s]
                 all        392       2352      0.951      0.942       0.97      0.575
                   2        392        101      0.992       0.98      0.995      0.568
                   A   

In [None]:
!zip -r detect_50b_s_2357tr_392val_250e_0-35conf.zip /content/yolov5/runs/detect/exp2