# CAR PLATE DETECTOR

My goal was to create an algorithm that would detect license plates on vehicles. As part of my research, I browsed the Kaggle website and selected a dataset with ready-made labels (https://www.kaggle.com/datasets/andrewmvd/car-plate-detection). I downloaded the selected dataset to the Google Colab environment. This requires installing the `opendatasets` library and generating a token (Kaggle.com -> top right corner -> Account -> Create New API Token).

For more information, see: https://blog.roboflow.com/how-to-train-yolov5-on-a-custom-dataset/?ref=ultralytics

**NOTE:** Don't forget to remove emoticons from the first line in files: 
* /content/yolov5/train.py, 
* /content/yolov5/detect.py,
* /content/yolov5/data/hyps/hyp.scratch-... .yaml, 
* /content/yolov5/models/yolov5... .yaml. 

Otherwise Google Colab will raise an error.

In [1]:
!pip install opendatasets

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting opendatasets
  Downloading opendatasets-0.1.22-py3-none-any.whl (15 kB)
Installing collected packages: opendatasets
Successfully installed opendatasets-0.1.22


In [14]:
!rm -r /content/sample_data

In [2]:
import opendatasets as od

od.download("https://www.kaggle.com/datasets/andrewmvd/car-plate-detection")

Downloading car-plate-detection.zip to ./car-plate-detection


100%|██████████| 203M/203M [00:01<00:00, 107MB/s]







## PREPRECESS DATASET

The downloaded labels need to be decoded and parsed from XML format into a form for the YOLOv5 network. Then the processed labels and images need to be move to subfolders.

In [6]:
# Preparation of paths for training, test and validation set

import os

ANNOTATIONS = r"/content/car-plate-detection/annotations"
LABEL_DATASET_DIR = r"/content/car-plate-detection/annotations_txt"
IMG_DATASET_DIR = r"/content/car-plate-detection/images"

DATASET_MAIN_DIR = r"/content/car_plates"
IMAGES_DIR = r"/content/car_plates/images"
LABELS_DIR = r"/content/car_plates/labels"
TRAIN_IMG_DIR = r"/content/car_plates/images/train"
TRAIN_LABEL_DIR = r"/content/car_plates/labels/train"
TEST_IMG_DIR = r"/content/car_plates/images/test"
TEST_LABEL_DIR = r"/content/car_plates/labels/test"
VAL_IMG_DIR = r"/content/car_plates/images/val"
VAL_LABEL_DIR = r"/content/car_plates/labels/val"

LIST_OF_PATHS = [LABEL_DATASET_DIR, DATASET_MAIN_DIR, IMAGES_DIR, LABELS_DIR, 
                 TRAIN_IMG_DIR,  TRAIN_LABEL_DIR, TEST_IMG_DIR, TEST_LABEL_DIR, 
                 VAL_IMG_DIR, VAL_LABEL_DIR]

for path in LIST_OF_PATHS:
  if not os.path.exists(path):
    os.mkdir(path)

In [7]:
import xml.etree.ElementTree as ET

def extract_info_from_xml(xml_file: str) -> dict:
  """
  Decodes information from XML file.
  
  Args:
    xml_file (str): path to XML file.
  
  Returns:
    Decoded XML file in the form of a dictionary.
  """
  root = ET.parse(xml_file).getroot()

  # Initialise the info dict
  info_dict = {}
  info_dict['bboxes'] = []

  # Parse the XML Tree
  for elem in root:
    # Get the file name
    if elem.tag == "filename":
      info_dict['filename'] = elem.text

    # Get the image size
    elif elem.tag == "size":
      image_size = []
      for subelem in elem:
        image_size.append(int(subelem.text))

      info_dict['image_size'] = tuple(image_size)

    # Get details of the bounding box
    elif elem.tag == "object":
      bbox = {}
      for subelem in elem:
        if subelem.tag == "name":
          bbox["class"] = subelem.text

        elif subelem.tag == "bndbox":
          for subsubelem in subelem:
            bbox[subsubelem.tag] = int(subsubelem.text)
          info_dict['bboxes'].append(bbox)

  return info_dict

The cell below converts the data into the required format.

In [8]:
for annotation in os.listdir(ANNOTATIONS):
  file_name = annotation.split(".")[0]
  data_from_xml_file = extract_info_from_xml(
    os.path.join(ANNOTATIONS, annotation))

  img_width = data_from_xml_file["image_size"][0]
  img_height = data_from_xml_file["image_size"][1]
  x_min = data_from_xml_file["bboxes"][0]["xmin"]
  y_min = data_from_xml_file["bboxes"][0]["ymin"]
  x_max = data_from_xml_file["bboxes"][0]["xmax"]
  y_max = data_from_xml_file["bboxes"][0]["ymax"]

  x_center = (x_max + x_min) / 2
  y_center = (y_max + y_min) / 2
  width = (x_max - x_min) / img_width
  height = (y_max - y_min) / img_height

  x_center /= img_width
  y_center /= img_height

  with open(file=os.path.join(LABEL_DATASET_DIR, f"{file_name}.txt"), 
            mode="a", 
            encoding="utf8") as file:
    file.write(f"0 {x_center:.6f} {y_center:.6f} {width:.6f} {height:.6f}")

In [9]:
# Split the data into traning, test and validation sets
import random

DATASET = [
  element.split(".")[0] 
  for element in os.listdir("/content/car-plate-detection/annotations_txt")]

TEST_DATASET = random.sample(population=DATASET,
                             k=int(0.1 * len(DATASET)))
VAL_DATASET = random.sample(population=[element for element in DATASET 
                                        if element not in TEST_DATASET],
                            k=(int(0.1 * len(DATASET))))
TRAIN_DATASET = [element for element in DATASET 
                 if element not in TEST_DATASET 
                 if element not in VAL_DATASET]

In [10]:
import shutil

def split_dataset(dataset: list,
                  img_dir: str,
                  label_dir: str) -> None:
  """
  Moves selected photos and labels from the main 
  folder to subfolders.
  
  Args:
    dataset (list): list that contains file names.
    img_dir (list): Path to the folder for images.
    label_dir (list): Path to the folder for labels.
  """
  for element in dataset:
    label_path = os.path.join(LABEL_DATASET_DIR,
                              f"{element}.txt")
    img_path = os.path.join(IMG_DATASET_DIR,
                            f"{element}.png")

    shutil.move(src=label_path,
                dst=label_dir) 
    shutil.move(src=img_path,
                dst=img_dir)

In [11]:
split_dataset(dataset=TEST_DATASET,
     img_dir=TEST_IMG_DIR,
     label_dir=TEST_LABEL_DIR)

len(os.listdir(TEST_IMG_DIR)), len(os.listdir(TEST_LABEL_DIR))

(43, 43)

In [12]:
split_dataset(dataset=VAL_DATASET,
     img_dir=VAL_IMG_DIR,
     label_dir=VAL_LABEL_DIR)

len(os.listdir(VAL_IMG_DIR)), len(os.listdir(VAL_IMG_DIR))

(43, 43)

In [13]:
split_dataset(dataset=TRAIN_DATASET,
     img_dir=TRAIN_IMG_DIR,
     label_dir=TRAIN_LABEL_DIR)

len(os.listdir(TRAIN_IMG_DIR)), len(os.listdir(TRAIN_LABEL_DIR))

(347, 347)

## Train YOLOv5 model

In [15]:
!git clone https://github.com/ultralytics/yolov5

Cloning into 'yolov5'...
remote: Enumerating objects: 12467, done.[K
remote: Counting objects: 100% (20/20), done.[K
remote: Compressing objects: 100% (17/17), done.[K
remote: Total 12467 (delta 7), reused 10 (delta 3), pack-reused 12447[K
Receiving objects: 100% (12467/12467), 12.21 MiB | 30.50 MiB/s, done.
Resolving deltas: 100% (8598/8598), done.


Create .yaml file that contains informations about location of datasets, number of classes and name of classes

In [16]:
with open(file="/content/yolov5/data/car_plates.yaml", mode="a", 
          encoding="utf8") as file:
  file.write(f"# dataset paths\n")
  file.write(f"train: {TRAIN_IMG_DIR}\n")
  file.write(f"val: {VAL_IMG_DIR}\n")
  file.write(f"test: {TEST_IMG_DIR}\n")
  file.write(f"\n# number of classes\n")
  file.write(f"nc: 1\n")
  file.write("\n# class names\n")
  file.write("names: ['cal_plates']")

In [19]:
!python /content/yolov5/train.py \
  --img 640 \
  --cfg /content/yolov5/models/yolov5m.yaml \
  --hyp /content/yolov5/data/hyps/hyp.scratch-low.yaml \
  --batch 32 \
  --epochs 100 \
  --data /content/yolov5/data/car_plates.yaml \
  --weights yolov5s.pt \
  --name yolo_car_plates

[34m[1mtrain: [0mweights=yolov5s.pt, cfg=/content/yolov5/models/yolov5m.yaml, data=/content/yolov5/data/car_plates.yaml, hyp=/content/yolov5/data/hyps/hyp.scratch-low.yaml, epochs=100, batch_size=32, imgsz=640, rect=False, resume=False, nosave=False, noval=False, noautoanchor=False, noplots=False, evolve=None, bucket=, cache=None, image_weights=False, device=, multi_scale=False, single_cls=False, optimizer=SGD, sync_bn=False, workers=8, project=yolov5/runs/train, name=yolo_car_plates, exist_ok=False, quad=False, cos_lr=False, label_smoothing=0.0, patience=100, freeze=[0], save_period=-1, seed=0, local_rank=-1, entity=None, upload_dataset=False, bbox_interval=-1, artifact_alias=latest
[34m[1mgithub: [0mup to date with https://github.com/ultralytics/yolov5 ✅
YOLOv5 🚀 v6.1-292-g0414637 Python-3.7.13 torch-1.11.0+cu113 CUDA:0 (Tesla T4, 15110MiB)

[34m[1mhyperparameters: [0mlr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias

Checking the detection results on my own images - hidden

In [20]:
!python /content/yolov5/detect.py \
  --source /content/test_photo \
  --weights /content/yolov5/runs/train/yolo_car_plates3/weights/best.pt \
  --conf 0.5 \
  --name yolo_road_det

[34m[1mdetect: [0mweights=['/content/yolov5/runs/train/yolo_car_plates3/weights/best.pt'], source=/content/test_photo, data=yolov5/data/coco128.yaml, imgsz=[640, 640], conf_thres=0.5, iou_thres=0.45, max_det=1000, device=, view_img=False, save_txt=False, save_conf=False, save_crop=False, nosave=False, classes=None, agnostic_nms=False, augment=False, visualize=False, update=False, project=yolov5/runs/detect, name=yolo_road_det, exist_ok=False, line_thickness=3, hide_labels=False, hide_conf=False, half=False, dnn=False
YOLOv5 🚀 v6.1-292-g0414637 Python-3.7.13 torch-1.11.0+cu113 CUDA:0 (Tesla T4, 15110MiB)

Fusing layers... 
YOLOv5m summary: 290 layers, 20852934 parameters, 0 gradients
image 1/5 /content/test_photo/IMG_0775.JPG: 640x480 1 cal_plates, Done. (0.025s)
image 2/5 /content/test_photo/IMG_0776.JPEG: 480x640 1 cal_plates, Done. (0.026s)
image 3/5 /content/test_photo/IMG_0777.JPEG: 480x640 1 cal_plates, Done. (0.023s)
image 4/5 /content/test_photo/IMG_0778.JPEG: 480x640 1 cal_p

In [None]:
import cv2
from google.colab.patches import cv2_imshow

path = "/content/yolov5/runs/detect/yolo_road_det"

for img in os.listdir(path):
  img_path = os.path.join(path, img)
  cv2_imshow(cv2.imread(img_path))