# License Plate Detection Model Using YOLOv8

This notebook presents a **License Plate Detection Model** that utilizes the **YOLOv8** (You Only Look Once version 8) model. YOLOv8 is a state-of-the-art, real-time object detection model that is incredibly effective at identifying objects in images and video.

The model has been fine-tuned on the [Car License Plate Detection](https://www.kaggle.com/datasets/andrewmvd/car-plate-detection) dataset from Kaggle. This dataset provides a diverse set of images with annotated license plates, which is ideal for training our model.

This work also features the use of **Optical Character Recognition** (OCR) to extract and interpret the content of the detected license plates. This enhancement is achieved through the incorporation of two OCR frameworks: **pytesseract** and **EasyOCR**. These OCRs significantly augment the functionality of the license plate detection model, enabling the extraction of textual information from the predicted bounding boxes.

## Dataset Overview

The Car License Plate Detection dataset is a rich collection of car images with annotated license plates. The annotations provide the bounding box coordinates of license plates, making it a valuable resource for training object detection models.

## Model Overview

The YOLOv8 algorithm divides the input image into an SxS grid and for each grid cell predicts multiple bounding boxes and class probabilities. The bounding boxes are weighted by the predicted probabilities.

Our model has been fine-tuned on the aforementioned dataset for the specific task of license plate detection. Fine-tuning involves training the model on a specific dataset after it has been pre-trained on a larger dataset, allowing the model to adapt to new data.

## Usage

This model can be used in various applications such as traffic surveillance, parking management, and in automated systems where vehicle identification is required.

Please note that the use of this model should comply with local laws and regulations related to privacy and data protection.

## Future Work

While the fine-tuned YOLO model is showing great performance, the next step is enhancing the license plate OCR process. By improving the pre-OCR processing of cropped license plate images for better results. This tweak aims to make text extraction smoother, boosting accuracy and efficiency.

## Acknowledgment
The implementation presented in this notebook draws significant inspiration from the work shared in this [notebook](https://www.kaggle.com/code/aslanahmedov/automatic-number-plate-recognition) by Aslan Ahmedov.

We hope you find this notebook useful for your license plate detection tasks! Happy coding! 🚗🔍

In [None]:
# Install required packages

# Install Tesseract OCR
!apt install -y tesseract-ocr

# Set preferred encoding to UTF-8
import locale
def getpreferredencoding(do_setlocale=True):
    return "UTF-8"
locale.getpreferredencoding = getpreferredencoding

# Install required Python packages
!pip install pytesseract easyocr imutils onnxruntime ultralytics




In [None]:
# Imports
import os
from glob import glob
from shutil import copy

from ultralytics import YOLO
import cv2
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from skimage import io
import PIL
import pytesseract as pt

import xml.etree.ElementTree as ET
import tensorflow as tf
import plotly.express as px



# Data Preprocessing

To train and fine-tune the YOLO model, we need a suitable dataset. In this project, we utilize the [Car License Plate Detection dataset](https://www.kaggle.com/datasets/andrewmvd/car-plate-detection) from Kaggle. The dataset provides a diverse collection of images annotated with car license plate information.

## Kaggle Dataset

1. Visit the [Car License Plate Detection Kaggle dataset page](https://www.kaggle.com/datasets/andrewmvd/car-plate-detection).
2. Download the dataset in a zip file format.

## Google Drive

Once the dataset is obtained, it needs to be upload it to Google Drive for convenient access.

Now, we are ready to proceed with the data extraction and exploration.



In [None]:
# Unzip the uploaded file
!unzip /content/drive/MyDrive/Licence_plate_data/archive.zip

## Parsing Data from XML Files

The downloaded dataset is structured in XML format, with each annotation file providing information about the images, including object labels and bounding box coordinates. Here is an example of the XML file structure:

```xml
<annotation>
    <folder>images</folder>
    <filename>Cars0.png</filename>
    <size>
        <width>500</width>
        <height>268</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>licence</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <occluded>0</occluded>
        <difficult>0</difficult>
        <bndbox>
            <xmin>226</xmin>
            <ymin>125</ymin>
            <xmax>419</xmax>
            <ymax>173</ymax>
        </bndbox>
    </object>
</annotation>


In [None]:
# Parse XML annotations for object detection
# Extract relevant information about bounding boxes and image properties
# Organize the information into a structured dictionary


path = glob("/content/annotations/*.xml") # get a list of file paths for XML files in the /content/annotations/ directory.
labels_dict = dict(filepath=[], xmin=[], xmax=[], ymin=[], ymax=[], filename=[], width=[], height=[]) # create labels dictionary

# iterate over xml files
for file in path:
  info = ET.parse(file)
  root = info.getroot()
  member_object = root.find("object") # get object element
  labels_info = member_object.find("bndbox") #get bndbox element

  # Extract int values of bounding box dimensions
  xmin = int(labels_info.find("xmin").text)
  xmax = int(labels_info.find("xmax").text)
  ymin = int(labels_info.find("ymin").text)
  ymax = int(labels_info.find("ymax").text)

  # Extract str values for image filename
  filename = root.find("filename").text # get filename element


  size_element = root.find("size") # get size element
  # Extract int values for image size
  width = int(size_element.find("width").text)
  height = int(size_element.find("height").text)


  # Save the information in the labels_dict
  labels_dict["filepath"].append(file)
  labels_dict["xmin"].append(xmin)
  labels_dict["xmax"].append(xmax)
  labels_dict["ymin"].append(ymin)
  labels_dict["ymax"].append(ymax)
  labels_dict["filename"].append(filename)
  labels_dict["width"].append(width)
  labels_dict["height"].append(height)

In [None]:
# Convert this dictionary as pandas dataframe
df = pd.DataFrame(labels_dict)
df.head()

## YOLO Model Requirements for Bounding Box Representation

The transformation of bounding box coordinates in our dataset is necessary to align with the requirements of the YOLO model. YOLO expects bounding box labels to be represented in terms of the center coordinates (X, Y) and the width (W) and height (H) of the bounding box, all normalized to the dimensions of the image.

<img src= "https://github.com/Asikpalysik/Automatic-License-Plate-Detection/blob/main/Presentation/Notebook9.png?raw=true" width="100%" align="center"  hspace="5%" vspace="5%"/>






In [None]:
# Center coordinates of the bounding box

df["x"] = (df['xmax'] + df['xmin'])/(2*df['width'])
df["y"] = (df['ymax'] + df['ymin'])/(2*df['height'])

df["w"] = (df['xmax'] - df['xmin'])/df['width']
df["h"] = (df['ymax'] - df['ymin'])/df['height']

df.head()

Let's visualize the image and draw the bounding box in order to verify the results

In [None]:
image_path = "/content/images/Cars0.png" #path of our image N2.jpeg

import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
from PIL import Image

#display the image
plt.imshow(Image.open(image_path))

#add rectangle
plt.gca().add_patch(Rectangle((226,125),-(419-226),-(173-125),
                    angle=-180,
                    edgecolor='red',
                   facecolor='none',
                    lw=1))

## Folder Structure for YOLO Models

To follow the requirements of YOLO models, we organize our data in the following folder structure:

```plaintext
data/
|-- train/
|   |-- image1.jpg
|   |-- image1.txt
|   |-- image2.jpg
|   |-- image2.txt
|   |-- ...
|
|-- test/
|   |-- image1.jpg
|   |-- image1.txt
|   |-- image2.jpg
|   |-- image2.txt
|   |-- ...
```

**Training and Test Data**: Images are divided into training and test sets

**Label Information:** Corresponding labels are stored in .txt files, sharing the same filenames as the images.

In [None]:
# Create directory structure for dataset
!mkdir -p /content/datasets/data_images/train /content/datasets/data_images/test

In [None]:
# Split the data into training and test set
df_train = df.iloc[:int(0.8*len(df))]
df_test = df.iloc[int(0.8*len(df)):]

In [None]:
train_folder = "/content/datasets/data_images/train"

values = df_train[['filename','x','y','w','h']].values
for fname, x,y, w, h in values:
    image_name = os.path.split(fname)[-1]
    txt_name = os.path.splitext(image_name)[0]

    dst_image_path = os.path.join(train_folder,image_name)
    dst_label_file = os.path.join(train_folder,txt_name+'.txt')

    # copy each image into the folder
    copy("/content/images/"+image_name,dst_image_path)

    #generate .txt which has label info
    label_txt = f'0 {x} {y} {w} {h}'
    with open(dst_label_file,mode='w') as f:
        f.write(label_txt)
        f.close()


In [None]:
test_folder = "/content/datasets/data_images/test"

values = df_test[['filename','x','y','w','h']].values
for fname, x,y, w, h in values:
    image_name = os.path.split(fname)[-1]
    txt_name = os.path.splitext(image_name)[0]

    dst_image_path = os.path.join(test_folder,image_name)
    dst_label_file = os.path.join(test_folder,txt_name+'.txt')

    # copy each image into the folder
    copy("/content/images/"+image_name,dst_image_path)

    #generate .txt which has label info
    label_txt = f'0 {x} {y} {w} {h}'
    with open(dst_label_file,mode='w') as f:
        f.write(label_txt)
        f.close()

In [None]:
# Create data.yaml file
# Create and write data.yaml file
data_yaml_content = """
train: data_images/train
val: data_images/test
nc: 1
names: [
    'license_plate'
]
"""

with open('data.yaml', 'w') as file:
    file.write(data_yaml_content)

# Check if the file is created successfully
!cat data.yaml

# YOLO Model Initialization and Training

In [None]:
# Load a model
model = YOLO('yolov8n.yaml')  # build a new model from YAML
model = YOLO('yolov8n.pt')  # load a pretrained model (recommended for training)
model = YOLO('yolov8n.yaml').load('yolov8n.pt')  # build from YAML and transfer weights

# Train the model
results = model.train(data='data.yaml', epochs=100, device="0")

## Export as onnx

In [None]:
# Export the model as onnx
model.export(format='onnx')

In [None]:
# Run inference on 'bus.jpg' with arguments
model.predict('/content/car_plate_test.jpg', save=True, conf=0.5)

In [None]:
# Set image size setting
IMAGE_WIDTH = 640
IMAGE_HEIGHT = 640

# load sample test image
img = io.imread("/content/car_plate_test.jpg")

fig = px.imshow(img)
fig.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
fig.show()


In [None]:
# LOAD YOLO MODEL
net = cv2.dnn.readNetFromONNX('/content/drive/MyDrive/Licence_plate_detection_model/runs/detect/train2/weights/best.onnx')
net.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
net.setPreferableTarget(cv2.dnn.DNN_TARGET_CPU)

## Load the saved model

In [None]:
 #Load a model
loaded_model = YOLO('/content/drive/MyDrive/Licence_plate_detection_model/runs/detect/train2/weights/best.pt')  # load a partially trained model

# Run inference on 'bus.jpg' with arguments
loaded_model.predict('/content/car_plate_test.jpg', save=True, conf=0.5)

In [None]:
import torch
import torchvision
import onnx
import onnxruntime as ort
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
import pytesseract

# Constants
MODEL_PATH = "/content/drive/MyDrive/Licence_plate_detection_model/runs/detect/train2/weights/best.onnx"
IMAGE_PATH = "/content/car_plate_test.jpg"
IMAGE_SIZE = 640

# Load the YOLOv8 model in ONNX format
ort_session = ort.InferenceSession(MODEL_PATH)

# Preprocess the input image for YOLOv8
image = Image.open(IMAGE_PATH)  # Load the image using PIL
scale_x = image.width / IMAGE_SIZE
scale_y = image.height / IMAGE_SIZE
resized_image = image.resize((IMAGE_SIZE, IMAGE_SIZE))  # Resize to 640x640

# Convert PIL image to tensor
transform = torchvision.transforms.ToTensor()
input_tensor = transform(resized_image).unsqueeze(0)  # Add a batch dimension

# Run the YOLOv8 model
outputs = ort_session.run(None, {'images': input_tensor.numpy()})

# Post-process the output tensor to get bounding box coordinates and confidence scores
boxes = outputs[0][0]
confidences = boxes[4]  # Confidence scores are stored in the 5th element

# Find the bounding box with the highest confidence
max_confidence_index = np.argmax(confidences)
x, y, w, h, c = boxes[:, max_confidence_index]

# Convert box format from [x_center, y_center, width, height] to [x_min, y_min, x_max, y_max]
x_min, y_min = (x - w / 2) * scale_x, (y - h / 2) * scale_y
x_max, y_max = (x + w / 2) * scale_x, (y + h / 2) * scale_y

# Crop the license plate from the image
license_plate_image = image.crop((x_min, y_min, x_max, y_max))

# Use pytesseract to extract text
license_plate_text = pytesseract.image_to_string(license_plate_image)
print(license_plate_text)

# Visualize the predicted bounding boxes on the image
plt.imshow(image)
current_axis = plt.gca()

# Draw the bounding box and the license plate text on the image
current_axis.add_patch(plt.Rectangle((x_min, y_min), x_max - x_min, y_max - y_min, color='red', fill=False, linewidth=2))
current_axis.text(x_min, y_min-10, f'License Plate: {license_plate_text} Confidence: {c:.2f}', bbox={'facecolor': 'white', 'alpha': 0.7})

plt.show()


In [None]:
import cv2
import numpy as np
import onnxruntime as ort
import pytesseract
import torch
import torchvision
from PIL import Image

def get_detections(image_path, size, ort_session):
    # Check if image_path is a string (indicating a file path)
    if isinstance(image_path, str):
        image = Image.open(image_path)
    # Check if image_path is a NumPy array
    elif isinstance(image_path, np.ndarray):
        image = Image.fromarray(image_path)
    else:
        raise ValueError("image_path must be a file path (str) or a NumPy array.")

    scale_x = image.width / size
    scale_y = image.height / size
    resized_image = image.resize((size, size))
    transform = torchvision.transforms.ToTensor()
    input_tensor = transform(resized_image).unsqueeze(0)
    outputs = ort_session.run(None, {'images': input_tensor.numpy()})
    return image, outputs, scale_x, scale_y

def non_maximum_supression(outputs):
    boxes = outputs[0][0]
    confidences = boxes[4]
    max_confidence_index = np.argmax(confidences)
    return boxes[:, max_confidence_index]

def drawings(image, boxes, scale_x, scale_y):
    x, y, w, h, c = boxes
    x_min, y_min = (x - w / 2) * scale_x, (y - h / 2) * scale_y
    x_max, y_max = (x + w / 2) * scale_x, (y + h / 2) * scale_y
    license_plate_image = image.crop((x_min, y_min, x_max, y_max))
    license_plate_text = pytesseract.image_to_string(license_plate_image)
    print(license_plate_text)
    image = cv2.rectangle(np.array(image), (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 0, 255), 3)
    cv2.putText(image, f'License Plate: {license_plate_text}', (int(x_min), int(y_min-10)), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255,255,255), 2)
    cv2.putText(image, f'Confidence: {c:.2f}', (int(x_min), int(y_min+80)), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255,255,255), 2)
    return image

def yolo_predictions(image_path, size, ort_session):
    image, outputs, scale_x, scale_y = get_detections(image_path, size, ort_session)
    boxes = non_maximum_supression(outputs)
    result_img = drawings(image, boxes, scale_x, scale_y)
    return result_img

In [None]:
  model_path = "/content/drive/MyDrive/Licence_plate_detection_model/runs/detect/train2/weights/best.onnx"
  image_path = "/content/car_plate_test.jpg"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img = yolo_predictions(image_path, size, ort_session)

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  fig = px.imshow(result_img)
  fig.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)

  fig.show()


In [None]:
  image_path = "/content/IMG_40872.28.29PM_1024x1024@2x.webp"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img = yolo_predictions(image_path, size, ort_session)

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  fig = px.imshow(result_img)
  fig.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)

  fig.show()

In [None]:
  image_path = "/content/lpr-tesla-license-plate-recognition-1910x1000.jpg"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img = yolo_predictions(image_path, size, ort_session)

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  fig = px.imshow(result_img)
  fig.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)

  fig.show()

# Cropped image processing

Our model works great, however it seems like the OCR is strugguling to read out text from most of the images. Let's try and make the job easier for OCR by implementing some of the common image processing to the croped image in order to get better reading [source](https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy):

*   **Rescale the image**
*   **Convert to greyscale**
*   **Remove the noise**





In [None]:
# Cropped image processing
def ocr_image_process(img):
  # Convert PIL Image to numpy array
  img = np.array(img)
  # Rescaling the image
  img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)
  # Convert to grayscale
  img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
  # Remove the noise
  kernel = np.ones((1, 1), np.uint8)
  img = cv2.dilate(img, kernel, iterations=1)
  img = cv2.erode(img, kernel, iterations=1)
  # Apply blur
  #img = cv2.threshold(cv2.bilateralFilter(img, 5, 75, 75), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1] # good

  #img = cv2.threshold(cv2.medianBlur(img, 3), 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1] # better

  #img = cv2.adaptiveThreshold(cv2.GaussianBlur(img, (5, 5), 0), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2) # good

  img = cv2.adaptiveThreshold(cv2.bilateralFilter(img, 9, 75, 75), 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 31, 2) # good

  return img


In [None]:
# Cropped image processing
def ocr_image_process(img):
  # Convert PIL Image to numpy array
  img = np.array(img)

  # Rescaling the image
  #img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)

  # Convert the image to grayscale
  gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

  # Apply bilateral filter for noise reduction
  bfilter = cv2.bilateralFilter(gray, 11, 17, 17)

  # Apply edge detection
  edged = cv2.Canny(bfilter, 30, 200)
  #cv2_imshow(edged)

  #cnts = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
  #cnts = imutils.grab_contours(cnts)
  #cnts = sorted(cnts, key = cv2.contourArea, reverse = True)[:10]
  #screenCnt = None


  #for c in cnts:
  #  peri = cv2.arcLength(c, True)
  #  approx = cv2.approxPolyDP(c, 0.018 * peri, True)
  #  if len(approx) == 4:
  #    screenCnt = approx
  #    break
  #mask = np.zeros(gray.shape, np.uint8)
  #new_image = cv2.drawContours(mask,[screenCnt],0,255,-1)
  #new_image = cv2.bitwise_and(img,img,mask=mask)

  #(x, y) = np.where(mask == 255)
  #(topx, topy) = (np.min(x), np.min(y))
  #(bottomx, bottomy) = (np.max(x), np.max(y))
  #cropped_img = gray[topx:bottomx+1, topy:bottomy+1]
  #print("printing cropped img")
  #cv2_imshow(cropped_img)




  # Find contours in the edged image
  #keypoints = cv2.findContours(edged.copy(), cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
  #contours = imutils.grab_contours(keypoints)

  # Sort the contours based on the area and keep the top 10
  #contours = sorted(contours, key=cv2.contourArea, reverse=True)[:10]

  #location = None
  # Loop over the contours to find a contour with 4 points (rectangle)
  #for contour in contours:
  #  approx = cv2.approxPolyDP(contour, 10, True)
  #  if len(approx) == 4:
  #      location = approx
  #      break

  #print(location)

   #Create a mask to draw the contours
 # mask = np.zeros(gray.shape, np.uint8)

  # Draw the contours on the mask
  #new_image = cv2.drawContours(mask, [location], 0,255, -1)

  # Bitwise-and the mask and the original image
  #new_image = cv2.bitwise_and(img, img, mask=mask)

  # Find the coordinates of the area with the mask
  #(x,y) = np.where(mask==255)
  #(x1, y1) = (np.min(x), np.min(y))
  #(x2, y2) = (np.max(x), np.max(y))

  # Crop the original image
  #cropped_image = gray[x1:x2+1, y1:y2+1]

  # Return the cropped image

  return bfilter

## Visualize cropped image before and after processing


In [None]:
# Now lets try again but let's visualize the cropped image before and after processing therefore we have to make some changes to the original functions
import cv2
import numpy as np
import onnxruntime as ort
import pytesseract
import torch
import torchvision
from PIL import Image

def get_detections(image_path, size, ort_session):
    # Check if image_path is a string (indicating a file path)
    if isinstance(image_path, str):
        # Check if the image is a PNG
        if image_path.lower().endswith('.png'):
            # Open the image file
            img = Image.open(image_path)
            # Convert the image to RGB (removes the alpha channel)
            rgb_img = img.convert('RGB')
            # Create a new file name by replacing .png with .jpg
            jpg_image_path = os.path.splitext(image_path)[0] + '.jpg'
            # Save the RGB image as a JPG
            rgb_img.save(jpg_image_path)
            # Update image_path to point to the new JPG image
            image_path = jpg_image_path

        image = Image.open(image_path)
    # Check if image_path is a NumPy array
    elif isinstance(image_path, np.ndarray):
        image = Image.fromarray(image_path)
    else:
        raise ValueError("image_path must be a file path (str) or a NumPy array.")

    scale_x = image.width / size
    scale_y = image.height / size
    resized_image = image.resize((size, size))
    transform = torchvision.transforms.ToTensor()
    input_tensor = transform(resized_image).unsqueeze(0)
    outputs = ort_session.run(None, {'images': input_tensor.numpy()})
    return image, outputs, scale_x, scale_y

def non_maximum_supression(outputs):
    boxes = outputs[0][0]
    confidences = boxes[4]
    max_confidence_index = np.argmax(confidences)
    return boxes[:, max_confidence_index]

def drawings(image, boxes, scale_x, scale_y):
    x, y, w, h, c = boxes
    x_min, y_min = (x - w / 2) * scale_x, (y - h / 2) * scale_y
    x_max, y_max = (x + w / 2) * scale_x, (y + h / 2) * scale_y
    license_plate_image = image.crop((x_min, y_min, x_max, y_max))
    processed_cropped_image = ocr_image_process(license_plate_image)
    license_plate_text = pytesseract.image_to_string(processed_cropped_image)
    print(license_plate_text)
    image = cv2.rectangle(np.array(image), (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 0, 255), 3)
    cv2.putText(image, f'License Plate: {license_plate_text}', (int(x_min), int(y_min-10)), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255,255,255), 2)
    cv2.putText(image, f'Confidence: {c:.2f}', (int(x_min), int(y_min+80)), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255,255,255), 2)
    return image, license_plate_image, processed_cropped_image

def yolo_predictions(image_path, size, ort_session):
    image, outputs, scale_x, scale_y = get_detections(image_path, size, ort_session)
    boxes = non_maximum_supression(outputs)
    result_img, license_plate_image, processed_cropped_image = drawings(image, boxes, scale_x, scale_y)
    return result_img, license_plate_image, processed_cropped_image

In [None]:
  image_path = "/content/lpr-tesla-license-plate-recognition-1910x1000.jpg"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  # Plot for result_img
  fig1 = px.imshow(result_img)
  fig1.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig1.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig1.show()

  # Plot for license_plate_image
  fig2 = px.imshow(license_plate_image)
  fig2.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig2.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig2.show()

  # Plot for processed_cropped_image
  fig3 = px.imshow(processed_cropped_image)
  fig3.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig3.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig3.show()

In [None]:
  image_path = "/content/Car+Number+Plate+-+MP+3894.jpg"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  # Plot for result_img
  fig1 = px.imshow(result_img)
  fig1.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig1.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig1.show()

  # Plot for license_plate_image
  fig2 = px.imshow(license_plate_image)
  fig2.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig2.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig2.show()

  # Plot for processed_cropped_image
  fig3 = px.imshow(processed_cropped_image)
  fig3.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig3.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig3.show()

In [None]:
  image_path = "/content/Front-Number-Plate.webp"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  # Plot for result_img
  fig1 = px.imshow(result_img)
  fig1.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig1.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig1.show()

  # Plot for license_plate_image
  fig2 = px.imshow(license_plate_image)
  fig2.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig2.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig2.show()

  # Plot for processed_cropped_image
  fig3 = px.imshow(processed_cropped_image)
  fig3.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig3.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig3.show()

Implement EasyOCR functionality and provide the funcionality to choose between PyTesseract and EeasyOCR for text recognition

In [None]:
import cv2
import numpy as np
import onnxruntime as ort
import imutils
import torch
import torchvision
from PIL import Image
import os

def get_detections(image_path, size, ort_session):
    """
    Function to get detections from the model.
    """
    # Check if image_path is a string (indicating a file path)
    if isinstance(image_path, str):
        # Check if the image is a PNG
        if image_path.lower().endswith('.png'):
            # Open the image file
            img = Image.open(image_path)
            # Convert the image to RGB (removes the alpha channel)
            rgb_img = img.convert('RGB')
            # Create a new file name by replacing .png with .jpg
            jpg_image_path = os.path.splitext(image_path)[0] + '.jpg'
            # Save the RGB image as a JPG
            rgb_img.save(jpg_image_path)
            # Update image_path to point to the new JPG image
            image_path = jpg_image_path

        image = Image.open(image_path)
    # Check if image_path is a NumPy array
    elif isinstance(image_path, np.ndarray):
        image = Image.fromarray(image_path)
    else:
        raise ValueError("image_path must be a file path (str) or a NumPy array.")

    scale_x = image.width / size
    scale_y = image.height / size
    resized_image = image.resize((size, size))
    transform = torchvision.transforms.ToTensor()
    input_tensor = transform(resized_image).unsqueeze(0)
    outputs = ort_session.run(None, {'images': input_tensor.numpy()})
    return image, outputs, scale_x, scale_y

def non_maximum_supression(outputs):
    """
    Function to apply non-maximum suppression.
    """
    boxes = outputs[0][0]
    confidences = boxes[4]
    max_confidence_index = np.argmax(confidences)
    return boxes[:, max_confidence_index]

def drawings(image, boxes, scale_x, scale_y, ocr="pytesseract"):
    """
    Function to draw bounding boxes and apply OCR.
    """
    x, y, w, h, c = boxes
    x_min, y_min = (x - w / 2) * scale_x, (y - h / 2) * scale_y
    x_max, y_max = (x + w / 2) * scale_x, (y + h / 2) * scale_y
    license_plate_image = image.crop((x_min, y_min, x_max, y_max))
    processed_cropped_image = ocr_image_process(license_plate_image)

    if ocr == "ez":
      import easyocr
      reader = easyocr.Reader(['en'])
      result = reader.readtext(processed_cropped_image)
      license_plate_text = str.upper(result[0][1])
      print(license_plate_text)
    else:
      import pytesseract
      license_plate_text = pytesseract.image_to_string(processed_cropped_image)
      print(license_plate_text)

    image = cv2.rectangle(np.array(image), (int(x_min), int(y_min)), (int(x_max), int(y_max)), (0, 0, 255), 3)
    cv2.putText(image, f'License Plate: {license_plate_text}', (int(x_min), int(y_min-10)), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255,255,255), 2)
    cv2.putText(image, f'Confidence: {c:.2f}', (int(x_min), int(y_min+80)), cv2.FONT_HERSHEY_SIMPLEX, 0.9, (255,255,255), 2)
    return image, license_plate_image, processed_cropped_image

def yolo_predictions(image_path, size, ort_session, ocr="pytesseract"):
    """
    Function to get YOLO predictions.
    """
    image, outputs, scale_x, scale_y = get_detections(image_path, size, ort_session)
    boxes = non_maximum_supression(outputs)
    result_img, license_plate_image, processed_cropped_image = drawings(image, boxes, scale_x, scale_y, ocr)
    return result_img, license_plate_image, processed_cropped_image


In [None]:
  # Test the EasyOCR
  image_path = "/content/lpr-tesla-license-plate-recognition-1910x1000.jpg"
  size = 640

  ort_session = ort.InferenceSession(model_path)
  result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session, "ez")

    #cv2_imshow(result_img)
    #cv2.waitKey(0)
  # Plot for result_img
  fig1 = px.imshow(result_img)
  fig1.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig1.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig1.show()

  # Plot for license_plate_image
  fig2 = px.imshow(license_plate_image)
  fig2.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig2.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig2.show()

  # Plot for processed_cropped_image
  fig3 = px.imshow(processed_cropped_image)
  fig3.update_layout(width=700, height=400, margin=dict(l=10, r=10, b=10, t=10))
  fig3.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
  fig3.show()

## Process and visualize predictions for multiple images from a directory

In [None]:
from google.colab.patches import cv2_imshow
import os

# Specify the directory path
dir_path = "/content/"

size=640

# Get a list of all files in the directory
image_files = [f for f in os.listdir(dir_path) if os.path.isfile(os.path.join(dir_path, f))]

# Loop through all the image files
for image_file in image_files:
    # Construct the full image path
    image_path = os.path.join(dir_path, image_file)

    # Your existing code here
    ort_session = ort.InferenceSession(model_path)
    result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)

    license_plate_image_resized = cv2.resize(np.array(license_plate_image), (result_img.shape[1], result_img.shape[0]//2))
    processed_cropped_image_resized = cv2.resize(np.array(processed_cropped_image), (result_img.shape[1], result_img.shape[0]//2))

    processed_cropped_image_resized_rgb = cv2.cvtColor(processed_cropped_image_resized, cv2.COLOR_GRAY2RGB)
    right_image = np.vstack((license_plate_image_resized, processed_cropped_image_resized_rgb))

    if right_image.shape[0] > result_img.shape[0]:
        result_img = cv2.resize(result_img, (result_img.shape[1], right_image.shape[0]))
    else:
        right_image = cv2.resize(right_image, (right_image.shape[1], result_img.shape[0]))

    combined_image = np.hstack((result_img, right_image))

    fig = px.imshow(combined_image)
    fig.update_layout(width=700, height=400)
    fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
    fig.show()




### Test the PyTesseract on multple images

In [None]:
ort_session = ort.InferenceSession(model_path)
result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)


# Specify the directory path
dir_path = "/content/"

# Get a list of all files in the directory
image_files = [f for f in os.listdir(dir_path) if os.path.isfile(os.path.join(dir_path, f))]

# Loop through all the image files
for image_file in image_files:
    # Construct the full image path
    image_path = os.path.join(dir_path, image_file)

    # Your existing code here
    ort_session = ort.InferenceSession(model_path)
    result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)

    license_plate_image_resized = cv2.resize(np.array(license_plate_image), (result_img.shape[1], result_img.shape[0]//2))
    processed_cropped_image_resized = cv2.resize(np.array(processed_cropped_image), (result_img.shape[1], result_img.shape[0]//2))

    processed_cropped_image_resized_rgb = cv2.cvtColor(processed_cropped_image_resized, cv2.COLOR_GRAY2RGB)
    right_image = np.vstack((license_plate_image_resized, processed_cropped_image_resized_rgb))

    if right_image.shape[0] > result_img.shape[0]:
        result_img = cv2.resize(result_img, (result_img.shape[1], right_image.shape[0]))
    else:
        right_image = cv2.resize(right_image, (right_image.shape[1], result_img.shape[0]))

    combined_image = np.hstack((result_img, right_image))

    fig = px.imshow(combined_image)
    fig.update_layout(width=700, height=400)
    fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
    fig.show()


### Test the EasyOCR on multple images

In [None]:
size = 640

#ort_session = ort.InferenceSession(model_path)
#result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)


# Specify the directory path
dir_path = "/content/"

# Get a list of all files in the directory
image_files = [f for f in os.listdir(dir_path) if os.path.isfile(os.path.join(dir_path, f))]

# Loop through all the image files
for image_file in image_files:
    # Construct the full image path
    image_path = os.path.join(dir_path, image_file)

    # Your existing code here
    ort_session = ort.InferenceSession(model_path)
    result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session, "ez")

    license_plate_image_resized = cv2.resize(np.array(license_plate_image), (result_img.shape[1], result_img.shape[0]//2))
    processed_cropped_image_resized = cv2.resize(np.array(processed_cropped_image), (result_img.shape[1], result_img.shape[0]//2))

    processed_cropped_image_resized_rgb = cv2.cvtColor(processed_cropped_image_resized, cv2.COLOR_GRAY2RGB)
    right_image = np.vstack((license_plate_image_resized, processed_cropped_image_resized_rgb))

    if right_image.shape[0] > result_img.shape[0]:
        result_img = cv2.resize(result_img, (result_img.shape[1], right_image.shape[0]))
    else:
        right_image = cv2.resize(right_image, (right_image.shape[1], result_img.shape[0]))

    combined_image = np.hstack((result_img, right_image))

    fig = px.imshow(combined_image)
    fig.update_layout(width=700, height=400)
    fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
    fig.show()

## Adaptive threshholding

In [None]:
from google.colab.patches import cv2_imshow
image_path = "/content/test_plate.png"
 # Check if the image is a PNG
if image_path.lower().endswith('.png'):
           # Open the image file
  img = Image.open(image_path)
            # Convert the image to RGB (removes the alpha channel)
  rgb_img = img.convert('RGB')
            # Create a new file name by replacing .png with .jpg
  jpg_image_path = os.path.splitext(image_path)[0] + '.jpg'
            # Save the RGB image as a JPG
  rgb_img.save(jpg_image_path)
            # Update image_path to point to the new JPG image
  image_path = jpg_image_path

  image = Image.open(image_path)

  image = np.array(image)


# Calculate skew angle of an image
#img = cv2.imread('/content/test_plate.png', 0)
  # Prep image, copy, convert to gray scale, blur, and threshold
newImage = image.copy()
gray = cv2.cvtColor(newImage, cv2.COLOR_BGR2GRAY)
# Resize the image
#image = cv2.resize(image, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)

# Convert to grayscale
#gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply adaptive thresholding
thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,25)

# Perform morphological operations
kernel = np.ones((5,5),np.uint8)
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

return opening

# Apply distance transform
#dist_transform = cv2.distanceTransform(opening,cv2.DIST_L2,5)
#ret, sure_fg = cv2.threshold(dist_transform,0.7*dist_transform.max(),255,0)

#cv2_imshow(opening)
#fig = px.imshow(minAreaRect)
#fig.update_layout(width=700, height=400)
#fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
#fig.show()


In [None]:
# Cropped image processing
def ocr_image_process(img):
  # Convert PIL Image to numpy array
  img = np.array(img)

  # Rescaling the image
  #img = cv2.resize(img, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)

  newImage = img.copy()
  gray = cv2.cvtColor(newImage, cv2.COLOR_BGR2GRAY)
# Resize the image
#image = cv2.resize(image, None, fx=1.2, fy=1.2, interpolation=cv2.INTER_CUBIC)

# Convert to grayscale
#gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

# Apply adaptive thresholding
  thresh = cv2.adaptiveThreshold(gray,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY,11,25)

# Perform morphological operations
  kernel = np.ones((5,5),np.uint8)
  opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel)

  return thresh

In [None]:
size = 640
model_path = "/content/drive/MyDrive/Licence_plate_detection_model/runs/detect/train2/weights/best.onnx"

#ort_session = ort.InferenceSession(model_path)
#result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)


# Specify the directory path
dir_path = "/content/"

# Get a list of all files in the directory
image_files = [f for f in os.listdir(dir_path) if os.path.isfile(os.path.join(dir_path, f))]

# Loop through all the image files
for image_file in image_files:
    # Construct the full image path
    image_path = os.path.join(dir_path, image_file)

    # Your existing code here
    ort_session = ort.InferenceSession(model_path)
    result_img, license_plate_image, processed_cropped_image = yolo_predictions(image_path, size, ort_session)

    license_plate_image_resized = cv2.resize(np.array(license_plate_image), (result_img.shape[1], result_img.shape[0]//2))
    processed_cropped_image_resized = cv2.resize(np.array(processed_cropped_image), (result_img.shape[1], result_img.shape[0]//2))

    processed_cropped_image_resized_rgb = cv2.cvtColor(processed_cropped_image_resized, cv2.COLOR_GRAY2RGB)
    right_image = np.vstack((license_plate_image_resized, processed_cropped_image_resized_rgb))

    if right_image.shape[0] > result_img.shape[0]:
        result_img = cv2.resize(result_img, (result_img.shape[1], right_image.shape[0]))
    else:
        right_image = cv2.resize(right_image, (right_image.shape[1], result_img.shape[0]))

    combined_image = np.hstack((result_img, right_image))

    fig = px.imshow(combined_image)
    fig.update_layout(width=700, height=400)
    fig.update_xaxes(showticklabels=False).update_yaxes(showticklabels=False)
    fig.show()