<p id="part0"></p>
<p style="font-family: Arials; line-height: 2; font-size: 24px; font-weight: bold; letter-spacing: 2px; text-align: center; color:rgb(255, 255, 255)">AUTOMATIC LICENSE NUMBER PLATE DETECTION AND RECOGNITION</p>
<img src= "https://github.com/Asikpalysik/Automatic-License-Plate-Detection/blob/main/Presentation/Notebook1.png?raw=true" width="40%" align="center"  hspace="5%" vspace="5%"/>



<p id="part2"></p>

# <span style="font-family: Arials; font-size: 20px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">1. INTRODUCTION</span>
<hr style="height: 0.5px; border: 0; background-color:rgb(255, 255, 255)">


In today’s world, ensuring security is of paramount importance, and automating these processes can lead to significant improvements. This project demonstrates an approach to automatic license plate detection and recognition using Python. The system leverages OpenCV for detecting license plates in vehicle images and utilizes PyTesseract for extracting the alphanumeric characters from the plates. Experimental results indicate that our approach can achieve an accuracy of approximately 90–95% even on low-resolution images.


<p id="part4"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">1.2 PROJECT ARCHITECTURE</span>

<img src= "https://github.com/Asikpalysik/Automatic-License-Plate-Detection/blob/main/Presentation/Notebook3.png?raw=true" width="40%" align="center"  hspace="5%" vspace="5%"/>

The project architecture is composed of several modules: </p> <ul> <li><strong>Data Collection:</strong> Gather vehicle images with visible license plates.</li> <li><strong>Data Preprocessing:</strong> Process and annotate the images to mark the license plate regions.</li> <li><strong>License Plate Detection:</strong> Use OpenCV-based image processing techniques to accurately detect and crop the license plate (Region of Interest).</li> <li><strong>OCR:</strong> Apply PyTesseract to extract the textual information from the detected license plates.</li> <li><strong>Pipeline Integration:</strong> Consolidate all the modules into a streamlined recognition workflow.</li> </ul> 

<img src= "https://github.com/Asikpalysik/Automatic-License-Plate-Detection/blob/main/Presentation/Notebook4.png?raw=true" width="40%" align="center"  hspace="5%" vspace="5%"/>


<p id="part6"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">2.1 UNDERSTAND & COLLECT REQUIRED DATA</span>

For this project, a labeled dataset was used to train and evaluate the license plate recognition system. The dataset consists of numerous vehicle images where the license plate is clearly visible. You can view sample images from the dataset

<img src= "https://github.com/Asikpalysik/Automatic-License-Plate-Detection/blob/main/Presentation/Notebook5.png?raw=true" width="40%" align="center"  hspace="5%" vspace="5%"/>



<p id="part8"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">2.3 PARSING INFORMATION FROM XML</span>

Example of our .xml files will look as below.
```
<annotation>
    <folder>images</folder>
    <filename>N1.jpeg</filename>
    <path>/Users/asik/Desktop/ANPR/imagesN1.jpeg</path>
    <source>
        <database>Unknown</database>
    </source>
    <size>
        <width>1920</width>
        <height>1080</height>
        <depth>3</depth>
    </size>
    <segmented>0</segmented>
    <object>
        <name>number_plate</name>
        <pose>Unspecified</pose>
        <truncated>0</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin>1093</xmin>
            <ymin>645</ymin>
            <xmax>1396</xmax>
            <ymax>727</ymax>
        </bndbox>
    </object>
</annotation>
```
The labeled dataset includes XML files that contain annotations for each image. These XML files provide crucial details such as image metadata and the bounding box coordinates (xmin, ymin, xmax, ymax) that define the location of the license plate. An example annotation is shown below:


<p id="part9"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">2.4 PARSING DATA FROM XML AND CONVERTING IT INTO CSV</span>


2.4 CONVERTING XML DATA TO CSV </h2> <p> To prepare the data for further processing, it is necessary to convert the XML annotations into a more convenient format such as CSV. The XML files contain the bounding box details (xmin, ymin, xmax, ymax) for the license plates. By using Python’s <code>xml.etree.ElementTree</code> library along with <code>pandas</code> and <code>glob</code>, the useful annotation details are extracted and saved to a CSV file. This CSV is then easily transformed into an array using Pandas for subsequent processing. </p>

In [None]:
#importing necessary libraries 

import os
import cv2
import numpy as np
import pandas as pd
import tensorflow as tf
import pytesseract as pt
import plotly.express as px
import matplotlib.pyplot as plt
import xml.etree.ElementTree as xet

from glob import glob
from skimage import io
from shutil import copy
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import TensorBoard
from sklearn.model_selection import train_test_split
from tensorflow.keras.applications import InceptionResNetV2
from tensorflow.keras.layers import Dense, Dropout, Flatten, Input
from tensorflow.keras.preprocessing.image import load_img, img_to_array

In [2]:
# extravting the XML data in order to the xy/min/max data of the bounding boxes 
path = glob("../car-plate-/images/*.xml") # '../code/car-plate/images/*.xml'
labels_dict = dict(filepath=[],xmin=[],xmax=[],ymin=[],ymax=[])
for filename in path:

    info = xet.parse(filename)
    root = info.getroot()
    member_object = root.find('object')
    labels_info = member_object.find('bndbox')
    xmin = int(labels_info.find('xmin').text)
    xmax = int(labels_info.find('xmax').text)
    ymin = int(labels_info.find('ymin').text)
    ymax = int(labels_info.find('ymax').text)

    labels_dict['filepath'].append(filename)
    labels_dict['xmin'].append(xmin)
    labels_dict['xmax'].append(xmax)
    labels_dict['ymin'].append(ymin)
    labels_dict['ymax'].append(ymax)

In [3]:
df = pd.DataFrame(labels_dict)
df.to_csv('labels.csv',index=False)
df

Unnamed: 0,filepath,xmin,xmax,ymin,ymax
0,../car-plate-/images\N1.xml,1093,1396,645,727
1,../car-plate-/images\N100.xml,134,301,312,350
2,../car-plate-/images\N101.xml,31,139,128,161
3,../car-plate-/images\N102.xml,164,316,216,243
4,../car-plate-/images\N103.xml,813,1067,665,724
...,...,...,...,...,...
220,../car-plate-/images\N95.xml,23,408,173,391
221,../car-plate-/images\N96.xml,137,352,141,186
222,../car-plate-/images\N97.xml,175,290,228,255
223,../car-plate-/images\N98.xml,563,675,207,238


With the above code, we successfully extract the diagonal position of each image and convert the data from an unstructured to a structured format. You can have a look at the data above. 

For the next step, let's extract the actual image that corresponds to the bouding box's coordenents 

In [4]:
filename = df['filepath'][0]

def getFilename(filename):
    filename_image = xet.parse(filename).getroot().find('filename').text
    filepath_image = os.path.join('../car-plate-/images',filename_image)
    return filepath_image
getFilename(filename)

'../car-plate-/images\\N1.jpeg'

In [5]:
image_path = list(df['filepath'].apply(getFilename))
image_path[:10]#random check

['../car-plate-/images\\N1.jpeg',
 '../car-plate-/images\\N100.jpeg',
 '../car-plate-/images\\N101.jpeg',
 '../car-plate-/images\\N102.jpeg',
 '../car-plate-/images\\N103.jpeg',
 '../car-plate-/images\\N104.jpeg',
 '../car-plate-/images\\N105.jpeg',
 '../car-plate-/images\\N106.jpeg',
 '../car-plate-/images\\N107.jpeg',
 '../car-plate-/images\\N108.jpeg']

<p id="part10"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">2.5 VERIFY THE DATA</span>

As till now we did the manual process it is important to verify the information is we got is valid or not. For that just verify the bounding box is appearing properly for a given image. Here I consider the image N2.jpeg and the corresponding diagonal position can found in df. Result you can see on *Figure 8*

In [6]:
file_path = image_path[1] #path of our image N2.jpeg
img = cv2.imread(file_path) #read the image 1093,1396,645,727
# xmin-1804/ymin-1734/xmax-2493/ymax-1882 
img = io.imread(file_path) #Read the image 134,301,312,350
fig = px.imshow(img)
fig.update_layout( width=600, height=500, margin=dict(l=10, r=10, b=10, t=10),xaxis_title='Figure 8 - N2.jpeg with bounding box')
fig.add_shape(type='rect',x0=134, x1=301, y0=312, y1=350, xref='x', yref='y',line_color='cyan')

### Alright so at this point we will be training the model, please go check out the second notebook "model-train.ipynb" (the name kinda spoils it haha)



<p id="part18"></p>

# <span style="font-family: Arials; font-size: 20px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">5. PIPELINE OBJECT DETECTION MODEL</span>

<p id="part19"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">5.1 MAKE PREDICTIONS</span>

So by this point, we have the pictures, the bounding boxes, and the trained model that would predict where the bounding boxes for new pictures go.

This is the final step in object detection. In this step, we will put it all together and get the prediction for a given image. First, I would like to try with one of my test pictures of car. Let load our model.

In [7]:

# Load model
model = tf.keras.models.load_model('./object_detection.keras')
print('Model loaded Sucessfully')


Model loaded Sucessfully


Next is loading our TEST picture with right path to it. I loaded some more images for this purpose  only - folder __TEST__.

In [8]:
path = '../car-plate-/images/N1.jpeg'  
image = load_img(path) # PIL object
image = np.array(image,dtype=np.uint8) # 8 bit array (0,255)     
image1 = load_img(path,target_size=(224,224))
image_arr_224 = img_to_array(image1)/255.0  # Convert into array and get the normalized output

# Size of the orginal image
h,w,d = image.shape
print('Height of the image =',h)
print('Width of the image =',w)


Height of the image = 1080
Width of the image = 1920


Now we can have a look at our image *Figure 13*

In [9]:
fig = px.imshow(image)
fig.update_layout(width=700, height=500,  margin=dict(l=10, r=10, b=10, t=10), xaxis_title='Figure 13 - TEST Image')

So, let's look into the shape of my image.

In [10]:
image_arr_224.shape

(224, 224, 3)

But in order to pass this image of a model, we need to provide the data in the dynamic fourth dimension. And what one indicates is a number of images. So here we are just passing only one image.

In [11]:
test_arr = image_arr_224.reshape(1,224,224,3)
test_arr.shape

(1, 224, 224, 3)

<p id="part20"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">5.2 DE-NORMALIZE THE OUTPUT</span>

In [12]:
# Make predictions
coords = model.predict(test_arr)
coords

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 10s/step


array([[0.58089465, 0.7253995 , 0.5982538 , 0.67948294]], dtype=float32)

We have got the output from the model and output what we got is the normalized output. So, what we need to do is to convert back into our original form values, which actually we did in during the training process, in the training process, we have the original form values and convert that normalized one. So basically, we will de-normalize the values back.

In [13]:
# Denormalize the values
denorm = np.array([w,w,h,h])
coords = coords * denorm
coords

array([[1115.31772614, 1392.76702881,  646.11408949,  733.84157181]])

<p id="part21"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">5.3 BOUNDING BOX</span>

Now we will draw bounding box on top of the image. I just want to provide the two diagonal points. Let's make use of these points and let's draw the rectangle box.

In [14]:
coords = coords.astype(np.int32)
coords

array([[1115, 1392,  646,  733]], dtype=int32)

In [15]:
# Draw bounding on top the image
xmin, xmax,ymin,ymax = coords[0]
pt1 =(xmin,ymin)
pt2 =(xmax,ymax)
print(pt1, pt2)

(np.int32(1115), np.int32(646)) (np.int32(1392), np.int32(733))


In [16]:
cv2.rectangle(image,pt1,pt2,(0,255,0),3)
fig = px.imshow(image)
fig.update_layout(width=700, height=500, margin=dict(l=10, r=10, b=10, t=10))

<p id="part22"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">5.4 CREATE PIPELINE</span>

for the creation of the pipeline, let's put it all together in one place and create a function, that will let us visualize it and thus get as an outsput the image and coordinates of it's bounding box.

In [17]:
# Create pipeline
path = '../car-plate-/images/N189.jpeg'
def object_detection(path):
    
    # Read image
    image = load_img(path) # PIL object
    image = np.array(image,dtype=np.uint8) # 8 bit array (0,255)
    image1 = load_img(path,target_size=(224,224))
    
    # Data preprocessing
    image_arr_224 = img_to_array(image1)/255.0 # Convert to array & normalized
    h,w,d = image.shape
    test_arr = image_arr_224.reshape(1,224,224,3)
    
    # Make predictions
    coords = model.predict(test_arr)
    
    # Denormalize the values
    denorm = np.array([w,w,h,h])
    coords = coords * denorm
    coords = coords.astype(np.int32)
    
    # Draw bounding on top the image
    xmin, xmax,ymin,ymax = coords[0]
    pt1 =(xmin,ymin)
    pt2 =(xmax,ymax)
    print(pt1, pt2)
    cv2.rectangle(image,pt1,pt2,(0,255,0),3)
    return image, coords

image, cods = object_detection(path)

fig = px.imshow(image)
fig.update_layout(width=700, height=500, margin=dict(l=10, r=10, b=10, t=10),xaxis_title='Figure 14')

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 191ms/step
(np.int32(177), np.int32(120)) (np.int32(381), np.int32(184))


<p id="part23"></p>

# <span style="font-family: Arials; font-size: 20px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">6. OPTICAL CHARACTER RECOGNITION - OCR</span>
<hr style="height: 0.5px; border: 0; background-color: #000000">

<p id="part24"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">6.1 TESSERACT OCR</span>

Optical character recognition (OCR) is a software that is used to extract text from the image. Tesseract OCR has a python API and it is open source. Firstly, we will install it. It's pretty simple and depends on you OS. You can find the manual and files to download here [here](https://guides.library.illinois.edu/c.php?g=347520&p=4121425).

<p id="part25"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">6.2 LIMITATIONS OF PYTESSERACT</span>

Tesseract works best when there is a clean segmentation of the foreground text from the background. In practice, it can be extremely challenging to guarantee these types of setups. There are a variety of reasons you might not get good quality output from Tesseract like if the image has noise on the background. The better the image quality (size, contrast, lightning) the better the recognition result. It requires a bit of preprocessing to improve the OCR results, images need to be scaled appropriately, have as much image contrast as possible, and the text must be horizontally aligned. Tesseract OCR is quite powerful but does have the following limitations.

__Tesseract limitations summed in the list.__
<ul>
  <li>The OCR is not as accurate as some commercial solutions available to us.</li>
  <li>Doesn't do well with images affected by artifacts including partial occlusion, distorted perspective, and complex background.</li>
  <li>It is not capable of recognizing handwriting.</li>
  <li>It may find gibberish and report this as OCR output.</li>
  <li>If a document contains languages outside of those given in the -l LANG arguments, results may be poor.</li>  
  <li>It is not always good at analyzing the natural reading order of documents. For example, it may fail to recognize that a document contains two columns, and may try to join text across columns.</li>
  <li>Poor quality scans may produce poor quality OCR.</li>
  <li>It does not expose information about what font family text belongs to.</li>
</ul>

<p id="part26"></p>

# <span style="font-family: Arials; font-size: 16px; font-style: normal; font-weight: bold; letter-spacing: 3px; text-align: center; color:rgb(255, 255, 255); line-height:1.0">6.3 EXTRACT NUMBER PLATE TEXT FROM IMAGE</span>

Firstly, we will load our image and convert it to an array. Crop our bounding box using it's coordinates. We will identify region of interest (ROI) and have look at our cropped image *Figure 15*.

In [18]:
img = np.array(load_img(path))
xmin ,xmax,ymin,ymax = cods[0]
roi = img[ymin:ymax,xmin:xmax]
fig = px.imshow(roi)
fig.update_layout(width=350, height=250, margin=dict(l=10, r=10, b=10, t=10),xaxis_title='Figure 15 Cropped image')

In [20]:
import pytesseract as pt

# Specify the full path to the Tesseract executable:
pt.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'

In [21]:
# extract text from image
text = pt.image_to_string(roi)
print(text)

TN74 AL 5074



## Next steps to be taken in the future 

1- I liked using the ResNet, but id also like to try out YOLO, as ive heard its better (so i'd have to look into that) 

2- As marked in the project architecture in the intro, i want to use Flask in order to create some sort of SaaS that would make the licence-plate-recognizer process work as a webapp 

