License Plate Character Extractor

LPCE: A simple yet useful tool built to extract only the alphanumerical characters from a license plate image.

The reason behind this script was given by the necessity of extracting only the characters from a license plate to fine-tune an OCR neural network model like Tesseract, both for train the netowrk on how to recognize and correctly identify characters from a license plate and for applying a solid preprocessing before perform the OCR.

Compatibilities

The extraction does work well with all kind of European plates and such, disposed on one line (as the above example). Plates not supported at the moment are ones from Belgium, Luxembourg and Netherlands.

Features

This script comes with five different methods of extraction, based on different needs:

Extraction methods

Plate Exact Grayscale: Given a license plate, will produce an image containing only the exact character from the original plate. Note: This method works really good with good resolution LP since it will build a mask containing the cropped characters from the original LP.
Plate Binarization: Given a license plate will produce the most accurate binarization.
Plate Enhance Grayscale: Given a license plate will produce a cropped, enhanced original plate containing mostly only the area in which the characters are contained.
Plate Binarization Smooth: Given a license plate, will produce a smoothed and thickened version of the binarization plate
Single Character Extraction: It does what you expect; extracts one by one the characters contained into the image, saving each
of them into a separate image. We got two kind of single character extraction:
- Binary: Will give a similar output as the Plate Binarization, but dividing each character into a single image.
  ,,,,,,
- Exact: Will give a similar output as the Plate Exact, but dividing each character into a single image.
  ,,,,,,
ALL: Methods that uses EACH OF THE PREVIOUS methods, calling them one by one and storing the result of each one into the appropriate folder.

Functionalities

This script was made with efficience in mind. Indeed, each of the methods proposed are insanely fast and performant, relying mostly on the Numpy and OpenCV library: this because, the preoprocessing step before the OCR should be as fast as possible.

Display Result: Function that takes in input an image and display every single step of our pre and post processing, useful to understand what the pipeline does (for curiosity or debug purposes). NOTE: can't be used with SINGLECHAR.
Extraction on path: Given a path (a single image or an entire folder), this function will take care of apply the extraction methods on the given input/s. NOTE If a folder has been passed, be sure that the folder itself contains ONLY IMAGES. Others file will make the script not working.
.BOX File creation for Tesseract: Since this script was created having in mind the training of tesseract as main purpose, it will take care of generate .BOX file containing the coordinates of each character into the given plate, that will be needed later into the tesseract training to identify the characters. To take advantage of this functionality, you need to name your file as follow: ID-PLATECHARACTERS.EXT
-> Example: Given the plate used as example, the name should be: 0-FI764WL.ext where ID are progressive into the main folder (0,1,2,3,4...N), FI764WL are the characters contained into the plate and ext can be any extension (jpg,png,tif etc). It will then produce a .BOX file containing coordinates useful for tesseract in training phase. Be careful to the syntax if you want to generate .BOX files!

Usage

First of all, be sure to download/clone the repo and extract the lcpe.py in a folder of your choice (default: current folder) and then follow up the usage section. If you want to extract the lcpe.py file in another directory, be sure to specify that into your script. Additionally, the file example.py contains some of usage examples ready to be executed.
Note that, the script works on image containing only the plate. Images with an entire car or with partially visible plates will not work well.

Basic

This script was also made with an extreme ease of use in mind.
All the functions are organized into a class, which you'll need to import into your python script to perform the extraction.
There are two main methods which can use, and both of them can be used out-of-the-box without any tweaking to it: display_result to show pipeline outputs at each step and apply_extraction_on_path that given a folder or a single image, will take care of creating for you all the output folders in which results will be contained; if specified, can also return the processed plate.
By default, the extraction method used without any tweaking on the flags are both Binarization for the entire plate and the single characters extraction. Let's look for a practical example:
Extraction on a given path:

# Importing class module
from lpce import PlateExtractor
# Generating our istance
extractor = PlateExtractor()
# Apply extraction on a given path (image or an entire folder containing ONLY images)
extractor.apply_extraction_onpath(input_path=path)

Extraction on a single image specifying the return flag (plate must be returned) and setting the write on disk option to false

# Importing class module
from lpce import PlateExtractor
# Generating our istance
extractor = PlateExtractor()
# Apply extraction on a given path (image) and return the processed plate
extracted_plate = extractor.apply_extraction_onpath(input_path=path, ret=True, write=False)

Display Pipeline results:

# Importing class module
from lpce import PlateExtractor
# Generating our istance
extractor = PlateExtractor()
# Display each step of the pipeline on a SINGLE image
extractor.display_result(input_path=path)

Advanced

Instead, if you'd like to tweak the extraction or the display method, there are some flags you should know about: let's look for the entire function call in the class, both for extraction and display:

def apply_extraction_onpath(self, input_path=None, desired_ext='png', precise_masking=True, adaptive_bands=True, ftype=FTYPE.BINARY, stype=STYPE.BINARY, write=True, ret=False)
def display_result(self, input_path=None, precise_masking=True, adaptive_bands=True, ftype=FTYPE.BINARY, stype=STYPE.BINARY, display=True)

Let's now breakdown the parameter list and explain carefully what each of them does:

List of parameters: apply_extraction_on_path

input_path: The path resolving to an image or an entire folder (containing only images)
desired_ext: Desired extension for the output file. default='png'.
precise_masking: Precise masking indicates which type of output will be made by the script. Setting to ON, will produce an image in which the precision of the extraction is built around the character contour. Instead, setting to OFF, will produce an output in which the precision of the result character will be the bounding box containing the character itself. default=True.
adaptive_bands: Adaptive bands will automatically fetch the coordinates of the blue band (if present) into the image that will be used in the script to crop automatically the image considering only the area of the plate in which the character are contained. default: True
ftype: Type of extraction method used to process the plate into an image containing only the characters. default=FTYPE.BINARY
stype: Type of extraction method used to extract the single characters from the plate into single images. default=STYPE.BINARY
ret: Specify if the image should be returned to the function caller or not. WORKS ONLY FOR SINGLE IMAGES. (currently) default=False
write: Specify if the image need to be wrote on disk or not. default=True

List of parameters: display_result

input_path: Same as above
precise_masking: Same as above
adaptive_bands: Same as above
ftype: Same as above
display Flag that indicates if displaying or not the additional pre and post processing steps for a full view of the pipeline process. default=True

Since we saw different methods, there are different flags for each method contained into a comfy enum class. Let's see the structure, both for FTYPE and STYPE:

class FTYPE(Enum):
    ALL = 0
    EXACT = 1
    SMOOTH = 2
    BINARY = 3
    GRAYSCALE = 4
    SINGLECHAR = 5

class STYPE(Enum):
    BINARY = 1
    EXACT = 2

Each of them corresponds to the methods described earlier into the README and will be needed if planning to use the script in a different way than the default behavior. Let's then see an example of an advanced use, with custom flags:
Apply extraction on path:

# Importing class module, FTYPE and STYPE enumS
from lpce import PlateExtractor, FTYPE, STYPE
# Generating our istance
extractor = PlateExtractor()
# Apply extraction on a given path, with precise_masking set to false and grayscale extraction instead of binary
extractor.apply_extraction_onpath(input_path=path, precise_masking=False, ftype=FTYPE.GRAYSCALE)

OR

# Apply extraction on a given path, with adaptive_bands set to false, using ALL extraction methods
# and specifying EXACT method for the single character extraction
extractor.apply_extraction_onpath(input_path=path, precise_masking=False, ftype=FTYPE.GRAYSCALE, stype=STYPE.EXACT)

OR

# Apply extraction on a given path using the single character method extraction
extractor.apply_extraction_onpath(input_path=path, ftype=FTYPE.SINGLECHAR)

Display Result: Basically the approach is the same as above. We'll see just an example to show off:

# Display results of the pipeline of a given image using the Smoothing extraction method, using display=True to show off EVERY step of the pipeline
extractor.display_result(path, ftype=FTYPE.SMOOTH, display=True)

NOTE: Obviously, combinations can be made AS YOU PREFER. No limit on that.

Dependencies

You'll need this modules to run this script:

Opencv (cv2) >3
Numpy
Imutils
TQDM

Known issues

The script was made to work with Italian plates, but works with almost EVERY kind of EUROPEAN CAR plate, as long they presents two common aspects:

If bands are present, they must be BLUE
Plate background should be WHITE (i.e: not working with different background, like yellow).
Sometimes, the median filter break in two distinct part a single plate band: this will cause the method GRAYSCALE to perform a little worse than expected since the coordinate fetched doesn't fit perfectly. Fix for this is soon coming. FIXED

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
test-plates		test-plates
LICENSE		LICENSE
README.md		README.md
example.py		example.py
lpce.py		lpce.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test-plates

test-plates

LICENSE

LICENSE

README.md

README.md

example.py

example.py

lpce.py

lpce.py

Repository files navigation

License Plate Character Extractor

Compatibilities

Features

Extraction methods

Functionalities

Usage

Basic

Advanced

List of parameters: apply_extraction_on_path

List of parameters: display_result

Dependencies

Known issues

About

Releases

Packages

Languages

License

Asynchronousx/License-Plate-Character-Extractor

Folders and files

Latest commit

History

Repository files navigation

License Plate Character Extractor

Compatibilities

Features

Extraction methods

Functionalities

Usage

Basic

Advanced

List of parameters: apply_extraction_on_path

List of parameters: display_result

Dependencies

Known issues

About

Resources

License

Stars

Watchers

Forks

Languages