OCR_with_TFOD_and_EasyOCR

TFOD and EasyOCR for a robust OCR engine

EasyOCR

EasyOCR is a deep learning model trained for OCR(optical character recognition). It's code base is based on the pytorch framework. The model is able to recognize 83+ languages.

Introduction

Optical character recognition is the conversion of images of typed, handwritten, or printed text into machine-encoded text, whether from a scanned document, a photo of a document, a scene photo, or subtitle text superimposed on an image. The OCR application developed here combines TFOD and EasyOCR to create a robust OCR system.

This README is a brief walkthrough of the major steps carried out to create this application. Refer to TFOD_and_EasyOCR.ipynb for the full procedures

I used the labellimg tool to label and annotate my images. My images are saved in the pascalVOC format and transformed to TFRecords to be fed into the TFOD pipeline.

Steps

step 1 - Download the TFOD repo and requirments

if not os.path.exists(os.path.join(paths['APIMODEL_PATH'], 'research', 'object_detection')):
    !git clone https://github.com/tensorflow/models {paths['APIMODEL_PATH']}

After that we:

Install TFOD
Install Dependencies
Run Verification Script
Creat Label map and TFRecords
Train and Evaluate the model

step 2 - Install EasyOCR and Import it to our enviroment

!pip install easyocr

import easyocr

Step 3 - Filter the detections from our TFOD model

scores = list(filter(lambda x: x >thresh, detections['detection_scores']))
boxes = detections['detection_boxes'][:len(scores)]
classes = detections['detection_classes'][:len(scores)]

step 4 - Make inference on the OCR Model

Now we loop throug the detection(s) to get our final text recognition.

Note: We need to Renormalize the detection box:
The coordinates of the bounding box from the output of the TFOD pipeline needs to be renormalized in other to correspond with the original image size. This is done because the image document fed into the TFOD model was pre-processed and transformed. This reduces the image size and now the final Output bounding box coordinates now reflects the size of the pre-processed image, which is not what we want.

height, width = image_np_with_detections.shape[0], image_np_with_detections.shape[1]

for idx, box in enumerate(boxes):
  roi = box * [height, width, height, width]
  region = image_np_with_detections[int(roi[0]) : int(roi[2]), int(roi[1]) : int(roi[3])]
  ocr_result = reader.readtext(region)
  print(ocr_result)

Credits

EasyOCR
labelImg

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md
TFOD_and_EasyOCR.ipynb		TFOD_and_EasyOCR.ipynb
annotating (1).gif		annotating (1).gif
easyocr.png		easyocr.png
partition_dataset.py		partition_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OCR_with_TFOD_and_EasyOCR

EasyOCR

Introduction

Steps

step 1 - Download the TFOD repo and requirments

step 2 - Install EasyOCR and Import it to our enviroment

Step 3 - Filter the detections from our TFOD model

step 4 - Make inference on the OCR Model

Credits

Tweet me at Dike Nnamaka

About

Releases

Packages

Languages

Nnamaka/OCR_with_TFOD_and_EasyOCR

Folders and files

Latest commit

History

Repository files navigation

OCR_with_TFOD_and_EasyOCR

EasyOCR

Introduction

Steps

step 1 - Download the TFOD repo and requirments

step 2 - Install EasyOCR and Import it to our enviroment

Step 3 - Filter the detections from our TFOD model

step 4 - Make inference on the OCR Model

Credits

Tweet me at Dike Nnamaka

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages