Skip to content

A python service that uses pytesseract wrapper for google tesseract to optically read images and return back detected values and fields.

Notifications You must be signed in to change notification settings

Rahul30032/flask_ocr_pytesseract

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

flask_ocr_pytesseract

A python service that uses pytesseract wrapper for google tesseract to optically read images and return back detected values and fields, through the FLASK api. The Api accepts Base64 encoded images as input, processes the image with Teseract and returns back detected text. This project was meant to focus solely in detecting texts(fields and vlaues) in the govt. id cards like driving licence,pan card,aadhar card(samples for each in Images folder including the images,ther base64 encoding and screenshot for result/extracted text so obtained).

Requirements/libraries:

Testing the application:

  • clone or download the repo and move into the repo folder
  • make sure all the requirements/installations are met.(activating an environment is suggested )
  • run the app.py script: $ python3 app.py
  • now encode the image to base64 (can use Base64 image encoder) or can try sample base64 urls present in the txt files in the images folder of the repo.
  • Valid base64 input( "data:image/png;base64,"+ output from encoder (for PNG images)) will be processed and extracted text output will be displayed.(see sample image below)

    Sample IMage

Tesseract 4.x installation

If you're using ubuntu 16.04 or earlier version then by default tesseract 3 is installed using the commands in the official documentation. I followed the follwing link to install tesseract 4.x and corresponding version of leptonica(1.74 or higher). Link to adding languages to tesseract 4.x You can also install through ppa:

$ sudo add-apt-repository ppa:alex-p/tesseract-ocr 
$ sudo apt-get update
$ sudo apt install tesseract-ocr

Links to Resources followed :

Possible Improvements/Changes:

  • The tesseract model can be fine tuned and trained to specialise in a specific dataset/domain(like govt. IDs here)
  • Can detect more than one language present in the image with small changes in the code.
  • Further improve on the aesthetics and appearances of the template and it's interaction with API.(For example can work on segregating fields and values from text outputs for better display in our case)

About

A python service that uses pytesseract wrapper for google tesseract to optically read images and return back detected values and fields.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published