Skip to content

Using OpenCV and Tesseract OCR to process a text image to detect the words in the image

Notifications You must be signed in to change notification settings

walkershashi/Text-Recognition

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text-Recognition Using OpenCV and Tesseract OCR

Computer don't have the ability to process the videos and images on their own like humans. We can interpret any kind of images to a large extent but the machines lack this ability.

Computer Vision gives them this ability to interpret digital images and videos. Computer Vision is about teaching the computer to see and interpret and process these images for better understanding. Computer Vision is about how computer can gain high level of understanding from digital images.

Field of Use:

  • Self Driving Cars
  • Facial Recognition
  • Malicious object detection for security purpose

OpenCV is an open source computer vision library for image processing, Machine Learning. It is a cross platform library.Some of its use cases are:

  • It includes interfaces for C, C++, Java, and Python
  • It is used to process static images
  • It is also used to process offline videos and/or streaming videos

Tesseract OCR is an optical character recognition engine, which has the ability to recognize words and text files.

In this project I have used OpenCV and Tesseract OCR to process a text image to detect the words in the image.
The project work includes

  • Importing an Image from the local computer, or
  • Downloading an Image over the Internet using urllib.request module
  • When the image is loaded it is read using imread() function from the cv2 module. The image read by the imread function is by default in BGR format and needs to be converted to Gray scale image before processing for better text recognition. After the image is converted into the desired scale of colors it is resized and then using the Gaussian Blur technique it is formatted. After formatting of the image, the image is then passed through the Tesseract OCR to return a string which is then processed to return the final text contained in the image.

    Visual Represenation


    I NEVER DREAMED ABOUT SUCCESS. I WORKED FOR IT


    My love said she would marry only me
    And Jove himself could not make her care. For what women say to lovers, you'll agree lone writes on running water or on air.

    Sin of self-love possesseth all mine eye And all my soul and all my every part;
    land for this sin there is no remedy,
    It is so grounded inward in my heart.

    About

    Using OpenCV and Tesseract OCR to process a text image to detect the words in the image

    Resources

    Stars

    Watchers

    Forks

    Releases

    No releases published

    Packages

    No packages published