Skip to content

An opencv based script which could take/capture an image of a hard copy and convert it into text/audio output. Primary goal was to enable Blind people to analyze text content via audio outputs

Rapternmn/Power-Vision

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Power-Vision

Primary goal was to enable Blind people to analyze text content via audio outputs

We applied following methodology to achive this goal :

  1. Detect and extract a page by applying four-point transform method

  2. Identify multiple columns in the page by applying morphological transforms(Erosion + Dilation)

  3. Crop the images and pass them sequentially into pytesseract ocr to get an appropriate text output.

  4. Converting text to speech.

Example :

-Find a test image : half.jpg

-Find a sequence of cropped-output images in the folder "Crop Outputs"

It shows the accuracy of boundary detection and cropping accuracy+sequencing of the images

Demo video : https://www.youtube.com/watch?v=CcR5tph-pm4

About

An opencv based script which could take/capture an image of a hard copy and convert it into text/audio output. Primary goal was to enable Blind people to analyze text content via audio outputs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages