Text-Extraction-Table-Image

This project aims to extract text from a table image into python objects. Below is a result of the detection:

Prerequisites/Dependencies

OpenCV => 2.4.8
Numpy
PyTesseract

Idea Behind The Code

I've publisehed the documentation on my website. Please read it to understand the idea behind the code.

For Refinement

After your algorithm can detect the text successfully, now you can save it into Python object such as Dictionary or List. Some regions name (in the “Kabupaten/Kota” are failed to be detected precisely, since it is not included in Tesseract training data. However, it shouldn’t be a problem as the regions’ indexes can be detected precisely. Also, this text extraction might fail to detect the text in other fonts, depending on the font used. In case of misinterpretation, such as “5” is detected as “8”, you can do an image processing such as eroding and dilating.

My code is far from perfect, if you find some error or chances of refinement, write me a comment!

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
images		images
scripts		scripts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

images

images

scripts

scripts

README.md

README.md

Repository files navigation

Text-Extraction-Table-Image

Prerequisites/Dependencies

Idea Behind The Code

For Refinement

About

Releases

Packages

Languages

fazlurnu/Text-Extraction-Table-Image

Folders and files

Latest commit

History

Repository files navigation

Text-Extraction-Table-Image

Prerequisites/Dependencies

Idea Behind The Code

For Refinement

About

Resources

Stars

Watchers

Forks

Languages