Skip to content

Joyli66/read-table-image

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Read-table-image

This project translates table images to editable texts and outputs a structured JSON files.

Requirements

Python 3.7
OpenCV 3.4.2
Tesseract 5.0.0
Jsonschema 3.0.2

Usage

Make some changes in read_table_image.py first and then run it directly.

  1. Change imgPath and target_dir to your own path of the input images and the output location.
  2. To recognize special characters, download the 'traineddata' files in tessdata or fine-tune the model through Tesseract. In the defined function img2text, add the name of the traineddata you need to the parameter '-lang'. Use '+' between different traineddata.
  3. Save processed images during the processes by indications in the comments.
  4. The output is a structured JSON file named as 'image name + _table.json'. An image showing cell detection result in a table is also provided.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages