GitHub - Roll-Face/table_extraction: extract information from tubular data

Architecture

Table detection: Using SOTA detectron2
Table Line: Using architecture Unet + rule base
OCR: Using SOTA easyocr

Train

Prepare dataset:

Data is private not public, you can learn on internet about tabular data, You can label data by labelme (wkentaro/labelme: Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation). (github.com))

Refers datasets:

Training

Config params: file base_config.yaml

bash sh scripts/train.sh

Demo Table Line

bash sh scripts/infer_table_line.sh

Step 1: Table detection

Step 2: Table Line

Input:

Output:

Table OCR

Step 1: Table detection

Step 2: Table line

Step 3: Crop image according line

Step 4: OCR

Step 5: Save file csv/excel

sh scripts/infer_table_ocr.sh

Input: ./datasets/demo_examples/demo2.png

Output: ./results/demo.csv

Docker

docker run --name table_extraction nam157/table_extraction:v1.0.0

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
configs		configs
data		data
datasets		datasets
detectron2_model		detectron2_model
image/README		image/README
installation		installation
pre-trained		pre-trained
results		results
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dockerfile		dockerfile
main.py		main.py
models.py		models.py
table_detect.py		table_detect.py
table_line.py		table_line.py
table_ocr.py		table_ocr.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Architecture

Train

Prepare dataset:

Training

Demo Table Line

Table OCR

Docker

References

About

Releases 1

Packages

Languages

License

Roll-Face/table_extraction

Folders and files

Latest commit

History

Repository files navigation

Architecture

Train

Prepare dataset:

Training

Demo Table Line

Table OCR

Docker

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages