Skip to content

GVCL/ChartDecode

Repository files navigation

ChartDecode

We adapted color based and geometry data extraction of Line, Pie, Scatter plot and it's variants(Simple, Dot, Bubble) in this system.

The data extraction bar charts is based on tensor voting computational model | Tensor-field-framework-for-chart-analysis |

Text Detection and Recognition Module

This module performs text detection and recognition on chart Image. We use a deep-learning-based OCR, namely Character Region Awareness for Text Detection, CRAFT | Paper | Code | succeeded by a scene text recognition framework, STR | Paper | Code |

To run the code

Things to be taken care before runing the code:

  1. Download the pretrained model craft_mlt_25k.pth, and place model at the following path ChartDecode/CRAFT_TextDetector/craft_mlt_25k.pth
  2. Download the pretrained model TPS-ResNet-BiLSTM-Attn.pth, tand place model at the following path ChartDecode/Deep_TextRecognition/TPS-ResNet-BiLSTM-Attn.pth
  3. The code is developed and tested on Python 3.6 you can also find attached requirements.txt to avoid errors due to compatibility issues
  4. Finally you can run the main.py file and provide the path of your chart image file. It generates the following files as output:
    1. data_filename.csv: contains extracted data values along with additional semantic attributes like chart_type, title, x-title, and y-title that helps in chart reconstruction and summarization
    2. Reconstructed_filename.png: The reconstructed image from extracted data_filename.csv file.
    3. summ_filename.txt: The chart text summary generated using templated-NLG approach based on our user-study observations
  5. Also find the synthetically genrerated test data set for this system with it's results at ChartDecode/SYNTHETIC_DATA.

About

MS2019005(Thesis Code) - Data extraction of Line, Pie, Scatter plot and it's variants

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors