Skip to content

Provided scripts for deriving training and testing data for table detection tasks.

License

Notifications You must be signed in to change notification settings

MBAigner/Table-Detection-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 

Repository files navigation

Table Detection Data

Provided scripts for deriving training and testing data for table detection tasks.

The data which was intially used can be requested under https://doc-analysis.github.io/. The data cannot be provided here and thus has to be requested and stored in the folder /data/tablebank/Detection_data.

The following scripts are provided:

  • TableBankGTParsing.py retrieves URLs stored in the TableBank data and converts them into CSV. Additionally, the ground-truth is re-formatted into a .csv.
  • WordData.ipynb and TableBankWord.py are checking the accessibility of World URLs.
  • TableBankWord2PDF.ipynb converts retrieved Word files into PDF documents.

About

Provided scripts for deriving training and testing data for table detection tasks.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published