Skip to content

mihaighidoveanu/Sparktech-Hackathon-Textract

Repository files navigation

Sparktech-Hackathon-Textract

Extract information from pdfs. Turn unstructured data into structured data. http://www.sparktech.ro/textract/

Dependencies : Python 2.7.x Libraries : sklearn.svm glob numpy

How to run the project : run "main.py" python script this script will output the tables found in the test pdf files

About

Extract information from pdfs. Turn unstructured data into structured data. http://www.sparktech.ro/textract/

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages