ExtractTable - API to extract tabular data from images and scanned PDFs
The motivation is to make it easy for developers to extract tabular data from images or scanned PDF files without worrying about the table area, column coordinates, rotation et al.
Before we talk/boast about the service, a developer MUST need an API key to use the ExtractTable service. FREE credits here - check data privacy in FAQ.
pip install -U ExtractTable
Ok, enough selling. Let the ease in coding do the talk, and the output encourages you to buy credits - put that timer on and count the LOC.
from ExtractTable import * et_sess = ExtractTable(api_key=YOUR_API_KEY) # Replace your VALID API Key here print(et_sess.check_usage()) # Checks the API Key validity as well as shows associated plan usage table_data = et_sess.process_file(filepath=Location_of_Image_with_Tables, output_format="df") # To process PDF, make use of pages ("1", "1,3-4", "all") params in the read_pdf function table_data = et_sess.process_file(filepath=Location_of_PDF_with_Tables, output_format="df", pages="all")
Woahh, as simple as that ?!
Certainly. Do you know the current ExtractTable users use it on
- Bank Statement
- Medical Records
- Invoice Details
- Tax forms
Its up to you now to explore the ways.
Whatelse is in the store.
ExtractTable._OUTPUT- check the list of available output formats
et_sess.ServerResponse.json()- check the latest Actual ServerResponse attached to the session
Pull Requests & Rewards
Pull requests are most welcome and greatly appreciated with API credits.
This project is licensed under the Apache License 2.0, see the LICENSE file for details.
Follow us on Social media for library updates and free credits.