Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement for fixed table Pdfs #301

Closed
rannj005 opened this issue Mar 29, 2019 · 5 comments
Closed

Performance improvement for fixed table Pdfs #301

rannj005 opened this issue Mar 29, 2019 · 5 comments

Comments

@rannj005
Copy link

rannj005 commented Mar 29, 2019

I am trying to extract data using lattice flavor for pdfs. The table data is fixed it has fixed boundary as well as fixed column width for each column. Have tried specifying the table_areas parameter as well. Is there a way to improve the performance(Time Taken) of extraction given that table boundaries and columns are fixed? Have attached sample pdf. Please suggest a way out.
sample.pdf

@anakin87
Copy link

What do you mean for performance?
Time, quality of results...

@rannj005
Copy link
Author

The time taken is more. Quality is perfect.

@rannj005
Copy link
Author

rannj005 commented Apr 1, 2019

Please mention the ways to improve the speed of pdf table extraction.

@vinayak-mehta
Copy link
Contributor

@rannj005 Let me look into this.

@vinayak-mehta
Copy link
Contributor

camelot-dev/camelot#20 could improve performance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants