use table extractor with table_areas #367

CartierPierre · 2019-07-26T14:38:00Z

Hi,
I'm having an issue with the table areas. When I use the webserver to select an area to extract, I can get the table.
But it is not corresponding with an other area selection in pdf with an other software (adobe, pdfviewer or pdf2img).

So when I put the table_areas param with the true values (which I supposed are from adobe, and other 😄 ), camelot is extracting the wrong areas.
Is it possible to unify this ?

In addition, why making a list of string ["area1", "area2"] instead of list of list [[area1],[area2]].
It is memory lighter and don't need to split on string to extract x1,y1,x2,y2

Can you take a look ?

PS : I sent you a mail about a new method mixed of Lattice and Stream

CartierPierre · 2019-07-26T14:47:31Z

camelot/camelot/utils.py

Line 193 in 0efb3ca

def scale_pdf(k, factors):

Is it possible to pass throught this ? If the area is already image scaled.

vinayak-mehta · 2019-07-28T12:13:32Z

Closed in favor of camelot-dev/camelot#40.

vinayak-mehta mentioned this issue Jul 28, 2019

Unify table_area input with output from Adobe PDF Viewer camelot-dev/camelot#40

Open

vinayak-mehta closed this as completed Oct 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

use table extractor with table_areas #367

use table extractor with table_areas #367

CartierPierre commented Jul 26, 2019 •

edited

Loading

CartierPierre commented Jul 26, 2019

vinayak-mehta commented Jul 28, 2019

use table extractor with table_areas #367

use table extractor with table_areas #367

Comments

CartierPierre commented Jul 26, 2019 • edited Loading

CartierPierre commented Jul 26, 2019

vinayak-mehta commented Jul 28, 2019

CartierPierre commented Jul 26, 2019 •

edited

Loading