New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multiple table in 1 page #22
Comments
@khun84 Yes, you can specify the page number twice, along with the area (or use the So something like You can pursue the Java approach, but it's really only useful if you know the underlying tabula Java library well; and that is not very well documented anywhere. |
thanks for the clarification...ive tried with Is there any function that can return the entire content of the pdf in a DOM like format? In that case, I can traverse the DOM tree and extract what I want. |
Hi @leeper - I've recently run into similar issues, but with multi-page documents and a random number of tables per page, I found that the 'spreadsheet' method on the command line and/or via Tabula's interface will drag them out. The I've edited the |
Yes, please send a PR! |
Migrated from ropensci/tabulizerjars#1 (@khun84)
Is there param that I can parse in to extract more than 1 table per page?
I have a pdf page with 2 tables:
I use the
extract_table()
function with default param and the output only has 1 table (table 1).What I can think of is to set
method = 'asis'
but I do not know to proceed with the output java object. Is there any documentation I can refer to?The text was updated successfully, but these errors were encountered: