Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: max() arg is an empty sequence #29

Closed
SebastianDeLaile opened this issue Jul 10, 2019 · 6 comments
Closed

ValueError: max() arg is an empty sequence #29

SebastianDeLaile opened this issue Jul 10, 2019 · 6 comments

Comments

@SebastianDeLaile
Copy link

SebastianDeLaile commented Jul 10, 2019

When running on this document (https://www.qao.qld.gov.au/sites/qao/files/annual-reports/annual_report_2016-17.pdf), when it reaches page 4, it throws the following ValueError:

import camelot
camelot.read_pdf(path, pages='3', flavor='stream')

Traceback (most recent call last):
File "", line 2, in
File "C:\Users\sdelail\AppData\Local\Continuum\anaconda3\envs\Financial_Extraction\lib\site-packages\camelot\io.py", line 117, in read_pdf
**kwargs
File "C:\Users\sdelail\AppData\Local\Continuum\anaconda3\envs\Financial_Extraction\lib\site-packages\camelot\handlers.py", line 172, in parse
p, suppress_stdout=suppress_stdout, layout_kwargs=layout_kwargs
File "C:\Users\sdelail\AppData\Local\Continuum\anaconda3\envs\Financial_Extraction\lib\site-packages\camelot\parsers\stream.py", line 458, in extract_tables
cols, rows = self._generate_columns_and_rows(table_idx, tk)
File "C:\Users\sdelail\AppData\Local\Continuum\anaconda3\envs\Financial_Extraction\lib\site-packages\camelot\parsers\stream.py", line 349, in _generate_columns_and_rows
ncols = max(set(elements), key=elements.count)
ValueError: max() arg is an empty sequence

Easy enough to capture with a try/except but thought I would pop it up here to let you know
Thanks for writing this package, excellent work!

@akshowhini
Copy link

@SebastianDeLaile That looks like the problem with 3rd page, may be because the page is image but not text

@vinayak-mehta
Copy link
Member

Yep the 3rd page looks like an image.

@dimitern
Copy link
Contributor

dimitern commented Aug 2, 2019

I had this exception also when there are no tables recognized

@amansani
Copy link

I have the same problem when there is only header and no data under it or when the tables are not recognized. Can someone tell us how to fix this. Thanks.

@amansani
Copy link

amansani commented Jul 8, 2020

As a temporary fix, I added try exception blocks to skip the table when something occurs like this.

@baridhi
Copy link

baridhi commented Aug 11, 2020

I also faced this issue. I think I'm also going to work with a temporary fix using try-except block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants