Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to extract table or text #27

Closed
atulpuri opened this issue Mar 1, 2017 · 2 comments
Closed

Unable to extract table or text #27

atulpuri opened this issue Mar 1, 2017 · 2 comments

Comments

@atulpuri
Copy link

atulpuri commented Mar 1, 2017

Hi, I have been trying to extract tables using the extract_tables function which was working well until I updated to the newer version.

Most functions are now returning the same error:
ValueError: Cannot convert None to Decimal.

This error occurred when I tried the functions extract_table, extract_tables, find_tables, extract_text and extract_words. I have not changed the table settings from the default. The pdf I tried this on was https://github.com/jsvine/pdfplumber/blob/master/examples/pdfs/ca-warn-report.pdf

Please let me know what may be causing this error and how it can be worked around.

@jsvine
Copy link
Owner

jsvine commented Mar 1, 2017

Hi, @atulpuri, and thanks for opening this issue! Those functions seem to be working fine for me on that PDF. For instance, when I run this code:

import pdfplumber
pdf = pdfplumber.open("ca-warn-report.pdf")
print(pdf.pages[0].extract_table())

... I get this:

[['Notice Date', 'Effective', 'Received', 'Company', 'City', 'No. Of', 'Layoff/Closure'], ['06/22/2015', '0  3  / 2  5  / 2  0  16', '0  7  / 0  1  / 2  0  15', 'Maxim Integrated Product', 'San Jose', '150', 'Closure Permanent'], ['06/30/2015', '0  8  / 2  9  / 2  0  15', '0  7  / 0  1  / 2  0  15', 'McGraw-Hill Education', 'Monterey', '137', 'Layoff Unknown at this time'], ['06/30/2015', '0  8  / 3  0  / 2  0  15', '0  7  / 0  1  / 2  0  15', 'Long Beach Memorial Medical Center', 'Long Beach', '90', 'Layoff Permanent'], ['07/01/2015', '0  9  / 0  2  / 2  0  15', '0  7  / 0  1  / 2  0  15', 'Leidos', 'El Segundo', '72', 'Layoff Permanent'] [...]

In order to investigate further, it'd be helpful to have the following:

  • The particular snippet of code you're using
  • The full error message it produces
  • The version of Python you're using
  • Your operating system

@atulpuri
Copy link
Author

atulpuri commented Mar 2, 2017

Hi, @jsvine, the error I mentioned isn't appearing any longer.

The code snippet I used was the same as posted by you.
Since the error isn't appearing any longer I cannot send the entire error message, but the error was appearing in the decimalize function in utils.
I am using Python 3.5 on a windows 7 machine.

Thank you, and sorry about the bother.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants