Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] issue just running sample code #382

Closed
2 tasks done
guocity opened this issue Apr 24, 2024 · 1 comment
Closed
2 tasks done

[BUG] issue just running sample code #382

guocity opened this issue Apr 24, 2024 · 1 comment

Comments

@guocity
Copy link

guocity commented Apr 24, 2024

Summary

issue just running sample code

Did you read the FAQ?

  • I have read the FAQ

Did you search GitHub Discussions?

  • I have searched the discussions

(Optional) PDF URL

No response

About your environment

import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
tabula.read_pdf(pdf_path, stream=True)

---------------------------------------------------------------------------
CalledProcessError                        Traceback (most recent call last)
Cell In[7], line 3
      1 import tabula
      2 pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
----> 3 tabula.read_pdf(pdf_path, stream=True)

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:395, in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, user_agent, use_raw_url, pages, guess, area, relative_area, lattice, stream, password, silent, columns, relative_columns, format, batch, output_path, force_subprocess, options)
    392     raise ValueError(f"{path} is empty. Check the file, or download it manually.")
    394 try:
--> 395     output = _run(
    396         tabula_options,
    397         java_options,
    398         path,
    399         encoding=encoding,
    400         force_subprocess=force_subprocess,
    401     )
    402 finally:
    403     if temporary:

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:82, in _run(options, java_options, path, encoding, force_subprocess)
     79 elif set(java_options) - IGNORED_JAVA_OPTIONS:
     80     logger.warning("java_options is ignored until rebooting the Python process.")
---> 82 return _tabula_vm.call_tabula_java(options, path)
...
--> 571         raise CalledProcessError(retcode, process.args,
    572                                  output=stdout, stderr=stderr)
    573 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['java', '-Djava.awt.headless=true', '-Dfile.encoding=UTF8', '-jar', '/opt/homebrew/lib/python3.11/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--stream',

What did you do when you faced the problem?

pip install tabula-py

run the code

ERROR:

Code

import tabula
pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
tabula.read_pdf(pdf_path, stream=True)

Expected behavior

read the table from pdf

Actual behavior


CalledProcessError Traceback (most recent call last)
Cell In[7], line 3
1 import tabula
2 pdf_path = "https://github.com/chezou/tabula-py/raw/master/tests/resources/data.pdf"
----> 3 tabula.read_pdf(pdf_path, stream=True)

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:395, in read_pdf(input_path, output_format, encoding, java_options, pandas_options, multiple_tables, user_agent, use_raw_url, pages, guess, area, relative_area, lattice, stream, password, silent, columns, relative_columns, format, batch, output_path, force_subprocess, options)
392 raise ValueError(f"{path} is empty. Check the file, or download it manually.")
394 try:
--> 395 output = _run(
396 tabula_options,
397 java_options,
398 path,
399 encoding=encoding,
400 force_subprocess=force_subprocess,
401 )
402 finally:
403 if temporary:

File /opt/homebrew/lib/python3.11/site-packages/tabula/io.py:82, in _run(options, java_options, path, encoding, force_subprocess)
79 elif set(java_options) - IGNORED_JAVA_OPTIONS:
80 logger.warning("java_options is ignored until rebooting the Python process.")
---> 82 return _tabula_vm.call_tabula_java(options, path)
...
--> 571 raise CalledProcessError(retcode, process.args,
572 output=stdout, stderr=stderr)
573 return CompletedProcess(process.args, retcode, stdout, stderr)

CalledProcessError: Command '['java', '-Djava.awt.headless=true', '-Dfile.encoding=UTF8', '-jar', '/opt/homebrew/lib/python3.11/site-packages/tabula/tabula-1.0.5-jar-with-dependencies.jar', '--stream',

Related issues

No response

@chezou
Copy link
Owner

chezou commented Apr 24, 2024

Please follow the issue template. You didn't provide an appropriate answer for "About your environment".

@chezou chezou closed this as completed Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants