-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
non-zero exit status #19
Comments
I can't reproduce your error. ~/t/tabula-py (master ☡=) (default) ipython 15:41:58
Python 3.5.2 (default, Oct 11 2016, 05:05:28)
Type "copyright", "credits" or "license" for more information.
IPython 5.1.0 -- An enhanced Interactive Python.
? -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help -> Python's own help system.
object? -> Details about 'object', use 'object??' for extra details.
In [1]: import tabula
In [2]: df = tabula.read_pdf("https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/arabic.pdf")
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF-8
In [3]: df
Out[3]:
مرحبًا اسمي سلطان
0 انا من ولاية كارولينا الشمال من اين انت؟
1 1234 عندي 47 قطط
2 هل انت شباك؟ اسمي Jeremy في الانجليزية
3 Jeremy is جرمي in Arabic NaN
In [4]: print(df)
مرحبًا اسمي سلطان
0 انا من ولاية كارولينا الشمال من اين انت؟
1 1234 عندي 47 قطط
2 هل انت شباك؟ اسمي Jeremy في الانجليزية
3 Jeremy is جرمي in Arabic NaN It seems tabula-java issues. How about using tabula-java directly as follows:
|
Thanks for looking into this! It turns out terminal was running an old version of java http://stackoverflow.com/questions/12757558/installed-java-7-on-mac-os-x-but-terminal-is-still-using-version-6 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi, I tried to run this code on Mac OSX with Anaconda and Java 8:
import tabula
df = tabula.read_pdf("https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/arabic.pdf") print(df)
I got this error message:
Exception in thread "main" java.lang.NoSuchMethodError: java.lang.Integer.compare(II)I at technology.tabula.TextChunk.isLtrDominant(TextChunk.java:179) at technology.tabula.TextElement.mergeWords(TextElement.java:266) at technology.tabula.TextElement.mergeWords(TextElement.java:105) at technology.tabula.detectors.NurminenDetectionAlgorithm.detect(NurminenDetectionAlgorithm.java:178) at technology.tabula.CommandLineApp.extractTables(CommandLineApp.java:161) at technology.tabula.CommandLineApp.main(CommandLineApp.java:60) Traceback (most recent call last): File "test.py", line 3, in <module> df = tabula.read_pdf("https://github.com/tabulapdf/tabula-java/raw/master/src/test/resources/technology/tabula/arabic.pdf") File "/Users/username/anaconda/lib/python3.5/site-packages/tabula/wrapper.py", line 54, in read_pdf_table output = subprocess.check_output(args) File "/Users/username/anaconda/lib/python3.5/subprocess.py", line 626, in check_output **kwargs).stdout File "/Users/username/anaconda/lib/python3.5/subprocess.py", line 708, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['java', '-jar', '/Users/username/anaconda/lib/python3.5/site-packages/tabula/tabula-0.9.1-jar-with-dependencies.jar', '--pages', '1', '--guess', '4931.pdf']' returned non-zero exit status 1
Any idea what might cause this? thanks!
The text was updated successfully, but these errors were encountered: