-
-
Notifications
You must be signed in to change notification settings - Fork 296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CParserError: Error tokenizing data. C error: Expected 2 fields in line 733, saw 3 #12
Comments
If there were multiple tables in a file, you should specify page number with |
Thanks for the help @chezou, I tried this:
out:
And: In:
Out:
As you can see I specified the pages parameter. Any idea of how to proceed?. Thanks! |
Could you try with tabula-java? If page 45 of your pdf includes multiple table or has combined cell, tabula-py should be fail. If you use Anywhere I can't uess anymore without your pdf. |
@alonsopg Did your problem solve with updated version? If so, I would like to close this issue. |
I have the same problem with pages =all. could anuone help me ? |
@RAHAAMA Set |
@chezou Thank you . There is another problem with multiple tables , I have a pdf that prepared in two language , It means that pdf has two column (English and French ) , when I want to extract the tables , it consider all text like table. Is there any suggestion for this problem ? |
I got above warnings also , I have set |
I am trying to extract the tables from a number of pdf documents:
In:
Out:
I tried to use
sep
parameter as\t
. Nevertheless, it did not worked. What can I do?The text was updated successfully, but these errors were encountered: