Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with parsing PDF table with unicode characters #322

Closed
OlegGavrilov opened this issue May 2, 2019 · 2 comments
Closed

Problem with parsing PDF table with unicode characters #322

OlegGavrilov opened this issue May 2, 2019 · 2 comments

Comments

@OlegGavrilov
Copy link

Hello! Sorry for reporting a minor issue, but when I tried to parse table with Unicode characters using Excalibur front-end, I got an error:

ERROR:root:'ascii' codec can't encode character u'\xf6' in position 376: ordinal not in range(128)
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/excalibur/tasks.py", line 123, in extract
    tables.export(f_datapath, f=f, compress=True)
  File "/usr/local/lib/python2.7/dist-packages/camelot/core.py", line 701, in export
    self._write_file(f=f, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/camelot/core.py", line 659, in _write_file
    to_format(filepath)
  File "/usr/local/lib/python2.7/dist-packages/camelot/core.py", line 594, in to_html
    f.write(html_string)
UnicodeEncodeError: 'ascii' codec can't encode character u'\xf6' in position 376: ordinal not in range(128)

Fixed that by adding .encode('utf-8') at core.py:594.

Don't know if this is a good fix, but just hope it can help someone.

Thanks for the amazing project!

@ngenovictor
Copy link

Got the same error too and the change also worked out for me.

ngenovictor added a commit to ngenovictor/camelot that referenced this issue May 29, 2019
@vinayak-mehta
Copy link
Contributor

Closing because there's no PDF to reproduce this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants