New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError in compat/__init__.py with python3.3 #41

Closed
minhoryang opened this Issue Jan 8, 2014 · 2 comments

Comments

Projects
None yet
2 participants
@minhoryang

minhoryang commented Jan 8, 2014

I tried to crawl http://imnews.imbc.com/rss/news/news_00.xml

ERROR:earthreader.web.app:Exception on /feeds/ [POST]
Traceback (most recent call last):
  File "/home/speech/py3env/lib/python3.3/site-packages/flask/app.py", line 1817, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/speech/py3env/lib/python3.3/site-packages/flask/app.py", line 1477, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/speech/py3env/lib/python3.3/site-packages/flask/app.py", line 1381, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/speech/py3env/lib/python3.3/site-packages/flask/_compat.py", line 33, in reraise
    raise value
  File "/home/speech/py3env/lib/python3.3/site-packages/flask/app.py", line 1475, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/speech/py3env/lib/python3.3/site-packages/flask/app.py", line 1461, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/speech/py3env/lib/python3.3/site-packages/earthreader/web/app.py", line 286, in add_feed
    feed_links = autodiscovery(document, url)
  File "/home/speech/py3env/lib/python3.3/site-packages/libearth/parser/autodiscovery.py", line 60, in autodiscovery
    document = text(document)
  File "/home/speech/py3env/lib/python3.3/site-packages/libearth/compat/__init__.py", line 82, in text
    return string.decode('utf-8')
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb4 in position 91: invalid start byte
@minhoryang

This comment has been minimized.

Show comment
Hide comment
@minhoryang

minhoryang Jan 8, 2014

Parser.AutoDiscovery can't understand http://imnews.imbc.com/rss/news/news_00.xml as RSS2.

It caused by 'euc-kr'.

minhoryang commented Jan 8, 2014

Parser.AutoDiscovery can't understand http://imnews.imbc.com/rss/news/news_00.xml as RSS2.

It caused by 'euc-kr'.

@dahlia

This comment has been minimized.

Show comment
Hide comment
@dahlia

dahlia May 28, 2014

Member

It seems due to pyexpat.c’s bug.

Member

dahlia commented May 28, 2014

It seems due to pyexpat.c’s bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment