Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
check if html is unicode before trying to encode
  • Loading branch information
tknorris committed May 20, 2015
1 parent fa12f04 commit e4c9e25
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion pw_scraper.py
Expand Up @@ -579,7 +579,8 @@ def __get_cached_url(self, url, cache_limit=8):
dialog.ok("Robot Check", "You must enter text in the image to continue")
wdlg.close()

body = unicode(body, 'windows-1252', 'ignore')
if not isinstance(html, unicode):
body = unicode(body, 'windows-1252', 'ignore')
parser = HTMLParser.HTMLParser()
body = parser.unescape(body)
except Exception as e:
Expand Down

0 comments on commit e4c9e25

Please sign in to comment.