Change decoding in http to iso-8859-1 instead of unicode #68
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I did not find a good way of passing an argument, it was nested a bit down. If you give me some pointers maybe I can add that instead.
Anyway, since parsing of for example Nordic languages fail with a UnicodeDecodeError on
polyglot.as_unicode
when opening aan example line to as_unicode that fail looks like
b'<div style="font-size: 28px;color: #3b3933;margin-bottom: 20px;">F\xf6rening</div>'
This is the traceback