fix non-ASCII header exception in Python 3 #2357
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi there!
This Pull Request contains fix for non-ASCII(emoji, CJK characters, etc.) http headers in Python 3.
For example, refer to the following code:
Above code will trigger browser to download
hello馃榿.txt
instead of open it.However, above code would raise such exception:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 45-46: ordinal not in range(256)
And also, non-ASCII header set in any field would raise this exception,
for example
self.set_header("laugh馃榿", "emoji")
The error occurs in the L385 of http1connection.py:
lines.extend(l.encode('latin1') for l in header_lines)
Of course encoding an emoji or CJK in latin1 would throw an exception.
The solution varies, whether
url_escape(file_name)
when we programs or fix the implementation inhttp1connection.py
Though according to RFC7230 section 3.2.4:
HTTP Header should be limited to ASCII value.
However, this PR seems like a temporary solution since the Python 2 version just send the header in raw.
Anyway, thanks for your precious time in reviewing this.