UnicodeEncodeError on python3 #1151

benoitc opened this Issue Nov 23, 2015


benoitc commented Nov 23, 2015

I wonder if someone is really using Python 3 or at least test gunicorn with it, but we have a regression introduced via #1102 during the tests. Python 2 version is not affected. This is actually a blocker for 19.4.

Error handling request /
Traceback (most recent call last):
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/workers/sync.py", line 130, in handle
    self.handle_request(listener, req, client, addr)
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/workers/sync.py", line 177, in handle_request
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/http/wsgi.py", line 324, in write
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/http/wsgi.py", line 320, in send_headers
    util.write(self.sock, util.to_latin1(header_str))
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/util.py", line 517, in to_latin1
    return value.encode("latin-1")

To reproduce it, do the following:

  1. Launch the test example with the following command line:
$ gunicorn -w3 test:app
  1. Then launch curl on it:
$ curl
benoitc commented Nov 23, 2015

According to the RFC 7230:

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.


I am not sure what to do yet. Either we let the gunicorn return an error like it is right now and fix the test. (we should also fix the encoding to usascii only sigh ). Or we quote by default the header value.


benoitc commented Nov 23, 2015

To complete my comment above. This is more about deciding if as a server we should not take care about it and let the application handling the issue, or if we should fix the headers encoding whatever the application give us.

benoitc commented Nov 24, 2015
benoitc commented Nov 25, 2015


@benoitc don't return utf8 header in example
Since the updated RFC 7230 implys that new Headers Key and Value should be
sent as USASCII only don't try to test utf8 headers in examples.

We now only encode them to ascii. Gunicorn will fail if it's unable to encode
them letting the responsability to the application to correctly encode the
response. (we are just a gateway).

While i'm here simplify the code to not create an extra function only used at
one place.

NOTE: if anyone come to a better solution, i am happy to revisit it on the
next release.

fix #1151
