New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError on python3 #1151

Closed
benoitc opened this Issue Nov 23, 2015 · 4 comments

Comments

Projects
None yet
1 participant
@benoitc
Owner

benoitc commented Nov 23, 2015

I wonder if someone is really using Python 3 or at least test gunicorn with it, but we have a regression introduced via #1102 during the tests. Python 2 version is not affected. This is actually a blocker for 19.4.

Error handling request /
Traceback (most recent call last):
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/workers/sync.py", line 130, in handle
    self.handle_request(listener, req, client, addr)
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/workers/sync.py", line 177, in handle_request
    resp.write(item)
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/http/wsgi.py", line 324, in write
    self.send_headers()
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/http/wsgi.py", line 320, in send_headers
    util.write(self.sock, util.to_latin1(header_str))
  File "/Users/benoitc/Projects/gunicorn/gunicorn_py3/gunicorn/gunicorn/util.py", line 517, in to_latin1
    return value.encode("latin-1")

To reproduce it, do the following:

  1. Launch the test example with the following command line:
$ gunicorn -w3 test:app
  1. Then launch curl on it:
$ curl http://127.0.0.1:8000/

@benoitc benoitc added this to the R19.4 milestone Nov 23, 2015

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Nov 23, 2015

Owner

According to the RFC 7230:

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

https://tools.ietf.org/html/rfc7230#section-3.2.4

I am not sure what to do yet. Either we let the gunicorn return an error like it is right now and fix the test. (we should also fix the encoding to usascii only sigh ). Or we quote by default the header value.

Thoughts?

Owner

benoitc commented Nov 23, 2015

According to the RFC 7230:

Historically, HTTP has allowed field content with text in the ISO-8859-1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US-ASCII octets. A recipient SHOULD treat other octets in field content (obs-text) as opaque data.

https://tools.ietf.org/html/rfc7230#section-3.2.4

I am not sure what to do yet. Either we let the gunicorn return an error like it is right now and fix the test. (we should also fix the encoding to usascii only sigh ). Or we quote by default the header value.

Thoughts?

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Nov 23, 2015

Owner

To complete my comment above. This is more about deciding if as a server we should not take care about it and let the application handling the issue, or if we should fix the headers encoding whatever the application give us.

Owner

benoitc commented Nov 23, 2015

To complete my comment above. This is more about deciding if as a server we should not take care about it and let the application handling the issue, or if we should fix the headers encoding whatever the application give us.

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc
Owner

benoitc commented Nov 24, 2015

@benoitc

This comment has been minimized.

Show comment
Hide comment
@benoitc

benoitc Nov 25, 2015

Owner

bump.

Owner

benoitc commented Nov 25, 2015

bump.

@benoitc benoitc closed this in 5f4ebd2 Nov 25, 2015

benoitc added a commit that referenced this issue Nov 25, 2015

fofanov pushed a commit to fofanov/gunicorn that referenced this issue Mar 16, 2018

don't return utf8 header in example
Since the updated RFC 7230 implys that new Headers Key and Value should be
sent as USASCII only don't try to test utf8 headers in examples.

We now only encode them to ascii. Gunicorn will fail if it's unable to encode
them letting the responsability to the application to correctly encode the
response. (we are just a gateway).

While i'm here simplify the code to not create an extra function only used at
one place.

NOTE: if anyone come to a better solution, i am happy to revisit it on the
next release.

fix benoitc#1151

fofanov pushed a commit to fofanov/gunicorn that referenced this issue Mar 16, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment