New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSGI environ string should be latin-1 encoding #468

Closed
StoneMoe opened this Issue Mar 6, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@StoneMoe
Copy link

StoneMoe commented Mar 6, 2018

PEP-0333 encoding part: https://www.python.org/dev/peps/pep-0333/#unicode-issues

Patch pull request: #467

Known issue:
not compatible with werkzeug

Traceback (most recent call last):
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/eventlet/wsgi.py", line 539, in handle_one_response
    result = self.application(self.environ, start_response)
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/flask/app.py", line 1997, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/flask_socketio/__init__.py", line 43, in __call__
    start_response)
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/engineio/middleware.py", line 49, in __call__
    return self.wsgi_app(environ, start_response)
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/werkzeug/debug/__init__.py", line 468, in __call__
    request.path == self.console_path:
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/werkzeug/utils.py", line 73, in __get__
    value = self.func(obj)
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/werkzeug/wrappers.py", line 596, in path
    self.charset, self.encoding_errors)
  File "/home/stonemoe/Desktop/Code/test/venv/lib/python3.6/site-packages/werkzeug/_compat.py", line 176, in wsgi_decoding_dance
    return s.encode('latin1').decode(charset, errors)
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 11-12: ordinal not in range(256)

127.0.0.1 - - [06/Mar/2018 13:31:44] "GET /test/%E4%BD%A0%E5%A5%BD HTTP/1.0" 500 1603 0.013243
@temoto

This comment has been minimized.

Copy link
Member

temoto commented Mar 6, 2018

Thank you.

@StoneMoe

This comment has been minimized.

Copy link

StoneMoe commented Mar 7, 2018

@temoto

This comment has been minimized.

Copy link
Member

temoto commented Mar 10, 2018

Fix is merged 4e576cb

Thank you @StoneMoe

@temoto temoto closed this Mar 10, 2018

tipabu added a commit to tipabu/eventlet that referenced this issue Jun 7, 2018

wsgi: Use byte strings on py2 and unicode strings on py3
Closes eventlet#488; for some additional context, see eventlet#468 and eventlet#467.

PEP-0333 says [1]

> all strings referred to in this specification **must** be of type ``str``

but muddied that message earlier with a bunch of talk about

> all strings passed to or from the server must be standard Python byte
> strings, not Unicode objects

PEP-3333 sought to clarify [2] with

> WSGI therefore defines two kinds of "string":
>
> * "Native" strings (which are always implemented using the type
>   named ``str``) that are used for request/response headers and
>   metadata
>
> * "Bytestrings" (which are implemented using the ``bytes`` type
>   in Python 3, and ``str`` elsewhere), that are used for the bodies
>   of requests and responses (e.g. POST/PUT input data and HTML page
>   outputs).
>
> Do not be confused however: even if Python's ``str`` type is actually
> Unicode "under the hood", the *content* of native strings must
> still be translatable to bytes via the Latin-1 encoding!

And later, in "Unicode Issues" [3], it adds

> For values referred to in this specification as "bytestrings"
> (i.e., values read from ``wsgi.input``, passed to ``write()``
> or yielded by the application), the value **must** be of type
> ``bytes`` under Python 3, and ``str`` in earlier versions of
> Python.

So the upshot seems to be

  - All request and response bodies must be bytes, regardless of Python
    version.
  - All headers and other WSGI environment keys need to be native
    strings; i.e. bytes strings on Python 2 and unicode strings on
    Python 3.

[1] https://www.python.org/dev/peps/pep-0333/#unicode-issues
[2] https://www.python.org/dev/peps/pep-3333/#a-note-on-string-types
[3] https://www.python.org/dev/peps/pep-3333/#unicode-issues

tipabu added a commit to tipabu/eventlet that referenced this issue Jun 7, 2018

wsgi: Use byte strings on py2 and unicode strings on py3
Closes eventlet#488; for some additional context, see eventlet#468 and eventlet#467.

PEP-0333 says [1]

> all strings referred to in this specification **must** be of type ``str``

but muddied that message earlier with a bunch of talk about

> all strings passed to or from the server must be standard Python byte
> strings, not Unicode objects

PEP-3333 sought to clarify [2] with

> WSGI therefore defines two kinds of "string":
>
> * "Native" strings (which are always implemented using the type
>   named ``str``) that are used for request/response headers and
>   metadata
>
> * "Bytestrings" (which are implemented using the ``bytes`` type
>   in Python 3, and ``str`` elsewhere), that are used for the bodies
>   of requests and responses (e.g. POST/PUT input data and HTML page
>   outputs).
>
> Do not be confused however: even if Python's ``str`` type is actually
> Unicode "under the hood", the *content* of native strings must
> still be translatable to bytes via the Latin-1 encoding!

And later, in "Unicode Issues" [3], it adds

> For values referred to in this specification as "bytestrings"
> (i.e., values read from ``wsgi.input``, passed to ``write()``
> or yielded by the application), the value **must** be of type
> ``bytes`` under Python 3, and ``str`` in earlier versions of
> Python.

So the upshot seems to be

  - All request and response bodies must be bytes, regardless of Python
    version.
  - All headers and other WSGI environment keys need to be native
    strings; i.e. bytes strings on Python 2 and unicode strings on
    Python 3.

[1] https://www.python.org/dev/peps/pep-0333/#unicode-issues
[2] https://www.python.org/dev/peps/pep-3333/#a-note-on-string-types
[3] https://www.python.org/dev/peps/pep-3333/#unicode-issues

temoto added a commit that referenced this issue Jul 25, 2018

wsgi: Use byte strings on py2 and unicode strings on py3
Closes #488; for some additional context, see #468 and #467.

PEP-0333 says [1]

> all strings referred to in this specification **must** be of type ``str``

but muddied that message earlier with a bunch of talk about

> all strings passed to or from the server must be standard Python byte
> strings, not Unicode objects

PEP-3333 sought to clarify [2] with

> WSGI therefore defines two kinds of "string":
>
> * "Native" strings (which are always implemented using the type
>   named ``str``) that are used for request/response headers and
>   metadata
>
> * "Bytestrings" (which are implemented using the ``bytes`` type
>   in Python 3, and ``str`` elsewhere), that are used for the bodies
>   of requests and responses (e.g. POST/PUT input data and HTML page
>   outputs).
>
> Do not be confused however: even if Python's ``str`` type is actually
> Unicode "under the hood", the *content* of native strings must
> still be translatable to bytes via the Latin-1 encoding!

And later, in "Unicode Issues" [3], it adds

> For values referred to in this specification as "bytestrings"
> (i.e., values read from ``wsgi.input``, passed to ``write()``
> or yielded by the application), the value **must** be of type
> ``bytes`` under Python 3, and ``str`` in earlier versions of
> Python.

So the upshot seems to be

  - All request and response bodies must be bytes, regardless of Python
    version.
  - All headers and other WSGI environment keys need to be native
    strings; i.e. bytes strings on Python 2 and unicode strings on
    Python 3.

[1] https://www.python.org/dev/peps/pep-0333/#unicode-issues
[2] https://www.python.org/dev/peps/pep-3333/#a-note-on-string-types
[3] https://www.python.org/dev/peps/pep-3333/#unicode-issues
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment