New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading negative length in pywsgi wsgi.input never returns #1274

Closed
tzickel opened this Issue Aug 29, 2018 · 7 comments

Comments

Projects
None yet
2 participants
@tzickel
Contributor

tzickel commented Aug 29, 2018

  • gevent version: master

Description:

Python sometimes uses read length -1 for reading all (like in file i/o) and sometimes None (like in http)...

When reading a non chunked body in pywsgi from wsgi.input with negative length (or -1) it gets stuck in the read command, the chunked code path has a check for this:

if length is not None and length < 0:

Maybe it's worth adding it in regular read as well before:

if length is None:

Also a minor nitpick (although it seems other frameworks have this issue as well):

If application already sets a Transfer-Encoding: chunked in the headers in start_response, pywsgi will not check it, and add another one (if it deems to use it), resulting in a Transfer-Encoding: chunked, chunked (Might be the same for Content-Length). Maybe those headers should be removed if they are conflicting (when deciding which headers to send) ?

What I've run:

from gevent.pywsgi import WSGIServer

def app(env, start_response):
  print env['wsgi.input'].read(-1)
  start_response('204 OK', [])
  return []

WSGIServer(('', 1234), app).serve_forever()
@jamadden

This comment has been minimized.

Member

jamadden commented Aug 29, 2018

Maybe it's worth adding it in regular read as well before:

Maybe? The specification only says that "A server should allow read() to be called without an argument, and return the remainder of the client's input stream.". It doesn't say anything about what "invalid" argument values mean, but the argument for symmetry is strong.

A PR that moves the check from _do_chunked_read up to read() should be pretty simple and not reduce test coverage; bonus points if it added a specific test for the non-chunked.

If application already sets a Transfer-Encoding: chunked in the headers in start_response,

I believe that means the application is buggy. The specification says " applications and middleware must not apply any kind of Transfer-Encoding to their output, such as chunking or gzipping; as "hop-by-hop" operations, these encodings are the province of the actual web server/gateway."

@tzickel

This comment has been minimized.

Contributor

tzickel commented Sep 12, 2018

What do you suggest on readline ? The wsgiref validator that wraps Input in the testing framework does not allow for passing arguments to readline

@jamadden

This comment has been minimized.

Member

jamadden commented Sep 12, 2018

Portable applications cannot rely on readline accepting an argument at all. Anything there is implementation defined.

@jamadden

This comment has been minimized.

Member

jamadden commented Sep 12, 2018

Oh, did you meant for testing it? I think there are unit tests for the Input class itself that should work, or there's TestInputReadline that specifically disables the validator for app level tests.

@tzickel

This comment has been minimized.

Contributor

tzickel commented Sep 13, 2018

Just a quick question before this issue is closed (unrelated, but I want to know if to open a new issue or it's a wontfix).

When trying to read from Input using env['wsgi.input'].read(size) where size is a number > 31bit (i.e. 2GB) or content_length is > 31bit this will trigger:
https://github.com/gevent/gevent/blob/master/src/gevent/pywsgi.py#L182

Shouldn't we be able to handle reading inputs larger than 2GB ? (The comment is wrong, it's 31bit, since the if cStringIO does not have the required data in buffer, it would try passing the read length to the original socket .recv command which can only do 2GB at a time).

@jamadden

This comment has been minimized.

Member

jamadden commented Sep 13, 2018

Shouldn't we be able to handle reading inputs larger than 2GB ?

In a single go? It's under-specified. gevent.pywsgi and wsgiref both have the OverflowError issue on CPython 2.7 only; it works on PyPy 2.7 and CPython 3. On CPython 2.7, gunicorn appears to allow it (though allocating a single 2GB string may in the end prove challenging), while waitress rejects it via a configurable policy.

@tzickel

This comment has been minimized.

Contributor

tzickel commented Sep 13, 2018

Ok, but the comment is still misleading there " # Expecting to read more than 64 bits of data. Ouch!" for someone who is reading the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment