Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support chunked transfer encoding #1198

Merged
merged 13 commits into from Nov 28, 2017

Conversation

Projects
None yet
3 participants
@jan-auer
Copy link
Contributor

commented Nov 27, 2017

This PR introduces support for requests with chunked transfer encoding to WSGIRequestHandler by wrapping the input in a DechunkedInput. Chunks are unrolled in the following way

  • Read a line to get the chunk's size in bytes
  • Read the chunk data, but only to the length of the buffer
  • Reduce the chunk size by the number of bytes read
  • At the end of the chunk, skip the chunk's final newline

Once a chunk of size zero is encountered, a final newline is read, and then the input is marked as done. All further requests to readinto will exit early.

There is only a test for the success case. It needs to issue a manual HTTP to achieve a proper chunked multipart payload.

Note: For some reason, werkzeug crashes when parsing the response body if a Content-Length header is present. If you would like this fixed in this PR too, let me know. RFC 2616, Section 4.4 states that the Content-Length header must be ignored with any Transfer-Encoding other than identity.

Fixes #1149

jan-auer added some commits Nov 27, 2017

@davidism

This comment has been minimized.

Copy link
Member

commented Nov 27, 2017

Thanks for working on this!

@davidism

This comment has been minimized.

Copy link
Member

commented Nov 27, 2017

There were some issues with our tests on Travis, you'll need to merge in master to fix the unrelated failing tests.

@@ -54,7 +54,7 @@ Install Werkzeug in editable mode::

Install the minimal test requirements::

pip install pytest pytest-xprocess requests
pip install pytest pytest-xprocess requests six

This comment has been minimized.

Copy link
@davidism

davidism Nov 27, 2017

Member

We do not use six. Add the specific thing you need to werkzeug._compat.

This comment has been minimized.

Copy link
@jan-auer

jan-auer Nov 27, 2017

Author Contributor

Can do, it's only working in Python 2.7 anyway from what I've seen. Will have a look at this tomorrow.

@davidism davidism requested a review from mitsuhiko Nov 27, 2017

@mitsuhiko

This comment has been minimized.

Copy link
Member

commented Nov 27, 2017

Generally looks good. Needs some fixes for the test failures and I will look into what uses the content length currently.

@davidism

This comment has been minimized.

Copy link
Member

commented Nov 27, 2017

Hold off merging until I roll back the changes mentioned in #1149 (#1126), since they're related.

jan-auer added some commits Nov 28, 2017

Improve DechunkedInput implementation and add comments
 - Try to read multiple chunks until the buffer is filled
 - Add comments to make control flow more clear
 - Add doc comment for DechunkedInput
@jan-auer

This comment has been minimized.

Copy link
Contributor Author

commented Nov 28, 2017

Pushed some updates:

  • The DechunkedInput read only one chunk at a time and then immediately returned the buffer. Now it tries to fill the buffer and put as many chunks (or parts of it) as possible
  • Also added some comments and a docstring to make the control flow more clear. It's still a bit weird on first sight, but someone familiar with the spec should understand it easily
  • Fixed the bug where the Content-Length header crashed the form parser. The problem is that the content length header includes chunk headers and newlines, but the payload returned by DechunkedInput does not. The form parser, however, wraps the input in a LimitedStream if a Content-Length header is present, which asserts they are the same. By removing the content length, the form parser does not apply LimitedStream anymore and the problem goes away. Also, this is more spec compliant.
  • Replaced six by an explicit import based on the python version. Could have put it in _compat like suggested, but since httplib/http.client is only used in the test, it's easer this way.
  • Finally, the test had some unused code which is removed now.
@@ -38,7 +38,7 @@
def default_stream_factory(total_content_length, filename, content_type,
content_length=None):
"""The stream factory that is used per default."""
if total_content_length > 1024 * 500:
if total_content_length is None or total_content_length > 1024 * 500:

This comment has been minimized.

Copy link
@jan-auer

jan-auer Nov 28, 2017

Author Contributor

Beware of this. Since we have no information on content length, the total_content_length parameter will be None. I decided to default to TemporaryFile since we can't possibly know how large the file is going to become...

This comment has been minimized.

Copy link
@mitsuhiko

mitsuhiko Nov 28, 2017

Member

I think it might make sense to use this rollover tempfile that lingers in the stdlib. That starts up to a certain amount in memory and then flushes to fs.

This comment has been minimized.

Copy link
@jan-auer

jan-auer Nov 28, 2017

Author Contributor

Suppose you mean SpooledTemporaryFile?

This comment has been minimized.

Copy link
@mitsuhiko

mitsuhiko Nov 28, 2017

Member

Yep, that's the one.

This comment has been minimized.

Copy link
@davidism

davidism Nov 28, 2017

Member

👍 Would remove the check here and just always return a spooled temp file.

@davidism

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

Reverted #1126 in #1200, please merge master. Still working on unrelated test failures.

jan-auer added some commits Nov 28, 2017

Merge branch 'master' into transfer-encoding-chunked
* master:
  Revert "Allow chunk request"
  fix flake8 errors
  codecov needs argparse on 2.6
  Fix redis tests
@jan-auer

This comment has been minimized.

Copy link
Contributor Author

commented Nov 28, 2017

Alright, here you go. FWIW, I also ran some manual tests with uwsgi 2.0.15 and master in both python 2.7 and 3.6 and it seemed to work just fine.

Just one thing I'm still wondering about is whether we need to take care of infinite streams in DechunkedInput, or is something else already dealing with that?

@davidism

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

This shouldn't affect other WSGI server support of chunked encoding, but thanks for testing it.

#1126 was supposed to handle infinite streams by setting a hard limit on the content read in, but that was based on some incorrect assumptions from me. I can't recall any special handling in Gunicorn from when I was looking a while ago, but might be good to check.

davidism added some commits Nov 28, 2017

fix 2.6 test, flake8
body in send instead of endheaders
@davidism

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

I merged master again and fixed some 2.6 and style issues. Travis should pass now.

@jan-auer

This comment has been minimized.

Copy link
Contributor Author

commented Nov 28, 2017

Whoop, it's all green! 🎉

@davidism

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

Would you mind looking at how this behaves with infinite streams (not chunked, but no content length) versus how Gunicorn behaves?

@jan-auer

This comment has been minimized.

Copy link
Contributor Author

commented Nov 28, 2017

Sure 👍 Please let me know if you or @mitsuhiko have a perspective on this, though.

@mitsuhiko

This comment has been minimized.

Copy link
Member

commented Nov 28, 2017

I will merge this in the current state now. I think we can look at the edge cases once we see how others are doing it. This is basically just the local dev setup anyways.

@ghost ghost referenced this pull request Oct 19, 2018

Closed

Flask update needed #896

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.