Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed #33699 -- Made ASGI request read body on-demand. #15704

Conversation

Flauschbaellchen
Copy link

@Flauschbaellchen Flauschbaellchen commented May 18, 2022

Prior to this change, the request has been written into a spooled temporary file
as the HTTPRequest class depends on a byte/io-like stream to process and
parse its content as a whole.
As the ASGI request is given in seperate chunks,
those chunks has been concatenated first using this file.

However, doing so resulted in an increasing latency on the server side,
as the file needed to be always written and re-read to be parsed afterwards.
This was especially bad for big file uploads resulting in (Gateway)
Timeouts if the Django server was running behind a reverse-proxy.

This change fixes this issue by wrapping the ASGI request and providing
a byte/io-like stream interface to it.
Doing so, reading the request's content is delayed until it needs to be
parsed and than directly delivered into the upload-handlers, resulting
in a reduced latency.

@Flauschbaellchen
Copy link
Author

This PR references the discussion on https://groups.google.com/g/django-developers/c/fu6ZSmu-YJE.

@github-actions
Copy link

Hello @Flauschbaellchen! Thank you for your contribution 💪

As it's your first contribution be sure to check out the patch review checklist.

If you're fixing a ticket from Trac make sure to set the "Has patch" flag and include a link to this PR in the ticket!

If you have any design or process questions then you can ask in the Django forum.

Welcome aboard ⛵️!

@carltongibson
Copy link
Member

Hi @Flauschbaellchen — thanks for this! First thing is to address the CI failures so we've got a clean slate — Looks like asgi.tests.ASGITest.test_disconnect (plus the lint errors).

@carltongibson carltongibson changed the title Process ASGI request input directly as a stream Fixed #33699 -- Made ASGI request consumed body on-demand. May 18, 2022
@carltongibson carltongibson changed the title Fixed #33699 -- Made ASGI request consumed body on-demand. Fixed #33699 -- Made ASGI request read body on-demand. May 18, 2022
Prior to this change, the request has been written into a spooled temporary file
as the HTTPRequest class depends on a byte/io-like stream to process and
parse its content as a whole.
As the ASGI request is given in seperate chunks,
those chunks has been concatenated first using this file.

However, doing so resulted in an increasing latency on the server side,
as the file needed to be always written and re-read to be parsed afterwards.
This was especially bad for big file uploads resulting in (Gateway)
Timeouts if the Django server was running behind a reverse-proxy.

This change fixes this issue by wrapping the ASGI request and providing
a byte/io-like stream interface to it.
Doing so, reading the request's content is delayed until it needs to be
parsed and than directly delivered into the upload-handlers, resulting
in a reduced latency.
@Flauschbaellchen
Copy link
Author

@carltongibson Can I ask you to take a look into the asgi.tests.ASGITest.test_disconnect test and help me out to fix this?
As the reading is now delayed until the HTTPRequest is parsing the body, the ASGIStream and its receive() is not called currently within the test - and thus no TimeoutError is raised.
I think the AsyncRequestFactory/ApplicationCommunicator needs to be adjusted, but I cannot see what I'd need to change.

@carltongibson
Copy link
Member

@Flauschbaellchen — OK, let me take a look. (🤹 — Might be next week.)

@carltongibson carltongibson self-requested a review May 18, 2022 13:17
@carltongibson
Copy link
Member

Hey @Flauschbaellchen — I'm picking this up now.

Can I ask, could you put together a minimal app showing the upload/download example you raised in the discussion? You're creating a 1GB file; uploading; waiting for what? The same file sent back? etc. — If you could put together the minimal Django app for that, we can run it with gunicorn/uvicorn/Daphne/etc. and see the differences.
(There's a wider need to set up such benchmarks, so this would tie into that too.)

Thanks

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I ask you to take a look into the asgi.tests.ASGITest.test_disconnect test and help me out to fix this?

The issue is the http.disconnect handling in _receive_more_data() is not triggered until the the input queue is read from (at least once). We'll need to consider how to handle this — maybe some kind of lookahead... 🤔 — but I need to ponder more.

Comment on lines +46 to +50
message = async_to_sync(self.receive)()
if message["type"] == "http.disconnect":
# Early client disconnect.
self._has_more = False # safeguard against trying to call receive again
raise RequestAborted()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't triggered until the input queue is read at least once.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is the intended behavior, maybe a better approach would be to test ASGIStream separately for this test case as a model test instead of the current integration test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, happy to look at a refactor.

But that integration test should pass no? (I do expect a timeout after a disconnect...?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're not currently handling this correctly I think. Follow up at https://code.djangoproject.com/ticket/33738

@carltongibson
Copy link
Member

Closing as per ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants