Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sans-IO MultiPartParser #1330

Open
wants to merge 5 commits into
base: master
from

Conversation

3 participants
@edk0
Copy link
Member

commented Jul 2, 2018

While this PR doesn't add any asyncio code, the motivation here is making it easier to support asyncio in the form parser, which is the major piece of synchronous-only code in werkzeug.

The idea here is to rewrite the parser to be force-fed pieces of buffer by something else. MultiPartParser's existing, synchronous API is then implemented by reading a file and feeding it in. This does necessitate making all parser state explicit, but the total LOC change is surprisingly light.

An effort to add async support might subclass MultiPartParser, and the only change required would be to drive the basic parsing logic with data acquired asynchronously.

One thing I'm not very happy with is that we need to keep parse_multipart_headers, as part of the documented API, despite the fact that we can't use it.

@edk0 edk0 force-pushed the edk0:sansio branch 3 times, most recently from 19cc4dc to d97f49c Jul 2, 2018

@tomchristie

This comment has been minimized.

Copy link

commented Jul 5, 2018

Nice work.

One other complication here that I'm not sure how to tackle best is that stream_factory presents a blocking interface when saving file data. Really you also want those operations to happen asynchronously, too. (As well as ingesting the request input stream asynchronously, which this PR already provides for)

@edk0

This comment has been minimized.

Copy link
Member Author

commented Jul 6, 2018

Yeah. asyncio provides no standard way to deal with files asynchronously, so I expect we'll just have to invoke synchronous write operations via the executor. I do need to think about how to split that out cleanly, though.

@tomchristie

This comment has been minimized.

Copy link

commented Jul 8, 2018

Makes sense - Helps that we’d be able to pass a custom stream_factory argument for the async case.

I’d be interested in seeing this implementation in Starlette https://github.com/encode/starlette too - since it doesn’t yet support form parsing.

@edk0 edk0 force-pushed the edk0:sansio branch from 6fc36d0 to b797272 Jul 9, 2018

@edk0

This comment has been minimized.

Copy link
Member Author

commented Jul 9, 2018

I've pulled out the bit that touches files into another class.

Regarding Starlette, you're welcome to it and I can try to remember to make a PR there when this is done with, but if I were you I'd take advantage of being async-only and write a natively async implementation.

@davidism

This comment has been minimized.

Copy link
Member

commented Aug 10, 2018

Sorry for taking so long. This looks good.

Would you add more comments to the code so it's easier to maintain later? For example, there's a lot of intermediate blocks and methods that are used in the parser, and it would be nice to know what string is being parsed at that point and what gets produced. It would also be good to document the "entry points" for parsing so another developer could understand when and how to override these classes.

@davidism

This comment has been minimized.

Copy link
Member

commented Aug 10, 2018

You mention keeping parse_multipart_headers. Can the code in the function be replaced with the new interface, or is there some other difference that we need to maintain?

@davidism davidism added this to In Progress in Async Aug 29, 2018

@davidism davidism requested a review from pgjones Aug 29, 2018

self._filename, self._file = self.parent.start_file_streaming(
filename, self._headers, self.content_length)
elif ev == 'cont':
self._file.write(data)

This comment has been minimized.

Copy link
@tomchristie

tomchristie Oct 15, 2018

Point of caution: In an async context we wouldn't want to have blocking file writes like this.

@davidism davidism added the ASGI label Mar 21, 2019

edk0 added some commits Jul 3, 2018

@edk0 edk0 force-pushed the edk0:sansio branch from b797272 to ec87584 Apr 4, 2019

@edk0 edk0 force-pushed the edk0:sansio branch 2 times, most recently from 24438fd to 7aa5510 Apr 25, 2019

@edk0

This comment has been minimized.

Copy link
Member Author

commented Apr 25, 2019

I guess we can implement parse_multipart_headers in terms of the new stuff. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.