Multipart feature #273

kxepal · 2015-02-12T17:11:43Z

I would like to contribute multipart feature from my aiocouchdb project.

This PR includes introduces following things:

Multipart reader. It was designed with stream processing in mind and ability to skip body parts which aren't need to be processed with almost no footprint. This works simple:

resp = yield from aiohttp.request(...)

# First, we need to wrap our response via special classmethod of MultipartReader.
# This needs to keep implementation of MultipartReader separated from response 
# and connection routines, keeping his implementation portable.
reader = MultipartReader.from_response(resp)

while True:
   # Fetching next part
  part = yield from reader.next()
  # The returned BodyPartReader instance holds only headers and response stream 
  # pointed on his content.

  if part is None:
    # no more parts left to process, breaking out
    break

  # filename is a special property which automagically extracts filename param from
  # Content-Disposition header. You may also try to make it by your own though access
  # to part.headers CIMultiDict, but believe me, this header value logic is full of pain.
  # see http://greenbytes.de/tech/tc2231
  if part.filename != 'foo.txt':
      # Here we continue loop without reading part body.
      # On the next `reader.next()` call his body will be read to the void and
      # new part will be returned.
      continue

  # The BodyPartReader API is very similar to ClientResponse one, but with own specifics.
  # The decode flag to let our content passed though decoding routines which are
  # care about values  defined in Content-Encoding and Content-Transfer-Encoding headers.
  # Other methods like `.json`, `.form`, `.text` automagically decodes content.  
  # Additionally there is `.read_chunk` coroutine method to read body by chunks,
  # and obliviously without any decoding magic.
  data = yield from part.read(decode=False)

  # You can also decode data by your own
  data = part.decode(data)
  break

# To release connection and read whole remain response payload to the void
# you may either call
# yield from resp.release()
# or
yield from reader.release()

Multipart writer. It was also designed for streaming nature in mind with helping automagic in the box:

with MultipartWriter('mixed') as writer:
  # Add file object as a part. name, content length and type
  # will be automagically recognised for you
  part = writer.append(open(__file__), {ETAG: '0xABCDEFG'}) 
  # You may also set additional headers by your own
  part.headers[CONTENT_ENCODING] = 'gzip'

  # If you're man-of-JSON here is special helper
  # which sets content type and applies encoding for you
  writer.append_json({'foo': 'bar'})


  # Multipart is recursive format so nesting 
  with MultipartWriter('form-data') as subwriter:
    part = writer.append_form({'bar': 'baz'})
    # Content-Disposition is hard header, so there is a helper to work with it
    part.set_content_disposition('form-data', name='abc')
  writer.append(subwriter)

# That's all what you need. All the magic happens internally during serialization.
# All JSON will be encoded properly via json module, all files will be read and encoded
# with respect to specified encoding. Parts with Content-Encoding or Content-Transfer-Encoding
# will get additionally encoded by the specified instructions.
yield from aiohttp.request(data=writer)

FormData remains active, but now based on multipart.MultipartWriter. Removing it will break a lot of code and ways aiohttp been used. I don't respect those, but do compatibility.
One nasty bug was found in aiohttp.web, but kept it untouched.

79 vs 80

cgi.FieldStorage doesn't likes body parts with Content-Length which were sent via chunked transfer encoding. This fix doesn't saves aiohttp.web from failure for third party clients, but at least it continue work correct for aiohttp one.

asvetlov · 2015-02-13T11:03:04Z

I like the idea in general.

About .next() method: I don't like returning None on end of post data.
Maybe exception is more appropriate solution?

asvetlov · 2015-02-13T11:08:38Z

Should aiohttp.web use new multipart parser?

kxepal · 2015-02-13T11:57:30Z

About .next() method: I don't like returning None on end of post data.
Maybe exception is more appropriate solution?

I didn't found any good standard exception that could be used in that place. StopIteration might be an only good candidate, but, imho, it's not what is expected here. TBH, I borrowed that pattern from some python-tulip thread where Guido replied on question "how to iterate over coroutine". In aiocouchdb I used it a lot for iteration over streaming data and didn't found any need to replace it with something. What could you suggest?

Should aiohttp.web use new multipart parser?

It's worth to, but I didn't start work with aiohttp.web much enough (shame on me!) to understand where "to harness horses". Need to implement something on it with multipart data exchange to understand that.

fafhrd91 · 2015-03-11T16:36:44Z

@kxepal i merged your changes. Would you mind to add new section to docs.

fafhrd91 · 2015-03-11T16:39:30Z

@kxepal is it possible to do something like without skipping data:

reader = MultipartReader.from_response(resp)
data = {}

part = yield from reader.next()
while part is not None:
   data[part.name] = part
   part = yield from reader.next()

itemData = yield from data["name"].read()

kxepal · 2015-03-11T17:50:02Z

Oh, great! Will send PR shortly.

lock · 2019-10-30T03:03:30Z

This thread has been automatically locked since there has not been
any recent activity after it was closed. Please open a new issue for
related bugs.

If you feel like there's important points made in this discussion,
please include those exceprts into that new issue.

kxepal added 5 commits February 12, 2015 19:31

Import multipart module from aiocouchdb

3c0a8ed

Make py.flakes happy

02ddf79

79 vs 80

Update FormData to use multipart.MultipartWriter

e290407

Integrate multipart into aiohttp

4c5208e

asvetlov mentioned this pull request Feb 16, 2015

Implement the third example of the POST multipart encoded file of the documentation #274

Merged

fafhrd91 merged commit 4c5208e into aio-libs:master Mar 11, 2015

kxepal mentioned this pull request Mar 12, 2015

Update multipart #288

Merged

vbraun mentioned this pull request Aug 2, 2015

multipart.BodyPartReader appends CRLF #454

Closed

lock bot added the outdated label Oct 30, 2019

lock bot locked as resolved and limited conversation to collaborators Oct 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multipart feature #273

Multipart feature #273

kxepal commented Feb 12, 2015

asvetlov commented Feb 13, 2015

asvetlov commented Feb 13, 2015

kxepal commented Feb 13, 2015

fafhrd91 commented Mar 11, 2015

fafhrd91 commented Mar 11, 2015

kxepal commented Mar 11, 2015

lock bot commented Oct 30, 2019

Multipart feature #273

Multipart feature #273

Conversation

kxepal commented Feb 12, 2015

asvetlov commented Feb 13, 2015

asvetlov commented Feb 13, 2015

kxepal commented Feb 13, 2015

fafhrd91 commented Mar 11, 2015

fafhrd91 commented Mar 11, 2015

kxepal commented Mar 11, 2015

lock bot commented Oct 30, 2019