Skip to content

Chunked stream memory leak #3631

Closed
Closed
@mvanderkroon

Description

Long story short

I'm trying to (POST) stream data to an aiohttp server instance (potentially hundreds of gigabytes), however my data is not typically stored in files and can be 'generated' in client processes on the fly. I can't use a multipart upload as it does not fit my needs, nor do I use the data = await request.post() shorthand (which the docs are clear about in that that will OOM for large files).

I'm trying to use the underlying StreamReader (request._payload) to allow line by line iteration over the stream. In doing so, aiohttp (server) consumes more and more memory until the application OOMs.

Expected behaviour

Processing a stream of data in aiohttp server should not cause OOMs

Actual behaviour

aiohttp OOMs on large streams of data

Steps to reproduce

aiohttp server

# server.py
async def process_stream(request):
    async for line in request._payload:
        pass

    return web.json_response({"STATUS": "OK"})

requests client

# client.py
def generate_data():
    while True:
        yield """hello world\n""".encode('utf-8')
        
r = requests.post("http://localhost:8080/", data=generate_data())

Additional info

I found a resource relating to asyncio and StreamReader/Writer-backpressure. I have done my best to read through the aiohttp source but it looks like the fixes mentioned in the document are already in place so I'm not sure why this is not working.

In fact, I'm not sure whether the memory increase is due to aiohttp (or an underlying lib) holding references to elements in memory, or whether the producing process is simply pushing data in to the queue faster than aiohttp is consuming it (this latter case would suggest a problem with backpressure).

Your environment

server
aiohttp 3.5.4
alpine 3.7.0

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions