New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak with aiohttp.request #1756

Closed
ddunlop opened this Issue Mar 27, 2017 · 19 comments

Comments

Projects
None yet
5 participants
@ddunlop

ddunlop commented Mar 27, 2017

Long story short

I'm working on a long running program that makes a significant number of https requests, and the memory of the program increases steadily.

Expected behaviour

I would expect the memory to remain relatively constant.

Actual behaviour

For a sample 500 requests the memory increases approximate 8 meg doing just requests.

Steps to reproduce

import logging
import asyncio
import aiohttp
import gc


async def get():
    response = await aiohttp.request('get', 'https://ddunlop.com')
    logging.info(response.status)
    response.close()


async def main(loop):
    await asyncio.sleep(10)
    for x in range(0, 500):
        await get()

    logging.info('done fetching')
    gc.collect()
    await asyncio.sleep(60)


if __name__ == '__main__':
    logging.getLogger().setLevel('DEBUG')
    loop = asyncio.get_event_loop()
    loop.run_until_complete(main(loop))

Your environment

aiohttp 2.0.2 inside a python:3.6.0-slim docker container.

fafhrd91 added a commit that referenced this issue Mar 27, 2017

fafhrd91 added a commit that referenced this issue Mar 27, 2017

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 27, 2017

should be fixed, will release new version today
thanks for report!

@fafhrd91 fafhrd91 closed this Mar 27, 2017

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 27, 2017

2.0.4 is released

@ddunlop

This comment has been minimized.

ddunlop commented Mar 28, 2017

Thanks for jumping on this.

Running my example code I still see the memory climb as the requests are made and then not drop down after the garbage collecting. So I suspect there are more memories issues.

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 28, 2017

does it grows all the time? gc does not show any leaked objects.

@ddunlop

This comment has been minimized.

ddunlop commented Mar 28, 2017

As the loop of 500 requests is done the memory for the python process grows, then while the sleep is happing it's static.

@fafhrd91 fafhrd91 reopened this Mar 28, 2017

@alex-eri

This comment has been minimized.

Contributor

alex-eri commented Mar 28, 2017

Are connections closed after each request? Look at netstat

@ddunlop

This comment has been minimized.

ddunlop commented Mar 28, 2017

@alex-eri I checked out netstat and it looks like there are not any connection left around.

@Rapptz

This comment has been minimized.

Rapptz commented Mar 29, 2017

I was looking into this.

After messing around with objgraph I found an abnormal growth in memory (33 MiB to 500 MiB in a timespan of 12 hours). I originally thought it was my code but some inspection led me to this library instead.

Looking at the growth of object references over time I found the following:

dict                       27731    +22390
deque                       9730     +8927
method                      9802     +8926
ServerDisconnectedError     4848     +4465
weakref                     6668     +4464
SSLProtocol                 4850     +4464
_SSLProtocolTransport       4850     +4464
ResponseHandler             4849     +4464
_SSLPipe                    4850     +4464
SSLObject                   4850     +4464

Which was odd given that I don't store references to any of those cases.

I began by investigating the ServerDisconnectError reference chain since I found it odd to store references to those, but that just led me back to ResponseHandler. I decided to print a reference graph to see where the backreferences lead to:

It ends at some list. I checked out the list and it's just a list of _SSLProtocolTransport objects that indefinitely grows. I am not sure where the source of this list is. I'm still investigating but I figured I'd post some progress and information here.

I should note that I do not use aiohttp.request, I use ClientSession.request instead while keeping the ClientSession alive myself.

Apologies if this is not helpful.

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 29, 2017

I guess asyncio's ssl transport implementation just bad. Could you try uvloop?

@Rapptz

This comment has been minimized.

Rapptz commented Mar 29, 2017

@fafhrd91 More investigation points this to aiohttp.connector.TCPConnector. It seems the list in question is Connector._cleanup_closed_transports and is never cleaned up. Just indefinitely grows.

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 29, 2017

Do you see this with 2.0.4?

@Rapptz

This comment has been minimized.

Rapptz commented Mar 29, 2017

2.0.4 is what I'm using.

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 29, 2017

ok, I will check

@Noctem

This comment has been minimized.

Noctem commented Mar 29, 2017

I'm using uvloop but also seem to be experiencing a memory leak. My application has been running for 30 hours making roughly 10 requests per second and my memory usage is ~450MB when it would normally be ~150MB.

@Rapptz

This comment has been minimized.

Rapptz commented Mar 29, 2017

@fafhrd91 Some more investigation that I did out of curiosity.

It seems to me that you intended Connector._cleanup_closed_handle to be called repeatedly. However some inspection on it makes it seem like it's only called once and never again.

Checking its _when attribute relative to the loop.time() leads something like (352676, 363576.031) which would place it as being called once but never again in the 11000 seconds afterwards (which is around ~3 hours, when I started running the script).

Making sure that the _cleanup_closed_handle keeps running is potentially the fix for this issue. I would do a PR myself but I do not really know the code structure. I am not sure how _cleanup is supposed to be run as it instantly cancels the timer and then attempts to recreate it but it seems it never gets the chance to.

Hope these findings help you.

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 29, 2017

I found problem. working on fix

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 29, 2017

pushed fix to 2.0 branch

@Rapptz @Noctem could you test?

@Rapptz

This comment has been minimized.

Rapptz commented Mar 29, 2017

@fafhrd91 I have tested it and it does seem to fix the issue.

@fafhrd91

This comment has been minimized.

Member

fafhrd91 commented Mar 29, 2017

ok, good

@fafhrd91 fafhrd91 closed this Mar 29, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment