Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Asynchronous generator memory leak #85401

Closed
zkonge mannequin opened this issue Jul 7, 2020 · 8 comments
Closed

Asynchronous generator memory leak #85401

zkonge mannequin opened this issue Jul 7, 2020 · 8 comments
Labels
3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-asyncio

Comments

@zkonge
Copy link
Mannequin

zkonge mannequin commented Jul 7, 2020

BPO 41229
Nosy @terryjreedy, @njsmith, @asvetlov, @1st1, @achimnol, @achimnol, @miss-islington, @zkonge
PRs
  • bpo-41229: Update docs for explicit aclose()-required cases and add contextlib.aclosing() method #21545
  • bpo-41229: Add a whatsnew entry about contextlib.aclosing #23217
  • Files
  • leak.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2020-07-19.10:17:19.128>
    created_at = <Date 2020-07-07.12:08:08.160>
    labels = ['interpreter-core', '3.8', '3.9', '3.10', 'performance', 'invalid', 'expert-asyncio']
    title = 'Asynchronous generator memory leak'
    updated_at = <Date 2020-11-10.07:57:19.738>
    user = 'https://github.com/zkonge'

    bugs.python.org fields:

    activity = <Date 2020-11-10.07:57:19.738>
    actor = 'Joongi Kim'
    assignee = 'none'
    closed = True
    closed_date = <Date 2020-07-19.10:17:19.128>
    closer = 'njs'
    components = ['Interpreter Core', 'asyncio']
    creation = <Date 2020-07-07.12:08:08.160>
    creator = 'zkonge'
    dependencies = []
    files = ['49302']
    hgrepos = []
    issue_num = 41229
    keywords = []
    message_count = 8.0
    messages = ['373221', '373498', '373942', '373947', '373951', '373953', '373954', '380192']
    nosy_count = 8.0
    nosy_names = ['terry.reedy', 'njs', 'asvetlov', 'yselivanov', 'Joongi Kim', 'achimnol', 'miss-islington', 'zkonge']
    pr_nums = ['21545', '23217']
    priority = 'normal'
    resolution = 'not a bug'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue41229'
    versions = ['Python 3.8', 'Python 3.9', 'Python 3.10']

    @zkonge
    Copy link
    Mannequin Author

    zkonge mannequin commented Jul 7, 2020

    The resource used by asynchronous generator can't be released properly when works with "asend" method.

    Besides, in Python 3.7-, a RuntimeError was raised when asyncio.run complete, but the message is puzzling:
    RuntimeError: can't send non-None value to a just-started coroutine

    In Python 3.8+, No Exception showed.

    Python3.5 unsupport yield in async function, so it seems no affect?

    @zkonge zkonge mannequin added 3.7 (EOL) end of life 3.10 only security fixes 3.8 (EOL) end of life 3.9 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) topic-asyncio performance Performance or resource usage labels Jul 7, 2020
    @terryjreedy terryjreedy removed 3.7 (EOL) end of life labels Jul 11, 2020
    @terryjreedy
    Copy link
    Member

    Only 3.8+ for bug fixes.

    @achimnol
    Copy link
    Mannequin

    achimnol mannequin commented Jul 19, 2020

    From the given example, if I add "await q.aclose()" after "await q.asend(123456)" it does not leak the memory.

    This is a good example showing that we should always wrap async generators with explicit "aclosing" context manager (which does not exist yet in the stdlib).
    I'm already doing so by writing a custom library:
    https://github.com/achimnol/aiotools/blob/ef7bf0ce/src/aiotools/context.py#L152

    We may need to update the documentation to recommend explicit aclosing of async generators.

    @achimnol
    Copy link
    Mannequin

    achimnol mannequin commented Jul 19, 2020

    I've searched the Python documentation and the docs must be updated to explicitly state the necessity of aclose().

    refs)
    https://docs.python.org/3/reference/expressions.html#asynchronous-generator-functions
    https://www.python.org/dev/peps/pep-0525/

    I'm not sure that what the original authors' intention is, but for me, it looks like that calling aclose() is an optional thing and the responsibility to call aclose() on async generators is left to the asyncgen-shutdown handler of the event loop.

    The example in this issue show that we need to aclose asyncgens whenever we are done with it, even far before shutting down the event loop.

    @njsmith
    Copy link
    Contributor

    njsmith commented Jul 19, 2020

    Huh, this is very weird. I can confirm that the async generator objects aren't cleaned up until loop shutdown on asyncio.

    On the trio main branch, we don't yet use the set_asyncgen_hooks mechanism, and the async generator objects are cleaned up immediately.

    However, if I check out this PR that will add it: python-trio/trio#1564

    ...then we see the same bug happening with Trio: all the async generators are kept around until loop shutdown.

    Also, it doesn't seem to be a circular references issue – if I explicitly call gc.collect(), then the asyncgen destructors are still not called; only shutting down the loop does it.

    This doesn't make any sense, because asyncio/trio only keep weak references to the async generator objects, so they should still be freed.

    So maybe the set_asyncgen_hooks code introduces a reference leak on async generator objects, or something?

    @njsmith
    Copy link
    Contributor

    njsmith commented Jul 19, 2020

    ...On closer examination, it looks like that Trio PR has at least one test that checks that async generators are collected promptly after they stop being referenced, and that test passes:

    https://github.com/python-trio/trio/pull/1564/files#diff-c79a78487c2f350ba99059813ea0c9f9R38

    So, I have no idea what's going on here.

    @njsmith
    Copy link
    Contributor

    njsmith commented Jul 19, 2020

    Oh! I see it. This is actually working as intended.

    What's happening is that the event loop will clean up async generators when they're garbage collected... but, this requires that the event loop get a chance to run. In the demonstration program, the main task creates lots of async generator objects, but never returns to the main loop. So they're all queued up to be collected, but it can't actually happen until you perform a real async operation. For example, try adding 'await asyncio.sleep(1)` before the input() call so that the event loop has a chance to run, and you'll see that the objects are collected immediately.

    So this is a bit tricky, but this is actually expected behavior, and falls under the general category of "don't block the event loop, it will break stuff".

    @miss-islington
    Copy link
    Contributor

    New changeset 6e8dcda by Joongi Kim in branch 'master':
    bpo-41229: Update docs for explicit aclose()-required cases and add contextlib.aclosing() method (GH-21545)
    6e8dcda

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage topic-asyncio
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants