Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: ASYNC119 when using an async context manager in an async generator #211

Open
alicederyn opened this issue Feb 27, 2024 · 23 comments

Comments

@alicederyn
Copy link

alicederyn commented Feb 27, 2024

Antipattern: The following code is unsafe:

async def yield_some_values():
    async with some_async_context_manager():
        for _ in range(5):
            ...
            yield some_value  # error: `yield` inside `async with` block

async def bad_method():
    async for value in yield_some_values():
        raise Exception("exiting early")

Explanation: If the iterator is not fully consumed, the cleanup path will only be triggered by the garbage collector, which will not allow the async cleanup logic to await (it will raise GeneratorExit when it tries). A similar issue arises if you try to await in a finally block in an async generator, or an except block that can intercept asyncio.exceptions.CancelledError.

The only exception to this that I'm aware of is if the async generator is decorated with @contextlib.asynccontextmanager.

Fix: I think unfortunately the only fix is to refactor the API to separate the context management from the iteration, e.g.

@contextlib.asynccontextmanager
async def yield_some_values_safely():
    async with some_async_context_manager():
        async def inner_iterator():
            for _ in range(5):
                ...
                yield some_value
        yield inner_iterator  # safe because this is in an asynccontextmanager

async def bad_method():
    async with yield_some_values_safely() as values:
        async for value in values:
            raise Exception("exiting early")

I'm not aware of any existing lint check for this antipattern (I may have missed it though!) — do the maintainers of this repo think it would be valuable?

@alicederyn alicederyn changed the title Suggestion: error when using an async context manager in an async iterator Suggestion: error when using an async context manager in an async generator Feb 27, 2024
@alicederyn
Copy link
Author

alicederyn commented Feb 27, 2024

Actually, I'm wondering if this is exactly ASYNC101 and I just don't understand what a "nursery or cancel scope" is? In which case this may just be a request to modify the wording in the README to say "async with" block as well?

@Zac-HD
Copy link
Member

Zac-HD commented Feb 27, 2024

Oh boy, this is much harder than it sounds - see https://peps.python.org/pep-0533/ and python-trio/trio#265 for some of the details. Additionally you can get similar issues if you yield across a CancelScope (a sync context manager), including having exceptions including cancellations delivered to the wrong task 😱

My personal view is that async generators are simply too dangerous and too difficult to use correctly, and so at work we've banned them entirely via the TRIO900 lint rule - soon to be ASYNC900. To make async generators reasonably safe, I think we'd need to

  1. revive something like PEP-533, which would be a considerable effort given the compatibility effects and general sense among most Python users and developers that the status quo is fine. There is some interest, but I'm not sure it's feasible.
  2. write a new PEP, proposing a mechanism by which a frame can be marked "error on attempt to yield across this", and have CancelScope apply such a marker. @njsmith had a draft of this a while ago we might be able to revive?
  3. Add an optional lint rule as proposed here; it'd trigger on a (large?) subset of cases caught by ASYNC900 but I guess it's technically possible to write correct async generators, at least in small codebases and teams. (If I sound bitter about this, it's because I tried really hard and it really didn't work)

Overall, designing around channels instead of generators can feel clumsy at first, but I've found it leads to cleaner and safer designs. And unrelated but relevant, consider using anyio rather than asyncio for better cancellation semantics?

@jakkdl
Copy link
Member

jakkdl commented Feb 29, 2024

Actually, I'm wondering if this is exactly ASYNC101 and I just don't understand what a "nursery or cancel scope" is? In which case this may just be a request to modify the wording in the README to say "async with" block as well?

nursery/cancelscope are anyio/trio constructs. I can try to clarify that in the docs.
Can also try to clarify the reason behind why you might want to enable ASYNC900 check.

The proposed check here would be:

Error if yield inside an async context manager, or inside a CancelScope, unless the method is marked with asynccontextmanager

?

A similar issue arises if you try to await in a finally block in an async generator or an except block that can intercept asyncio.exceptions.CancelledError.

This is ASYNC102, no? Except ASYNC102 hasn't been updated to also check for asyncio.exceptions.CancelledError

@alicederyn
Copy link
Author

alicederyn commented Feb 29, 2024

The proposed check here would be:

Error if yield inside an async context manager, or inside a CancelScope, unless the method is marked with asynccontextmanager

?

Yes to async context manager, defer to @Zac-HD on CancelScope

This is ASYNC102, no? Except ASYNC102 hasn't been updated to also check for asyncio.exceptions.CancelledError

Yes it is! Thank you

@Zac-HD
Copy link
Member

Zac-HD commented Feb 29, 2024

Oops, it took me long enough to write my comment that I missed Alice's - sorry!

Agree that ASYNC102 + asyncio support is the solution to that part.

I'm planning to write up some more detail about each of our error codes in a long doc, hopefully this weekend, which will include much more about why you'd want to enable each of the optional rules in addition to the existing material. For ASYNC900 that'll partly be based on my comment above. Should also be useful for ruff docs too 🙂

@Zac-HD
Copy link
Member

Zac-HD commented Mar 17, 2024

OK, writing up some notes on what we can implement here. I'm still pretty dubious about having async generators at all, but if you've already either put async with aclosing(...) as ait: everywhere or are using asyncio + given up on prompt resource cleanup, I guess it's not that bad. Avoiding context managers does mean you can't yield across cancel scopes, so we at least avoid that additional problem.

ASYNC119: async generator cleanup may be delayed until await is no longer allowed. Avoid yield inside (async) context managers, awaiting in finally: or cancel-catching except: blocks. We strongly encourage you to read PEP-533 and use async with aclosing(...), or better yet avoid async generators entirely (see ASYNC-900) in favor of context managers which return an iterable channel/queue.

@Zac-HD Zac-HD changed the title Suggestion: error when using an async context manager in an async generator Suggestion: ASYNC119 when using an async context manager in an async generator Mar 17, 2024
@alicederyn
Copy link
Author

alicederyn commented Mar 17, 2024

notes on what we can implement here...awaiting in finally: or cancel-catching except: blocks

Just to check, is it not possible for the plugin to only prevent the user awaiting in finally/except blocks that did a yield in the try? I think the following is safe:

async def foo():
  while True:
    foo = foo()
    try:
      await foo.aopen()
      value = await foo.get_result()
    finally:
      await foo.aclose()
    yield value

(At this point, it wouldn't surprise me immensely to learn it's actually not safe, but if so I'd love to know why...)

iterable channel/queue

I don't know what an iterable channel is, and people may not have come across asyncio.Queue, so it would be great if these could link to definitions.

better yet avoid async generators entirely (see ASYNC-900) in favor of context managers which return an iterable channel/queue.

Isn't returning an async generator which consumes from an async queue just as safe as returning that queue? (Asking because I own code that does that pattern.)

@Zac-HD
Copy link
Member

Zac-HD commented Mar 18, 2024

only prevent the user awaiting in finally/except blocks that did a yield in the try

Yeah, I think this is safe, though I'm not especially confident.

I don't know what an iterable channel is, and people may not have come across asyncio.Queue, so it would be great if these could link to definitions.

We should definitely put links in our docs; for the error messages I think using fully-qualified names and matching the error message to the library you're using would be sufficient. Trio and anyio use channels; it's basically decomposing the send-end and recieve-end of a queue into separate objects, and recieve-channels are async iterables.

Isn't returning an async generator which consumes from an async queue just as safe as returning that queue? (Asking because I own code that does that pattern.)

IIRC the problems with this pattern are if you yield across the boundary of a CancelScope. The closest asyncio interface is with asyncio.timeout(...): and probably also TaskGroup; I'm not sure whether you'd actually observe any bad behavior in asyncio and without an async-iterable interface to queues it sounds like an important bit of ergonomics.

@jakkdl
Copy link
Member

jakkdl commented Apr 15, 2024

So the check to be implemented is:

Error on yield, if inside any context manager, in an async function, that is not marked @asynccontextmanager.

And separately, ASYNC102 could be updated to only error if there's a yield inside the corresponding try?
Current ASYNC102 is

ASYNC102: It's unsafe to await inside finally: or except BaseException/trio.Cancelled/anyio.get_cancelled_exc_class()/asyncio.exceptions.CancelledError unless you use a shielded cancel scope with a timeout. This is currently not able to detect asyncio shields.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 15, 2024

Check to be implemented: yes.

I think #210 was the only change needed for ASYNC102 though. If there's something else, can someone propose a code block that will trigger a linter false-alarm?

@alicederyn
Copy link
Author

alicederyn commented Apr 15, 2024

I haven't checked if this actually passes or fails ASYNC102, but this is an example that the current text looks like it would reject:

async def foo():
  bar = Bar()
  bar.aopen()
  try
    values = await bar.fetch()
  finally:
    await bar.aclose()
  for value in values: 
    yield await baz(value)

This would be totally fine, as there's no yield in the try block.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 15, 2024

Hmm. I think you can still be cancelled while waiting for bar.fetch(), so the intent of ASYNC102 still applies here.

More generally this looks like Bar should be (or have) an async context manager, in which case you'd write:

async def foo():
    async with Bar() as bar:
        values = await bar.fetch()
    for value in values: 
        yield await baz(value)

and avoid the lint warning with shorter and less-mistake-prone code.

@alicederyn
Copy link
Author

alicederyn commented Apr 16, 2024

I think you can still be cancelled while waiting for bar.fetch(), so the intent of ASYNC102 still applies here.

Does the cancellation not get raised in the fetch() call in this case though, allowing the cleanup to await safely? I think the root issue is only triggered when cancellation happens during a yield, resulting in cleanup running in the GC.

More generally this looks like Bar should be (or have) an async context manager

The finally could also be a catch.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 16, 2024

🤦‍♂️ my bad, I forgot that asyncio uses edge-triggered rather than anyio/trio's level-triggered cancellation semantics - in the latter, await-ing while cancelled will immediately raise another Cancelled exception. Plausibly ASYNC102 just isn't worth it for asyncio code, at least in the current form? That's really a separate issue though.

@alicederyn
Copy link
Author

alicederyn commented Apr 16, 2024

Plausibly ASYNC102 just isn't worth it for asyncio code, at least in the current form?

If the choice is between having it as-is and removing it completely, I'd say keep it. Better to catch real problems and leave the user to refactor or suppress edge cases.

in the latter, await-ing while cancelled will immediately raise another Cancelled exception

Wow. That seems deeply problematic. I assume there are ways to schedule asynchronous cleanup code somehow though, even when cancelled?

@Zac-HD
Copy link
Member

Zac-HD commented Apr 17, 2024

Yes, you can shield Trio and anyio cancel scopes, and that's what ASYNC102 is recommending 🙂

@jakkdl
Copy link
Member

jakkdl commented Apr 18, 2024

Plausibly ASYNC102 just isn't worth it for asyncio code, at least in the current form?

If the choice is between having it as-is and removing it completely, I'd say keep it. Better to catch real problems and leave the user to refactor or suppress edge cases.

I could make it not warn on awaits inside except asyncio.exceptions.CancelledError, though making it not warn inside finally probably is more trouble than it`s worth.

Yes, you can shield Trio and anyio cancel scopes, and that's what ASYNC102 is recommending 🙂

asyncio also has shields: https://docs.python.org/3/library/asyncio-task.html#asyncio.shield that I was going to add support for, but given that this check is not applicable for asyncio I don't think I should do that anymore and instead update the warning text that it can be ignored for asyncio.

@alicederyn
Copy link
Author

alicederyn commented Apr 18, 2024

given that this check is not applicable for asyncio

It is applicable, just too wide.

update the warning text that it can be ignored for asyncio

As long as the text is clear that it can only sometimes be ignored (and when), that SGTM

asyncio also has shields

I don't think they're comparable -- a shield doesn't actually prevent the task being cancelled unless you also hold a hard reference to the shielded task. Honestly asyncio.shield is very strange, I can think of a bunch of extra rules to catch issues trying to use it.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 18, 2024

Given that the semantics are substantially different, I think I'd rather split the rule between the current trio/anyio support, and a new 3xx rule specific to asyncio. That's plausibly blocked on finding an asyncio-experienced contributor though, unless you'd be interested in joining the team?

@alicederyn
Copy link
Author

I can investigate whether that's possible with my current company (I work at Bloomberg), I know they have a process.

@jakkdl
Copy link
Member

jakkdl commented May 1, 2024

Uh, looking at ASYNC101 for #215 I think ASYNC119 will warn on all instances of ASYNC101 except for sync context managers - of which I can only think of [trio/anyio].CancelScope(). Do we want both codes to stick around, seeing as they kind of largely do the same thing - or should they be merged?

@Zac-HD
Copy link
Member

Zac-HD commented May 2, 2024

I think we should keep both; they're conceptually distinct. e.g. PEP-533 would resolve ASYNC119, but not 101 (that's a at-some-point-forthcoming PEP, cf this thread). And concretely, some users will want to suppress 119 but really shouldn't disable 101.

@jakkdl
Copy link
Member

jakkdl commented May 3, 2024

I think we should keep both; they're conceptually distinct. e.g. PEP-533 would resolve ASYNC119, but not 101 (that's a at-some-point-forthcoming PEP, cf this thread). And concretely, some users will want to suppress 119 but really shouldn't disable 101.

cool. I'll see about adding some text to that effect in the docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants