A nursery might inject a Cancelled even when none of the tasks received a Cancelled #1457

vxgmichel · 2020-04-19T10:30:43Z

I noticed that nurseries might inject Cancelled exception even when none of the tasks have received a Cancelled. I have two different examples reproducing this issue.

A nursery with a single task raising an exception

The following code runs a nursery with a single task raising an exception while the outer scope has been cancelled:

async def main():
    with trio.CancelScope() as scope:
        try:
            async with trio.open_nursery() as nursery:
                scope.cancel()
                raise RuntimeError
        except trio.MultiError as exc:
            print("Exceptions:", exc)
            print("Cancelled caught:", nursery.cancel_scope.cancelled_caught)

It prints:

Exceptions: RuntimeError(), Cancelled()
Cancelled caught: False

The Cancelled() exception is injected here:

trio/trio/_core/_run.py

Lines 845 to 851 in fbe3bd3

    
           # Nothing to wait for, so just execute a checkpoint -- but we 
        
           # still need to mix any exception (e.g. from an external 
        
           # cancellation) in with the rest of our exceptions. 
        
           try: 
        
               await checkpoint() 
        
           except BaseException as exc: 
        
               self._add_exc(exc)

A potential fix could to be to use cancel_shielded_checkpoint() instead of checkpoint().

A nursery with a two tasks, both raising an exception

The following code runs a nursery with two tasks both raising an exception while the outer scope has been cancelled:

async def main():
    with trio.CancelScope() as scope:
        try:
            async with trio.open_nursery() as nursery:
                nursery.start_soon(_raise, RuntimeError(1))
                scope.cancel()
                raise RuntimeError(2)
        except trio.MultiError as exc:
            print("Exceptions:", exc)
            print("Cancelled caught:", nursery.cancel_scope.cancelled_caught)

It prints:

Exceptions: RuntimeError(2), Cancelled(), RuntimeError(1)
Cancelled caught: False

The Cancelled() exception is injected here:

trio/trio/_core/_run.py

Lines 835 to 840 in fbe3bd3

    
           # If we get cancelled (or have an exception injected, like 
        
           # KeyboardInterrupt), then save that, but still wait until our 
        
           # children finish. 
        
           def aborted(raise_cancel): 
        
               self._add_exc(capture(raise_cancel).error) 
        
               return Abort.FAILED

I don't see the point of injecting a Cancelled here, since the cancellation is going to be ignored anyway: isn't it a rule of trio that a cancelled (in the sense of cancel_called) but completed operation should not raise a Cancelled?

Raising Cancelled means that the operation did not happen.
https://trio.readthedocs.io/en/latest/reference-core.html#cancellable-primitives

I feel like it would be more consistent with the trio cancellation semantics to see the nursery cleanup as:

try:
    yield nursery
finally:
    with trio.CancelScope(shield=True):
        await nursery.join_tasks()

The text was updated successfully, but these errors were encountered:

njsmith · 2020-08-25T07:42:27Z

This bit @catern today: https://gitter.im/python-trio/general?at=5f447dd3ec534f584fbcf7d7

I think this issue is correct, and nursery __aexit__ should only raise Cancelled if one of the child tasks raises Cancelled.

@oremanj

Previously, a Nursery would checkpoint on exit, regardless of whether it was running anything. This is a bit annoying, see python-trio#1457, so let's change it to not checkpoint on exit. Now it won't be a checkpoint at all if there's no tasks running inside. However, this doesn't fully solve python-trio#1457, because we'll still inject a cancel even if none of the child tasks raise Cancelled. This seems to break trio/_core/tests/test_asyncgen.py::test_last_minute_gc_edge_case such that it no longer ever hits the edge case. That seems like a pretty complicated test, so maybe @oremanj can say better why exactly it's breaking with this change.

oremanj · 2020-08-26T19:03:03Z

I think exiting a nursery should always be a schedule point, and should only be a cancellation point if the nursery didn't run any tasks (note this is different from the "no tasks were still running when we exited the async with block" in #1696). That means an async function that opens a nursery and runs zero or more tasks in it will have at least one full checkpoint as long as each individual task had at least one full checkpoint. And that makes it easier to reason about whether we're upholding the "all async functions provided by Trio execute at least one checkpoint" property.

catern · 2020-08-26T19:28:50Z

For me, my problems wouldn't be solved if the nursery injected a Canceled when none of the tasks exited with a Canceled. I think injecting a cancel in that way makes it harder for user code to provide the correct Canceled semantics...

Raising Cancelled means that the operation did not happen.

...as mentioned at the start of this issue. Most code using nurseries, that wants to provide the correct Cancelled semantics, will need to catch Cancelled, as I do here https://github.com/catern/rsyscall/blob/concur/python/rsyscall/tasks/clone.py#L83 and here https://github.com/catern/rsyscall/blob/concur/python/rsyscall/concurrency.py#L213 and probably need to do in other places as well.

So I think nurseries shouldn't be a cancellation point at all, because otherwise I have this pain. I don't really care about whether it's a schedule point, but it seems unnecessary for it to be a schedule point, so I'd also get rid of that.

oremanj · 2020-08-26T19:36:57Z

My goal is to depart as little as possible from Trio's semantics of "every async function in the trio namespace executes at least one checkpoint, and every async context manager executes at least one on either entry or exit". I don't care that much what happens in a nursery that has already run any tasks, because those tasks should have provided the checkpoint. I do care that an infinite loop of "run all these tasks in a nursery" doesn't hang the program if there are no tasks. (And that requires that a nursery-containing-no-tasks be a full checkpoint.)

catern · 2020-08-26T19:44:10Z

Hm, okay, how about a checkpoint on nursery entry, then?

oremanj · 2020-08-26T19:55:02Z

Checkpoint on nursery entry is a non-starter because there is lots of code that relies on the current behavior of checkpointing on exit only. "Start a task in a cancelled scope" is very different from "don't even start the task", especially when you consider that aclose() methods have well-defined semantics (that are not "do nothing") when run in a cancelled scope.

It sounds like we're in agreement that a nursery that runs any tasks shouldn't raise Cancelled if none of the tasks raised Cancelled. Can you explain why the case of a nursery that runs no tasks is important to you?

I would even be willing to consider letting a nursery that runs no tasks skip the checkpoint if any checkpoints occurred inside the async with block, though that's getting a little magical for my tastes.

njsmith · 2020-08-26T19:55:14Z

@oremanj How do you feel about this not containing any checkpoints?

async def noop():
    pass

async with trio.open_nursery() as nursery:
   nursery.start_soon(noop)

?

Every nursery that contains a child task has to contain at least one schedule point (proof: the nursery can't exit until the child task has been scheduled), so adding a schedule point to the __aexit__ for always-empty-nurseries does strengthen that guarantee to "every nursery contains at least one schedule point". But adding a cancel point to always-empty-nurseries doesn't.

I would hesitate to add a checkpoint to nursery entry because:

async def handler(stream: Stream):
    async with stream:
        ...

stream = await open_stream(...)
async with trio.open_nursery() as nursery:
    nursery.start_soon(handler, stream)

Right now, the above code guarantees that stream will be closed; if we add a checkpoint to nursery entry then that guarantee goes away.

oremanj · 2020-08-26T20:01:33Z

@njsmith I'm fine with the noop example not containing any cancel points, because I think of nurseries as a compositional primitive: if the elements (tasks) obey some property of interest, then their composition should too. I'm more worried about patterns like:

async def run_all(funcs):
    async with trio.open_nursery() as nursery:
        for func in funcs:
            nursery.start_soon(func)

and then await run_all([]) now not executing any checkpoints. It's not a hard no, just a concern.

Maybe a more realistic example of "has child tasks, but can't actually take a cancellation" involves child tasks that run in a shielded scope.

catern · 2020-08-26T20:48:12Z

It sounds like we're in agreement that a nursery that runs any tasks shouldn't raise Cancelled if none of the tasks raised Cancelled. Can you explain why the case of a nursery that runs no tasks is important to you?

Ah, I didn't understand. Yes, that's fine - I don't care about the case of a nursery that runs no tasks, it can have a cancellation point or not.

I thought you were referring to the situation where a nursery was exiting when all its tasks had already exited. But I see now that we agree that in that situation, a nursery should not provide a cancellation point.

await run_all([])

IMO, this shouldn't raise Cancelled, because it successfully performed its task: It did nothing, successfully. But I don't feel too strongly about it, since it isn't an issue for me right now.

njsmith · 2020-08-26T23:42:37Z

I think of nurseries as a compositional primitive: if the elements (tasks) obey some property of interest, then their composition should too.

Yeah, that's also the intuition for why, maybe, an always-empty nursery should not do a checkpoint :-)

I.e. in general we expect these to be mostly equivalent:

await foo()

async with trio.open_nursery() as nursery:
    nursery.start_soon(foo)

Right now they're not, b/c the nursery injects a checkpoint. Injecting a schedule point is unavoidable, but more-or-less harmless, since schedule points have minimal effect on user-visible behavior. Injecting a cancel point is avoidable, though, and as this issue shows it causes practical problems, and it's just kind of surprising b/c of how it breaks the equivalence above.

Generalizing, we expect these to be more-or-less similar:

for f in fs:
    await f()

async with trio.open_nursery() as nursery:
    for f in fs:
        nursery.start_soon(f)

Obviously the latter runs the functions concurrently rather than sequentially, and there are some semantic differences that follow directly from that – in particular, it means that if two calls raise exceptions, we have to report them both, inside of the first one breaking out of the loop. And again it forces at least one schedule point. But again again, you wouldn't expect it to inject a cancellation point.

So by this argument, you would expect these two trivial cases to be equivalent:

pass  # don't call anything

async with trio.open_nursery() as nursery:
    pass  # don't start anything

...but of course that's exactly what putting a checkpoint in always-empty-nurseries would change, so we have two sets of principles that we all agree on but that contradict each other here.

I guess one way to look at it is: is a nursery block an "operation" (so should be a checkpoint), or just scaffolding for arranging other operations?

oremanj · 2020-08-26T23:49:33Z

Mm, I see your point. We've generally taken the point of view that an async context manager must perform a checkpoint-y "operation" on either entry or exit (or both), but I think nurseries deserve to break that rule if anyone does. I'm happy for nursery exit to be an unconditional schedule point and unconditional lack of a cancellation point, if that sounds good to you. (We could drop the schedule point if the nursery has ever started a task, but I'm not sure that helps any practical use case enough to pay for the weirdness -- dropping it is maybe better for performance, but that seems better served by having a more general mechanism for dropping schedule points if we've scheduled recently enough.)

njsmith · 2020-09-01T13:08:06Z

I'm happy for nursery exit to be an unconditional schedule point and unconditional lack of a cancellation point, if that sounds good to you.

Let's do it.

We could drop the schedule point if the nursery has ever started a task, but I'm not sure that helps any practical use case enough to pay for the weirdness -- dropping it is maybe better for performance, but that seems better served by having a more general mechanism for dropping schedule points if we've scheduled recently enough.

Yeah, let's not bother with trying to micro-optimize this right now. (And yeah, eliding schedule points if we've scheduled recently is probably a good idea, but orthogonal to the rest of this.)

Previously, a Nursery could raise Cancelled even if all its children had completed successfully. This is undesirable, as discussed in python-trio#1457, so now a Nursery will shield itself from cancels in __aexit__. A Nursery with children can still be cancelled fine: if any of its children raise Cancelled, then the whole Nursery will raise Cancelled. If none of the Nursery's children raise Cancelled, then the Nursery will never raise Cancelled. As another consequence, a Nursery which never had any child tasks will never raise Cancelled. As yet another consequence, since an internal Nursery is used in the implementation of Nursery.start(), if start() is cancelled, but the task it's starting does not raise Cancelled, start() won't raise Cancelled either.

catern · 2020-09-01T15:52:49Z

I went ahead and made the additional changes in #1696 but there are some additional implications.

As yet another consequence, since an internal Nursery is used in the
implementation of Nursery.start(), if start() is cancelled, but the
task it's starting does not raise Cancelled, start() won't raise
Cancelled either.

See the updated tests in the PR as well. I think I argue that this is correct.

catern · 2020-09-01T15:54:45Z

Incidentally, all this also reduces the rate of MultiErrors, which is nice.

richardsheridan · 2023-10-21T17:09:51Z

Fixed in #1696

vxgmichel changed the title ~~Nurserys injecting Cancelled even when none of the tasks received a Cancelled~~ A nursery might inject a Cancelled even when none of the tasks received a Cancelled Apr 19, 2020

oremanj added the cancellation label May 12, 2020

catern mentioned this issue Aug 26, 2020

Nursery: don't act as a checkpoint when not running anything #1696

Merged

oremanj mentioned this issue Jan 24, 2023

Cancellation blocks task_status.started #2544

Open

Zac-HD mentioned this issue Mar 17, 2023

strict_exception_groups=True "leaks" the implementation-detail nursery in nursery.start(fn) #2611

Closed

oremanj mentioned this issue Jul 14, 2023

Updated cancellation semantics to better match the new trio and asyncio ones agronholm/anyio#586

Merged

richardsheridan closed this as completed Oct 21, 2023

ghost mentioned this issue Nov 4, 2023

I am having trouble with basic Sockets. #2690

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A nursery might inject a Cancelled even when none of the tasks received a Cancelled #1457

A nursery might inject a Cancelled even when none of the tasks received a Cancelled #1457

vxgmichel commented Apr 19, 2020 •

edited

njsmith commented Aug 25, 2020

oremanj commented Aug 26, 2020

catern commented Aug 26, 2020

oremanj commented Aug 26, 2020 •

edited

catern commented Aug 26, 2020

oremanj commented Aug 26, 2020

njsmith commented Aug 26, 2020

oremanj commented Aug 26, 2020

catern commented Aug 26, 2020

njsmith commented Aug 26, 2020

oremanj commented Aug 26, 2020

njsmith commented Sep 1, 2020

catern commented Sep 1, 2020

catern commented Sep 1, 2020

richardsheridan commented Oct 21, 2023

A nursery might inject a Cancelled even when none of the tasks received a Cancelled #1457

A nursery might inject a Cancelled even when none of the tasks received a Cancelled #1457

Comments

vxgmichel commented Apr 19, 2020 • edited

A nursery with a single task raising an exception

A nursery with a two tasks, both raising an exception

njsmith commented Aug 25, 2020

oremanj commented Aug 26, 2020

catern commented Aug 26, 2020

oremanj commented Aug 26, 2020 • edited

catern commented Aug 26, 2020

oremanj commented Aug 26, 2020

njsmith commented Aug 26, 2020

oremanj commented Aug 26, 2020

catern commented Aug 26, 2020

njsmith commented Aug 26, 2020

oremanj commented Aug 26, 2020

njsmith commented Sep 1, 2020

catern commented Sep 1, 2020

catern commented Sep 1, 2020

richardsheridan commented Oct 21, 2023

vxgmichel commented Apr 19, 2020 •

edited

oremanj commented Aug 26, 2020 •

edited