Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

implement open_nursery with explicit context manager #612

Merged
merged 10 commits into from
Aug 22, 2018

Conversation

belm0
Copy link
Member

@belm0 belm0 commented Aug 18, 2018

avoid use of @asynccontextmanager and @async_generator since they cause bugs as well as extraneous exception stack frames

TODO:

@belm0 belm0 requested review from smurfix and njsmith August 18, 2018 15:28
@belm0
Copy link
Member Author

belm0 commented Aug 18, 2018

I suspect we're lacking tests for open_nursery with a non-BaseException. When I fiddle with that case in the code it doesn't affect test output.

Copy link
Member

@njsmith njsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't looked at the tests yet because I'm on my phone, but two quick comments I noticed.

#
# def open_nursery():
# return NurseryManager()
nested_child_exc = exc if isinstance(exc, BaseException) else None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is equivalent to nested_child_exc = exc :-). The context manager protocol says that exc can be either an exception object, or None. And all exception objects are instances of BaseException, by definition.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def __enter__(self):
raise RuntimeError(
"use 'async with {func_name}(...)', not 'with {func_name}(...)'".
format(func_name=self._wrapper_func_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wrapper func name is always open_nursery, so it can be hard coded here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@belm0
Copy link
Member Author

belm0 commented Aug 19, 2018

Regarding the two failing tests, it's clearly regarding the case of nursery being cancelled before a start task can respond with started(), but I don't understand the trio machinery well enough for the fix to be obvious. My observations:

  • in both cases, Nursery _nested_child_finished() is throwing a MultiError of all Cancelled, and the cancel scope __exit__ proceeds with propagating the exception (i.e. returns False).
  • it doesn't appear sufficient to test for this case and suppress the exception within NurseryManager, as it causes several other test failures

My tentative conclusion is that this can't be solved in NurseryManager; rather Nursery is needing some extra logic.

update: has something to do with the system nursery's entry_queue. The entry_queue.spawn() task also raises Cancelled for the case in question, and the cancel scope appears to not be properly suppressing the exception

--> exc._scope is self trickery in CancelScope's MultiError filter is implicated

@njsmith
Copy link
Member

njsmith commented Aug 19, 2018

Just pushed a fix. How I tracked it down:

I looked at the two failing tests. The one testing start is super complicated (and start itself is ... kind of a mess, honestly), but the test_serve_tcp is simpler so I focused on it. If you look at the traceback in the pytest logs, it's this weird Cancelled exception that shouldn't be escaping – it should be getting caught by the NurseryManager. And while this test does involve a background task and start... the traceback for the stray exception shows it starting out in the main task. The things that show up are:

async with trio.open_nursery() as nursery:
    async with stream:
        nursery.cancel_scope.cancel()
    # The dedent here exits the 'async with stream' block, which calls 'await stream.aclose()'
    # This call raises 'Cancelled' (as it should, since we just cancelled the surrounding scope).
    # And then this exception escapes the nursery block, which is the bug.

So I tried writing a minimal version of this:

import trio

async def main():
    async with trio.open_nursery() as nursery:
        nursery.cancel_scope.cancel()
        await trio.sleep(0)

trio.run(main)

and indeed, this crashes the same way. So now we have a minimal reproducer.

I still had absolutely no idea what was going on though. How come this Cancelled exception wasn't being caught by the cancel scope? So next I stuck a pdb breakpoint at the top of NurseryManager.__aexit__, and started single-stepping. I checked that the exception was actually being passed into __aexit__, and the exc._scope matched nursery.cancel_scope, so it should be caught by the cancel scope... I eventually stepped into the MultiError.catch call inside open_cancel_scope... and discovered that it actually did catch the exception correctly. So... somehow the exception is later being magically resurrected out of nowhere?

So I squinted hard at NurseryManager.__aexit__ again, trying to figure out which path it would take: if self._scope_manager.__exit__ actually is catching the exception... then it should be returning True, to suppress it... and so the if not should not be taken so... then we fall through and return None, which tells Python to let the original exception keep propagating, which in this case... oh yeah, that's our resurrected exception, duh.

@codecov
Copy link

codecov bot commented Aug 19, 2018

Codecov Report

Merging #612 into master will increase coverage by 0.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #612      +/-   ##
==========================================
+ Coverage   99.31%   99.33%   +0.01%     
==========================================
  Files          91       91              
  Lines       10800    11073     +273     
  Branches      751      816      +65     
==========================================
+ Hits        10726    10999     +273     
  Misses         56       56              
  Partials       18       18
Impacted Files Coverage Δ
trio/_core/tests/test_run.py 100% <100%> (ø) ⬆️
trio/_core/_run.py 100% <100%> (ø) ⬆️
trio/_core/_traps.py 100% <0%> (ø) ⬆️
trio/_core/_parking_lot.py 100% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a1b56c6...4be156d. Read the comment docs.

@belm0
Copy link
Member Author

belm0 commented Aug 20, 2018

@njsmith thank for the fix and debug tips! I lost most of a day on this-- yes unfortunately dug into the more complex of the two failing tests.

How important is it to not explicitly re-raise if _nested_child_finished() throws the original exception? That case is missing test coverage: https://codecov.io/gh/python-trio/trio/pull/612/diff?src=pr&el=tree#D1-316. Would be nice to remove if new_exc is exc if possible.

@njsmith
Copy link
Member

njsmith commented Aug 20, 2018

@belm0

How important is it to not explicitly re-raise if _nested_child_finished() throws the original exception?

I'm actually not sure. I stole this from the implementation of @contextmanager, which carefully does the same if new_exc is exc check... I guess you could try it and see if anything different happens? Does it affect the traceback you get?

I suppose it would be easy to write a test that hits that branch, just something like:

with pytest.raises(KeyError):
    async with trio.open_nursery() as nursery:
        raise KeyError()

Maybe we should do that anyway...

@njsmith
Copy link
Member

njsmith commented Aug 20, 2018

Did a little experiment:

class CM:
    def __enter__(self):
        pass

    def __exit__(self, exc_type, exc_value, exc_tb):
        # <insert code here>

with CM():
    raise KeyError

When __exit__ is return False:

Traceback (most recent call last):
  File "/tmp/asdf.py", line 9, in <module>
    raise KeyError
KeyError

When __exit__ is raise exc_value:

Traceback (most recent call last):
  File "/tmp/asdf.py", line 9, in <module>
    raise KeyError
  File "/tmp/asdf.py", line 6, in __exit__
    raise v
  File "/tmp/asdf.py", line 9, in <module>
    raise KeyError
KeyError

When __exit__ is bare raise:

Traceback (most recent call last):
  File "/tmp/asdf.py", line 9, in <module>
    raise KeyError
  File "/tmp/asdf.py", line 9, in <module>
    raise KeyError
KeyError

So raise saves a traceback frame compared to raise exc_value, but return False gives an even shorter traceback.

I suppose the next step in reducing traceback clutter will be to refactor NurseryManager.__aexit__ friends so that it calls so that we use raise as little as possible... the obvious step would be to make _nested_child_finished return the collected exception instead of raising it.

@njsmith njsmith mentioned this pull request Aug 20, 2018
11 tasks
@belm0
Copy link
Member Author

belm0 commented Aug 20, 2018

@njsmith hitting that if new_exc is exc path isn't straightforward. An explicit exception from the nursery block will get combined with Cancelled into a MultiError, so it would never match the original exception.

Actually the exception comes from the cancel scope __exit__ itself and is uncaught, so we never get to the if new_exc is exc test.

@belm0
Copy link
Member Author

belm0 commented Aug 20, 2018

@njsmith So we can put an additional try-catch on the cancel scope __exit__() call, and test for match to original exception. That saves about 8 frames in your KeyError example.

@njsmith
Copy link
Member

njsmith commented Aug 21, 2018

Okay, yeah, I see it. Wow, this is complicated. It's looking like #607 will involve refactoring cancel scope entry/exit to use the context manager protocol instead of @contextmanager; maybe when we do that we can also replace the use of MultiError.catch and make the whole nursery __aexit__ path raise-free, and that should simplify things enormously. But for now this looks like a good first step... what else do you want to do before merging? I see you have a few boxes unticked still... and also, you should add a newsfragment to boast about your work :-). (And how big is the impact anyway? want to post some before/afters in this thread?)

@belm0
Copy link
Member Author

belm0 commented Aug 21, 2018

Before/after using the same trace I referenced in #56 (comment)

As a user I'd much prefer the general eliding I posted there, though I understand that this work clears some other bugs.

Before (39 frames):

Traceback (most recent call last):
  File "demo.py", line 58, in <module>
    test_multiplexer_with_error()
  File "demo.py", line 55, in test_multiplexer_with_error
    trio.run(runner2)
  File "/.../site-packages/trio/_core/_run.py", line 1277, in run
    return result.unwrap()
  File "/.../site-packages/outcome/_sync.py", line 107, in unwrap
    raise self.error
  File "/.../site-packages/trio/_core/_run.py", line 1387, in run_impl
    msg = task.context.run(task.coro.send, next_send)
  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "/.../site-packages/trio/_core/_run.py", line 970, in init
    self.entry_queue.spawn()
  File "/.../site-packages/async_generator/_util.py", line 42, in __aexit__
    await self._agen.asend(None)
  File "/.../site-packages/async_generator/_impl.py", line 366, in step
    return await ANextIter(self._it, start_fn, *args)
  File "/.../site-packages/async_generator/_impl.py", line 202, in send
    return self._invoke(self._it.send, value)
  File "/.../site-packages/async_generator/_impl.py", line 209, in _invoke
    result = fn(*args)
  File "/.../site-packages/trio/_core/_run.py", line 317, in open_nursery
    await nursery._nested_child_finished(nested_child_exc)
  File "/.../contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/.../site-packages/trio/_core/_run.py", line 202, in open_cancel_scope
    yield scope
  File "/.../site-packages/trio/_core/_multierror.py", line 144, in __exit__
    raise filtered_exc
  File "/.../site-packages/trio/_core/_run.py", line 202, in open_cancel_scope
    yield scope
  File "/.../site-packages/trio/_core/_run.py", line 317, in open_nursery
    await nursery._nested_child_finished(nested_child_exc)
  File "/.../site-packages/trio/_core/_run.py", line 428, in _nested_child_finished
    raise MultiError(self._pending_excs)
  File "/.../site-packages/trio/_core/_run.py", line 1387, in run_impl
    msg = task.context.run(task.coro.send, next_send)
  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "demo.py", line 52, in runner2
    nursery.start_soon(writer2, mx, (7,9))
  File "/.../site-packages/async_generator/_util.py", line 42, in __aexit__
    await self._agen.asend(None)
  File "/.../site-packages/async_generator/_impl.py", line 366, in step
    return await ANextIter(self._it, start_fn, *args)
  File "/.../site-packages/async_generator/_impl.py", line 202, in send
    return self._invoke(self._it.send, value)
  File "/.../site-packages/async_generator/_impl.py", line 209, in _invoke
    result = fn(*args)
  File "/.../site-packages/trio/_core/_run.py", line 317, in open_nursery
    await nursery._nested_child_finished(nested_child_exc)
  File "/.../contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/.../site-packages/trio/_core/_run.py", line 202, in open_cancel_scope
    yield scope
  File "/.../site-packages/trio/_core/_multierror.py", line 144, in __exit__
    raise filtered_exc
  File "/.../site-packages/trio/_core/_run.py", line 202, in open_cancel_scope
    yield scope
  File "/.../site-packages/trio/_core/_run.py", line 317, in open_nursery
    await nursery._nested_child_finished(nested_child_exc)
  File "/.../site-packages/trio/_core/_run.py", line 428, in _nested_child_finished
    raise MultiError(self._pending_excs)
  File "/.../site-packages/trio/_core/_run.py", line 1387, in run_impl
    msg = task.context.run(task.coro.send, next_send)
  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "demo.py", line 15, in reader
    raise e
  File "demo.py", line 9, in reader
    value = await mx[key]
  File "multiplexer.py", line 41, in __getitem__
    value = await trio.hazmat.wait_task_rescheduled(abort_fn)
  File "/.../site-packages/trio/_core/_traps.py", line 159, in wait_task_rescheduled
    return (await _async_yield(WaitTaskRescheduled(abort_func))).unwrap()
  File "/.../site-packages/outcome/_sync.py", line 107, in unwrap
    raise self.error
Exception: Ka-Boom!

After (31 frames, -8):

Traceback (most recent call last):
  File "demo.py", line 58, in <module>
    test_multiplexer_with_error()
  File "demo.py", line 55, in test_multiplexer_with_error
    trio.run(runner2)
  File "/.../site_packages/trio/_core/_run.py", line 1264, in run
    return result.unwrap()
  File "/.../site-packages/outcome/_sync.py", line 107, in unwrap
    raise self.error
  File "/.../site_packages/trio/_core/_run.py", line 1374, in run_impl
    msg = task.context.run(task.coro.send, next_send)
  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "/.../site_packages/trio/_core/_run.py", line 957, in init
    self.entry_queue.spawn()
  File "/.../site_packages/trio/_core/_run.py", line 313, in __aexit__
    type(new_exc), new_exc, new_exc.__traceback__
  File "/.../contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/.../site_packages/trio/_core/_run.py", line 200, in open_cancel_scope
    yield scope
  File "/.../site_packages/trio/_core/_multierror.py", line 144, in __exit__
    raise filtered_exc
  File "/.../site_packages/trio/_core/_run.py", line 200, in open_cancel_scope
    yield scope
  File "/.../site_packages/trio/_core/_run.py", line 309, in __aexit__
    await self._nursery._nested_child_finished(exc)
  File "/.../site_packages/trio/_core/_run.py", line 415, in _nested_child_finished
    raise MultiError(self._pending_excs)
  File "/.../site_packages/trio/_core/_run.py", line 1374, in run_impl
    msg = task.context.run(task.coro.send, next_send)
  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "demo.py", line 52, in runner2
    nursery.start_soon(writer2, mx, (7,9))
  File "/.../site_packages/trio/_core/_run.py", line 313, in __aexit__
    type(new_exc), new_exc, new_exc.__traceback__
  File "/.../contextlib.py", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/.../site_packages/trio/_core/_run.py", line 200, in open_cancel_scope
    yield scope
  File "/.../site_packages/trio/_core/_multierror.py", line 144, in __exit__
    raise filtered_exc
  File "/.../site_packages/trio/_core/_run.py", line 200, in open_cancel_scope
    yield scope
  File "/.../site_packages/trio/_core/_run.py", line 309, in __aexit__
    await self._nursery._nested_child_finished(exc)
  File "/.../site_packages/trio/_core/_run.py", line 415, in _nested_child_finished
    raise MultiError(self._pending_excs)
  File "/.../site_packages/trio/_core/_run.py", line 1374, in run_impl
    msg = task.context.run(task.coro.send, next_send)
  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)
  File "demo.py", line 15, in reader
    raise e
  File "demo.py", line 9, in reader
    value = await mx[key]
  File "multiplexer.py", line 41, in __getitem__
    value = await trio.hazmat.wait_task_rescheduled(abort_fn)
  File "/.../site_packages/trio/_core/_traps.py", line 159, in wait_task_rescheduled
    return (await _async_yield(WaitTaskRescheduled(abort_func))).unwrap()
  File "/.../site-packages/outcome/_sync.py", line 107, in unwrap
    raise self.error
Exception: Ka-Boom!

@belm0
Copy link
Member Author

belm0 commented Aug 21, 2018

@njsmith ready to go I think
I included the whole async_zip example as a test, I'm sure it could be distilled.

Let me know if you'd use merge, squash, or rebase in this case.

@njsmith
Copy link
Member

njsmith commented Aug 22, 2018

Let me know if you'd use merge, squash, or rebase in this case.

I'm usually lazy and just merge...

ready to go I think

I agree! Let's do this.

@njsmith njsmith merged commit 7f0dd7d into python-trio:master Aug 22, 2018
@belm0 belm0 deleted the nursery_context_manager branch August 22, 2018 03:44
@thejohnfreeman
Copy link

thejohnfreeman commented Oct 2, 2018

Just wanted to respond to a comment above:

So raise saves a traceback frame compared to raise exc_value [in __exit__], but return False gives an even shorter traceback.

This made me go back and read some documentation on __exit__ to see if they mention any aspect of its behavior with regards to exceptions, and I found this:

Note that __exit__() methods should not reraise the passed-in exception; this is the caller’s responsibility.

I hope this is helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants