Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elide task.context.run() and contextvars.callable() frames from tracebacks #631

Merged
merged 4 commits into from
Aug 28, 2018

Conversation

belm0
Copy link
Member

@belm0 belm0 commented Aug 25, 2018

for #56

TODO:

  • write tests

@belm0 belm0 requested a review from njsmith August 25, 2018 08:05
@codecov
Copy link

codecov bot commented Aug 25, 2018

Codecov Report

Merging #631 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #631      +/-   ##
==========================================
+ Coverage   99.28%   99.28%   +<.01%     
==========================================
  Files          91       91              
  Lines       10763    10781      +18     
  Branches      768      770       +2     
==========================================
+ Hits        10686    10704      +18     
  Misses         58       58              
  Partials       19       19
Impacted Files Coverage Δ
trio/_core/tests/test_run.py 100% <100%> (ø) ⬆️
trio/_core/_run.py 99.69% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b926f5d...c9087e9. Read the comment docs.

final_result = Error(task_exc)
# Store for later, removing uninteresting top frames:
# 1. task.context.run()
# 2. contextvars.callable() (< Python 3.7 only)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I thought the top two frames were trio._core._run.run_impl and (on <3.7 only) contextvars.Context.run. Do you know where the discrepancy comes from? am I confused? :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here's what you wrote in #56:

  File "/.../site_packages/trio/_core/_run.py", line 1374, in run_impl
    msg = task.context.run(task.coro.send, next_send)

This is noise... when we catch a user exception in run_impl, we should unconditionally discard the top frame in the traceback (which will be this one).

  File "/.../site-packages/contextvars/__init__.py", line 38, in run
    return callable(*args, **kwargs)

This line will disappear on 3.7, where contextvars.Context.run is implemented in C, so maybe not worth worrying about. Or we could discard it at the same time we fix the run_impl frame above.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I'm enumerating the child functions rather than the parents.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhh yeah, right, so we're just looking at different lines in the traceback :-). that makes sense!

@njsmith
Copy link
Member

njsmith commented Aug 25, 2018

For a test I guess it'd work do so something like:

async def my_child_task():
    raise KeyError()

async def test_frame_removal():
    try:
        # Trick: For now cancel/nursery scopes still leave a bunch of tb gunk behind.
        # But if there's a MultiError, they leave it on the MultiError, which lets us get
        # a clean look at the KeyError itself. Someday I guess this will always be
        # a MultiError (#611), but for now we can force it by raising two exceptions.
        async with trio.open_nursery() as nursery:
            nursery.start_soon(my_child_task)
            nursery.start_soon(my_child_task)
    except trio.MultiError as exc:
        assert isinstance(exc.exceptions[0], KeyError)
        # The top frame in the exception traceback should be inside the child task,
        # not trio/contextvars internals. And there's only one frame inside the child task,
        # so this will also detect if our frame-removal is too eager.
        assert exc.exceptions[0].__traceback__.tb_frame.f_code is my_child_task.__code__

@belm0
Copy link
Member Author

belm0 commented Aug 25, 2018

Thanks for the test guidance. It almost works, but appears to be testing the frame at the wrong end of the trace. Investigating...

@njsmith
Copy link
Member

njsmith commented Aug 25, 2018

...but there should only be 1 frame in the trace...?

@njsmith
Copy link
Member

njsmith commented Aug 25, 2018

(import traceback; traceback.print_tb(exc.__traceback__) may be a useful incantation to have up your sleeve.)

@belm0
Copy link
Member Author

belm0 commented Aug 25, 2018

The first exception of the MultiError is as follows:

  File "/Users/john/dev/trio/trio/_core/_run.py", line 201, in open_cancel_scope
    yield scope
  File "/Users/john/dev/trio/trio/_core/_run.py", line 319, in __aexit__
    await self._nursery._nested_child_finished(exc)
  File "/Users/john/dev/trio/trio/_core/_run.py", line 425, in _nested_child_finished
    raise MultiError(self._pending_excs)
  File "/Users/john/dev/trio/trio/_core/tests/test_run.py", line 1903, in my_child_task
    raise KeyError()

Copy link
Member

@njsmith njsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first exception of the MultiError is as follows:

Oh, that's unfortunate. (And confusing!)

It wouldn't be the end of the world if we had to wait until after we did the next step of #55 (which will clean up those frames you quoted). It'll be much easier to write a test to make sure the tracebacks contain exactly the frames we expect, instead of writing a test to assert particular frames are missing.

# 1. trio._core._run.run_impl()
# 2. contextvars.Context.run() (< Python 3.7 only)
tb_next = task_exc.__traceback__.tb_next
if sys.version_info < (3, 7): # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pragma: no cover is for cases where we don't even want to have test coverage. Here, not only is it useful to have tests that cover both branches of the if, we actually have those tests :-). (We test on multiple python versions and combine the coverage.) So we shouldn't use pragma: no cover here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's good-- I assumed the coverage testing was looking at each version run individually rather than the aggregate.

assert isinstance(first_exc, KeyError)
tb_text = ''.join(traceback.format_tb(first_exc.__traceback__))
for r in ('/trio/_core/.* in run_impl$', '/contextvars/.* in run$'):
assert not re.search(r, tb_text, re.MULTILINE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're doing it this way, then I think you can simplify to just

try:
    async with trio.open_nursery() as nursery:
        nursery.start_soon(my_child_task)
except KeyError as exc:
    ...

(Of course that will then need adjusting when we're working on #611, but everything will, so I wouldn't worry about that.)

this is less fragile, e.g. should python 3.7 contextvars be
backported, or should trio core frames vary by code path
@njsmith
Copy link
Member

njsmith commented Aug 27, 2018

I mentioned this a bit in chat, but let me say it again here with more carefully collected thoughts :-).

So the traceback frames that are relevant for this PR are:

  • the frame that appears when run_impl catches the exception (always exactly one, guaranteed by the python language)

  • The frame added by older versions of Context.run (doesn't apply on python 3.7+)

  • The frames that older versions of Context.run might or might not add in the future, in case Yury changes the backport library (e.g. by adding another frame, or rewriting it in C so it stops adding any frames).

At one point I was confused and thought that this was also attempting to address the frames that the nursery aexit machinery is adding, but that was wrong, it can't and doesn't affect those either way. It's just about the frames mentioned above.

So some options include:

  • Remove one frame, ignore Context.run. Extremely simple, works perfectly for recent enough python (and eventually that's everyone), degrades gracefully on older pythons. Not perfect but not too bad.

  • Remove one frame on py37+, remove 2 frames otherwise. Provides perfect results on all current deployments. Might need adjusting in the future if contextvars changes. We could catch this with a test so we're at least alerted if it happens.

  • Check how many frames Context.run actually uses at startup, and later remove 1+that many frames. Extremely robust, and keeps the main loop simple and tight, but at the cost of ~5-10 extra lines of code to do the calculation.

These all seem like viable options, with slightly different complexity/reliability tradeoffs.

The current PR has a different approach, based on calling extract_tb and looking at filenames. This unfortunately has a few issues:

  • Unlike the options above, it has the potential for producing incorrect results in innocent-seeming circumstances. For example, what if the code in the task actually is part of trio/_core, like nursery.start_soon(wait_readable, fd)? What if someone has a module mylib/contextvars/blah.py? These aren't terribly common situations, but with the other options above these aren't even possibilities.

  • Performance: extract_tb is way heavier than is appropriate for this part of the code. It does disk I/O! We could avoid this by writing our own filename extraction code, but that would add more complexity.

  • Correctness: I'm pretty sure that the current PR doesn't work on Windows, where the filename contains trio\_core. Again, fixable, but only by adding more complexity. And there's something just... disconcerting... about the idea that core exception propagation code needs to know about os.sep.

So I don't think we should be checking filenames here.

On an unrelated note, the test isn't actually testing anything right now, I think? Like, it passes on current master, right, regardless of this change? If we want to test this change specifically we could iterate over the traceback and assert that run_impl and contextvars don't show up there... Or wait for the next change to land them both together, that's a little naughty but probably fine under the circumstances.

change test to confirm frame count and bottom-most
watermark frame
@belm0
Copy link
Member Author

belm0 commented Aug 27, 2018

@njsmith ready to go-- thanks for your review and patience

@njsmith njsmith merged commit fc7c7e0 into python-trio:master Aug 28, 2018
@njsmith
Copy link
Member

njsmith commented Aug 28, 2018

Thank you for your patience :-)

njsmith added a commit to njsmith/trio that referenced this pull request Aug 28, 2018
This is kind of ridiculous overkill, but I was thinking about how you
could do it, then wrote some code to experiment, then realized that
I had implemented it so we might as well use it...

See also: python-trio#631
belm0 pushed a commit that referenced this pull request Sep 1, 2018
Count how many frames Context.run adds to tracebacks and use this info 
to elide frames in run_impl().

See also: #631
@njsmith njsmith mentioned this pull request Sep 2, 2018
11 tasks
@belm0 belm0 deleted the run_impl_frames branch October 1, 2020 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants