New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tidy up the tracebacks in the case of multiple failures #848

Open
alexwlchan opened this Issue Sep 12, 2017 · 7 comments

Comments

Projects
None yet
4 participants
@alexwlchan
Member

alexwlchan commented Sep 12, 2017

Consider the following test:

#!/usr/bin/env python
# -*- encoding: utf-8

from hypothesis import given
from hypothesis.strategies import integers

@given(integers())
def test_x_is_not_integer(x):
    if x > 100:
        raise ValueError("This number is too big!")
    elif x < -100:
        raise RuntimeError("This number is too small!")
    elif isinstance(x, int):
        raise ZeroDivisionError("This number is too close to 0")
    else:
        assert False, "How did you even get here?"

This is the output in Hypothesis 3.25.0, and there's a lot of it:

Falsifying example: test_x_is_not_integer(x=-101)
Traceback (most recent call last):
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 631, in run
    print_example=True, is_final=True
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/executors.py", line 58, in default_new_style_executor
    return function(data)
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 116, in run
    return test(*args, **kwargs)
  File "foo.py", line 12, in test_x_is_not_integer
    raise RuntimeError("This number is too small!")
RuntimeError: This number is too small!

Falsifying example: test_x_is_not_integer(x=101)
Traceback (most recent call last):
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 631, in run
    print_example=True, is_final=True
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/executors.py", line 58, in default_new_style_executor
    return function(data)
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 116, in run
    return test(*args, **kwargs)
  File "foo.py", line 10, in test_x_is_not_integer
    raise ValueError("This number is too big!")
ValueError: This number is too big!

Falsifying example: test_x_is_not_integer(x=0)
Traceback (most recent call last):
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 631, in run
    print_example=True, is_final=True
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/executors.py", line 58, in default_new_style_executor
    return function(data)
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 116, in run
    return test(*args, **kwargs)
  File "foo.py", line 14, in test_x_is_not_integer
    raise ZeroDivisionError("This number is too close to 0")
ZeroDivisionError: This number is too close to 0

Traceback (most recent call last):
  File "foo.py", line 18, in <module>
    test_x_is_not_integer()
  File "foo.py", line 8, in test_x_is_not_integer
    def test_x_is_not_integer(x):
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 774, in wrapped_test
    state.run()
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 692, in run
    len(self.falsifying_examples,)))
hypothesis.errors.MultipleFailures: Hypothesis found 3 distinct failures.

A bunch of Hypothesis traceback is mixed into the test. It might be nicer if we could render it as follows:

Falsifying example: test_x_is_not_integer(x=-101)
Traceback (most recent call last):
  File "foo.py", line 12, in test_x_is_not_integer
    raise RuntimeError("This number is too small!")
RuntimeError: This number is too small!

Falsifying example: test_x_is_not_integer(x=101)
Traceback (most recent call last):
  File "foo.py", line 10, in test_x_is_not_integer
    raise ValueError("This number is too big!")
ValueError: This number is too big!

Falsifying example: test_x_is_not_integer(x=0)
Traceback (most recent call last):
  File "foo.py", line 14, in test_x_is_not_integer
    raise ZeroDivisionError("This number is too close to 0")
ZeroDivisionError: This number is too close to 0

Traceback (most recent call last):
  File "foo.py", line 18, in <module>
    test_x_is_not_integer()
  File "foo.py", line 8, in test_x_is_not_integer
    def test_x_is_not_integer(x):
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 774, in wrapped_test
    state.run()
  File "/Users/alexwlchan/.virtualenvs/tempenv-340c26170176d/lib/python3.6/site-packages/hypothesis/core.py", line 692, in run
    len(self.falsifying_examples,)))
hypothesis.errors.MultipleFailures: Hypothesis found 3 distinct failures.

So the traceback for each falsifying example starts and ends in user code.

I have no idea how hard this is to do, but it’s an idea.

@alexwlchan

This comment has been minimized.

Show comment
Hide comment
@alexwlchan

alexwlchan Sep 12, 2017

Member

I’m not going to look at this more now, but going after the traceback.format_exc() lines added in #836 is probably a good start.

Member

alexwlchan commented Sep 12, 2017

I’m not going to look at this more now, but going after the traceback.format_exc() lines added in #836 is probably a good start.

@DRMacIver

This comment has been minimized.

Show comment
Hide comment
@DRMacIver

DRMacIver Sep 14, 2017

Member

+1 on tidying this up.

I think there are two different parts to doing this:

  • The stack traces we print ourselves we'll need to filter ourselves.
  • The final traceback is printed by pytest and I think we should think about just enabling traceback hiding for the entirety of core.py - we already have it in some individual methods, but I'm not sure we ever want core.py appearing in a user stack trace unless they're specifically debugging Hypothesis.

P.S. You may or may not have realised this, but that example would work even if all of the exceptions were of the same type - disambiguation is by both type and line

Member

DRMacIver commented Sep 14, 2017

+1 on tidying this up.

I think there are two different parts to doing this:

  • The stack traces we print ourselves we'll need to filter ourselves.
  • The final traceback is printed by pytest and I think we should think about just enabling traceback hiding for the entirety of core.py - we already have it in some individual methods, but I'm not sure we ever want core.py appearing in a user stack trace unless they're specifically debugging Hypothesis.

P.S. You may or may not have realised this, but that example would work even if all of the exceptions were of the same type - disambiguation is by both type and line

@Zac-HD

This comment has been minimized.

Show comment
Hide comment
@Zac-HD

Zac-HD Mar 18, 2018

Member

Fixing this will probably be in the run and execute methods of hypothesis.core.StateForActualGivenExecution, or by filtering the traceback in the final except: block in the evaluate_test_data method.

I suggest starting by replacing the traceback with a completely different string, and checking that it works as you'd expect with the simple example above. Then, you can filter out lines associated with something that is_hypothesis_file.

Member

Zac-HD commented Mar 18, 2018

Fixing this will probably be in the run and execute methods of hypothesis.core.StateForActualGivenExecution, or by filtering the traceback in the final except: block in the evaluate_test_data method.

I suggest starting by replacing the traceback with a completely different string, and checking that it works as you'd expect with the simple example above. Then, you can filter out lines associated with something that is_hypothesis_file.

@Zac-HD

This comment has been minimized.

Show comment
Hide comment
@Zac-HD

Zac-HD Apr 12, 2018

Member

I've tried this suggested format in #1225, and it looks good!

There is some repetition remaining though - I suspect because the outer function has been replaced by a proxy. An optional part of this feature would be to detect adjacent duplicate lines and remove one.

Either way, the filtering should be skipped when in verbose or debug mode.

Member

Zac-HD commented Apr 12, 2018

I've tried this suggested format in #1225, and it looks good!

There is some repetition remaining though - I suspect because the outer function has been replaced by a proxy. An optional part of this feature would be to detect adjacent duplicate lines and remove one.

Either way, the filtering should be skipped when in verbose or debug mode.

@banool

This comment has been minimized.

Show comment
Hide comment
@banool

banool Aug 27, 2018

Looking into this.

banool commented Aug 27, 2018

Looking into this.

@Zac-HD

This comment has been minimized.

Show comment
Hide comment
@Zac-HD

Zac-HD Sep 12, 2018

Member

https://vorpus.org/blog/beautiful-tracebacks-in-trio-v070/

Some really impressive work here, with a lovely example and (importantly) links to the issues and PRs where the code was designed and done. Short tips - we'll probably want to do all of these:

  • Remove unneeded stack frames at the top of a traceback before storing it to reraise later
  • Implement context managers by hand, not with a decorator (and some fancy tips for this)
  • Example tests for traceback frame removal
  • Look at master, not PRs - they evolved a fair bit over time

But my main takeaway is that this can work really, really well!

Member

Zac-HD commented Sep 12, 2018

https://vorpus.org/blog/beautiful-tracebacks-in-trio-v070/

Some really impressive work here, with a lovely example and (importantly) links to the issues and PRs where the code was designed and done. Short tips - we'll probably want to do all of these:

  • Remove unneeded stack frames at the top of a traceback before storing it to reraise later
  • Implement context managers by hand, not with a decorator (and some fancy tips for this)
  • Example tests for traceback frame removal
  • Look at master, not PRs - they evolved a fair bit over time

But my main takeaway is that this can work really, really well!

@Zac-HD Zac-HD self-assigned this Sep 15, 2018

@Zac-HD

This comment has been minimized.

Show comment
Hide comment
@Zac-HD

Zac-HD Sep 15, 2018

Member

master...Zac-HD:short-tracebacks

Before: see above. After:

Traceback (most recent call last):
  File "c:\users\zac\documents\github\hypothesis\hypothesis-python\src\hypothesis\core.py", line 935, in wrapped_test
    raise clip_traceback(error)
hypothesis.errors.MultipleFailures: Hypothesis found 3 distinct failures.
--------------------------------- Hypothesis ----------------------------------
Falsifying example: test(x=-101)
Traceback (most recent call last):
  File "C:\Users\Zac\Documents\GitHub\hypothesis\test.py", line 19, in test_x_is_not_integer
    raise RuntimeError("This number is too small!")
RuntimeError: This number is too small!

Falsifying example: test(x=101)
Traceback (most recent call last):
  File "C:\Users\Zac\Documents\GitHub\hypothesis\test.py", line 17, in test_x_is_not_integer
    raise ValueError("This number is too big!")
ValueError: This number is too big!

Falsifying example: test(x=1)
Traceback (most recent call last):
  File "C:\Users\Zac\Documents\GitHub\hypothesis\test.py", line 21, in test_x_is_not_integer
    raise ZeroDivisionError("This number is too close to 0")
ZeroDivisionError: This number is too close to 0

Still proof-of-concept, but I'm pretty happy with ~three hours work 😄

Member

Zac-HD commented Sep 15, 2018

master...Zac-HD:short-tracebacks

Before: see above. After:

Traceback (most recent call last):
  File "c:\users\zac\documents\github\hypothesis\hypothesis-python\src\hypothesis\core.py", line 935, in wrapped_test
    raise clip_traceback(error)
hypothesis.errors.MultipleFailures: Hypothesis found 3 distinct failures.
--------------------------------- Hypothesis ----------------------------------
Falsifying example: test(x=-101)
Traceback (most recent call last):
  File "C:\Users\Zac\Documents\GitHub\hypothesis\test.py", line 19, in test_x_is_not_integer
    raise RuntimeError("This number is too small!")
RuntimeError: This number is too small!

Falsifying example: test(x=101)
Traceback (most recent call last):
  File "C:\Users\Zac\Documents\GitHub\hypothesis\test.py", line 17, in test_x_is_not_integer
    raise ValueError("This number is too big!")
ValueError: This number is too big!

Falsifying example: test(x=1)
Traceback (most recent call last):
  File "C:\Users\Zac\Documents\GitHub\hypothesis\test.py", line 21, in test_x_is_not_integer
    raise ZeroDivisionError("This number is too close to 0")
ZeroDivisionError: This number is too close to 0

Still proof-of-concept, but I'm pretty happy with ~three hours work 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment