Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration between pytest-trio and hypothesis #1257

Closed
njsmith opened this issue Apr 26, 2018 · 12 comments
Closed

Integration between pytest-trio and hypothesis #1257

njsmith opened this issue Apr 26, 2018 · 12 comments
Assignees
Labels
enhancement it's not broken, but we want it to be better opinions-sought tell us what you think about these ones!

Comments

@njsmith
Copy link

njsmith commented Apr 26, 2018

pytest-trio is a pytest plugin to help with testing trio code; the main feature is you can do:

@pytest.mark.trio
async def test_whatever(...):
    ...

and it automagically arranges to run the test using trio.run, with various bells and whistles to handle async fixtures, and so forth; probably in the future it will also be able to detect async tests and add the pytest.mark.trio automatically. (It's generally sort of similar to pytest-asyncio, in case anyone's more familiar with it.)

As you might expect, this turns out to interact badly with hypothesis, since both pytest-trio and hypothesis want to take over the actual running of the test: python-trio/pytest-trio#42

From the analysis in that issue, I think hypothesis almost has enough to make this work already, via the execute_example hook. The problem here is that we would want to automatically configure hypothesis to use our custom execute_example hook without having to create a class etc. If there were some way to pass this in via e.g. a thread-local, then that would be awesome.

The other tricky bit will be when we add auto-detection of async functions, we'll need to be able to "see through" hypothesis's wrappers. This might be possible right now via some hacks like walking the __wrapped__ chain, but alternatively, maybe hypothesis could expose an .is_async_hypothesis_test attribute or something, so we don't have to?

@Zac-HD
Copy link
Member

Zac-HD commented Apr 26, 2018

We would really, really like to support async test runners. This would obviously include Trio (this issue), as well as Twisted (#703) and asyncio (#968)!

The main reason we don't support this yet is that none of the Hypothesis core team uses any of these frameworks, and we really don't want to commit to an API that turns out to be horrible to use or only narrowly compatible. If you're offering to help out though, I think between us we could find a solution 😄

  • The execute_example hook, and the executors interface more generally, should allow Hypothesis to correctly execute an async function. Coroutine tests do not work #292 demonstrates that it's possible (and ugly) to do this via a class.

  • Deeper support for async tests - e.g. having two Hypothesis tests share an event loop, or @given return anything but a sync function - is probably out of scope for the foreseeable future. Moving to the forthcoming Rust backend and refactoring the many-layered ancient internals might make this practical. Fortunately, I think shallower integration would still be useful!

  • I'm not keen on the idea of an .is_async_hypothesis_test attribute - I would expect inspect.unwrap is enough to get at the interesting stuff, and providing library-specific hooks is a road I don't want to go down unless absolutely unavoidable.

  • Our pytest plugin might be a good place to start. If we come up with a nice way to set the runner for a particular test, you could use a variant on this loop:

def pytest_collection_modifyitems(items):
for item in items:
if not isinstance(item, pytest.Function):
continue
if getattr(item.function, 'is_hypothesis_test', False):
item.add_marker('hypothesis')

TLDR - all we need to provide is a way to set the Hypothesis test_runner (see core.py) for a particular test; the rest should be fairly straightforward to implement downstream and I'm happy to help with that.

@Zac-HD Zac-HD added enhancement it's not broken, but we want it to be better opinions-sought tell us what you think about these ones! labels Apr 26, 2018
@njsmith
Copy link
Author

njsmith commented Apr 30, 2018

Deeper support for async tests - e.g. having two Hypothesis tests share an event loop, or @given return anything but a sync function - is probably out of scope for the foreseeable future. Moving to the forthcoming Rust backend and refactoring the many-layered ancient internals might make this practical. Fortunately, I think shallower integration would still be useful!

I'm actually having trouble thinking of any ways that "deep" integration would be better than "shallow" :-). I guess in Trio's case threading the async/await annotations into Hypothesis would let us share async fixtures across multiple examples, instead of reinstantiating them each time? But maybe you prefer to reinstantiate them each time anyway...

I would expect inspect.unwrap is enough to get at the interesting stuff

I've had mixed experiences with unwrap – it turns out that people use all kind of decorators, and it's hard to know how many layers of abstraction you want to "peek through". The question isn't really "is there some async function buried in here somewhere", it's "is the first function after the hypothesis wrapping async".

This is a pretty edge-casey kind of thing to worry about; we could probably get away with ignoring this problem and most people wouldn't notice. But I hear that you like to worry about edge cases around here :-).

providing library-specific hooks is a road I don't want to go down unless absolutely unavoidable

"Is the function you're wrapping async" is just a question about python built-in types, right? Not sure what's library-specific about it?

TLDR - all we need to provide is a way to set the Hypothesis test_runner (see core.py) for a particular test

Agreed that this is the main issue. The first API that comes to to mind is to use some thread-local to pass this through – probably this would mean hypothesis exposing some kind of test_runner context manager (that internally uses a thread-local), and would be used like:

def run_async_test(test_fn):
    if getattr(test_fn.is_hypothesis_test, False):
        with hypothesis.test_runner(real_async_test_runner):
            return test_fn()
    else:
        return real_async_test_runner(test_fn)

What do you think?

Do we care about the case where someone uses a library like pytest-trio and sets an explicit execute_example? The desired semantics here are pretty unclear, so my inclination would be to defer handling this until someone shows up with a use case.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 30, 2018

I'm actually having trouble thinking of any ways that "deep" integration would be better than "shallow" :-). I guess in Trio's case threading the async/await annotations into Hypothesis would let us share async fixtures across multiple examples, instead of reinstantiating them each time? But maybe you prefer to reinstantiate them each time anyway...

To me, a shallow integration is one that hides async-ness from Hypothesis internals, while a deep integration actually handles async stuff.

Shallow is much easier to support in a Python 2.7-compatible way, since we're stuck with that through to 2020. On the other hand, deep integration might (I don't know!) be important to support magic clocks or interleave execution if they're not supported (in asyncio, Twisted, etc). Performance aside, it's usually a bad idea to share a fixture between examples, especially with mutable state - that tends to violate all the assumptions shrinking is premised on!

[other comments]

unwrap is looking like the wrong tool, but we do like to cover edge cases! Async is not library-specific, sorry, and I should have been more precise - I would prefer a solution which is more general than "sync or not" but could be used for other integrations (eg pytest-django; I don't have any particular use-cases in mind but it's an interface principle). If in doubt, such as doubling up on runners, we like to make it an explicit error - it's always backwards-compatible to allow it later 😄


API idea, just to get a sense of the constraints: would replacing @given with a pytest-trio version work? If so we could expose enough hooks to unwrap a Hypothesis test and rewrap it with something like this; if not it would be good to understand why.

# Obviously a super sketchy version, but I hope you get the idea...
def given_(*args, **kwargs):
    def inner(func):
        return given(*args, **kwargs)(partial(trio.run, func))
    return inner

@njsmith
Copy link
Author

njsmith commented Apr 30, 2018

Performance aside, it's usually a bad idea to share a fixture between examples, especially with mutable state - that tends to violate all the assumptions shrinking is premised on!

Yeah. There is a weird quirk about how pytest-trio works: for regular fixtures, they're instantiated outside of the test and would be kept the same through all of hypothesis's examples because, well, that's just the standard pytest behavior. But for async fixtures, we have to play a game to get them inside the call to trio.run, which involves some hacks where from pytest's point of view the fixtures produce an opaque delayed call object, and then when pytest-trio goes to run the test, it resolves all the delayed calls. Anyway, the end result is that if we do this in the naive way, then regular fixtures will be shared across examples, while async fixtures will be reinstantiated. It's ... quirky, but probably fine?

I guess #377 is to enable the async-style delayed evaluation for all fixtures in hypothesis, so maybe I shouldn't worry about this – pytest-trio is just ahead in the game, and hopefully pytest-hypothesis will catch up ;-).

unwrap is looking like the wrong tool, but we do like to cover edge cases! Async is not library-specific, sorry, and I should have been more precise - I would prefer a solution which is more general than "sync or not" but could be used for other integrations

we could expose enough hooks to unwrap a Hypothesis test and rewrap it

Okay, here's an idea, that might kill two birds with one stone: what if hypothesis-wrapped tests exposed the underlying function as an attribute, like: testobj.hypothesis_wrapped_test. Then if we wanted to detect async-ness or whatever, we could introspect this object; and if we want to rewrite how it's done, we could re-assign it. This is basically how pytest-trio works right now, just with pytest "items" instead of hypothesis "givens":

https://github.com/python-trio/pytest-trio/blob/8a52d49aa8d84b3649d6e5e1d20bdd366ffbb54f/pytest_trio/plugin.py#L217-L222

It feels a little janky and fragile to have these different libraries poking at each others attributes like this. I think a certain amount of jankiness is unavoidable though given the problem...

would replacing @given with a pytest-trio version work?

There are a few problems with that specific code, but yeah, the basic idea is that we have a function that takes an async test function, and converts it into a synchronous test function, so we "just" need to somehow wedge it into the appropriate place inside given.

If in doubt, such as doubling up on runners, we like to make it an explicit error - it's always backwards-compatible to allow it later

Hmm, good point. And if we did go with mutating testobj.hypothesis_wrapped_test as the mechanism for applying the runner to the test, then it wouldn't be possible to detect when we were doubling up on runners (or at least not without hacks), so we couldn't give an error. Also, the semantics you'd get would be that any execute_example receives the already-wrapped-by-pytest-trio test, which seems like the more confusing option if we did want to support this.

I guess another problem with mutating .hypothesis_wrapped_test is that if there are any other wrappers getting involved – which, ugh, I hope note, but who knows – then @functools.wraps does copy attributes over to the new wrapper so any of hypothesis's magic attributes would still be readable, but if we tried mutating them then we'd actually be mutating the attribute on the wrapper object, which is not where hypothesis would be looking for it.

Maybe we want a testobj.hypothesis_add_runner function we can call? So pytest-trio would do:

@pytest.hookimpl(hookwrapper=True)
def pytest_runtest_call(item):
    if 'trio' in item.keywords:
        if getattr(item.obj.is_hypothesis_test, False):
            if not iscoroutinefunction(item.obj.hypothesis_wrapped_test):
                pytest.fail("expected an async test")
            item.obj.hypothesis_add_runner(trio_test_runner)
        else:
            # regular non-hypothesis branch
            ...

@njsmith
Copy link
Author

njsmith commented Apr 30, 2018

Also, maybe opening a can of worms: is @given the only thing we need to worry about?

What about... stateful tests?

@Zac-HD
Copy link
Member

Zac-HD commented Apr 30, 2018

Anyway, the end result is that if we do [fixtures] in the naive way, then regular fixtures will be shared across examples, while async fixtures will be reinstantiated. It's ... quirky, but probably fine?

Sounds good to me! Keep Trio in the future where it belongs 😄

Okay, here's an idea, that might kill two birds with one stone: what if hypothesis-wrapped tests exposed the underlying function as an attribute, like testobj.hypothesis_wrapped_test. Then if we wanted to detect async-ness or whatever, we could introspect this object; and if we want to rewrite how it's done, we could re-assign it. This is basically how pytest-trio works right now...

Yep, that's pretty much what I was thinking of! Psudeocode:

# inside pytest_runtest_call
if not iscoroutinefunction(item.obj.hypothesis_wrapped_test):
    pytest.fail("expected an async test")
rewrapper = item.obj.hypothesis_rewrapper  # todo: better name
# rewrapper == given(*args, **kwargs)  # ie you wrap it around a newly-sync test
item.obj = rewrapper(_some_trio_magic(item.obj.hypothesis_wrapped_test))  # wedge into place!

Hopefully, replacing the object instead of mutating it will avoid all the problems associated with reading state of the wrong layer, which would be very very difficult to debug. (we also like high-visibility errors, including detecting possible misuse)

Also, maybe opening a can of worms: is @given the only thing we need to worry about? What about... stateful tests?

Um. To a first approximation, not gonna work. However it's probably not too awful to reimplement - there's a pretty direct translation layer between the defined class and hypothesis.find; I imagine we could manage either an async-flavoured version (ie define AsyncRuleBasedStateMachine), or maaybe work out how to hook into internals for that too. Let's just get @given working first though!

@Zac-HD
Copy link
Member

Zac-HD commented Jun 13, 2018

After talking it over with David, I think we have an even simpler option: allow Trio (and anyone else) to assign to f.hypothesis_wrapped_test. Example code:

# Note: trio.run doesn't accept keyword args, so @given must be passed positional args
item.obj.hypothesis_wrapped_test = functools.partial(trio.run, item.obj.hypothesis_wrapped_test)

So all we need to do internally is ensure that we always get the wrapped test from this attribute, rather than holding onto a direct reference, and it should "just work".

@DRMacIver
Copy link
Member

DRMacIver commented Jun 13, 2018 via email

@Zac-HD
Copy link
Member

Zac-HD commented Jun 13, 2018

Ah, heck. I think the concept is still fine, but the wrapper will probably be more complicated than just dropping in partial.

Ping @njsmith; you probably (hopefully) have a better idea of how hard that will be in Trio than we do and whether this idea would help.

@njsmith
Copy link
Author

njsmith commented Jun 13, 2018

I don't think kwargs are an issue. Pytest also likes to pass kwargs. It makes things a bit more convoluted, but it's fine. Something like:

wrapped_test = lambda **kwargs: trio.run(partial(test_func, **kwargs))

There were two other concerns that I raised above (#1257 (comment)), though not very clearly.

One is: if we're mutating testobj.hypothesis_wrapped_test, then hypothesis has no visibility into the different layers of wrapping. One consequence is that it can't enforce any particular semantics for how execute_example interacts with other wrappers. Up above Zac suggested that using execute_example and pytest-trio together should be an error until we figured out what semantics we wanted. If we start mutating hypothesis_wrapped_test then it won't be an error, we'll just get some random semantics. (Probably not the ones we would have chosen, either.)

The other is: suppose there's another decorator in play.

def trivial_wrapper(fn):
    @wraps(fn)
    def wrapped(*args, **kwargs):
        return fn(*args, **kwargs)

@trivial_wrapper
@given(...)
@pytest.mark.trio
async def test_something(...):
    ...

This is pretty arcane, dunno if it would ever happen or why. But let me point out a strange thing that will happen if anyone does write this.

First, @given will create and return a callable that has the special attributes like hypothesis_wrapped_test. Then that gets wrapped by @trivial_wrapper. What pytest-trio ultimately sees is a test whose testobj is the wrapped function created by @trivial_wrapper.

Now, a fun thing about @wraps is that it indiscriminately copies all attributes from the wrapped function to the wrapper function. Therefore, when pytest-trio looks at this testobj, it will find that it does have a hypothesis_wrapped_test attribute, even though it's not actually the object returned by @given. For introspection, this is fine. But for mutation, not so much: pytest-trio will mutate wrapped.hypothesis_wrapped_test and be happy that it's set things up, but in fact this is a no-op, because it mutated the wrong object.

That's why I suggested that we might want an attribute like: testobj.hypothesis_add_test_wrapper(wrap_fn), which is a callable that applies a wrapper to the wrapped test. Calling it would be basically the same thing as doing testobj.hypothesis_wrapped_test = wrap_fn(testobj.hypothesis_wrapped_test), so I think implementation-wise the complexity is pretty similar. But it would give hypothesis more visibility into what was going on, so that it could e.g. make an informed decision about how to stack together execute_example and these wrappers; also, it would work even in the weird @trivial_wrapper case (because hypothesis_add_test_wrapper would get copied over onto wrapper, pytest-trio would call it, and it would go mutate the correct state).

These are pretty weird edge cases. I think for introspection we definitely want to expose hypothesis_wrapped_test, and for mutation then either allowing people to patch hypothesis_wrapped_test or a slightly more structured API like hypothesis_add_test_wrapper will work well enough. I currently have a weak preference for hypothesis_add_test_wrapper.

@njsmith njsmith closed this as completed Jun 13, 2018
@njsmith
Copy link
Author

njsmith commented Jun 13, 2018

Ugh misclicked sorry, let me reopen this and then edit my comment above real quick

@njsmith njsmith reopened this Jun 13, 2018
@Zac-HD
Copy link
Member

Zac-HD commented Jun 13, 2018

  • Yep, that's the wrapper I was imagining (though it's probably a good idea to handle *args too)

  • @trivial_wrapper is commonly spelled @pytest.mark.parametrize(...) - it works well with Hypothesis iff it's outside @given - so this pattern is too widely used to ignore.

  • Just throwing ideas out now, but we could have a .hypothesis attribute which holds the .wrapped_test attribute; it's PEP20-compliant and adding a layer of indirection should actually solve this problem 😇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better opinions-sought tell us what you think about these ones!
Projects
None yet
Development

No branches or pull requests

3 participants