Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn if strategies depend on the random module without using st.random_module() #3810

Closed
AdrianSosic opened this issue Dec 12, 2023 · 3 comments · Fixed by #3871
Closed

Warn if strategies depend on the random module without using st.random_module() #3810

AdrianSosic opened this issue Dec 12, 2023 · 3 comments · Fixed by #3871
Labels
legibility make errors helpful and Hypothesis grokable

Comments

@AdrianSosic
Copy link

Hi,

I'm not sure if the following is a "bug" or consequence of the design of the framework and my example just uses hypothesis in the wrong way. But when I use other sources of random numbers (which is perhaps simply forbidden?), then the test mechanism no longer seems to work. Consider this piece of code:

import random

import hypothesis.strategies as st
from hypothesis import given


@st.composite
def random_number(draw):
    return random.randint(0, 1)
    # return draw(st.integers(min_value=0, max_value=1))


@given(random_number())
def test_number(number):
    assert number != 0

On my machine, the test passes without errors, meaning that a "0" is never created in any of the tests. In fact, looking at the "debug" output shows that only one test is executed, which happens to create a "1". On the other hand, the alternative return statement (see comment in code) works as expected. This makes me believe that hypothesis does not expect any variations to happen in the first case, probably because the source of randomness given by the random module is not controlled by hypothesis?

If my understanding is correct, is there any part of the docs where this is explained? Or am I just missing an important piece to the puzzle?

@Zac-HD Zac-HD added the legibility make errors helpful and Hypothesis grokable label Dec 12, 2023
@Zac-HD
Copy link
Member

Zac-HD commented Dec 12, 2023

when I use other sources of random numbers (which is perhaps simply forbidden?)

Correct - when you're using Hypothesis, all your random choices should be managed by Hypothesis.

The problem here is that Hypothesis makes random code deterministic, by seeding (and resetting) the global random seed - see those docs for the proper way to manage randomness. As-is, your code is therefore deterministic, and it just happens that with the random seed we use by default your test will never fail.

(as an aside, when I run your example code, I get HypothesisDeprecationWarning: There is no reason to use @st.composite on a function which does not call the provided draw() function internally - which should hint that something is wrong here, and will in fact be an error in future)

@AdrianSosic
Copy link
Author

Hi @Zac-HD, thank you very much for your explanation, which confirms my ... hypothesis :D (pun intended).

In hindsight, this makes a lot of sense, but note that this was only the case for me after my thought process on how randomness must be controlled in the framework, which was only triggered by that one bug I noticed very late in my code development process.

In particular, note that the example above was only for demonstration; the real one contained more building blocks like:

...
 values = draw(categories)
 active_values = random.sample(values, random.randint(0, len(values)))
...

My point here is that draw was actually used, so I did not even get the warning but ran straight into a silent bug, which is sort of the worst case scenario.

And while I now see that there is some mentioning of this in the docs, I still must say:

  • Given that this is really a core concept that people need to understand, the text you linked seems not in a sufficiently central position, considering its importance. In other words: I did not notice it when I skimmed through the docs, and I guess other people might not, either.
  • Mentioning that hypothesis controls the random seed is one thing, but bridging the mental gap of what implications this has (that is: you must not use non-builtin strategies in your composites) is really a different thing. Perhaps it would be worth making this explicitly clear (e.g. in the form of a "note"), exactly at the place where it becomes relevant: the composite strategy

Anyway, these are just some suggestions. Apart from that, I really enjoy working with the framework so far! And I'm happy that I can now fix the bug (and some others of the same kind!) 👍🏼

AdrianSosic added a commit to emdgroup/baybe that referenced this issue Dec 13, 2023
* use only randomness from built-in strategies: HypothesisWorks/hypothesis#3810
* the lower bound should be 1
* the now used hypothesis strategy further allows for shrinkage
@Zac-HD
Copy link
Member

Zac-HD commented Dec 13, 2023

Absolutely agreed! Thanks for your notes on documentation too.

I've actually still had the tab (and issue) open, while I think over how we can make this a runtime warning of some kind - having a silently weaker-than-expected test is our worst case for API design, and here we are. It's just rather difficult to see how to make it precise enough to catch the problems, without introducing considerable friction when testing code that happens to use random numbers...

Got it: we warn if drawing from a strategy changed the random state (state of any registered random instance? I think that probably isn't worth the execution time), but continue to allow use of random from the test function itself - with our automatic seeding for determinism.

@Zac-HD Zac-HD changed the title Strategies using non-builtin generators Warn if strategies depend on the random module without using st.random_module() Dec 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
legibility make errors helpful and Hypothesis grokable
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants