Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stateful: sometimes no steps are run? #376

Closed
Tinche opened this issue Oct 10, 2016 · 2 comments · Fixed by #1188
Closed

Stateful: sometimes no steps are run? #376

Tinche opened this issue Oct 10, 2016 · 2 comments · Fixed by #1188
Labels
enhancement it's not broken, but we want it to be better

Comments

@Tinche
Copy link

Tinche commented Oct 10, 2016

I've noticed sometimes my stateful tests run zero rules - the rule-based state machine's __init__ gets run and that's it. This counts against the 200 valid examples limit.

Here's a braindead stateful test to demonstrate: https://gist.github.com/Tinche/e1339718ff9d5b56e1c50773f3a7a8a2 (if it's a little quaint, it's because it's modeled after what I actually use).

> set -x HYPOTHESIS_VERBOSITY_LEVEL debug
> pytest t.py -s | grep "init" | wc -l
200
> pytest t.py -s | grep "Step #1:" | wc -l
157

The second number goes from ~130 to ~160. So a quarter of the tests just initialize the class. That's a little weird, and not that useful IMO. The default number of steps is 50, but I find very few runs actually get to 50. (5-15 out of 200, so 5%?)

I think it'd be better if all runs ran at least one rule, or at least if Hypothesis didn't count runs with no rules hit against max_examples, simply because I'm betting just running __init__ doesn't actually test anything for most people.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 14, 2017

#498 may help with this; we should check again once it's been merged.

Update: nope, looks like it didn't do much - I get 200 inits, 175 to 185 step one, and eventually 10 to 15 step fifty. While it's clearly improved, I wouldn't call this issue closed.

@Zac-HD Zac-HD added the enhancement it's not broken, but we want it to be better label Apr 20, 2017
@Zac-HD
Copy link
Member

Zac-HD commented Oct 4, 2017

code with exact reporting
from collections import Counter
from hypothesis.stateful import RuleBasedStateMachine, \
    run_state_machine_as_test, rule, precondition

cache = dict()  # id(test): num_of_steps(test)

class StatefulTest(RuleBasedStateMachine):
    def increment(self):
        if not id(self) in cache:
             cache[id(self)] = 0
        else:
            cache[id(self)] += 1

    def __init__(self):
        super().__init__()
        self.a = False
        self.b = False
        self.c = False
        self.increment()

    @rule()
    def open_a(self):
        self.a = True
        self.increment()

    @rule()
    @precondition(lambda self: self.a)
    def open_b(self):
        self.b = True
        self.increment()

    @rule()
    @precondition(lambda self: self.b)
    def open_c(self):
        self.c = True
        self.increment()

    @rule()
    @precondition(lambda self: self.c)
    def final(self):
        self.increment()

run_state_machine_as_test(StatefulTest)
Counter(cache.values())

As of Hypothesis 3.31.2 the situation is still basically the same, with 15-20% of runs ending before running any steps at all, and ~40% running two or fewer steps.

As a solution, I suggest that we run all stateful tests until either stateful_step_count is reached or a bug is found.

The other problem is that without some memory, we are likely to retread the same sections of the tree of possible actions - especially when there are few transition rules. This may be handled by coverage-guided tests, but I'm unsure how they interact with stateful testing in practice.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants