Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initialization rules to RuleBasedStateMachine #1216

Closed
DRMacIver opened this issue Apr 6, 2018 · 10 comments
Closed

Add initialization rules to RuleBasedStateMachine #1216

DRMacIver opened this issue Apr 6, 2018 · 10 comments
Labels
enhancement it's not broken, but we want it to be better

Comments

@DRMacIver
Copy link
Member

A common usage pattern in RuleBasedStateMachine is that you have a a single mutable data structure instance that is shared across the whole test.

This is currently only well supported if you can just straightforwardly initialize that data structure without relying on Hypothesis provided data and do that in __init__.

If you need Hypothesis to provide the data, currently the best emulation is something like the following:

from hypothesis.stateful import RuleBasedStateMachine, rule, precondition
import hypothesis.strategies as st


class NeedsInit(RuleBasedStateMachine):
    def __init__(self):
       super(NeedsInit, self).__init__()
       self.some_data = None

   @precondition(lambda: self.some_data is None)
   @rule(data=st.lists(st.integers())
   def init_data(self, data):
       self.some_data =data

   @precondition(lambda: self.some_data is not None)
   @rule()
   def do_stuff(self):
       ...

(disclaimer: I wrote that from memory without running it).

This is obviously quite awkward! You have to add preconditions to each rule.

It would be nice if we could instead write:

from hypothesis.stateful import RuleBasedStateMachine, rule, initialize
import hypothesis.strategies as st

class NeedsInit(RuleBasedStateMachine):
   @initialize(data=st.lists(st.integers())
   def init_data(self, data):
       self.some_data =data

   @rule()
   def do_stuff(self):
       ...

Suggested semantics/implementation:

  • We add a state_machine_init method or something like that to GenericStateMachine as the relevant hook.
  • state_machine_init is called with a st.data() strategy before the first check_invariants call
  • RuleBasedStateMachine implements state_machine_init to run each @initialize decorated method exactly once in an arbitrary order (possibly an explicitly randomised order using permutations? That way people won't depend on the order).
@DRMacIver DRMacIver added enhancement it's not broken, but we want it to be better good first issue labels Apr 6, 2018
@bukzor
Copy link
Contributor

bukzor commented Apr 6, 2018

You can close #1214 as duplicate if you like. This would fill my feature request.

@bukzor
Copy link
Contributor

bukzor commented Apr 6, 2018

The API I had imagined is below. Do you prefer yours?

from hypothesis.stateful import RuleBasedStateMachine, rule
import hypothesis.strategies as st

class NeedsInit(RuleBasedStateMachine):
   def __init__(self, some_data=st.lists(st.integers()):
       self.some_data = some_data

   @rule()
   def do_stuff(self):
       ...
```

@DRMacIver
Copy link
Member Author

DRMacIver commented Apr 12, 2018

Do you prefer yours?

Yes, pretty strongly. I'd possibly be OK with an __init__ based approach (though I think separating out the initialize rules is probably better), but I'm very against using default arguments to indicate strategies - it's inconsistent with pretty much everywhere else we use strategies.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 13, 2018

+1 to @initialize rules - this will also allow a single class to test multiple constructors for an object.

@DRMacIver
Copy link
Member Author

this will also allow a single class to test multiple constructors for an object.

Hmm. You may have accidentally just argued against this feature, because there are two different interpretations here and we seem to have hit on different ones.

Possible semantics of multiple initialization rules:

  • Exactly one of them is run
  • All of them are run, in some arbitrary order

I had intended the latter and you seem to have inferred the former. This might be fine if we explicitly document it, but it might lead to some significant user confusion.

@alexwlchan
Copy link
Contributor

So I might almost be tempted to lean a third way, and suggest @initialize, but at most one per class, then different constructors get tested by different classes.

So if you have different constructors, you push your rules into a mixin, and create variants of that mixin with different @initialize rules. Each constructor gets a dedicated test case.

This is just spitballing, I’m not convinced this is actually a good idea. 🙃

@DRMacIver
Copy link
Member Author

FWIW I think the use-case of different constructors can be solved by just branching within the initialize method. So you can have something like:

@initialize(data=st.data())
def setup(self, data):
    if data.draw(st.booleans()):
        self.thing = ...
    else:
        self.thing = ...

So I'm not convinced that "What should we do about multiple constructors?" is an interesting question for shaping the design of this.

I don't have a problem with allowing at most one @initialize per class though. It might reduce the confusion, simplify the implementation, and is an easy restriction for us to lift later if someone comes up with a compelling use case.

@alexwlchan
Copy link
Contributor

I'm not convinced that "What should we do about multiple constructors?" is an interesting question for shaping the design of this.

I think I agree. Most classes I write don’t have multiple constructors, and of the ones that do, most of them have a “primary” constructor, and others that just tidy up the inputs (e.g. taking a str instead of a datetime, and parsing it before sending to the primary constructor).

@DRMacIver
Copy link
Member Author

Most classes I write don’t have multiple constructors,

I mean in Python no classes have multiple constructors. The only real way to get something like that is to have multiple subclasses.

@Zac-HD
Copy link
Member

Zac-HD commented Apr 13, 2018

Or classmethods!

I meant "you can define multiple methods with @initialize, of which exactly one will be run per iteration".

Take Pandas dataframes for example; they have an __init__ method but also classmethods from_records, from_dict, and from_items. Having an @initialize for each option would let you test that holds no matter how you create the object.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement it's not broken, but we want it to be better
Projects
None yet
Development

No branches or pull requests

4 participants