Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a versus_test method to TestPlayer #875

Merged
merged 32 commits into from
Mar 9, 2017
Merged

Add a versus_test method to TestPlayer #875

merged 32 commits into from
Mar 9, 2017

Conversation

drvinceknight
Copy link
Member

Addresses #874 with a tweaked testing framework.

The versus_test method removes the option to set player histories and can be used to test strategies (and player attributes during matches) by creating a match between the player in question and an opponent.

The opponent can either be an actual player from the library or is defined as a cycle by a passed sequence of actions (there are examples of this in the docs and test_titfortat.py).

I've refactored all the tests in test_titfortat.py (which is now one module that raises no warnings).

When/if we're happy with this as a suggestion, we should open an issue about refactoring all the tests to use this approach. (This would be a big/important piece of work so we should ask for small contributions at a time to make sure we don't miss anything with the reviews).

Once that refactor is complete we could potentially remove unused methods in the TestPlayer class.

@marcharper
Copy link
Member

I like most of this. Some quick thoughts:

  • Having opponent be a player or a sequence is a bit confusing -- using a MockPlayer would be better IMO, that's its purpose.
  • We might be able to get away with only having init_kwargs, or it may be better to tackle that issue before we rewrite a bunch of tests.

@drvinceknight
Copy link
Member Author

Having opponent be a player or a sequence is a bit confusing -- using a MockPlayer would be better IMO, that's its purpose.

Yeah I wasn't completely sure about having two input possibilities. I'll make this tweak.

We might be able to get away with only having init_kwargs, or it may be better to tackle that issue before we rewrite a bunch of tests.

Can you expand on this a bit more? I'm not too sure I follow which issue you mean.

@drvinceknight
Copy link
Member Author

I've also just pushed a refactor of the Random strategy. I've used init_kwargs there.

@marcharper
Copy link
Member

Issue #706 -- maybe we could make all the Player args into kwargs, which I think would help with #706 as well.

@drvinceknight
Copy link
Member Author

Issue #706 -- maybe we could make all the Player args into kwargs, which I think would help with #706 as well.

Sure thing, I'll remove the init_args option for now so that it will push us in that direction.

@marcharper
Copy link
Member

It's ok to use test_reset.

@drvinceknight
Copy link
Member Author

It's ok to use test_reset.

Did you mean to type that here or in the other issue? Surely that's overwriting the test_reset method of the parent class?

@@ -45,7 +45,7 @@ def test_strategy(self):
init_kwargs={"grumpy_threshold": 3,
"nice_threshold": 0})

def test_reset(self):
def test_reset_state(self):
Copy link
Member

@marcharper marcharper Mar 7, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test is unnecessary -- the default test catches any attribute changes.

actions is passed, a Mock Player is created that cycles over that
sequence.
expected_outcomes: List
the expected outcomes of the match (list of tuples of actions).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

T

`{length:-1}` implies that the players do not know the length of the
match.
attrs: dict
dictionary of internal attributes to check at the end of all plays
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D

self.responses_test([C], [C, D, C], [C, C, D], seed=1)

opponent = axelrod.MockPlayer()
outcomes = [(C, C), (D, C), (D, C)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This tests the histories above but not the next move, so it's not exactly equivalent. Doesn't matter in this case of course.

self.second_play_test(rCC=C, rCD=D, rDC=C, rDD=D)

# Play against opponents
outcomes = [(C, C), (C, D), (D, C), (C, D)]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as before re: not checking the last move anymore.

@drvinceknight
Copy link
Member Author

@marcharper the particular test in test_grumpy was not covered by the default case as it's checking a non default value but your comment got me to find a bug in our default test case. It was in effect not testing anything (it was simply comparing the attributes of the clone to the attributes of the clone and not the reset player).

I've fixed this (cff2e53 and a refactor here: b7ce33a) and ended up having to fix some strategies that were then failing.

This is worth a slow and thorough review in case I've missed anything so no rush :)

for k, v in clone.__dict__.items():
if isinstance(v, np.ndarray):
self.assertTrue(np.array_equal(v, getattr(clone, k)))
for attribute, value in player.__dict__.items():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would player == clone work here now?

player.play(opponent)
player.reset()
for i, p in enumerate(player.team):
self.assertEqual(len(p.history), 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use player == clone here?

axelrod.seed(seed)
player.play(opponent)

player.reset()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

player == clone?

@marcharper
Copy link
Member

That's interesting. I remember catching a subtle bug with MetaPlayer with that test so that's surprising that it worked at all! I like the new additions, looks like maybe we could remove some redundant code by using player == clone in a few places.

We might consider updating MetaPlayer's clone method to clone all the team members, or just take a closer look at it.

@drvinceknight
Copy link
Member Author

That's interesting. I remember catching a subtle bug with MetaPlayer with that test so that's surprising that it worked at all! I like the new additions, looks like maybe we could remove some redundant code by using player == clone in a few places.

You might be able to tell from the commit history that I implemented a player __eq__ but felt that that was stepping a bit too far for this PR so I reverted it (we should squash these commits for with the merge). Without the __eq__ method player == clone would just look at instance equality (so be equivalent to player is clone). That could be added at another date.

We might consider updating MetaPlayer's clone method to clone all the team members, or just take a closer look at it.

Perhaps for another PR?

This does it by specifically creating the Matches.
- Refactor test for Tf2Ts.
- Refactor tests for 2TfT
- Refactor tests for Bully.
- Refactor tests for SneakyTitForTat.
- Refactor test for suspicious TfT.
- Refactor test for AntiTitForTat
- Refactor tests for HardTfT
- Refactor tests for HardTf2T
- Refactor OmegaTfT tests.
- Refactor tests for Gradual.
- Refactor tests for SlowTitForTwoTats
- Refactor tests for AdaptiveTfT
- Refactor tests for SpitefulTfT

Also implement correct check for seed is not None.

If we check `if seed` then when seed is passed as 0 it will not be set.
@drvinceknight
Copy link
Member Author

I've just rebased on to master and caught a misclassification of stalker :)

I'm not sure if this means I'm going to miss any comments you wrote but I think all of them were suggesting player == clone which (unless I'm missing something) won't work because of the fact that we haven't got a __eq__ method implemented but that could be another issue to raise/discuss?

@marcharper
Copy link
Member

OK. __eq__ was still there when I was commenting. I would vote to add it back in, but it can be another PR.

marcharper
marcharper previously approved these changes Mar 8, 2017
@@ -57,6 +64,7 @@ def __init__(self, transitions=None, initial_state=None,
initial_action = C
super().__init__()
self.initial_state = initial_state
self.state = initial_state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this line?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needed for the reset test. The cloned player did not have a self.state attribute if it had not played yet.

@marcharper
Copy link
Member

Do we want to change expected_outcomes to expected_actions or something else more inline with the terminology elsewhere in the library?

@drvinceknight
Copy link
Member Author

OK. eq was still there when I was commenting. I would vote to add it back in, but it can be another PR.

Cool. Once this is in I'll open a couple of issues:

  1. To refactor the tests;
  2. To add an __eq__ method in to the players and refactor some of the base tests;
  3. Once 1. is complete to refactor the base tests to remove anything that is not needed anymore.

Do we want to change expected_outcomes to expected_actions or something else more inline with the terminology elsewhere in the library?

Good call. I've pushed that.

@drvinceknight
Copy link
Member Author

@marcharper in bc398f2 I've gone a step further and removed first_play_test and second_play_test from the docs and test_tit_for_tat, test_rand. (I also added a couple more versus_tests in test_tit_for_tat to make sure all states were being tested.)

My thinking is that we can test those properties using matches and it moves us away from the potential history mistmatches that can still occur with second_play_test.

Can always revert if we don't think it's a good idea.

@marcharper
Copy link
Member

I'd prefer to keep those tests -- the warnings for those are suppressed already. Or -- let's take that discussion to a new issue / PR?

@drvinceknight
Copy link
Member Author

I'd prefer to keep those tests -- the warnings for those are suppressed already. Or -- let's take that discussion to a new issue / PR?

Fine by me, I've reverted the commit.

@meatballs
Copy link
Member

This one looks ready to roll to me. Anyone object?

@drvinceknight
Copy link
Member Author

This one looks ready to roll to me. Anyone object?

Not from me :)

@meatballs meatballs merged commit ab01b8c into master Mar 9, 2017
@meatballs meatballs deleted the 874 branch March 9, 2017 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants