Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ML strategies #803

Merged
merged 29 commits into from
Jan 5, 2017
Merged

ML strategies #803

merged 29 commits into from
Jan 5, 2017

Conversation

marcharper
Copy link
Member

@marcharper marcharper commented Jan 2, 2017

This PR adds several new strategies.

  • Two strategies from the literature: Winner12 and Winner21
  • A new class of strategies based on hidden Markov models
  • Newly trained versions of players based on finite state machines, HMM, EvolvedLookerUp, PSOGambler, and Evolved ANN

Of particular note:

  • EvolvedFSM16, a 16 node FSM player that is the new "best" strategy
  • EvolvedFSM16Noise05, a 16 node FSM player that is the new best strategy for noisy tournaments (and quite good in the noise free tournaments)
  • PSOGamblerMem1, a memory one strategy trained with the PSO algorithm
  • A revised evolved looker up that performs well
  • EvolvedANN5 (with a smaller inner layer)
  • EvolvedHMM, a hidden markov model based strategy

It's likely that these strategy are not the best possible, and that better versions can be evolved in the future. I trained many strategies, some of which are not included, particularly:

  • Preliminary models for the Moran process (not added to the standard list, and likely to be improved)
  • Models trained to win rather than achieve a high net score

There are also other conceivable training modes, e.g. maximum total score (self + opponent). The point is: we may decide to add or remove trained strategies in the future.

Also in this PR:

  • Some of the strategy classes were refactored to allow better training. See the code in the axelrod-evolver repository for reference and training code.
  • Model data for the trained strategies is stored in a subdirectory allowing for easy addition of new strategies, and to prevent storing data directly in strategy files. The exception is HMM (for which I've only included one strategy).
  • To the point, since there are a lot of e.g. lookerup models, the strategy classes are created at runtime.

@drvinceknight
Copy link
Member

Oooof this is a biggy!

Copy link
Member

@drvinceknight drvinceknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a first shallow sweep, there's a lot here: could take me a little while but it looks like awesome work!

"""

name = 'Worse and Worse'
classifier = {
'memory_depth': float('inf'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It needs to know the current turn so does that not equate to knowing how long the game has been? (Thus infinite memory).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure about this one so I've changed it back to float('inf') for now. The question is whether using the round number counts as using history or not, let's discuss elsewhere.

@@ -0,0 +1,123 @@
"""Tests for Finite State Machine Strategies."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hidden markov model strategies

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -22,6 +22,10 @@ documentation.
.. [Li2009] Li, J. & Kendall, G. (2009). A Strategy with Novel Evolutionary Features for the Iterated Prisoner’s Dilemma. Evolutionary Computation 17(2): 257–274.
.. [Li2011] Li, J., Hingston, P., Member, S., & Kendall, G. (2011). Engineering Design of Strategies for Winning Iterated Prisoner ’ s Dilemma Competitions, 3(4), 348–360.
.. [Li2014] Li, J. and Kendall, G. (2016). The Effect of Memory Size on the Evolutionary Stability of Strategies in Iterated Prisoner's Dilemma. IEEE Transactions on Evolutionary Computation, 18(6) 819-826
.. [Mathieu2015] Mathieu, P. and Delahaye, J. New Winning Strategies
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the docs build does this look right? (Just doesn't match the one line formatting used for all others).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Citation updated

url='http://axelrod.readthedocs.org/',
license='The MIT License (MIT)',
description='Reproduce the Axelrod iterated prisoners dilemma tournament',
include_package_data=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you haven't, could you check that this pip installs? (eg pip installing in to a virtual env from the local dir)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does locally, and the tests would fail on travis otherwise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think travis specifically tests this (just fyi) but appveyor does test the setup install (because we had some windows problems with that at some point...).

@@ -0,0 +1,45 @@
import pkg_resources
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have unit tests for these please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure how to better test these than simply loading the data and testing the strategies. I added a few integrity checks in ANN and LookerUp to make sure the data is of the expected length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that we could have a tests/unit/test_load_data.py file that just checks that these functions run and that the data is of the expected format separately.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that we'd be testing anything further in that case -- if the format or data types are wrong the strategies will fail when constructed or played.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this wouldn't test anything further, it just consolidates things: for example in the future these ml strategies could be changed to no longer read the data (hypothetically), their tests adjusted and an error creeping in to these reader functions. That's a weird case but I think my point holds?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if the players no longer read the data then the data and these functions are unnecessary (and their coverage will disappear). So wouldn't we just delete the data and these functions in that case, unless something else is using them? And if something else is using them then a bad change will still break those things.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point, I'd still say too many tests is better than too phew but I won't insist. :) 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe type annotations are a good check here so when start annotating we'll get a little extra coverage.

"""Implementation of a basic Hidden Markov Model. We assume that the
transition matrix is conditioned on the opponent's last action, so there
are two transition matrices. Emission distributions are stored as Bernoulli
probabilities for each state. This is essentially a stochastic FSM.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Names

    - SimpleHMM: ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one isn't a strategy but I updated the HMM Players with names

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I got over zealous :)



class HMMPlayer(Player):
"""Abstract base class for Hidden Markov Model players."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment names ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

self.hmm.state = self.initial_state


class EvolvedHMM5(HMMPlayer):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -14,7 +14,7 @@ class TestEvolvedANN(TestPlayer):
expected_classifier = {
'memory_depth': float('inf'),
'stochastic': False,
'makes_use_of': set(["length"]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the number of turns no longer a feature?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope! Just the round number.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :)

def test_malformed_params(self):
# Test a malformed table
t_C = [[1, 0.5], [0, 1]]
self.assertFalse(is_stochastic_matrix(t_C))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re my request for a test for this function, could it be pulled out of here and tested independently? (With a True assertion as well).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -0,0 +1,5 @@
# name, features, hidden_layer_size, weights...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small suggestion: remove the # and list full header (potentially useful for other analysis?). I expect this would need to be done on the axelrod-evolver repo and not for this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should cross this bridge later. The number of columns isn't constant so there isn't really a proper header.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine to leave for later, you could have the max number of columns with headers and have NANs in the other ones? (Not suggesting that for now).

Copy link
Member

@drvinceknight drvinceknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fantastic addition to the library @marcharper :)

My requests are mainly docstrings and more tests as well as a couple of questions.

import pkg_resources


def load_file(filename, directory):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring for completeness. Also numpy style for this and the rest?


If the mutation_rate is 0, the population will eventually fixate on
exactly one player type. In this case a StopIteration exception is
raised and the play stops. If mutation_rate is not zero, then the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the mutation_rate

hidden_layer_size
)
ANN.__init__(self, i2h, h2o, bias)
num_features, num_hidden, weights = nn_weights['1']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we change the 1 in the dataset and here to be 10? This would be for readability and corresponding to the size of the hidden layer.


Names:

- EvolvedANN5: : Original name by Marc Harper.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra :

@@ -158,7 +160,8 @@ def strategy(self, opponent):

class EvolvedANN(ANN):
"""
A strategy based on a pre-trained neural network.
A strategy based on a pre-trained neural network with 17 features and a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be worth including a bullet point list of the 17 features here?

for m in [self.hmm.transitions_C, self.hmm.transitions_D]:
for row in m:
values.update(row)
if not values.issubset({0, 1}):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return not values.issubset({0, 1}).

(This is stylistic, I don't feel strongly about it.)


Names

- EvolvedHMM5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

: Original name...

player = self.player([[1]], [[1]], [0], initial_state=0)
player.hmm.state = -1
player.reset()
self.assertFalse(player.hmm.state == -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we have a specific test for the EvolvedHMM5.

player = self.player([[1]], [[1]], [0], initial_state=0)
player.hmm.state = -1
player.reset()
self.assertFalse(player.hmm.state == -1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think assertNotEqual would be better here.

And/or even assertEqual with 0.

@@ -19,7 +19,7 @@ class MetaPlayer(Player):
classifier = {
'memory_depth': float('inf'), # Long memory
'stochastic': True,
'makes_use_of': set(),
'makes_use_of': {'game', 'length'},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you walk me through this one please?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default player set has members that use both the game and the match length.

import pkg_resources


def load_file(filename, directory):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't these be better as a set of pandas dataframes? There would be far less code and it would be quicker too.

I know it's another dependency, but we're already dependent on numpy, so the precedent has been set.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! I prefer using pandas actually if the extra dependency is ok with @drvinceknight .

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not averse to the extra dependency. 👍

Could change the output of the results_set.summarize to be a data frame too (not for this PR, another issue :)).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. I think anyone that uses anaconda or can pip install numpy should have access to rest of the scientific stack (for sure at least scipy and pandas).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm punting on this one since the number of columns isn't constant in all cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine by me.

I'm not entirely sure use dfs would make things simpler in this case, the data as is would need to be pivoted for the df to be advantageous or the data could be stored with rows corresponding to "genes" and columns to different strategies... Perhaps not a bad idea (but I don't think necessary for this PR).

@marcharper
Copy link
Member Author

I believe that I addressed all the comments sufficiently, let me know if you agree!

Copy link
Member

@drvinceknight drvinceknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All looks good to me! Nice job :)

@meatballs meatballs merged commit 611bfb7 into master Jan 5, 2017
@meatballs meatballs deleted the ml_strategies branch January 5, 2017 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants