-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented Leyvraz, K86R from Axelrod's second #1153
Conversation
axelrod/strategies/axelrod_second.py
Outdated
2. If opponent Defected three turns ago, then Cooperate. | ||
3. If opponent Defected two turns ago, then Defect. | ||
4. If opponent Defected last turn, then Defect with prob 50%. | ||
5. Otherwise (all Coopertaions), then Cooperate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Cooperations
# Otherwise Defect, Cooperate, Cooperate, and increase `prob` | ||
self.prob += 0.05 | ||
self.more_coop, self.last_generous_n_turns_ago = 2, 1 | ||
return self.try_return(D, lower_flags=False) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened here? This doesn't look like it should have changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, maybe I shouldn't have slipped this in here. I got rid of a useless else.
axelrod/strategies/axelrod_second.py
Outdated
return D | ||
if recent_history[-1] == D: | ||
return random_choice(0.5) | ||
return C # recent_history == [C, C, C] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a suggestion to consider rather than a request, but might this be easier to read with a dict mapping recent_history to action? Something like:
history_to_action = {
[C, C, C]: C,
[C, C, D]: random_choice(0.5)
etc....
}
return history_to_action[recent_history]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this actually is not ideal in this case (unless I'm misunderstanding):
[C, C, D]: random_choice(0.5)
Implies history_to_action
would need to be redefined at every call of strategy
to ensure the random sample is different which in turn means we'd be using the random module even when not needed (which bumps the seed and makes the strategy slower).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way around it (if you did want to use a dict
) would be to use the dict to map to probabilities:
{
(C, C, C): 1,
(C, C, D): 0.5
}
and then use random_choice(history_to_action[recent_history])
(the random_choice
is smart so if passed 0 or 1 doesn't sample).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and implemented this. I agree that it's a little more readable.
Strategy logic looks good and a correct implementation to me. Will leave the stylistic issue of the |
axelrod/strategies/axelrod_second.py
Outdated
(D, D, C): 1.0, | ||
(D, D, D): 0.5 | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
1. If opponent Defected last two turns, then Defect with prob 75%. | ||
2. If opponent Defected three turns ago, then Cooperate. | ||
3. If opponent Defected two turns ago, then Defect. | ||
4. If opponent Defected last turn, then Defect with prob 50%. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the docstrings don't match the description:
(D, D, D): 0.5
That doesn't correspond to If opponent Defected last turn, then Defect with prob 50%.
as indicated for example by (D, C, D): 1.0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies: I see you said "ordered" rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I was hoping the tests would catch any issues. I'll go through this more carefully and re-run the fingerprints.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No sorry apologies again, that's still not exact then, according to the ordered description: (D, D, D)
should defect with prob 75%.
Now that we have this dict
, we could almost just describe the strategy by saying:
Depending on the 3 most recent actions by the opponent the strategy cooperates with probability:
- If (C, C, C): probability 1;
- If (C, C, D): probability 0.5;
...
and essentially write out the 8 cases that correspond to the dict
(which indeed really helps with readability here 👍).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think (D, D, D) should be 0.25, by the first rule. (These rules get applied in order.) These others seem okay; (D, C, D) is 1.0 by rule 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are correct:
>>> import axelrod as axl
>>> import axelrod_fortran as axlf
>>> axl.seed(0)
>>> players = (axl.Defector(), axlf.Player("k68r"))
>>> match = axl.Match(players, turns=1000)
>>> results = match.play()
>>> match.cooperation()
(0,240)
so k68r
is cooperating 25% of the time against D, D, D
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whereas your current branch (as of 890d930):
>>> players = (axl.Defector(), axl.Leyvraz())
>>> match = axl.Match(players, turns=1000)
>>> axl.seed(0)
>>> results = match.play()
>>> match.cooperation()
(0, 508)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A direct map of the previous implementation:
- if recent_history[-1] == D and recent_history[-2] == D:
- return random_choice(0.25)
- if recent_history[-3] == D:
- return C
- if recent_history[-2] == D:
- return D
- if recent_history[-1] == D:
- return random_choice(0.5)
- return C # recent_history == [C, C, C]
gives:
{
(C, D, D): 0.25,
(D, D, D): 0.25,
(D, C, C): 1,
(D, C, D): 1,
(D, D, C): 1,
(C, D, C): 0,
(C, C, D): 0.5,
(C, C, C): 1
}
which I believe gives the same as you have, apart from, as you say (D, D, D)
.
Would be good to confirm the fps just in case 👍
FPs look good. |
'manipulates_source': False, | ||
'manipulates_state': False | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add a test_init_of_dict(self)
just to add a second anchor for the internal dictionary (to help avoid any future changes that might re introduce the error we just caught).
def test_init_of_dict(self):
player = self.player()
expected_prob_coop = {...}
self.assertEqual(player.prob_coop, expected_proob_coop)
For completeness have rerun at dacc9c8: >>> players = (axl.Defector(), axl.Leyvraz())
>>> match = axl.Match(players, turns=1000)
>>> axl.seed(0)
>>> results = match.play()
>>> match.cooperation()
(0, 248) 👍 |
axelrod/strategies/axelrod_second.py
Outdated
if len(opponent.history) >= go_back: | ||
recent_history[-go_back] = opponent.history[-go_back] | ||
|
||
return random_choice(self.prob_coop[tuple(recent_history)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've hunted down why the type testing is failing on travis. It doesn't know what size tuple
tuple(recent_history)
returns so you need to change this to be:
1508 return random_choice(self.prob_coop[(recent_history[-3], │
+ 1509 recent_history[-2], │
+ 1510 recent_history[-1])])
So specifically say this is a tuple of size 3, which is a bit of a mouthful.
Thanks for making all the changes 👍 |
Fingerprints below.
Pretty direct translation. Their funny if statements reduced a lot.
J1 and J
J2 cannot be 1, because then the whole left side would exceed 1. Ditto J. So this becomes J1=1.
Similarly J1 and J2 must equal zero. So this becomes J=1.