Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new DynamicTwoTitsForTat strategy #1030

Merged
merged 6 commits into from
May 30, 2017

Conversation

FakeNameSE
Copy link
Member

Adds the new strategy DynamicTwoTitsForTat, based off of the TwoTitsForTat stategy by adding a probability of forgiveness based off of the ratio of the opponent's cooperations to total moves (so their rough probability of cooperation).

Link to original request: #1027

…ing TwoTitsForTat stategy by adding a probability of forgiveness based off of the ratio of the opponent's cooperations to total moves (so their rough probability of cooperation).

Signed-off-by: FakeNameSE <grantlycee@gmail.com>
@FakeNameSE
Copy link
Member Author

My edits passed the doctest and my unittest locally.

@FakeNameSE FakeNameSE changed the title Added a new DynamicTwoTitsForTat strategy which modifies the preexist… Add a new DynamicTwoTitsForTat strategy May 28, 2017
Copy link
Member

@drvinceknight drvinceknight left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking really good @FakeNameSE!

A couple of little things from me. The tests are failing at an integration test that checks that all deterministic strategies act the same way when repeated. However this strategy does not but that's simply because it hasn't been classified correctly (I've noted where it should have "stochastic": True).

I think I'm ok with the name because it is punishing a defection with 2 defections. 👍

Let me know if anything I've asked for isn't clear. Looking forward to getting this strategy in :)

if D in opponent.history[-2:]:
# Probability of turning the other cheek based off of
# opponent's probability of defection
if random.random() < (opponent.cooperations / len(opponent.history)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with random_choice(opponent.cooperations / len(opponent.history))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are internal reasons why we use random_choice. It is implemented in a way that it won't sample a random number if the probability is 0 or 1, which in turn means that the random state won't be offset when strategies act deterministicaly :)

For example: if we were to play a Defector there would not actually be anything random here (the probability would always be 0) , so random_choice would not actually sample any random numbers.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Unfortunately, after changing that I now get:

FAIL: test_strategy (axelrod.tests.strategies.test_titfortat.TestDynamicTwoTitsForTat)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/grant/Axelrod/axelrod/tests/strategies/test_titfortat.py", line 132, in test_strategy
    self.second_play_test(rCC=C, rCD=D, rDC=C, rDD=D)
  File "/home/grant/Axelrod/axelrod/tests/strategies/test_player.py", line 484, in second_play_test
    rCD, C, D, seed=seed)
  File "/home/grant/Axelrod/axelrod/tests/strategies/test_player.py", line 361, in test_responses
    test_class.assertEqual(s1, response)
AssertionError: 'C' != 'D'
- C
+ D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your player tests are testing things sufficiently so I'd just remove the second_play_test here (the fact that it failed is based on the history being different and the probabilities being sampled correctly).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I now get

======================================================================
FAIL: test_strategy (axelrod.tests.strategies.test_titfortat.TestDynamicTwoTitsForTat)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/grant/Axelrod/axelrod/tests/strategies/test_titfortat.py", line 139, in test_strategy
    self.versus_test(opponent, expected_actions=actions)
  File "/home/grant/Axelrod/axelrod/tests/strategies/test_player.py", line 535, in versus_test
    self.assertEqual(match.play(), expected_actions)
AssertionError: Lists differ: [('C', 'D'), ('C', 'C'), ('C', 'C'), ('C', 'D'), ('C', 'C')] != [('C', 'D'), ('D', 'C'), ('D', 'C'), ('C', 'D'), ('D', 'C')]

First differing element 1:
('C', 'C')
('D', 'C')

- [('C', 'D'), ('C', 'C'), ('C', 'C'), ('C', 'D'), ('C', 'C')]
+ [('C', 'D'), ('D', 'C'), ('D', 'C'), ('C', 'D'), ('D', 'C')]

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I always get (C,C) for the last iteration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please push your code, it's difficult for me to debug it without seeing it (there could be an error in the strategy now or a number of other things). I'm going to sleep now but will look at it in the morning. We'll figure it out.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should I push it? Otherwise, it is here: https://github.com/FakeNameSE/Axelrod

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That link shows me the same branch at the same stage of what I'm seeing on this PR. Commit your changes to the same branch and push them (they'll automatically appear here). Let me know if you don't know what I mean and I'll give a detailed explanation when I have a moment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I forgot to, Done

# Will defect twice when last turn of opponent was defection.
opponent = axelrod.MockPlayer(actions=[D, C, C, D, C])
actions = [(C, D), (D, C), (D, C), (C, D), (D, C)]
self.versus_test(opponent, expected_actions=actions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mock player test will need to be seeded: http://axelrod.readthedocs.io/en/stable/tutorials/contributing/strategy/writing_test_for_the_new_strategy.html

self.versus_test(axelrod.Alternator(), expected_actions=actions, seed=0)

(I expect the others won't as their probabilities are 0 or 1 right?)

Once seeded can you find two seeds where the player acts differently for different seeds (thus showing that it's stochastic).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. I choose a different seed, and then do I repeat the same test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow. I choose a different seed, and then do I repeat the same test?

That's right with different actions.

To illustrate:

>>> p = 0.5 # for example opponent.history = [C, C, D, D]
>>> axl.seed(0)
>>> axl.random_choice(p)
'D'
>>> axl.seed(1)
>>> axl.random_choice(p)
'C'

So with different seeds the random choice gives different results.

So in your case you'd have something like:

actions = [..., (D, D)]
self.versus_test(opponent=opponent, expected_actions=actions, seed=1)
actions = [..., (C, D)]  # Different actions
self.versus_test(opponent=opponent, expected_actions=actions, seed=0)  # Different seed

This just shows that the strategy does act randomly.

name = 'Dynamic Two Tits For Tat'
classifier = {
'memory_depth': 2, # Long memory, memory-2
'stochastic': False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strategy is stochastic (it acts randomly).

return C
if D in opponent.history[-2:]:
# Probability of turning the other cheek based off of
# opponent's probability of defection
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is ambiguous. I would remove it.

opponent with a dynamic bias based off of the opponents ratio of
cooperations to total moves (so their current probability of
cooperating towards cooporating regardless of the move
(aka: forgiveness)."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a closing bracket.

@@ -92,6 +94,39 @@ class TwoTitsForTat(Player):
def strategy(opponent: Player) -> Action:
return D if D in opponent.history[-2:] else C

class DynamicTwoTitsForTat(Player):
"""A player starts by cooperating and then mimics previous move by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strategy does not mimic the previous move by the opponent. It seeks revenge if the opponent has defected in the previous two rounds.

opponent with a dynamic bias based off of the opponents ratio of
cooperations to total moves (so their current probability of
cooperating towards cooporating regardless of the move
(aka: forgiveness)."""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add (at the end of this docstring):

Names:

- DynamicTwoTitsForTat: Original name by <your name if you wish>

Similar to http://axelrod.readthedocs.io/en/stable/reference/all_strategies.html#axelrod.strategies.ann.EvolvedANN for example.

@@ -1,6 +1,8 @@
from axelrod.actions import Actions, Action
from axelrod.player import Player
from axelrod.strategy_transformers import TrackHistoryTransformer, FinalTransformer
from axelrod.random_ import random_choice
import random
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you use random_choice, remove import random.

@drvinceknight
Copy link
Member

drvinceknight commented May 28, 2017 via email

return C
if D in opponent.history[-2:]:
# Probability of cooperating regardless
if random_choice(opponent.cooperations / len(opponent.history)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

random_choice does not return a boolean but returns either a C or a D. This should be:

125         if D in opponent.history[-2:]:                                                                                                          
126             # Probability of cooperating regardless                                                                                             
127             return random_choice(opponent.cooperations / len(opponent.history))                                                                 
128         return C     

I have checked this locally and it all works, you'll need to adjust the tests and the actions. For example the following are passing tests for this strategy:

134         opponent = axelrod.MockPlayer(actions=[D, C, D, D, C])                                                                                  
135         actions = [(C, D), (D, C), (C, D), (D, D), (D, C)]                                                                                      
136         self.versus_test(opponent, expected_actions=actions, seed=1)                                                                            
137                                                                                                                                                 
138         actions = [(C, D), (D, C), (D, D), (D, D), (C, C)]                                                                                      
139         self.versus_test(opponent, expected_actions=actions, seed=2) 

As well as those two I'd just add tests with the Defector and Cooperator (the case when this strategy does not act randomly):

140                                                                                                                                                 
141         actions = [(C, C), (C, C), (C, C), (C, C), (C, C)]                                                                                      
142         self.versus_test(axelrod.Cooperator(), expected_actions=actions)                                                                        
143                                                                                                                                                 
144         actions = [(C, D), (D, D), (D, D), (D, D), (D, D)]                                                                                      
145         self.versus_test(axelrod.Defector(), expected_actions=actions)  

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

defectiions by opponent with a dynamic bias based off of the
opponents ratio of cooperations to total moves (so their current
probability of cooperating towards cooporating regardless of the
move (aka: forgiveness)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blank line before Names please.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@FakeNameSE
Copy link
Member Author

FakeNameSE commented May 29, 2017

At last, thank you for your help. Everything is fixed now

@drvinceknight
Copy link
Member

At last, thank you for your help. Everything is fixed now

Great! Thanks for the work on it. I've given it a 👍, we have a two core reviewer policy so another of the core devs will take another look when they have a moment (they might raise some other points).

Looking forward to getting it in 👍

defectiions by opponent with a dynamic bias based off of the
opponents ratio of cooperations to total moves (so their current
probability of cooperating towards cooporating regardless of the
move (aka: forgiveness)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typos and grammar:
defectiions -> defections
opponents ratio -> opponent's ratio
based off of the -> based on

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"its opponent's defectiions by opponent"

Should this be either 'its opponent's defections' or 'defections by opponent'?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(so their current probability of cooperating towards cooporating regardless of the move (aka: forgiveness))."

Sorry, but I don't understand what that line means.

@staticmethod
def strategy(opponent):
# First move
if len(opponent.history) == 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be better as if not opponent.history:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI @FakeNameSE this gives some good background: https://stackoverflow.com/questions/53513/best-way-to-check-if-a-list-is-empty

(Apologies, as is was my suggestion, thanks to @meatballs for pointing it out!)

@FakeNameSE
Copy link
Member Author

Sorry about the typos. Should be fixed now.

probability of cooperating towards cooporating regardless of the
move (aka: forgiveness)).
defections with defections, but with a dynamic bias towards cooperating
based off of the opponent's ratio of cooperations to total moves
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"based off of the..." -> "based on the..."

@meatballs
Copy link
Member

Looking good. Just the one minor docstring improvement left for me.

thanks!

@meatballs
Copy link
Member

Many thanks @FakeNameSE !!

I'll send you an invitation to join the project team and you can then have the logo displayed on your github profile by visiting https://github.com/orgs/Axelrod-Python/people and changing your membership from private to public.

We also have a chat room for the project: https://gitter.im/Axelrod-Python/Axelrod

@meatballs meatballs merged commit 9edd676 into Axelrod-Python:master May 30, 2017
@FakeNameSE
Copy link
Member Author

Thank you.

@drvinceknight
Copy link
Member

Thank you.

Thanks for the contribution! 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants