Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implemented Harrington (k75r from Axelrod's Second) #1146

Merged
merged 12 commits into from
Dec 10, 2017
3 changes: 2 additions & 1 deletion axelrod/strategies/_strategies.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
UnnamedStrategy, SteinAndRapoport, TidemanAndChieruzzi)
from .axelrod_second import (
Champion, Eatherley, Tester, Gladstein, Tranquilizer, MoreGrofman,
Kluepfel, Borufsen, Cave, WmAdams, GraaskampKatzen, Weiner)
Kluepfel, Borufsen, Cave, WmAdams, GraaskampKatzen, Weiner, Harrington)
from .backstabber import BackStabber, DoubleCrosser
from .better_and_better import BetterAndBetter
from .bush_mosteller import BushMosteller
Expand Down Expand Up @@ -186,6 +186,7 @@
HardProber,
HardTitFor2Tats,
HardTitForTat,
Harrington,
HesitantQLearner,
Hopeless,
Inverse,
Expand Down
294 changes: 294 additions & 0 deletions axelrod/strategies/axelrod_second.py
Original file line number Diff line number Diff line change
Expand Up @@ -975,3 +975,297 @@ def strategy(self, opponent: Player) -> Action:
self.defect_padding = 0

return self.try_return(opponent.history[-1])


class Harrington(Player):
"""
Strategy submitted to Axelrod's second tournament by Paul Harrington (K75R)
and came in eighth in that tournament.

This strategy has three modes: Normal, Fair-weather, and Defect. These
mode names were not present in Harrington's submission.

In Normal and Fair-weather modes, the strategy begins by:

- Update history
- Detects random if turn is multiple of 15 and >=30.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

randomly

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Detects random opponent? (Not overly fussed)

- Check if `burned` flag should be raised.
- Check for Fair-weather opponent if turn is 38.

Updating history means to increment the correct cell of the `move_history`.
`move_history` is a matrix where the columns are the opponent's previous
move and rows are indexed by the combo of this player and the opponent's
moves two turns ago*. [The upper-left cell must be all cooperations, but
otherwise order doesn't matter.] * If the player is exiting Defect mode,
then the history to determine the row is taken from before the turn that
the player entered Defect mode. (That is, the turn that started in Normal
mode, but ended in Defect mode.)

If the turn is a multiple of 15 and >=30, then attempt to detect random.
If random is detected, enter Defect mode and defect immediately. If the
player was previously in Defect mode, then do not re-enter. The random
detection logic is a modified Pearson's Chi Squared test, with some
additional checks. [More details in `detect_random` docstrings.]

Some of this player's moves are marked as "generous." If this player made
a generous move two turns ago and the opponent replied with a Defect, then
raise the `burned` flag. This will stop certain generous moves later.

The player mostly plays Tit-for-Tat for the first 36 moves, then defects on
the 37th move. If the opponent cooperates on the first 36 moves, and
defects on the 37th move also, then enter Fair-weather mode and cooperate
this turn. Entering Fair-weather mode is extremely rare, since this can
only happen if the opponent cooperates for the first 36 then defects
unprovoked on the 37th. (That is, this player's first 36 moves are also
Cooperations, so there's nothing really to trigger an opponent Defection.)

Next in Normal Mode:

1. Check for defect and parity streaks.
2. Check if cooperations are scheduled.
3. Otherwise,

- If turn < 37, Tit-for-Tat.
- If turn = 37, defect, mark this move as generous, and schedule two
more cooperations**.
- If turn > 37, then if `burned` flag is raised, then Tit-for-Tat.
Otherwise, Tit-for-Tat with probability 1 - `prob`. And with
probability `prob`, defect, schedule two cooperations, mark this move
as generous, and increase `prob` by 5%.

** Scheduling two cooperations means to set `more_coop` flag to two. If in
Normal mode and no streaks are detected, then the player will cooperate and
lower this flag, until hitting zero. It's possible that the flag can be
overwritten. Notable on the 37th turn defect, this is set to two, but the
38th turn Fair-weather check will set this.

If the opponent's last twenty moves were defections, then defect this turn.
Then check for a parity streak, by flipping the parity bit (there are two
streaks that get tracked which are something like odd and even turns, but
this flip bit logic doesn't get run every turn), then incrementing the
parity streak that we're pointing to. If the parity streak that we're
pointing to is then greater than `parity_limit` then reset the streak and
cooperate immediately. `parity_limit` is initially set to five, but after
its been hit eight times, it decreases to three. The parity streak that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it has

we're pointing to also gets incremented if in normal mode and WE defect but
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we

not on turn 38, unless the result of a defect streak. Note that the parity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless we are defecting as a result of defect streak?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. I'll clarify.

streaks reset but the defect streak doesn't.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resets


If `more_coop` >= 1, then we cooperate and lower that flag here, in Normal
mode after checking streaks. Still lower this flag if cooperating as the
result of a parity streak or in Fair-weather mode.

Then use the logic based on turn from above.

In Fair-Weather mode after running the code from above, check if opponent
defected last turn. If so, exit Fair-Weather mode, and proceed THIS TURN
with Normal mode. Otherwise cooperate.

In Defect mode, update the `exit_defect_meter` (originally zero) by
incrementing if opponent defected last turn and decreasing by three
otherwise. If `exit_defect_meter` is then 11, then set mode to Normal (for
future turns), cooperate and schedule two more cooperations. [Note that
this move is not marked generous.]

Names:

- Harrington: [Axelrod1980b]_
"""

name = "Harrington"
classifier = {
'memory_depth': float('inf'),
'stochastic': True,
'makes_use_of': set(),
'long_run_time': False,
'inspects_source': False,
'manipulates_source': False,
'manipulates_state': False
}

def __init__(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some inline comments about these variables would be helpful

super().__init__()
self.mode = "Normal"
self.recorded_defects = 0
self.exit_defect_meter = 0
self.coops_in_first_36 = None
self.was_defective = False

self.prob = 0.25

self.move_history = np.zeros([4, 2])
self.chi_squared = None
self.history_row = 0

self.more_coop = 0
self.generous_n_turns_ago = 3
self.burned = False

self.defect_streak = 0
self.parity_streak = [0, 0]
self.parity_bit = 0
self.parity_limit = 5
self.parity_hits = 0

def try_return(self, to_return, lower_flags=True, inc_parity=False):
if lower_flags and to_return == C:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring please

self.more_coop -= 1
self.generous_n_turns_ago += 1

if inc_parity and to_return == D:
self.parity_streak[self.parity_bit] += 1

return to_return

def detect_random(self, turn):
"""
Calculates a modified Pearson's Chi Squared statistic on self.history,
and returns True (is random) if and only if the statistic is less than
or equal to 3.

Pearson's Chi Squared statistic = sum[ (E_i-O_i)^2 / E_i ], where O_i
are the observed matrix values, and E_i is calculated as number (of
defects) in the row times the number in the column over (total number
in the matrix minus 1).

We say this is modified because it differs from a usual Chi-Squared
test in that:

- It divides by turns minus 2 to get expected, whereas usually we'd
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to get expected ______?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this whole bullet should be removed; I was miscounting.

divide by matrix total. Total equals turns minus 1, unless Defect
mode has been entered at any point.
- Terms where expected counts are less than 1 get excluded.
- There's a check at the beginning on the first cell of the matrix.
- There's a check at the beginning for the recorded number of defects.

"""
denom = turn - 2

if self.move_history[0, 0] / denom >= 0.8:
return False
if self.recorded_defects / denom < 0.25 or self.recorded_defects / denom > 0.75:
return False

expected_matrix = np.outer(self.move_history.sum(axis=1), \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for , please align spacing

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for... Backslash? Space?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backslash \

self.move_history.sum(axis=0))

chi_squared = 0.0
for i in range(4):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment about what's going on in this nested loop would be helpful

for j in range(2):
expct = expected_matrix[i, j] / denom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's use expect or something other than "expct"

if expct > 1.0:
chi_squared += (expct - self.move_history[i, j]) ** 2 / expct

self.chi_squared = round(chi_squared, 3) # For testing
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this round really necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is proper, but I'm using self.chi_squared just to help testing. I'm asserting in the test file that it equals the value I'd expect; which is why I rounded. I can remove this if you want.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to break this out into a function that can be directly tested, otherwise let's at least have a more detailed comment like "Caching value only for testing purposes, not used otherwise" (and also in the __init__ function. I'd prefer a separate function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a separate function.

I agree. 👍

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer a separate function.

I'm not sure what you mean by this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok cool, so the idea would be to break def detect_random(self, turn): in to two separate functions.

It would look something like this:

def detect_random(self, turn):
    chi_squared = calculate_chi_squared(...)
    return chi_squared <= 3:

Then we can test the calculate_chi_squared function by itself without needed to set self.chi_squared:

       actions += [(D, C)]
       self.versus_test(axelrod.Random(0.5), expected_actions=actions, seed=10)
       # The history matrix will be [[0, 2], [5, 6], [3, 6], [4, 2]]

      self.assertEqual(round(axelrod_second.calculate_chi_squared([relevant attributes]), 3), 2.395)

The calculate_chi_squared function would take relevant attributes as inputs and could just be a function in the axelrod_second module (not necessarily a method).

Does that make sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does... However, I'm using the chi-squared value to test both the chi-squared calculation and the history matrix. I tried passing history_matrix as an attrs to the versus_test, but I guess assert_equal has a hard time with numpy matrices...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does... However, I'm using the chi-squared value to test both the chi-squared calculation and the history matrix. I tried passing history_matrix as an attrs to the versus_test, but I guess assert_equal has a hard time with numpy matrices...

Yup, testing numpy matrix equality isn't always what you'd hope:

>>> A = np.array([[1, 6], [4, 3]])
>>> A == A
array([[ True,  True],
       [ True,  True]], dtype=bool)
>>> np.array_equal(A, A)
True

but I don't think that's important here, you can still use those inputs to the chi_squared calculation? Don't worry if you don't pass them to the versus_test.

(A note for later/elsewhere, it would be straight forward to add the ability to test array_equal for numpy arrays in versus_test: just check the type etc...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I know it's working, it's fine, but when I was testing, I wanted to know that the history_matrix was getting filled out correctly. [Lot's of different matrices will give chi_squared < 3.]


if chi_squared > 3:
return False
return True

def detect_streak(self, last_move):
"""
Return if and only if the opponent's last twenty moves are defects.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defections

"""

if last_move == D:
self.defect_streak += 1
else:
self.defect_streak = 0
if self.defect_streak >= 20:
return True
return False

def detect_parity_streak(self, last_move):
self.parity_bit = 1 - self.parity_bit # Flip bit
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docstring please

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two spaces before # (and several more below)

if last_move == D:
self.parity_streak[self.parity_bit] += 1
else:
self.parity_streak[self.parity_bit] = 0
if self.parity_streak[self.parity_bit] >= self.parity_limit:
return True

def strategy(self, opponent: Player) -> Action:
turn = len(self.history) + 1

if turn == 1:
return C

if self.mode == "Defect":
if opponent.history[-1] == D:
self.exit_defect_meter += 1
else:
self.exit_defect_meter -= 3
if self.exit_defect_meter >= 11:
self.mode = "Normal"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coverage is failing because we don't have a test case that runs this code block:

axelrod/strategies/axelrod_second.py                       436      4    99%   1196-1199

(That's the output of coveralls showing that lines 1196-1199 are not hit.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting! I'll work on it.

self.was_defective = True
self.more_coop = 2
return self.try_return(to_return=C, lower_flags=False)

return self.try_return(D)


# If not Defect mode, proceed to update history and check for random,
# check if burned, and check if opponent's fairweather.

# History only gets updated outside of Defect mode.
if turn > 2:
if opponent.history[-1] == D:
self.recorded_defects += 1
opp_col = 1 if opponent.history[-1] == D else 0
self.move_history[self.history_row, opp_col] += 1

# Detect random
if turn % 15 == 0 and turn > 15 and not self.was_defective:
if self.detect_random(turn):
self.mode = "Defect"
return self.try_return(D, lower_flags=False) # Lower_flags not used here.

# history_row only gets updated if not in Defect mode AND not entering
# Defect mode.
self.history_row = 1 if opponent.history[-1] == D else 0
if self.history[-1] == D:
self.history_row += 2

# If generous 2 turn ago and opponent defected last turn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 turns ago

if self.generous_n_turns_ago == 2 and opponent.history[-1] == D:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does "generous_n_turns_ago" mean "last generous n turns ago" or something else? It's ambiguous IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I can add "last_" here.

self.burned = True

if turn == 38 and opponent.history[-1] == D and opponent.cooperations == 36:
self.mode = "Fair-weather"
return self.try_return(to_return=C, lower_flags=False)


if self.mode == "Fair-weather":
if opponent.history[-1] == D:
self.mode = "Normal" # Post-Defect is not possible
#Continue below
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to express here that this is the only place in the code where, following a mode-switch, we don't immediately return a value. Instead we actually treat this turn as a "Normal" mode code. I'll take your advice if you think it's really not worth mentioning. (Or if I should say more clearly.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't feel strongly about it but it's clear that the code continues to run IMO

else:
# Never defect against a fair-weather opponent
return self.try_return(C)

# Continue with Normal mode

# Check for streaks
if self.detect_streak(opponent.history[-1]):
return self.try_return(D, inc_parity=True)
if self.detect_parity_streak(opponent.history[-1]):
self.parity_streak[self.parity_bit] = 0
self.parity_hits += 1
if self.parity_hits >= 8:
self.parity_limit = 3
return self.try_return(C, inc_parity=True) # Inc parity won't get used here.

if self.more_coop >= 1:
return self.try_return(C, inc_parity=True)

if turn < 37:
return self.try_return(opponent.history[-1], inc_parity=True)
if turn == 37:
self.more_coop, self.generous_n_turns_ago = 2, 1
return self.try_return(D, lower_flags=False)
if self.burned or random.random() > self.prob:
return self.try_return(opponent.history[-1], inc_parity=True)
else:
self.prob += 0.05
self.more_coop, self.generous_n_turns_ago = 2, 1
return self.try_return(D, lower_flags=False)
Loading