In [1]:
import pandas as pd
import sympy as sym
import numpy as np
import matplotlib.pyplot as plt
import axelrod as axl
import axelrod.interaction_utils as iu

import testzd as zd

C, D = axl.Action.C, axl.Action.D

# Investigate whether or not a strategy is zero determinant.

In [1], given a match between 2 memory one strategies the concept of Zero Determinant strategies is introduced. This result showed that a player $p\in\mathbb{R}^4$ against a player $q\in\mathbb{R}^4$ could force a linear relationship between the scores.

Assuming the following:

- The utilities for player $p$: $S_x = (R, S, T, P)$ and for player $q$: $S_y = (R, T, S, P)$.
- The normalised long run score for player $p$: $s_x$ and for player $q$: $s_y$.
- Given $p=(p_1, p_2, p_3, p_4)$ a transformed (but equivalent) vector: $\tilde p=(p_1 - 1, p_2 - 1, p_3, p_4)$, similarly: $\tilde q=(1 - q_1, 1 - q_2, q_3, q_4)$

The main result of [1] is that:

if $\tilde p = \alpha S_x + \beta S_y + \gamma 1$ **or** if $\tilde q = \alpha S_x + \beta S_y + \gamma 1$ then:

$$
\alpha s_x + \beta s_y + \gamma 1 = 0
$$

where $\alpha, \beta, \gamma \in \mathbb{R}$

The question arises:

**Given a strategy $p$, is it a zero determinant strategy?**

This is equivalent to finding $\alpha, \beta, \gamma \in \mathbb{R}$ such that $\tilde p = \alpha S_x + \beta S_y + \gamma 1$.

Recall that $\tilde p, S_x, S_y, 1\in\mathbb{R}^{4\times 1}$ so this corresponds to a linear system of 4 equations on three variables.

$$\tilde p=Mx$$

Where:

$$
M = \begin{pmatrix}S_x, S_y, 1\end{pmatrix}\in\mathbb{R}^{4\times 3}
$$

As an example consider the `extort-2` strategy defined in [2]. This is given by:

$$p=(8/9, 1/2, 1/3, 0)$$

it is defined to ensure:

$$
\begin{aligned}
\alpha s_x - P &= 2(s_y - P)\\
\alpha s_x - 2s_y + P&=0\\
\end{aligned}
$$

Let us solve $Mx=\tilde p$

In [2]:
R, S, T, P = sym.S(3), sym.S(0), sym.S(5), sym.S(1)

tilde_p = sym.Matrix([sym.S(8) / 9 - 1, sym.S(1) / 2 - 1, sym.S(1) / 3, sym.S(0)])
M = sym.Matrix([[R, R, 1], 
                [S, T, 1],
                [T, S, 1], 
                [P, P, 1]])

In [3]:
system = (M, tilde_p)
symbols = sym.symbols("alpha, beta, gamma")
sym.linsolve(system, symbols)

{(1/18, -1/9, 1/18)}

This gives $\alpha = 1 / 18$, $\beta = -1/9$ and $\gamma = 1 / 18$ which ensures:

$$
1/18 s_x -1/9 s_y + 1/18 = 0
$$

multiplying this by 18 gives:


$$
s_x -2 s_y + 1 = 0
$$

which is the relationship described above.

Note that in practice, a vector $p$ might not be defined exactly: indeed it could be measured from observation. Thus: $p\notin\mathbb{Q}^{4\times 1}$ but $p\in\mathbb{R}^{4\times 1}$. As such that linear equations may no longer have exact solutions and/or indeed have no solutions at all as $M$ is not a square matrix. In this case, we can find the best fitting $\bar x=(\bar\alpha, \bar\beta, \bar\gamma)$ which minimises:

$$
\delta = \|M x-\tilde p\|_2= \sum_{i=1}^{4}\left((M\bar x)_i-\tilde p_i\right)^2
$$

Note that, $\delta$ itself becomes a measure of how close $p$ is to being a ZD strategy.

Thus we define a $\delta$-ZD strategy as a strategy for which there exists $\bar x = \text{argmin}_x\|M x-\tilde p\|_2$ such that $\|M \bar x-\tilde p\|_2\leq \delta$.

We can see that `Extort-2` is $\delta$-ZD for a very low value of $\delta$:

In [5]:
p = np.array([8 / 9, 1 / 2, 1 / 3, 0])
zd.is_delta_ZD(p, delta=10 ** -7)

True

Note that the following vector is not:

$$p = (8 / 9, 1, 1 / 3, 0)$$

In [8]:
zd.is_delta_ZD(np.array([8 / 9, 1, 1 / 3, 0]), delta=10 ** -7)

False

Furthermore we can simulate the play of strategies and measure the probabilities:

In [9]:
players = axl.ZDExtort2(), axl.Alternator()
match = axl.Match(players, turns=10 ** 6)
axl.seed(0)
interactions = match.play()

In [10]:
state_counter = iu.compute_normalised_state_to_action_distribution(interactions)[0]
p = np.array([state_counter[(state, C)] for state in ((C, C), (C, D), (D, C), (D, D))])
p

array([0.88787388, 0.49963841, 0.33404832, 0.        ])

We see that that measure of $p$ is not $\epsilon$-ZD for $\epsilon=10 ^ {-7}$:

In [12]:
zd.is_delta_ZD(p, delta=10 ** -7)

False

However it is for $\epsilon=10 ^ {-2}$:

In [13]:
zd.is_delta_ZD(p, delta=10 ** -2)

True

In fact the lowest $\delta$ for which $p$ is $\delta$-ZD is $\delta=10 ^ {-3}$:

In [15]:
zd.find_lowest_delta(p)

0.0001

Let us consider a few other strategies/tournament and evaluate their performance:

## Empirical observation

Let us consider the latest tournament of the Axelrod project: awaiting data collection.

## Evaluate the Press and Dyson tournament

Firslty let us look at the tournament of [2].

In [16]:
import dask as da
import dask.dataframe as dd

In [12]:
columns = ["Player index", 
           "Opponent index", 
           "Player name", 
           "Opponent name", 
           "Turns", 
           "Score", 
           "CC count",
           "CD count", 
           "DC count",
           "DD count",
           "CC to C count",
           "CC to D count",
           "CD to C count",
           "CD to D count",
           "DC to C count",
           "DC to D count",
           "DD to C count",
           "DD to D count",]
ddf = dd.read_csv("./data/stewart_plotkin_tournament/interactions/std/main.csv")[columns]

In [38]:
groups = ["Player index", "Opponent index"]
counts = ["CC count",
          "CD count", 
          "DC count",
          "DD count",
          "CC to C count",
          "CC to D count",
          "CD to C count",
          "CD to D count",
          "DC to C count",
          "DC to D count",
          "DD to C count",
          "DD to D count",]
summation = ddf.groupby(groups)[counts].sum()
df = da.compute(summation, da.get)[0]
df.reset_index(inplace=True) 

In [39]:
df.head()

Unnamed: 0,Player index,Opponent index,CC count,CD count,DC count,DD count,CC to C count,CC to D count,CD to C count,CD to D count,DC to C count,DC to D count,DD to C count,DD to D count
0,0,0,400000,0,0,0,399800,0,0,0,0,0,0,0
1,0,1,0,200000,0,0,0,0,199900,0,0,0,0,0
2,0,2,150500,49500,0,0,150424,0,49476,0,0,0,0,0
3,0,3,199900,100,0,0,199800,0,100,0,0,0,0,0
4,0,4,180052,19948,0,0,179958,0,19942,0,0,0,0,0


In [40]:
df["complete"] = df["CC count"] * df["CD count"] * df["DC count"] *  df["DD count"] > 0
columns = ["Player index", "Opponent index", "complete"]
for state in ("CC", "CD", "DC", "DD"):
    column = f"P(C|{state})"
    columns.append(column)
    df[column] = df[f"{state} to C count"] / (df[f"{state} to C count"] + df[f"{state} to D count"])
df = df[columns]
df

Unnamed: 0,Player index,Opponent index,complete,P(C|CC),P(C|CD),P(C|DC),P(C|DD)
0,0,0,False,1.000000,,,
1,0,1,False,,1.000000,,
2,0,2,False,1.000000,1.000000,,
3,0,3,False,1.000000,1.000000,,
4,0,4,False,1.000000,1.000000,,
5,0,5,False,1.000000,,,
6,0,6,False,1.000000,,,
7,0,7,False,1.000000,,,
8,0,8,False,1.000000,,,
9,0,9,False,1.000000,,,


In [45]:
np.NaN

nan

In [46]:
epsilons = []
for index, row in df.iterrows():
    if row.isnull().any():
        epsilons.append(np.NaN)
    else:
        epsilons.append(zd.find_lowest_epsilon(row[probabilities]))
df["epislon"] = epsilons

In [48]:
df[df["complete"]]

Unnamed: 0,Player index,Opponent index,complete,P(C|CC),P(C|CD),P(C|DC),P(C|DD),epislon
30,2,2,True,0.889151,0.476562,0.292969,0.000000,0.0428
31,2,3,True,0.888672,0.538462,0.333962,0.000000,0.0263
32,2,4,True,0.892458,0.478632,0.301587,0.000000,0.0390
33,2,5,True,0.889247,0.448276,0.321101,0.000000,0.0430
34,2,6,True,0.892711,0.534884,0.347339,0.000000,0.0288
35,2,7,True,0.901355,0.525424,0.371069,0.000000,0.0297
36,2,8,True,0.889012,0.480769,0.270000,0.000000,0.0552
37,2,9,True,0.890116,0.505582,0.333448,0.000000,0.0026
38,2,10,True,0.889754,0.342857,0.312865,0.000000,0.1193
39,2,11,True,0.884793,0.498139,0.331810,0.000000,0.0019


In [49]:
import axelrod as axl

In [50]:
players = [s() for s in axl.strategies]

In [51]:
len(players)

225

In [53]:
len(players)

204

## References

[1] Press, William H., and Freeman J. Dyson. "Iterated Prisoner’s Dilemma contains strategies that dominate any evolutionary opponent." Proceedings of the National Academy of Sciences 109.26 (2012): 10409-10413

[2] Stewart, Alexander J., and Joshua B. Plotkin. "Extortion and cooperation in the Prisoner’s Dilemma." Proceedings of the National Academy of Sciences 109.26 (2012): 10134-10135.