# The iterated prisoner's dilemma

Author : Philippe Mathieu, [CRISTAL Lab](http://www.cristal.univ-lille.fr), [SMAC team](https://www.cristal.univ-lille.fr/?rubrique27&eid=17), [Lille University](http://www.univ-lille.fr), email : philippe.mathieu@univ-lille.fr

Contributors : Louisa Fodil (CRISTAL/SMAC), Céline Petitpré (CRISTAL/SMAC)

Creation : 18/01/2018

## Introduction

The Prisoner's Dilemma (PD) is a **simultaneous**, **two-player**, **non-zero-sum** game, highlighted by Merill Flood & Melvin Dreschler in 1950 to show that Nash's balance is not always ideal. The iterated version of the game (IPD) allows you to express strategies that are based on the game history, and therefore learn from the past. In 1980 Robert Axelrod organized a competition for the iterated version of the game in which one of the participants, Anatol Rappoport, highlighted the famous TFT strategy. This iterated version and TFT strategy were popularized in 1984 in Robert Axelrod's book "The Evolution of Cooperation". Since then, thousands of publications have been produced on this subject, in all fields!

This notebook aims to show how to establish and compare strategies to this game.

We consider here a `Game` class allowing to code a game

----
### Game(tab, actions)
- `tab` : the list of score pairs
- `actions` : the list of possible strategies

**Methods**
- `getDominantStrategies(self, strict='True')` which prints a list of indexes of non-dominated strategies and returns a new Game with this new matrix
- `getNash(self)` which returns a list of Nash equilibrium indexes.
- `getPareto(self)` which returns a list of indexes of Pareto's equilibria

In [None]:
%run ../src/Game.py

dip =[(3,3),(0,5),(5,0),(1,1)]   # Prisoner's dilemma
g = Game(dip,['C','D'])
g.getNash()

# A strategy

A strategy aims to decide which move to play. In addition to the payoff matrix, the information available for a strategy is the moves played by both players in the past. The simplest strategies are obviously those that do not take into account this past, such as strategies that periodically play the same sequence of moves. To ensure a principle of autonomy of each agent, a strategy is of course able to provide his next move, but takes care of storing his previous moves if necessary.


#### Let's create a class of strategies of this category
The class we define here is very simple: strategies that periodically play moves, without thinking much! For the moment, it is not linked to any game since this kind of behaviour can be found in any game.

In [None]:
from abc import abstractmethod
class Strategy():
    def setMemory(self,mem):
        pass
    
    def getAction(self,tick):
        pass
    
    def __copy__(self):
        pass

    def update(self,x,y):
        pass
    

class Periodic(Strategy):
    def __init__(self, sequence, name=None):
        super().__init__()
        self.sequence = sequence.upper()
        self.step = 0
        self.name = "per_"+sequence if (name == None) else name

    def getAction(self,tick):
        return self.sequence[tick % len(self.sequence)]

    def clone(self):
        object = Periodic(self.sequence, self.name)
        return object

    def update(self,x,y):
        pass
    
print("All is ok")

#### Let's test our Periodic class

In [None]:
s1 = Periodic("abc")
print(s1.name,end="\t")
for i in range (0,10):
    print(s1.getAction(i), end=' ')
# there must be 10 moves. it starts with A and ends with A.    

# One meeting
A meeting involves two strategies during a number of moves at a fixed game. The score of each is the sum of the scores obtained on each move, according to the game matrix.

In [None]:
class Meeting :      
    def __init__(self,game,s1,s2,length=1000):
        self.game = game
        self.s1=s1.clone()
        self.s2=s2.clone()
        self.length=length
        self.nb_cooperation_s1 = 0
        self.nb_cooperation_s2 = 0
        
    def reinit(self):
        self.s1_score=0
        self.s2_score=0
        self.s1_rounds=""
        self.s2_rounds=""
    
    def run(self):
        self.reinit()
        for tick in range(0,self.length):
            c1=self.s1.getAction(tick).upper()
            c2=self.s2.getAction(tick).upper()
            if (c1 == "C"):
                self.nb_cooperation_s1 +=1
            if (c2 == "C"):
                self.nb_cooperation_s2 +=1
            self.s1_rounds+=c1
            self.s2_rounds+=c2
            self.s1.update(c1,c2)
            self.s2.update(c2,c1)
            act=self.game.actions
            self.s1_score+=self.game.scores['x'][act.index(c1),act.index(c2)]
            self.s2_score+=self.game.scores['y'][act.index(c1),act.index(c2)]
            
print("All is ok")

A meeting between two strategies is trivial: we initialize a game, we create 2 strategies, and we pass them on to the Meeting.

In [None]:
dip =[(3,3),(0,5),(5,0),(1,1)]   # Prisoner's dilemma
g = Game(dip,['C','D'])
s1=Periodic("CCD")
s2=Periodic("DDC")
m = Meeting(g,s1,s2,10)
m.run()
print(m.s1.name+"\t"+m.s1_rounds+" "+str(m.s1_score))
print(m.s2.name+"\t"+m.s2_rounds+" "+str(m.s2_score))
# We must get 15,35
print()
print("Number of cooperations : " )
print (m.s1.name+"\t" + str(m.nb_cooperation_s1))
print (m.s2.name+"\t" + str(m.nb_cooperation_s2))

# A tournament

A tournament is applied to a set of strategies. It consists in bringing together any couple of strategies in a meeting, including each strategy against itself. This kind of tournament is called a "round-robin" tournament. In this way, a square matrix of scores is filled in. In such a tournament, the score of each strategy is the sum of the scores it has obtained. The winning strategy is the one with the highest score.

A tournament is now defined using 4 parameters: the game to which it is applied, all the strategies evaluated, the length of the games, the number of repetitions performed if needed.

In [None]:
import pandas as pd
import numpy as np

class Tournament:
    def __init__(self, game, strategies, length=1000, repeat=1):
        self.strategies = strategies
        self.game = game
        self.length=length
        self.repeat=repeat
        size=len(strategies);
        df = pd.DataFrame(np.zeros((size,size+1),dtype=np.int32))
        df.columns, df.index = [s.name for s in self.strategies]+["Total"], [s.name for s in self.strategies]
        self.matrix = df
        df2 = pd.DataFrame(np.zeros((size,size+1),dtype=np.int32))
        df2.columns, df2.index = [s.name for s in self.strategies]+["Total"], [s.name for s in self.strategies]
        self.cooperations = df2

    def run(self):
        for k in range(self.repeat):
            for i in range (0,len(self.strategies)):
                for j in range (i,len(self.strategies)):
                    meet = Meeting(self.game, self.strategies[i], self.strategies[j], self.length)
                    meet.run()
                    self.matrix.at[self.strategies[i].name, self.strategies[j].name] = meet.s1_score
                    self.matrix.at[self.strategies[j].name, self.strategies[i].name] = meet.s2_score
                    self.cooperations.at[self.strategies[i].name, self.strategies[j].name] = meet.nb_cooperation_s1
                    self.cooperations.at[self.strategies[j].name, self.strategies[i].name] = meet.nb_cooperation_s2
        self.matrix["Total"] = self.matrix.sum(axis=1)
        self.matrix.sort_values(by='Total', ascending=False, inplace=True)
        rows = list(self.matrix.index) + ["Total"]
        self.matrix = self.matrix.reindex(columns=rows)
        self.cooperations["Total"] = self.cooperations.sum(axis=1)
        self.cooperations.sort_values(by='Total', ascending=False, inplace=True)
        rows = list(self.cooperations.index) + ["Total"]
        self.cooperations = self.cooperations.reindex(columns=rows)
        
print("All is ok")

#### Let's do a tournament.
The Prisoners' Dilemma is a **non-zero-sum** game: you don't win the same thing in all situations. It is important to note that in the tournament we do not count the number of wins but the total scores!

In [None]:
bag = []
bag.append(Periodic('C'))
bag.append(Periodic('D'))
bag.append(Periodic('DDC'))
bag.append(Periodic('CCD'))
t=Tournament(g,bag,10)
t.run()
print("The score matrix: ")
print(t.matrix)
print()
# SUR 10 COUPS : [('per_D', 120), ('per_DDC', 102), ('per_CCD', 78), ('per_C', 60)]
print("The cooperation matrix: ")
print(t.cooperations)

## Generate sets of strategies

Constituting a "soup" of strategies to test the performance of a given strategy can be considered subjective. The best-case scenario is to build indisputable soups, for example by building sets of all strategies that meet a global constraint.
For example, we can constitute the set of all the periodicals of period 1 (there are 2) and/or period 2 (there are $2^2=4$) and/or period 3 (there are $2^3=8$), or even all together.
(note that when we generate CCC which obviously corresponds to CC or C)

In [None]:
import itertools
cards = ['C','D']
periodics = [p for p in itertools.product(cards, repeat=1)]+[p for p in itertools.product(cards, repeat=2)] + [p for p in itertools.product(cards, repeat=3)]
strats = [Periodic(''.join(p)) for p in periodics] # join to transform in strings
print(str(len(strats))+" stratégies générées")
# 14 are generated: 2 with period lengh of 1, 4 with period lengh of 2, 8 with period lengh of 3

# Ecological competitions

Tournaments give an interesting, but insufficient ranking. They do not study the robustness of a strategy according to the number of opponents' representants. Ecological competition responds to this problem: it makes it possible to vary the populations of each strategy. The principle of ecological competition is very simple. Initially, we consider `n` representants of each of the `s` strategies evaluated. The `n*s` representants all play against each other in a tournament. In step 1, the representants of each strategy are determined in proportion to their success in the previous step. The better you are, the more descendants you will have. An ecological competition therefore requires a strategy to be robust to changes in the number of opponents. An ecological ranking is therefore more "robust" than a tournament ranking. It is immediately understood that to be well ranked, it is preferable to play very well against your own colleagues, since with a little luck, they will be more and more numerous.
An ecological competition can then be represented on a temporal graph, the generations on the abscissa and the populations of each strategy on the ordinate. We will use `matplotlib` to compute these graphs.

In [None]:
import pandas
import copy
import math
import matplotlib.pyplot as plt
%matplotlib inline

class Ecological:
    def __init__(self, game, strategies, length=1000, repeat=1, pop=100):
        self.strategies = strategies
        self.pop = pop
        self.game = game
        self.length = length
        self.generation = 0 # Number of the current generation
        self.base = pop*len(strategies)
        self.historic = pandas.DataFrame(columns = [strat.name for strat in strategies])
        self.historic.loc[0] = [pop for x in range (len(strategies))]
        self.extinctions = dict((s.name,math.inf) for s in strategies)
        self.cooperations =  dict((s.name,0) for s in strategies)
        self.listeCooperations = list()
        self.scores = dict((s.name,0) for s in strategies)
        self.tournament = Tournament(self.game, self.strategies,length,repeat)
        self.tournament.run()
        
    def run(self):
        dead = 0
        stab = False
        while ((self.generation < 1000) and (stab==False)):
            parents = list(copy.copy(self.historic.loc[self.generation]))
            for i in range (len(self.strategies)):
                strat=self.strategies[i].name
                if (self.historic.at[self.generation, strat] != 0):
                    score = 0
                    cooperations = 0
                    for j in range(len(self.strategies)): 
                        strat2 = self.strategies[j].name
                        if (self.historic.at[self.generation, strat2] != 0):
                            if i==j:
                                score+=(self.historic.at[self.generation, strat]-1)*self.tournament.matrix.at[strat,strat2]
                                cooperations+=(self.historic.at[self.generation, strat]-1)*self.tournament.cooperations.at[strat,strat2]
                            else:
                                score+=self.historic.at[self.generation, strat2]*self.tournament.matrix.at[strat,strat2]
                                cooperations+=self.historic.at[self.generation, strat2]*self.tournament.cooperations.at[strat,strat2]
                        self.scores[strat] = score
                        self.cooperations[strat] = cooperations
                    
            total = 0
            totalCooperations = 0
            for strat in self.strategies:
                total+=self.scores[strat.name]*self.historic.at[self.generation, strat.name]
                totalCooperations += self.cooperations[strat.name]*self.historic.at[self.generation, strat.name]
            for strat in self.strategies:        
                parent = self.historic.at[self.generation, strat.name]
                if (self.scores[strat.name] != 0):
                    self.historic.at[self.generation+1, strat.name] = math.floor(self.base*parent*self.scores[strat.name]/total)
                elif (self.scores[strat.name] == 0):
                    self.historic.at[self.generation+1, strat.name] = 0
                    dead += 1
                if ((parent!=0) and (self.historic.at[self.generation+1, strat.name] == 0)):
                    self.extinctions[strat.name] = self.generation+1
                elif (self.historic.at[self.generation+1, strat.name] != 0):
                    self.extinctions[strat.name] = self.historic.at[self.generation+1, strat.name]*1000
                if (dead == len(self.strategies) - 1):
                    stab = True
            self.listeCooperations.append(totalCooperations/(self.base*self.length*len(self.strategies)))
            self.generation+=1
            if (parents == list(self.historic.loc[self.generation])):stab = True
        trie = sorted(self.extinctions.items(), key=lambda t:t[1], reverse=True)
        df_trie = pandas.DataFrame()
        for t in trie :
            df_trie[t[0]]=self.historic[t[0]]
        self.historic = df_trie
        return self.historic

    def saveData(self):
        date = datetime.datetime.now()
        self.historic.to_csv(str(date)+'.csv', sep=';', encoding='utf-8')

    def drawPlot(self,nbCourbes=None,nbLegends=None):
        nbCourbes = len(self.strategies) if (nbCourbes==None) else nbCourbes
        nbLegends = len(self.strategies) if (nbLegends==None) else nbLegends
        strat = self.historic.columns.tolist()
        for i in range(nbCourbes):
            plt.plot(self.historic[strat[i]], label=strat[i] if (i<nbLegends) else '_nolegend_')
        plt.legend(bbox_to_anchor=(0, 1), loc=2, borderaxespad=0.)
        plt.ylabel('Population')
        plt.xlabel('Generation')
        plt.show()
        #date = datetime.datetime.now()
        #plt.savefig(str(date)+'.png', dpi=1000)
    
    def drawCooperations(self):
        plt.plot(self.listeCooperations)
        plt.ylabel('Percentage of cooperations')
        plt.xlabel('Generation')
        plt.ylim(0, 101)
        plt.show()
      
print("All is ok")

#### Organize an ecological competition with All_C (which always cooperates) and All_D (which always betrays)
Once the competition is over, it is possible to display the evolution of the population and the evolution of cooperation.

In [None]:
gentille = Periodic("C","All_C")
mechante = Periodic("D","All_D")
eco = Ecological(g, [gentille, mechante])
eco.run()
print("Evolution de la population")
eco.drawPlot()
print("Historique de la population")
print(eco.historic)
print("Evolution des cooperations")
eco.drawCooperations()
print(eco.scores)

In [None]:
# EXERCISE

# Compute an ecological competition with All_C, All_D, Periodic("CDD") and Periodic("CCD")
gentille = Periodic("C","All_C")
mechante = Periodic("D","All_D")
eco = Ecological(g, [gentille, mechante, Periodic('CDD'), Periodic("CCD")])
eco.run()
print("Evolution of the population")
eco.drawPlot()
print("History of the population")
print(eco.historic)
print("Evolution of cooperations")
eco.drawCooperations()

# Reactive strategies
Strategies are called "reactive" if their actions depend on the opponent's past actions. Some of them are very simple to understand. Among the most famous are
- `Tft` (abbreviation of "tit for tat" or "donnant-donnant" as we would say in French) which starts by cooperating and then plays the same thing as the opponent on the previous round
- `Spiteful` who cooperates as long as the opponent has cooperated, but who never forgives him if he has betrayed once 

In [None]:
class Tft(Strategy):
    def __init__(self):
        super().__init__()
        self.name = "tft"
        self.hisPast=""
        
    def getAction(self,tick):
        return 'C' if (tick==0) else self.hisPast[-1]

    def clone(self):
        return Tft()

    def update(self,my,his):
        self.hisPast+=his
    
    
class Spiteful(Strategy):
    def __init__(self):
        super().__init__()
        self.name = "spiteful"
        self.hisPast=""
        self.myPast=""
        
    def getAction(self,tick):
        if (tick==0):
                return 'C'
        if (self.hisPast[-1]=='D' or self.myPast[-1]=='D') :
            return 'D'
        else :
            return 'C'

    def clone(self):
        return Spiteful()

    def update(self,my,his):
        self.myPast+=my
        self.hisPast+=his

print("All is ok")

#### Behaviour of these reactive strategies
Let's check the behavior of these two new strategies against `Periodic("CCD")` in a Meeting.

In [None]:
m = Meeting(g,Tft(),Periodic("CCD"),10)
m.run()
print(m.s1.name+"\t"+m.s1_rounds+" "+str(m.s1_score))
print(m.s2.name+"\t"+m.s2_rounds+" "+str(m.s2_score))
print("")
m = Meeting(g,Spiteful(),Periodic("CCD"),10)
m.run()
print(m.s1.name+"\t"+m.s1_rounds+" "+str(m.s1_score))
print(m.s2.name+"\t\t"+m.s2_rounds+" "+str(m.s2_score))


In [None]:
# Exercise

# Compute an ecological competition with 5 stratgies : All_C, All_D, Tft et Periodic("CCD")
 
eco = Ecological(g, [Periodic('C',"All_C"),Periodic('D',"All_D"),Periodic('CCD'),Tft()])
eco.run()
plt.figure(figsize=(10,8))    # pour définir la taille de la figure
eco.drawPlot()
eco.drawCooperations()
# IN THIS EXPERIENCE, All_D WINS THE TOURNAMENT, BUT IT IS TFT THAT WINS THE ECOLOGICAL COMPETITION!

This experiment clearly illustrates the prey/predator phenomena that can occur. `All_D` wins the tournament, so it increases in population. But its gain is mainly at the expense of `All_C`. If the latter fails, then `All_D` also fails due to a lack of strategy to exploit. The result is a magnificent trend reversal. Winning a tournament is not the same as winning an ecological competition.
If we look at the evolution of cooperation, another phenomenon appears: the emergence of generalized cooperation: without any regulatory system, we have gone from 60% cooperation, to less than 40% cooperation and finally to 100% cooperation!

Some fundamental points:
- There are an infinite number of strategies
- there is no perfect strategy in the absolute. There are only strategies that behave well "in general". You can't play optimally against everyone, especially because of the first round.
- Tft never wins against anyone.
- All_D never loses against anyone.
- It's a non-zero-sum game: the important thing is not to win meetings but to win points! All_D` never loses but at what cost! Always making war brings back only very few points
- Alexrod already said it: to be good at this game you have to:
    - not to be aggressive (not to betray the first)
    - be responsive
    - know how to forgive
- This is the case of `Tft` which behaves very well in general, but since Rappoport we have found much better!
- Without any regulatory system, there is most of the time **emergence of cooperation**

In [None]:
# Accès aux données des DataFrame

# Attention ! avec un dataframe df
# chaque colonne nommée peut etre utilisée comme attribut
# Si on met un seul crochet, c'est une colonne
# donc tournoi.Total et tournoi['Total'] c'est pareil
# 
# si on met des crochets directs c'est df[Col][Lig]
# que l'on peut donc aussi écrire df.Col[lig]
# Si on utilise les fonctions loc, iloc et at, c'est df.iloc[lig,col]
# donc tournoi['Total'][1] est équivalent à tournoi.iloc[1,4]
# Par ailleurs, par défaut le dataframe utilise des entiers comme index
# dans iloc, on met des index ... sauf si on utilise la notation a:b auquel cas ce sont des adresses relatives

tournoi=eco.tournament.matrix
#print("--- La matrice complète du tournoi triée")
#print(tournoi) 
#print("--- Les gagnants du tournoi")
#print(tournoi['Total']) 
#print("--- Les 3 premiers gagnants du tournoi")
#print(tournoi['Total'][0:3])
#print("--- Les gagnants qui ont fait plus de 10000")
#print(tournoi['Total'][tournoi['Total']>10000])


evol=eco.historic
#print("--- L'historique complet trié")
#print(evol)
#print("--- Les populations finales classées")
#print(evol.iloc[eco.generation])
#print(evol.iloc[-1])
#print(evol.tail(1))
#print("--- Les 2 premiers de la compétition")
#print(evol.iloc[-1][0:2]) 
#print("--- Les derniers survivants")
#print(evol.iloc[-1][evol.iloc[-1]>0]) 
#print("--- la ligne quand tft=340 ?")
#evol.loc[evol.tft==340]
#print("--- A quel indice per_C et per_D se croisent ?")
#print(evol.loc[evol.per_C > evol.per_D].loc[evol.per_D!=0])
# Ecrire l'équivalent de select * from evol where ...
#evol.loc[(evol.tft>300) & (evol.per_D>0)]
# depuis pandas0.13  ... s'écrit
#evol.query('tft>300 & per_D>0')

#eco.drawPlot()


#### Two other classical reactive strategies
this time they are strategies based on the round mostly played by the opponent
- SoftMajority: It plays the majority round played by the opponent in the past. In case of a tie, it cooperates.
- `HardMajority` : It plays the majority round played by the opponent in the past. In case of a tie, it defects.
The difference between SoftMajority and HardMajority is slight. It should be noted that `HardMajority` is spontaneously aggressive. This characteristic means that it will generally behave much less well than `SoftMajority`.

In [None]:
# EXERCISE

# Encode SoftMajority which plays what its opponent played in majority. In the event of a tie, it shall play Cooperate

class SoftMajority(Strategy):
    def __init__(self):
        super().__init__()
        self.name = "softmajo"
        self.nbCooperations = 0
        self.nbTrahisons = 0
        
    def getAction(self,tick):
        if (self.nbCooperations >= self.nbTrahisons):
            return 'C'
        else :
            return 'D'

    def clone(self):
        return SoftMajority()

    def update(self,my,his):
        if (his == 'C'):
            self.nbCooperations += 1
        elif (his == 'D'):
            self.nbTrahisons += 1
            
# Encode HardMajority which plays what its opponent played in majority. In the event of a tie, it shall Defect
            
class HardMajority(Strategy):
    def __init__(self):
        super().__init__()
        self.name = "hardmajo"
        self.nbCooperations = 0
        self.nbTrahisons = 0
        
    def getAction(self,tick):
        if (self.nbCooperations > self.nbTrahisons):
            return 'C'
        else :
            return 'D'

    def clone(self):
        return HardMajority()

    def update(self,my,his):
        if (his == 'C'):
            self.nbCooperations += 1
        elif (his == 'D'):
            self.nbTrahisons += 1
            
# Encode Gradual. Gradual cooperates on the first move, then if the opponent has just betrayed her 
# for the nth time, enters a period of retaliation (successive Defect) of n rounds followed by 2 
# cooperation moves 

class Gradual(Strategy):
    def __init__(self):
        super().__init__()
        self.name = "gradual"
        self.nbTrahisons = 0
        self.punish = 0
        self.calm = 0
    def getAction(self,tick):
        if (tick==0) : return 'C'
        if self.punish > 0 :
            self.punish-=1
            return 'D'
        if self.calm > 0 :
            self.calm-=1
            return 'C'
        if self.hisLast=='D' : 
            self.punish=self.nbTrahisons - 1
            self.calm=2
            return 'D'
        else: return 'C'

    def clone(self):
        return Gradual()

    def update(self,my,his):
        self.hisLast=his
        if (his == 'D'):
            self.nbTrahisons += 1
  
print("All is ok")

In [None]:
# Test Hard and Soft functionalities in a Meeting against each other on 20 rounds (we must obtain DCDCDCDCDCDC for hard and CDCDCDCDCD for soft)
m = Meeting(g,HardMajority(),SoftMajority(),20)
m.run()
print(m.s1.name+"\t"+m.s1_rounds+" "+str(m.s1_score))
print(m.s2.name+"\t"+m.s2_rounds+" "+str(m.s2_score))
print()

# test Gradual against Periodic ("CD") on 40 rounds
m = Meeting(g,Periodic("CD"),Gradual(),40)
m.run()
print(m.s1.name+"\t"+m.s1_rounds+" "+str(m.s1_score))
print(m.s2.name+"\t"+m.s2_rounds+" "+str(m.s2_score))


## The memory family

In [None]:
class Mem(Strategy):
    def __init__(self, x, y, genome, name=None):
        self.name = name
        self.x = x
        self.y = y
        self.genome = genome
        if (name == None): # Default name is the used  if the user does not define it
            self.name = genome
        self.myMoves = []  # contains my x last moves
        self.itsMoves = [] # contains its y last moves

    def clone(self):
        return Mem(self.x, self.y, self.genome, self.name)

    def getAction(self, tick):
        if (tick < max(self.x, self.y)):
            return self.genome[tick]
        cpt = 0
        for i in range(self.x-1,-1,-1):
            cpt*=2
            if (self.myMoves[i] == 'D'):
                cpt+=1
        for i in range(self.y-1,-1,-1):
            cpt*=2
            if (self.itsMoves[i] == 'D'):
                cpt+=1
        cpt += max(self.x, self.y)
        return self.genome[cpt]

    def update(self, myMove, itsMove):
        if (self.x > 0):
            if(len(self.myMoves) == self.x):
                del self.myMoves[0]
            self.myMoves.append(myMove)
        if (self.y > 0):
            if(len(self.itsMoves) == self.y):
                del self.itsMoves[0]
            self.itsMoves.append(itsMove)
            
print("All is ok")

It should be noted that a large number of well-known strategies are described in the form of a `memory(X,Y)`<br>
Mem(0,0,'C','allc')<br>
Mem(0,0,'D','alld')<br>
Mem(1,0,'cDC','percd')<br>
Mem(1,0,'dDC','perdc')<br>
Mem(0,1,'cCD','tft')<br>
Mem(0,1,'dCD','mistrust')<br>
Mem(1,1,'cCDDD','spiteful')<br>
Mem(1,1,'cCDDC','pavlov')<br>
Mem(0,2,'ccCCCD','tf2t')<br>
Mem(0,2,'ccCDDD','hard_tft')<br>
Mem(1,2,'ccCCCDCDDD','slow_tft')<br>
Mem(1,2,'ccCDCDDCDD','winner12')<br>
Mem(1,2,'','tft_spiteful')<br>
Mem(1,2,'ccCDDDDDDD','spiteful_cc')<br>

In [None]:
# Exercise :  Small equivalence test: M
# Make two tournaments with All_C, All_D, Tft, Spiteful, Periodic('CD') and Periodic('DC').
# The first one using Periodics, Tft() and Spiteful() classes
# The second one using only Mem(x,y,"",name) to code them.
# Check the equivalence by printing the tournament matrix

bag1 = [Periodic('C'),Periodic('D'),Tft(),Spiteful(),Periodic('CD'),Periodic('DC')]
t1=Tournament(g,bag1,100)
t1.run()
print(t1.matrix)
print()

bag2 = [Mem(0,0,'C','allc'),Mem(0,0,'D','alld'),Mem(0,1,'cCD','tft'),
        Mem(1,1,'cCDDD','spiteful'),Mem(1,0,'cDC','percd'),Mem(1,0,'dDC','perdc')]
t2=Tournament(g,bag2,100)
t2.run()
print(t2.matrix)

## Generate them all
For a `Mem(x,y)` family, the genome is of size `max(x,y)` for the first rounds plus `2^(x+y)` for all situations `s` of the past on `x` moves of one player and `y` moves of the other. So there are `2^(max(x,y)+2^(x+y))` strategies to generate. To obtain all these elements, it is therefore sufficient to compute all the possible instanciations of C and D in the genome, which is done, once again, with a Cartesian product.


| family  | genome length | number of strats  |
|         :-:   |     :-:     | :-:    |
| mem(0,1) | 1+2^1 = 3        | 2^3 = 8 |
| mem(1,0) | 1+2^1 = 3        | 2^3 = 8 |
| mem(1,1) | 1+2^2 = 5        | 2^5 = 32 |
| mem(2,0) | 2+2^2 = 6        | 2^6 = 64 |
| mem(1,2) | 2+2^3 = 10       | 2^10 = 1024 |
| mem(2,1) | 2+2^3 = 10       | 2^10 = 1024 |
| mem(2,2) | 2+2^4 = 18       | 2^18 = 262144 |


In [None]:
def getAllMemory(x,y):
    if (x+y > 4):
        return "Pas calculable"
    len_genome = max(x,y)+2**(x+y)
    permut = [p for p in itertools.product(['C','D'], repeat=len_genome)]
    genomes = [''.join(p) for p in permut]
    return [Mem(x,y,gen) for gen in genomes]


print("In Mem(1,1) there are "+ str(len(getAllMemory(1,1))) + " strategies")

## The Mem(1,1) competition

In [None]:
bag3=getAllMemory(1,1)
e2=Ecological(g,bag3)
e2.run()
e2.drawPlot(None,4)
evol=e2.historic
print(evol.iloc[-1][evol.iloc[-1]>0])
# Only 4 survive : mem11_cCDDD-spite 2126  , mem11_cCDCD-tft 701 , mem11_cCDDC-pavlov 214 , mem11_cCDCC 158

# Exercises

In [None]:
# What is the common name of the strategy that wins the Mem(1,1) ?

In [None]:
# Study the phenomena of invasion
# In particular, measure empirically (by few tests) the number of ALL_D required 
# to invade a family of 100 ALL_C in an ecological competition.

In [None]:
# Master-Slave Strategies
# It is said that there is a set of Master-Slave strategies if a Master strategy tries to recognize 
# his Slaves on a starter, to better exploit them afterwards.
# Develop a Master strategy that plays TFT unless the opponent has played consecutively 1 time C, 
# 50 times D, 1 time C in which case she always plays D
# Develop a Slave strategy that plays 1 time C, 50 times D, then always C
# Each additional slave brings an advantage to Master!
# Add these strategies to the mem(1,1) competition by putting enough Escaves for the Master to 
# win this competition 

# Evaluation by subclass synthesis

Ecological competitions offer a fairly reliable tool for measuring the robustness of a strategy, but it is still insufficient. It is possible, for example, that some strategies may sacrifice themselves for others in a *master-slave* scheme. Having a synthesis of hundreds or even thousands of ecological competitions in which certain strategies have been removed probably measures a better robustness. One of the simplest ideas is to calculate the `n` possible competitions that can be done by removing 1 strategy from a set of `n` strategies. This technique is called the subclass technique.
We define here 3 functions to realize these subclasses.
- `subclasses(soup, n)` which evaluates all possible subsets of size n in the soup
- `subclassesWithOneStrat(soup, n, strat)` which evaluates Strat in all possible subsets of size n in the soup by systematically adding the strategy strat
- `subclassesRandomWithOneStrat(p, soup, n, Strat)` which organizes competitions of n randomly selected strategies in the soup in which Strat has been systématically added

The evaluations carried out in these functions are ecological competitions, but of course we could use just tourmaments.

These functions return at the end a table with for each strategy, its best place, its worst place, its average and its standard deviation.

In [None]:
import statistics
import random

def subClasses(soupe, n):
    if (n > len(soupe)):
        print ("the soup size must be smaller than n")
        return   
    res = pd.DataFrame(np.nan,[s.name for s in soupe], ["BestRank","WorstRank", "RankAvg", "RankStd"])   
    for s in soupe:
        res.at[s.name, "BestRank"] = len(soupe)
    ranks = dict()
    sousEnsembles = list(itertools.combinations(soupe, n))
    for s in sousEnsembles:
        e = Ecological(g, s)
        e.run()
        classements = e.historic.iloc[e.generation].rank(0, method="min", ascending=False)
        for strat in s : 
            classement = classements[strat.name]
            if (math.isnan(res.at[strat.name, "BestRank"]) or classement < res.at[strat.name, "BestRank"]):
                res.at[strat.name, "BestRank"] = classement
            if (math.isnan(res.at[strat.name, "WorstRank"]) or classement > res.at[strat.name, "WorstRank"]):
                res.at[strat.name, "WorstRank"] = classement  
            if (strat.name in ranks.keys()):
                ranks[strat.name].append(classement)
            if (strat.name not in ranks.keys()):
                ranks[strat.name] = [classement]      
    for strat in soupe : 
        res.at[strat.name, "RankAvg"] = statistics.mean(ranks[strat.name])
        res.at[strat.name, "RankStd"] = statistics.stdev(ranks[strat.name])
    print(res.sort_values(by = ['RankAvg', 'BestRank', 'RankStd', 'WorstRank'],  ascending = [True, True, True, True ]))
          

def subClassesWithOneStrat(soupe, n, strategy, printAll = False):
    if (n > len(soupe)):
        print ("the soup size must be smaller than n")
        return     
    res = pd.DataFrame(np.nan,[s.name for s in soupe+[strategy]], ["BestRank", "WorstRank", "RankAvg", "RankStd"])   
    sousEnsembles = list(itertools.combinations(soupe, n))
    ranks = dict()
    bestComp = []
    worstComp = []
    for s in sousEnsembles:  
        e = Ecological(g, list(s) + [strategy])
        e.run()
        classements = e.historic.iloc[e.generation].rank(0, method="min", ascending=False)
        for strat in  list(s) + [strategy] : 
            classement = classements[strat.name]
            if (math.isnan(res.at[strat.name, "BestRank"]) or classement < res.at[strat.name, "BestRank"]):
                res.at[strat.name, "BestRank"] = classement
                if (strat == strategy):
                    bestComp = list(s) + [strategy]
            if (math.isnan(res.at[strat.name, "WorstRank"]) or classement > res.at[strat.name, "WorstRank"]):
                res.at[strat.name, "WorstRank"] = classement 
                if (strat == strategy):
                    worstComp = list(s) + [strategy]
            if (strat.name in ranks.keys()):
                ranks[strat.name].append(classement)
            if (strat.name not in ranks.keys()):
                ranks[strat.name] = [classement]
    for s in soupe+[strategy]:
        if (s.name in ranks.keys()):
            res.at[s.name, "RankAvg"] = statistics.mean(ranks[s.name])
            if (len(ranks[s.name]) > 1):
                res.at[s.name, "RankStd"] = statistics.stdev(ranks[s.name])
    if (printAll) :       
        print(res.sort_values(by = ['RankAvg', 'Bestrank', 'RankStd', 'Pire place'],  ascending = [True, True, True, True ]))
    else : 
        print("Strategy ranking : "+strategy.name)
        print(res.loc[strategy.name,:])
    return bestComp, worstComp, strategy



def subClassesRandomWithOneStrat(p, soupe, n, strategy, printAll = False ):
    if (n > len(soupe)):
        "The ssoup size must be smaller that n"
        return  
    res = pd.DataFrame(np.nan,[s.name for s in soupe+[strategy]], ["BestRank","WorstRank", "RankAvg", "RankStd"])
    ranks = dict()
    bestComp = []
    worstComp = []
    for i in range (0, p) : 
        #print("Competition "+str(i+1)+ "/"+str(p))
        strategies = []
        strategies.append(strategy)
        indice = [i for i in range (0, len(soupe))]
        for i in range (0, n):
            indiceStrat = random.choice(indice)
            indice.remove(indiceStrat)
            strategies.append(soupe[indiceStrat])
        #print("Les stratégies qui jouent sont : ")
        #for s in strategies :
            #print(s.name)
        e = Ecological(g, strategies)
        e.run()
        classements = e.historic.iloc[e.generation].rank(0, method="min", ascending=False)
        for strat in strategies : 
            classement = classements[strat.name]
            if (math.isnan(res.at[strat.name, "BestRank"]) or classement < res.at[strat.name, "BestRank"]):
                res.at[strat.name, "BestRank"] = classement
                if (strat == strategy):
                    bestComp = strategies
            if (math.isnan(res.at[strat.name, "WorstRank"]) or classement > res.at[strat.name, "WorstRank"]):
                res.at[strat.name, "WorstRank"] = classement  
                if (strat == strategy):
                    worstComp = strategies
            if (strat.name in ranks.keys()):
                ranks[strat.name].append(classement)
            if (strat.name not in ranks.keys()):
                ranks[strat.name] = [classement]
    for s in soupe+[strategy]:
        if (s.name in ranks.keys()):
            res.at[s.name, "RankAvg"] = statistics.mean(ranks[s.name])
            if (len(ranks[s.name]) > 1):
                res.at[s.name, "RankStd"] = statistics.stdev(ranks[s.name])
    if (printAll) :   
        print(res.sort_values(by = ['RankAvg', 'BestRank', 'RankStd', 'WorstRank'],  ascending = [True, True, True, True ]))
    else : 
        print("Strategy ranking : "+strategy.name)
        print(res.loc[strategy.name,:])
    return bestComp, worstComp, strategy

print("All is OK")

#### A simple example: all competitions of 3 strategies among the classics

In [None]:
All_C = Periodic('C')
All_D = Periodic('D')
soupe = [All_C, All_D, Tft(), Spiteful(), Gradual(), SoftMajority(), HardMajority()]
subClasses(soupe, 3)

#### A larger case: all Mem(1,1) strategies with one less strategy each time
Knowing that there are 32 `mem(1,1)` this operation therefore performs 32 competitions of 31 strategies. Note that in the case of this method, all strategies are present (and absent) exactly the same number of times.

In [None]:
soup = getAllMemory(1,1)
subClasses(soup, len(soup)-1)
# ATTENTION this experiment takes about 1 minute

#### Testing the Spiteful strategy with all classic triplets
In the case of both `subClassesWithOneStrat` methods only the strategy passed as parameter participates in all subclasses (feasible for not too large sets like mem(1,1)). In the first one it participates in all the subclasses while with `subClassesRandomWithOneStrat` it participates in a fixed number of subclasses of the same size but taken randomly (usable in the larger whole as the same (2,2)).

In [None]:
All_C = Periodic('C')
All_D = Periodic('D')
soup = [gentille, mechante, Tft(), Gradual(), SoftMajority(), HardMajority()]
res  = subClassesWithOneStrat(soup, 3, Spiteful())
# To view the entire table : 
# res = subClassesWithOneStrat(soupe, 3, Spiteful(), True)

It should be noted that the objects `subClassesWithOneStrat` and `subClassesRandomWithOneStrat` keep the 
best and worst tournament for the `Strat` strategy
When displaying the subclass ranking it is therefore possible to display the set of strategies that has been favorable or unfavorable to the `Strat` srategy.

In [None]:
bestComp, worstComp, strategy = res
print("The best competition for strategy "+strategy.name +" is : ")
for strat in bestComp :
    print(strat.name)
 

#### 100 experiments of 10 strategies randomly selected in mem(2,2) against Gradual()
For subclassesRandom, if a strategy has only played once then it has no standard deviation (NaN); if it has not played at all then all its values are at NaN in the table

In [None]:
soup = getAllMemory(2,2)
res = subClassesRandomWithOneStrat(100,soup, 10, Gradual())
# To view the entire table : 
# subClassesRandomWithOneStrat(100 soupe, 10, Gradual(), True)

# Attention : this experiment takes about 1 minute

We can also check the most "unfavourable" competition in Gradual 
(it is a random choice, so 2 executions will not always give the same result)

In [None]:
bestComp, WorstComp, strategy = res

In [None]:
soup = worstComp
e2=Ecological(g,soup)
e2.run()
e2.drawPlot(None,None)
evol=e2.historic
print(evol.iloc[-1])
print()
print(e2.historic.iloc[e2.generation].rank(0, method="min", ascending=False))

In [None]:
# Attention : this experiment takes about 1 minute
soup = getAllMemory(1,1)
res  = subClassesWithOneStrat(soup,len(soup)-1, Gradual())

# Bibliography

- Robert Axelrod. *The Evolution of Cooperation*. (New York: Basic Books, 1984).
- William Poundstone. *Prisoner's Dilemma*. 1st anchor books
- JP Delahaye et P Mathieu. *Des surprises dans le monde de la coopération*. Pour la Science, numéro spécial "Les mathématiques sociales", pp 58-66, Juillet 1999.
- Philippe Mathieu, Jean-Paul Delahaye. [New Winning Strategies for the Iterated Prisoner's Dilemma](http://jasss.soc.surrey.ac.uk/20/4/12.html). J. Artificial Societies and Social Simulation 20(4) (2017)
- Bruno Beaufils, Jean-Paul Delahaye et Philippe Mathieu. *Our Meeting with Gradual : A good Strategy for the Itareted Prisoner’s Dilemma*. Intern. Conf. on Artificial Life V (ALIFE V), pp. 159- 165, 16-18 mai 1996, Nara (Japon).
- Martin Nowak et K. Sigmund. *TIT for TAT in Heterogeneous Populations*. Nature, vol. 355, n° 16, pp. 250-253, janvier 1992.
- Nowak M., May R., Sigmund K., *L'arithmétique de l'entraide*, Pour la Science No 214, Août 1995, pp. 56-61 