## Background 
 -- *From Romero and Rosokha (2018,2019a,2019b) and Rosokha and Wei (2020)*

Strategy frequency estimation method (SFEM) is a finite-mixture estimation aproach to estimate the proportion of strategies in experimental data proposed by Dal Bo and Frechette (2011). The method works by first specifying the set of $K$ strategies considered by the modeler. Then, for each subject $n\in \{1,...,N\}$, and each strategy $k\in \{1,...,K\}$, the method prescribes to compare subject $n$'s actual play with how strategy $k$ would have played in her place. Let $C(k,n)$ denote the number of periods in which subject $n$'s play correctly matches the play of strategy $k$. Then, let $C$ denote a $K \times N$ matrix of the number of correct matches for all combinations of subjects and strategies. Similarly, let $E$ denote a $K \times N$ matrix of the number of mismatches when comparing subjects' play with what the strategies would do in their place. Then, define a Hadamard-product $P$:

\begin{equation}
    P = \beta^{C}\circ(1-\beta)^{E},
\end{equation}

where $\beta$ is the probability that a subject plays according to a strategy and $(1-\beta)$ the probability that the subjects deviates from that strategy. Thus,
each entry $P(k,n)$ is the likelihood that the observed choices by subject $n$ were generated by strategy $k$. Then, using the matrix dot product, we define the log-likelihood function $\mathcal{L}$:

\begin{equation}
    \mathcal{L}(\beta,\phi)= ln \big( \phi' \cdot P \big) \cdot  \mathbf{1} .
\end{equation}

For this example estimation, the set of strategies encompasses is the five most common strategies found in the literature on repeated Prisoner's Dilemma (Dal Bo and Frechette, 2018). In particular, the strategies included in strategies.py are Always Cooperate (ALLC), Always Defect (ALLD), Grim Trigger (GRIM), Tit-for-Tat (TFT), and Suspicious Tit-for-Tat (DTFT).

## Load libraries

In [1]:
import pandas as pd
import numpy as np
import sys
from scipy.optimize import minimize

## Get Data

In [2]:
#For this example we will use data from Romero & Rosokha (2018)
data=pd.read_csv('input\\action_data_RR2018.csv')
data.head()

Unnamed: 0,subject,supergame,period,action,opponentAction
0,T1S1S01,1,1,C,D
1,T1S1S01,1,2,D,D
2,T1S1S01,1,3,C,D
3,T1S1S01,1,4,D,D
4,T1S1S01,1,5,C,D


## Clean Data

In [3]:
# Subset data to supergames and periods of interest. 
# In RR2018 we focus on supermages 31-50 and restrict to at most 20 first periods 
# This is done so numerical issues when raising beta to large powers
data = data[(data.supergame>30) & (data.supergame<51) & (data.period<21)] 

In [4]:
# Get number of subjects and maximum number of actions for each subject 
n_subjects = len(data.subject.unique())
n_actions = data.groupby('subject').count().action.max()

In [5]:
#Create nan matrices of actions, matches, periods, and opponent's actions
actions=np.empty((n_subjects,n_actions))
actions.fill(np.nan)
matches=np.empty((n_subjects,n_actions))
matches.fill(np.nan)
periods=np.empty((n_subjects,n_actions))
periods.fill(np.nan)
others=np.empty((n_subjects,n_actions))
others.fill(np.nan)

In [6]:
# Get the data into matrix format 
for i,sub in enumerate(data.subject.unique()):
    
    a = [x=='C' for x in data.action[data.subject==sub].tolist()]
    m = data.supergame[data.subject==sub].tolist()
    p = data.period[data.subject==sub].tolist()
    o = [x=='C' for x in data.opponentAction[data.subject==sub].tolist()]
    
    n = len(a)
    
    actions[i,:n] = a
    matches[i,:n] = m
    periods[i,:n] = p
    others[i,:n]  = o    

## Simulate Strategies Against Opponent Actions

In [7]:
sys.path.append('input\\')
import strategies

In [8]:
strats = []
strat_names = []
for i in dir(strategies):
    s = getattr(strategies,i)
    if callable(s):
        strats.append(s)
        strat_names.append(s.__name__)
        

n_strats = len(strats)
print("There are",n_strats,'strategies in the strategies.py file. The strategies are:',strat_names)

There are 5 strategies in the strategies.py file. The strategies are: ['ALLC', 'ALLD', 'DTFT', 'GRIM', 'TFT']


In [9]:
# For each subject n and each strategy k compare subject n's actual play with how strategy k would have played.
C = np.zeros((n_strats,n_subjects)) #Number of periods in which play matches
E = np.zeros((n_strats,n_subjects)) #Number of periods in which play does not match
for n in range(n_subjects):
    for k in range(n_strats): 

        subChoice = actions[n]
        otherChoice = others[n]
        periodData = periods[n]

        stratChoice = strats[k](otherChoice,periodData)

        C[k,n]=np.sum(subChoice==stratChoice)
        E[k,n]=np.sum((1-subChoice)==stratChoice)

## Set up the loglikelihood function

In [10]:
# Likelhood function takes as an input a vector of proportions of strategies and returns the likelihood value
#Note cMat and eMat are global matrices that are updated externally for each treatment.
def objective(x,args):
    
    C = args[0]
    E = args[1]
    
    bc=np.power(x[0],C) #beta to the power of C
    be=np.power(1-x[0],E) #beta to the power of E
    prodBce = np.multiply(bc,be) #Hadamard product
    
    #maximum is taken so that there is no log(0) warning/error
    res = np.log(np.maximum(np.dot(x[1:],prodBce),np.nextafter(0,1))).sum() 
    
    return -res

def constraint1(x):
    
    return x[1:].sum()-1

#Set up the boundaries and constraints
b0 = (np.nextafter(0.5,1),1-np.nextafter(0,1))
b1 = (np.nextafter(0,1),1-np.nextafter(0,1))
bnds = (b0,b1,b1,b1,b1,b1) #Beta is at least .5
con1 = {'type': 'eq', 'fun': constraint1} 
cons = ([con1])

## Run likelihood maximization

In [11]:
#Some random starting point
x0 = np.zeros(n_strats+1)
x0[0] = .5+.5*np.random.random()
temp = np.random.random(n_strats)
x0[1:]=temp/temp.sum()

bestX=x0
bestObjective=objective(x0,[C,E])

for k in range(30): #Do many times so that there is low likelihood of being stuck in local optimum

    x0 = np.zeros(n_strats+1)
    x0[0] = .5+.5*np.random.random()
    temp = np.random.random(n_strats)
    x0[1:]=temp/temp.sum()

    #Notice that we are minimizing the negative
    solution = minimize(objective,x0,method='SLSQP',bounds=bnds,constraints=cons,args=([C,E]))
    x = solution.x
    obj = solution.fun

    if bestObjective>obj:
        bestObjective=obj
        bestX=x

## Output results

In [12]:
results=pd.DataFrame(bestX.round(4).tolist()+[np.round(-bestObjective,4)],index=['beta']+strat_names+['LL'])
print(results)
results.to_csv("output\\01-sfem_estimates.csv")

              0
beta     0.9477
ALLC     0.0855
ALLD     0.1341
DTFT     0.1219
GRIM     0.2139
TFT      0.4445
LL   -4643.5154


## References

- Dal Bó, P. and Fréchette, G.R., 2011. The evolution of cooperation in infinitely repeated games: Experimental evidence. American Economic Review, 101(1), pp.411-29.

- Dal Bó, P. and Fréchette, G.R., 2018. On the determinants of cooperation in infinitely repeated games: A survey. Journal of Economic Literature, 56(1), pp.60-114.

- Romero, J. and Rosokha, Y., 2018. Constructing strategies in the indefinitely repeated prisoner’s dilemma game. European Economic Review, 104, pp.185-219.

- Romero, J. and Rosokha, Y., 2019. The Evolution of Cooperation: The Role of Costly Strategy Adjustments. American Economic Journal: Microeconomics, 11(1), pp.299-328.

- Romero, J. and Rosokha, Y., 2019. Mixed Strategies in the Indefinitely Repeated Prisoner's Dilemma. Available at SSRN 3290732.

- Rosokha, Y. and Wei, C., 2020. Cooperation in Queueing Systems. Available at SSRN 3526505.
