# Characterizing Nash Equillibrium

## Definition:

As seen in the class, a layman definiton of Nash equllibrium would be the choice made by players in a game such that no 
player can choose better given what every other player is choosing.

## A Toy Model:

To understand the characteristics of Nash equillibrium, a simple toy model of the Prisoner's dilemma was discussed in the class. The model had the following pay-off-matrix:

| Calvin/Klein | Confess | Not Confess  |
| --- | --- | --- |
| Confess |   5y,5y   |  0y,15y |
| Not Confess  | 15y,0y | 1y,1y |

The Nash equillibrium of this game is for both of them to confess. Even if one of them is let know of other's decision, there is no chance they would want to change their strategy of confessing (under the assumption the prisoners are rational), as the strategy of confessing always ensures the best possible outcome for them.

Here, we iterate such a model $N-$number of times and estimate the average pay-off for each player.

[Reference: Effective Choice in the Prisoner's Dilemma ](https://www.jstor.org/stable/pdf/173932.pdf?refreqid=excelsior%3Aeef7a653a4ae47e904b41d8d57f55b12)

### Theory to Code:

In [1]:
import numpy as np
from random import randint
import pandas as pd

In [2]:
#Utilities needed.
mean_of_list = lambda List:sum(List)/len(List)
transpose    = lambda List:zip(*List)

In [3]:
'''A class that describes the strategies our prisoners undertake.'''
class Classic_Prisoners:
    def __init__(self,prob_confess):   #All the prior information assumed for our prisoners goes here. 
        self.p = prob_confess          #The probablity for them to confess.               
    
    def choose(self):
        #We assume our player randomly chooses one of the two strategies (confess or not) in each round randomly.
        #To do this we generate a random integer r.
        r = randint(0,99)
        if r < self.p:
            return 0      #Chooses strategy 1(confess).
        else:
            return 1      #Chooses strategy 2(Not Confess).
    

In [4]:
'''A class describing the game to be played by our agents.'''
class Two_Person_Game:
    def __init__(self,pay_off_mat,player1,player2):
        self.pmat = pay_off_mat
        self.players = [player1,player2]
        self.payoff_dict = dict((p,0) for p in self.players)
        self.choices = []
        
    
    def play_game(self,N):    #N denotes the number of times the game should be played.
        p1,p2 = self.players
        for _ in range(1,N):
            p1_choice = p1.choose()
            p2_choice = p2.choose()
            self.choices.append((p1_choice,p2_choice))
        #return self.choices
    
    def pay_off(self):
        p1,p2 = self.players
        #self.choices = self.play_game(N)
        payoffs = (self.pmat[c1][c2] for (c1,c2) in self.choices)
        pay_p1, pay_p2 = transpose(payoffs)
        return {'Player-1' : mean_of_list(pay_p1),'Player-2': mean_of_list(pay_p2)}
    
            
            

#### Analysis of the results of the algorithm:

We look at what would happen to estimated the pay-off of each prisoner for different probablities of confessing. 

In [5]:
pay_off_mat_prisoners = [ [(5,5),(0,15)] , [(15,0),(1,1)] ]

In [6]:
def varying_prob_results(iterator,pay_off_matrix,player):
    pay_off_dict = {}
    for (p1,p2) in iterator:
        prisoner1 = player(p1)
        prisoner2 = player(p2)
        Prisoners_Dilemma = Two_Person_Game(pay_off_mat_prisoners,prisoner1,prisoner2)
        Prisoners_Dilemma.play_game(1000)
        key = '('+str(p1)+ '|'+str(p2) + ')'+'%'
        pay_off_dict[key] = Prisoners_Dilemma.pay_off()
    df_pay_off =  pd.DataFrame(pay_off_dict)
    return df_pay_off

In [7]:
seq_gen1 = ((i*10,i*10) for i in range(11))       #An iterator.
df1_pay_off = varying_prob_results(seq_gen1,pay_off_mat_prisoners,Classic_Prisoners)
df1_pay_off.round(2).head()

Unnamed: 0,(0|0)%,(10|10)%,(20|20)%,(30|30)%,(40|40)%,(50|50)%,(60|60)%,(70|70)%,(80|80)%,(90|90)%,(100|100)%
Player-1,1.0,2.33,3.34,4.08,4.92,5.24,5.69,5.54,5.42,5.41,5.0
Player-2,1.0,2.56,3.25,3.87,4.53,5.26,5.65,5.81,5.93,5.41,5.0


In [8]:
seq_gen2 = ((i*10,(10-i)*10) for i in range(11)) 
df2_pay_off = varying_prob_results(seq_gen2,pay_off_mat_prisoners,Classic_Prisoners)
df2_pay_off.round(2).head()

Unnamed: 0,(0|100)%,(10|90)%,(20|80)%,(30|70)%,(40|60)%,(50|50)%,(60|40)%,(70|30)%,(80|20)%,(90|10)%,(100|0)%
Player-1,15.0,12.42,10.84,8.35,6.86,5.19,3.48,2.49,1.73,0.59,0.0
Player-2,0.0,0.71,1.4,2.67,3.96,5.25,7.04,8.81,10.36,12.89,15.0
