# Bayesian Nash Equilibrium in double bi-matrix games

This notebook introduces a solution method suggested by William Spaniel for analyzing bimatrix games where one of the players can have multiple types in this video: https://youtu.be/E0_CA9TwZ8c. It is recommended to check the video out in order to fully understand how and why the method works.

In [1]:
import pandas as pd 
import numpy as np 
import itertools
import nashpy
import bimatrix

Player types 
* Player 1: always a stag hunt type, 
* Player 2: with prob. `p`, she is PD, and `1-p`, she is SH. 

Payoffs: a payoff matrix *list* for each player: one payoff matrix for each type that player 2 can have. 

In [8]:
def compute_full_matrix(U1, U2, p, action_names=None): 
    '''
        Assumes that only player 2's type varies 
        (this means that player 1 has one action per row in U1, 
         while 2 has nA2**2 (one choice per type))
        Both players have one utility matrix for each realization 
        of player 2's type. 
         
        INPUTS: 
            U1: list of 2 payoff matrices for player 1 (row player)
            U2: list of 2 payoff matrices for player 2 (column player)
            p: (scalar) Probability that player 2 is the first type 
            action_names: [optional] 2-list of names of actions (nA1 and nA2 long)
        OUTPUTS: 
            t1, t2: wide-form payoff matrices suitable for finding the NE 
            A1, A2: names of actions 
    '''
    assert len(U1) == 2
    assert len(U2) == 2 
    assert np.isscalar(p)
    nA1, nA2 = U1[0].shape
    
    t1 = np.empty((nA1, nA2*nA2))
    t2 = np.empty((nA1, nA2*nA2))
    
    # player 1 chooses an action without knowing what type 2 is 
    for ia1 in range(nA1): 
        i_col = 0 
        
        # player 2 chooses an action conditional on observing her type 
        for a2_1 in range(nA2): 
            for a2_2 in range(nA2): 
                t1[ia1,i_col] = p * U1[0][ia1,a2_1] + (1.-p) * U1[1][ia1,a2_2]
                t2[ia1,i_col] = p * U2[0][ia1,a2_1] + (1.-p) * U2[1][ia1,a2_2]
                
                i_col += 1
                
    if action_names is None: 
        A1 = [f'{i}' for i in range(nA1)]
        A2 = [f'{a}{b}' for a in range(nA2) for b in range(nA2)]
    else: 
        assert len(action_names) == 2 
        A1 = action_names[0]
        assert len(A1) == nA1, f'Incorrect # of action names'
        a2 = action_names[1]
        assert len(a2) == nA2, f'Incorrect # of action names'
        
        A2 = [f'{a}{b}' for a in a2 for b in a2]
        
    return t1, t2, A1, A2

# Spaniel's example

* Player 1 just has one type and receives Stag Hunt payoffs regardless 
* Player 2 can be two types: Prisonner's Dilemma type (probability 0.2) or Stag Hunt type (probability 0.8). 

In [2]:
# Pr(player 2 is the PD type)
p = 0.2

# player 1 
u1  = np.array([[3,0], [2,1]])
U1 = [u1, u1] # player 1 has same payoffs regardless of 2's type 
A1 = ['U', 'D']

# player 2
u21 = np.array([[3,4],[1,2]]) # prisonner's dilemma 
u22 = np.array([[3,2],[0,1]]) # stag hunt 
U2 = [u21, u22]
a2 = ['L', 'R']
A2 = [f'{a}{b}' for a in a2 for b in a2]

In [3]:
print(f'--- If P2 is type 0 ---')
bimatrix.print_payoffs([u1, u21], [A1, a2])

--- If P2 is type 0 ---


Unnamed: 0,L,R
U,"(3, 3)","(0, 4)"
D,"(2, 1)","(1, 2)"


In [4]:
print(f'--- If P2 is type 1 ---')
bimatrix.print_payoffs([u1, u22], [A1, a2])

--- If P2 is type 1 ---


Unnamed: 0,L,R
U,"(3, 3)","(0, 2)"
D,"(2, 0)","(1, 1)"


## Wide form
We first convert the game to wide matrix form 

In [9]:
t1, t2, A1, A2 = compute_full_matrix(U1, U2, p, [A1, a2])

In [10]:
bimatrix.print_payoffs([t1, t2], [A1,  A2], 3)

Unnamed: 0,LL,LR,RL,RR
U,"(3.0, 3.0)","(0.6, 2.2)","(2.4, 3.2)","(0.0, 2.4)"
D,"(2.0, 0.2)","(1.2, 1.0)","(1.8, 0.4)","(1.0, 1.2)"


## Removing strictly dominated strategies

Looking for strictly dominated strategies: we know that if P2 is the Prisonner's Dilemma type, playing $R$ should be a strictly dominating strategy (defecting). And this comes out of running IESDS on the wide-form matrix representation of the game. 

In [14]:
A_, T_ = bimatrix.IESDS([A1, A2], [t1, t2], DOPRINT=True)

Player 2: LL is dominated by RL
Player 2: LR is dominated by RR


Since those actions were strictly dominated, it suffices to focus on this reduced version of the game. 

In [15]:
bimatrix.print_payoffs(T_, A_, 3)

Unnamed: 0,RL,RR
U,"(2.4, 3.2)","(0.0, 2.4)"
D,"(1.8, 0.4)","(1.0, 1.2)"


## Solving 
Now we simply call a game theory solver to find all equilibria of the game. 

In [16]:
eqs = list(nashpy.Game(T_[0], T_[1]).support_enumeration())
print(f'Found {len(eqs)} equilibria')
for i,eq in enumerate(eqs): 
    print(f'{i+1}: s1 = {eq[0]}, s2 = {eq[1]}')

Found 3 equilibria
1: s1 = [1. 0.], s2 = [1. 0.]
2: s1 = [0. 1.], s2 = [0. 1.]
3: s1 = [0.5 0.5], s2 = [0.625 0.375]


## A warning: calling `nashpy` on the full game

Interestingly, it  seems that the nashpy `support_enumeration` is getting confused if we provide it the full game as opposed to eliminating strictly dominated strategies beforehand. 

In [17]:
G = nashpy.Game(t1, t2)

eqs = list(G.support_enumeration())
print(f'Found {len(eqs)} equilibria')
for i,eq in enumerate(eqs): 
    print(f'{i+1}: s1 = {eq[0]}, s2 = {eq[1]}')

Found 2 equilibria
1: s1 = [1. 0.], s2 = [0. 0. 1. 0.]
2: s1 = [0. 1.], s2 = [0. 0. 0. 1.]


An even number of (2) equilibria was returned. This
indicates that the game is degenerate. Consider using another algorithm
to investigate.
                  
