In [1]:
import numpy as np
import nashpy as nash

A **(mixed) strategy** for a player is a probability distribution over the space of (pure) strategies. The expected score of a player is thus the expected value of the payoff.

An upshot is that for a 2-player game with payoff $(A,B)$, given mixed strategies $\{\sigma^i_j\}_{j\in S_i}$ for player $i$, the utility for player 1 is
$$ u^1 = \sum_{i=1}^m\sum_{j=1}^n A_{i,j}\sigma^1_i\sigma^2_j $$
and the utility for player 2 is
$$ u^2 = \sum_{i=1}^m\sum_{j=1}^n B_{i,j}\sigma^1_i\sigma^2_j $$

**example**: Consider the coin matching game. Given strategies $\sigma^1=(0.2, 0.8), \sigma^2=(0.6, 0.4)$, we compute the payoffs to be...

In [2]:
# payoff matrices
A = np.array([[1, -1], [-1, 1]])
B = np.array([[-1, 1], [1, -1]])
# player strategies
sigma_1 = np.array([0.2, 0.8])
sigma_2 = np.array([0.6, 0.4])

def compute_payoff_naive(A, B, sigma_1, sigma_2):
    expected_payoff_1 = 0
    expected_payoff_2 = 0
    for i in range(len(sigma_1)):
        for j in range(len(sigma_2)):
            expected_payoff_1 += A[i][j] * sigma_1[i] * sigma_2[j]
            expected_payoff_2 += B[i][j] * sigma_1[i] * sigma_2[j]
    return np.array([expected_payoff_1, expected_payoff_2])

compute_payoff_naive(A, B, sigma_1, sigma_2)

array([-0.12,  0.12])

This is recognizable algebraically: it's just an inner product! Here, $u^1 = \sigma^{1,T} A\sigma^2$ and $u^2 = \sigma^{1,T} B\sigma^2$

In [10]:
def compute_payoff(A, B, sigma_1, sigma_2):
    expected_payoff_1 = sigma_1.T @ A @ sigma_2
    expected_payoff_2 = sigma_1.T @ B @ sigma_2
    return np.array([expected_payoff_1, expected_payoff_2])

compute_payoff(A, B, sigma_1, sigma_2)

array([-0.12,  0.12])

Add this to the class.

In [14]:
class game:
    """
    2 player normal form game
    """
    def __init__(self, payoff_A, payoff_B=None):
        self.payoff_A = np.array(payoff_A)
        if payoff_B is None:
            # zero-sum game
            self.payoff_B = -self.payoff_A
        else:
            self.payoff_B = np.array(payoff_B)
        
    def __repr__(self):
        return "player 1 payoff:\n" + str(self.payoff_A) + "\n\n" + \
               "player 2 payoff:\n" + str(self.payoff_B)
    
    def payoff(self, sigma_1, sigma_2):
        expected_payoff_1 = sigma_1.T @ self.payoff_A @ sigma_2
        expected_payoff_2 = sigma_1.T @ self.payoff_B @ sigma_2
        return np.array([expected_payoff_1, expected_payoff_2])

In [15]:
coins = game(A, B)
coins

player 1 payoff:
[[ 1 -1]
 [-1  1]]

player 2 payoff:
[[-1  1]
 [ 1 -1]]

In [16]:
coins.payoff(sigma_1, sigma_2)

array([-0.12,  0.12])

In [17]:
"""testing a zero-sum game"""
A = [[1, -2, 4], [2, -1, 2], [7, -7, 6]]
g = game(A)
g

player 1 payoff:
[[ 1 -2  4]
 [ 2 -1  2]
 [ 7 -7  6]]

player 2 payoff:
[[-1  2 -4]
 [-2  1 -2]
 [-7  7 -6]]