In [2]:
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import linprog

# 4. Computing Solution Concepts of Normal-form games

This chapter deals with the algorithms to compute things like equilibria, domination, etc. It starts with the easiest one, which is the zero-sum game:

## 4.1 Computing Nash equilibria of two-player, zero-sum games

In the previous chapter the idea of a minmax and maxmin solution was introduced. In a zero sum 2-player game it was noted that at equilibrium each player receives their minmax / maxmin value. We can use this to calculate the equilibrium by linear programming. 

Minmax means that the agent makes their best decision, given the other player will then minimise their return. Say I am player 1 and my best action is $A_1^j$. Then player 2 minimises my return by finding a mixed strategy $s_2$, to minimise:

$$\sum_{k} u_1(A_1^j,A_2^k)s_2^k$$

Note: Here $k$ is the index of each action player 2 could take.

This can be re-written in linear-programming style as:

$$
\begin{align*}
\text{minimise} \quad & U_1 \\
\text{subject to} \quad & \sum_{k} u_1(A_1^j,A_2^k)s_2^k \leq U_1 \\
\text{} \quad & \sum_{k} s_2^k=1 \\
\text{} \quad & s_2^k\geq 0 \\
\end{align*}
$$

(Why? Well, $\min f(x)$ is the same as $\min y$ subject to $f(x)\leq y$.

Of course, we don't actually know which action $j$ is the best for player 1, so we need to check every one:

$$
\begin{align*}
\text{minimise} \quad & U_1 \\
\text{subject to} \quad & \sum_{k} u_1(A_1^j,A_2^k)s_2^k \leq U_1 \quad \forall A_1^j \\
\text{} \quad & \sum_{k} s_2^k=1 \\
\text{} \quad & s_2^k\geq 0 \\
\end{align*}
$$

Consider the Matching Pennies problem again:

$
\begin{array}{c|cc}
\text{} & \text{H} & \text{T} \\
\hline
\text{H} & 1,-1 & -1,1 \\
\text{T} & -1,1 & 1,-1 \\
\end{array}
$

Written using the above strategy for player 1 this gives (here $s_2^1$ being the chance of player 2 going heads, $s_2^2$ their chance of tails):

$$
\begin{align*}
\text{minimise} \quad & U_1 \\
\text{subject to} \quad & s_2^1 - s_2^2 \leq U_1 \quad \text{(player 1 goes heads)} \\
\text{} \quad & -s_2^1+s_2^2 \leq U_1 \quad \text{(player 1 goes tails)} \\
\text{} \quad & s_2^1 + s_2^2=1 \\
\text{} \quad & s_2 \geq 0 \\
\end{align*}
$$

The answer here is that $s_2^1=s_2^2=\frac{1}{2}$. 

Here is an example for the more complicated game paper/scissors/rock. Note that scipy does $\min c\cdot x$, subject to $A\cdot x \geq b$, so the signs are flipped here in the function call.

In [32]:
c = np.array([0,0,0,1])
A_ub = np.array([
    [0,-1,1,-1], # PAPER. If player 2 now goes paper player 1 gets 0, -1 if scissors, 1 if rock.
    [1,0,-1,-1], # SCISSORS. Note, as above the -1 at the end is because a+b+c<d is turned into a+b+c-d<0
    [-1,1,0,-1], # ROCK.
])
b_ub = np.array([0,0,0])
A_eq = np.array([[1,1,1,0]])
b_eq = np.array([1])
res = linprog(c, A_ub=-A_ub, b_ub=-b_ub, A_eq=A_eq, b_eq=b_eq, )
print("Player 2 strategy at equilibrium:",res["x"].round(2)[:3])
print("Player 1 utility:",res["x"].round(2)[-1])

Player 2 strategy at equilibrium: [0.33 0.33 0.33]
Player 1 utility: 0.0


There we go. Unsuprisingly, the utility is 0 and the best option is to be random. 

We can also work out the strategy for player 1. One option is to do the above again in reverse. Another is to do maxmin instead, which corresponds to the dual problem:

$$
\begin{align*}
\text{maximise} \quad & U_1 \\
\text{subject to} \quad & \sum_{k} u_1(A_1^j,A_2^k)s_1^j \leq U_1 \quad \forall A_2^k \\
\text{} \quad & \sum_{j} s_1^j=1 \\
\text{} \quad & s_1^j\geq 0 \\
\end{align*}
$$

As a final note it can be useful to reformulate the equations from before (or the ones right above) by adding slack variables and turning the inequalities into equalities:

$$
\begin{align*}
\text{minimise} \quad & U_1 \\
\text{subject to} \quad & \sum_{k} u_1(A_1^j,A_2^k)s_2^k + r_1^j = U_1 \quad \forall A_1^j \\
\text{} \quad & \sum_{k} s_2^k=1 \\
\text{} \quad & s_2^k\geq 0 \\
\text{} \quad & r_1^j\geq 0 \\
\end{align*}
$$
