# <center>Nash equilibria in bimatrix games</center>
### <center>Alfred Galichon (NYU & Sciences Po) and Antoine Jacquet (Sciences Po)</center>
## <center>'math+econ+code' masterclass series</center>
#### <center>With python code examples</center>
© 2018–2024 by Alfred Galichon.
Past and present support from NSF grant DMS-1716489, ERC grant CoG-866274 are acknowledged, as well as inputs from contributors listed [here](http://www.math-econ-code.org/team).

**If you reuse material from this masterclass, please cite as:**<br>
Alfred Galichon and Antoine Jacquet, 'Nash equilibria in bimatrix games', 'math+econ+code' masterclass series. https://www.math-econ-code.org/


### References

* Mangasarian and Stone (1964). "Two-person nonzero-sum games and quadratic programming." *Journal of Mathematical Analysis and Applications*.  
* Lemke and Howson (1964). "Equilibrium points of bimatrix games." *SIAM Journal on Applied Mathematics*.
* Bruno Codenotti. *Computational Aspects of Game Theory* Bertinoro Spring School 2011. Lecture 11: The Lemke-Howson Algorithm
http://wwwold.iit.cnr.it/staff/bruno.codenotti/lecture11p.pdf

### Learning objectives

* LCP formulation of Nash equilibria
* Mangasarian–Stone quadratic programming formulation
* Lemke–Howson algorithm


In [1]:
#!pip install mec --upgrade
import numpy as np
import gurobipy as grb


# Nash equilibrium in a two-player bimatrix game

Consider a two-player game where if player 1 plays $i \in \mathcal I$ and player 2 plays $j \in \mathcal J$, the payoff to player 1 is $A_{ij} > 0$ and the payoff to player 2 is $B_{ij} > 0$. (Assuming positive payoffs is without loss of generality.)


Let's begin by defining a Python class for bimatrix games.

In [2]:
class Bimatrix_game:
    def __init__(self, A_i_j, B_i_j):
        if A_i_j.shape != B_i_j.shape:
            raise ValueError("A_i_j and B_i_j must be of the same size.")
        self.A_i_j, self.B_i_j = A_i_j, B_i_j
        self.nbi,self.nbj = A_i_j.shape
    

Now recall the definition of a Nash equilibrium: it is a pair of vectors $p = (p_i)_{i \in \mathcal I}$ and $q = (q_j)_{j \in \mathcal J}$ which simultaneously solve

\begin{equation}
\max_{p \geq 0} \left\{ p^\top A q \mid \textstyle\sum_i p_i = 1 \right\},
\qquad
\max_{q \geq 0} \left\{ p^\top B q \mid \textstyle\sum_j q_j = 1 \right\}.
\end{equation}

By writing the KKT conditions for these two linear programs we obtain necessary and sufficient conditions for $p$ and $q$ to be a Nash equilibrium: there must exist two real numbers $\alpha$ and $\beta$ (the dual variables associated with the equality constraints) such that

\begin{align}
p_i &\geq 0 \\
q_j &\geq 0 \\
\textstyle\sum_i p_i &= 1 \\
\textstyle\sum_j q_j &= 1 \\
\alpha - (A q)_i &\geq 0 \quad (\forall i) \\
\beta - (B^\top p)_j &\geq 0 \quad (\forall j) \\
p_i \big( \alpha - (Aq)_i \big) &= 0 \quad (\forall i) \\
q_j \big( \beta - (B^\top p)_j \big) &= 0 \quad (\forall j).
\end{align}



*Question.* What is the interpretation of $\alpha$ and $\beta$?

## LCP formulation

The above problem is not a LCP in a strict sense for two reasons:  
(i) there are no nonnegativity constraints on $\alpha$ and $\beta$ a priori,  
(ii) there are equality constraints $\sum_i p_i = 1$ and $\sum_j q_j = 1$.  
This is sometimes called a *mixed linear complementarity problem*.
Still, it can be turned into a bona fide LCP:

$0 \leq p \perp \alpha 1_{\mathcal I} - Aq \geq 0$

$0 \leq q \perp \beta 1_{\mathcal J} - B^\top p \geq 0$ 

$0 \leq \alpha \perp - 1 + 1_{\mathcal I}^\top p \geq 0$

$0 \leq \beta \perp - 1 + 1_{\mathcal J}^\top q \geq 0$.

The difference with the previous formulation is that we enforce $\alpha \geq 0$ and $\beta \geq 0$, and we relax the equalities to inequalities: $\sum_i p_i \geq 1$ and $\sum_j q_j \geq 1$.

*Exercise.* Show that the solutions $(p, q, \alpha, \beta)$ of this LCP coincide with those of the KKT conditions above. (Recall that $A_{ij} > 0$ and $B_{ij} > 0$.)
<!--
In practice this is not a problem for our solutions.
Indeed, since $- 1 + 1_{\mathcal J}^\top q \geq 0$ there must be at least one $q_j > 0$.  
Then, because $A_{ij} > 0$ for all $ij$ and $\alpha 1_{\mathcal I} - Aq \geq 0$, we must have $\alpha > 0$, hence imposing $\alpha \geq 0$ is not an issue.  
Furthermore, since $\alpha > 0$ the equality $- 1 + 1_{\mathcal I}^\top p = 0$ must hold.  
By a similar argument, we must have $\beta > 0$ and $- 1 + 1_{\mathcal J}^\top q = 0$.
-->


## Mangasarian and Stone formulation

Let's look at the quadratic program associated with this LCP. It is

\begin{align}
\min_{p \geq 0, q \geq 0, \alpha \geq 0, \beta \geq 0}
&\Big\{
p^\top (\alpha 1_{\mathcal I} - Aq )
+ q^\top (\beta 1_{\mathcal J} - B^\top p)
+ \alpha (-1 + 1_{\mathcal I}^\top p) + \beta (-1 + 1_{\mathcal J}^\top q) 
\Big\} \\
\text{s.t.} ~ & \alpha 1_{\mathcal I} - Aq \geq 0 \\
              & \beta 1_{\mathcal J} - B^\top p \geq 0 \\
              & -1 + 1_{\mathcal I}^\top p \geq 0 \\
              & -1 + 1_{\mathcal J}^\top q \geq 0.
\end{align}

The constraint $-1 + 1_{\mathcal I}^\top p \geq 0$ must bind: if not, we can take $\tilde p$ such that $\tilde p_i = p_i / 1_{\mathcal I}^\top p < p_i$ and $\tilde \beta = \beta / 1_{\mathcal I}^\top p < \beta$ to decrease our objective.  
Similarly, the constraint $-1 + 1_{\mathcal J}^\top q \geq 0$ must also bind.

Thus, after simplification we obtain Mangasarian and Stone's formulation:

\begin{align}
\min_{p \geq 0, q \geq 0, \alpha, \beta}
&\left\{
\alpha + \beta - p^\top (A + B) q
\right\} \\
\text{s.t.} ~ & \alpha 1_{\mathcal I} - Aq \geq 0 \\
              & \beta 1_{\mathcal J} - B^\top p \geq 0 \\
              & -1 + 1_{\mathcal I}^\top p = 0 \\
              & -1 + 1_{\mathcal J}^\top q = 0.
\end{align}

*Remark.* This formulation is actually valid even when the matrices $A$ and $B$ are not positive.


## Abridged LCP representation


*Exercise.* Show that if $(p,q)$ is a Nash equilibrium with associated values $\alpha$ and $\beta$, then $x$ and $y$ defined by $x_i = \dfrac{p_i}{\beta}$ and $y_j = \dfrac{q_j}{\alpha}$ satisfy the LCP:  
$0 \leq x \perp 1_{\mathcal I} - A y \geq 0$

$0 \leq y \perp 1_{\mathcal J} - B^\top x \geq 0$.

Conversely, show that if $(x,y)$ satisfies this LCP, then either:
- $x=0$ and $y=0$, or 
- $p_i = \dfrac{x_i}{\sum_k x_k}$ and $q_j = \dfrac{y_j}{\sum_l y_l}$ is a Nash equilibrium, and in that case, $\alpha = \dfrac{1}{\sum_l y_l}$ and $\beta = \dfrac{1}{\sum_k x_k}$.


# Zero-sum games

Assume $B = c - A$ with $c$ a constant.

We use the data from Palacios–Huerta (2003) on penalty kicks. (See the [lecture on zero-sum games](https://www.math-econ-code.org/matrix-games).)

In [3]:
penalty_data = np.array([[53.21, 71.35, 93.80], 
                         [90.26, 42.81, 86.12], 
                         [96.88, 100.0, 75.43]])

penalty_zero_sum = Bimatrix_game(A_i_j = penalty_data, B_i_j = 100 - penalty_data)
print('A_i_j =\n', penalty_zero_sum.A_i_j)

A_i_j =
 [[ 53.21  71.35  93.8 ]
 [ 90.26  42.81  86.12]
 [ 96.88 100.    75.43]]


Here only $A_{ij}$ is used. $B_{ij}$ is ignored, and the game is solved as a zero-sum game, i.e. as a LP from one of the two players' perspective.  
Recall for instance, from the perspective of player 1:
\begin{align}
\min_{x_i \geq 0} ~& \textstyle\sum_i x_i \\
\text{s.t.} ~& \textstyle\sum_i A_{ij} x_i \geq 1 \quad [y_j \geq 0].
\end{align}

We then recover the strategy using $p_i = \dfrac{x_i}{\sum_i x_i}$, $q_j = \dfrac{y_j}{\sum_i x_i}$ (since $\sum_i x_i = \sum_j y_j$ by strong duality).

## Solution using Gurobi

We can adapt the Gurobi solver from the `Matrix_game` class.

In [4]:
from mec.gt import Matrix_game

def Bimatrix_game_zero_sum_solve(self, verbose=0):
    return Matrix_game(self.A_i_j).solve(verbose)

Bimatrix_game.zero_sum_solve = Bimatrix_game_zero_sum_solve

In [5]:
from scipy import sparse

In [6]:
penalty_zero_sum.zero_sum_solve()

Set parameter Username
Academic license - for non-commercial use only - expires 2025-01-21


{'p_i': array([0.3033884 , 0.15180727, 0.54480433]),
 'q_j': array([0.21908541, 0.10161508, 0.67929951]),
 'val': 82.62606457928055}

# Solving non-zero-sum games

Now we consider a version of the game with another $B$, such that the game is not zero-sum anymore.  
Imagine for instance that the goalkeeper gets a bonus if he jumps in the correct direction of the penalty kick, whether or not he catches the ball.

In [7]:
penalty_nonzero_sum = Bimatrix_game(A_i_j = penalty_data,
                                    B_i_j = np.array([[150, 100, 100],
                                                      [100, 150, 100],
                                                      [100, 100, 150]]) - penalty_data)
print('A_i_j =\n', penalty_nonzero_sum.A_i_j)
print('B_i_j =\n', penalty_nonzero_sum.B_i_j)

A_i_j =
 [[ 53.21  71.35  93.8 ]
 [ 90.26  42.81  86.12]
 [ 96.88 100.    75.43]]
B_i_j =
 [[ 96.79  28.65   6.2 ]
 [  9.74 107.19  13.88]
 [  3.12   0.    74.57]]


## Mangasarian–Stone

We implement the Mangasarian–Stone formulation using Gurobi.

In [8]:
tst = np.array([1] ).reshape(1)
tst * np.ones((3,1))

array([[1.],
       [1.],
       [1.]])

In [9]:
def Bimatrix_game_mangasarian_stone_solve(self, verbose=0):
    model=grb.Model()
    model.Params.OutputFlag = 0
    model.params.NonConvex = 2
    p_i = model.addMVar(shape = self.nbi)
    q_j = model.addMVar(shape = self.nbj)
    α = model.addMVar(shape = 1, lb = -grb.GRB.INFINITY)
    β = model.addMVar(shape = 1, lb = -grb.GRB.INFINITY)
    model.setObjective( α + β  - p_i @ (self.A_i_j + self.B_i_j) @ q_j , sense=grb.GRB.MINIMIZE)
    model.addConstr(np.ones((self.nbi,1)) @ α  - self.A_i_j @ q_j >= 0)
    model.addConstr(np.ones((self.nbj,1)) @ β - self.B_i_j.T @ p_i >= 0)
    model.addConstr(p_i.sum() == 1)
    model.addConstr(q_j.sum() == 1)
    model.optimize()
    sol = np.array(model.getAttr('x'))
    if verbose > 0: print('p_i =', sol[:self.nbi], '\nq_j =', sol[self.nbi:(self.nbi+self.nbj)])
    return {'p_i': sol[:self.nbi], 'q_j': sol[self.nbi:(self.nbi+self.nbj)],
            'val1': sol[-2], 'val2': sol[-1]}

Bimatrix_game.mangasarian_stone_solve = Bimatrix_game_mangasarian_stone_solve

In [10]:
penalty_nonzero_sum.mangasarian_stone_solve()

{'p_i': array([0.3374337 , 0.24917907, 0.41338723]),
 'q_j': array([0.21908541, 0.10161508, 0.67929951]),
 'val1': 82.62606457928057,
 'val2': 36.37698012002112}

In [11]:
penalty_zero_sum.mangasarian_stone_solve()

{'p_i': array([0.3033884 , 0.15180727, 0.54480433]),
 'q_j': array([0.21908541, 0.10161508, 0.67929951]),
 'val1': 82.62606457928057,
 'val2': 17.373935420719445}

Notice that the goalkeeper (player 2) keeps the same strategy, and his payoff increases.  
Conversely, the kicker (player 1) changes strategy, but his payoff stays the same. Why?

## Lemke–Howson algorithm

The Lemke–Howson algorithm was developed in 1964 to solve bimatrix games. 
It was later extended by Lemke to solve LCPs (see previous lecture).

This algorithm finds a non-trivial solution to the abridged representation LCP seen above:

$0 \leq x \perp 1_{\mathcal I} - A y \geq 0$

$0 \leq y \perp 1_{\mathcal J} - B^\top x \geq 0$.

Write $z = \binom{x}{y}$ and define
\begin{equation}
    C =
    \begin{pmatrix}
    0_{\mathcal I \times \mathcal I} & A \\
    B^\top & 0_{\mathcal J \times \mathcal J}
    \end{pmatrix}.
\end{equation}
Then the problem is to find $z$ and $w$ such that

\begin{equation}
    z \geq 0,
    \qquad
    w \geq 0,
    \qquad
    w + Cz = 1_{\mathcal I + \mathcal J},
    \qquad
    z_k w_k = 0 \quad (\forall k).
\end{equation}

Because the right-hand vector in $w + Cz = 1_{\mathcal I + \mathcal J}$ is positive, we can begin by setting the basic variables as $(w_1, \dots, w_{\mathcal I + \mathcal J})$, and the nonbasic variables as $(z_1, \dots, z_{\mathcal I + \mathcal J})$.
This is a complemetary basis, in the sense that it includes exactly one variable from the pair $(z_k, w_k)$ for all $k$.
Its associated solution is the trivial one, $z = 0$ (which is the only solution not corresponding to a Nash equilibrium).

We use the equation $w + Cz = 1_{\mathcal I + \mathcal J}$ to build an initial tableau associated with our LCP. 
We then choose an arbitrary variable $z_k$ to enter the basis, and we perform complementary pivot operations until we find another complementary basis.
This end basis will necessarily correspond to a nontrivial solution.

*Question.* Why can't this algorithm end with ray termination?


---

*Lemke–Howson algorithm*

**Step 0.**
Choose some nonbasic variable, for instance $z_1$, which enters the basis.

**Step 1.**
Determine the departing variable using the minimum-ratio rule in the current tableau, and update the tableau.  
Now, check if the new basis is complementary:
if yes, we have found a solution;
if not, update the tableau for the new basis and go to step 2.

**Step 2.**
The new entering variable is the complement of the one which just left. 
Go back to step 1.

---

Note that since we maintain almost-complementarity of the basis throughout, it is enough to check that:  
(i) the complement of the variable which just departed *is* in the basis,  
(ii) the complement of the variable which just entered *is not* in the basis.

*Remark.* The Lemke–Howson algorithm actually works for LCPs not only associated with bimatrix games, but for any matrix $C$ which is nonnegative with at least one positive entry per column, and a nonnegative right-hand vector.

In [12]:
def Bimatrix_game_is_nondegenerate(self):
    Aprime = np.block( [[self.A_i_j,np.ones((self.nbi,1))], [-np.eye(self.nbj),np.zeros((self.nbj,1))]] )
    Bprime = np.block( [[-np.eye(self.nbi),self.B_i_j], [np.zeros((1,self.nbi)),np.ones((1,self.nbj))]] )
    return (np.linalg.matrix_rank(Aprime) == min(Aprime.shape)) and (np.linalg.matrix_rank(Bprime) == min(Bprime.shape))

Bimatrix_game.is_nondegenerate = Bimatrix_game_is_nondegenerate

In [13]:
penalty_nonzero_sum.is_nondegenerate()

True

In [14]:
from mec.lp import Tableau

def Bimatrix_game_lemke_howson_solve(self,verbose = 0):
    A_i_j = self.A_i_j - np.min(self.A_i_j) + 1     # ensures that matrices are positive
    B_i_j = self.B_i_j - np.min(self.B_i_j) + 1
    zks = ['x_' + str(i+1) for i in range(self.nbi)] + ['y_' + str(j+1) for j in range(self.nbj)]
    wks = ['r_' + str(i+1) for i in range(self.nbi)] + ['s_' + str(j+1) for j in range(self.nbj)]
    complements = list(len(zks)+np.arange(len(zks))) + list(np.arange(len(zks)))
    C_k_l = np.block([[np.zeros((self.nbi, self.nbi)), A_i_j],
                      [B_i_j.T, np.zeros((self.nbj, self.nbj))]])
    tab = Tableau(C_k_l, np.ones(self.nbi + self.nbj), np.zeros(self.nbi + self.nbj), wks, zks)
    kent = len(wks) # z_1 enters
    while True:
        kdep = tab.determine_departing(kent)
        if verbose > 1:
            print('Basis: ', [(wks+zks)[i] for i in tab.k_b])
            print((wks+zks)[kent], 'enters,', (wks+zks)[kdep], 'departs')
        tab.update(kent, kdep)
        if (complements[kent] not in tab.k_b) and (complements[kdep] in tab.k_b):
            break
        else:
            kent = complements[kdep]
    z_k, _, _ = tab.solution() # solution() returns: x_j, y_i, x_j @ self.c_j
    x_i, y_j = z_k[:self.nbi], z_k[self.nbi:]
    α = 1 / y_j.sum()
    β = 1 / x_i.sum()
    p_i = x_i * β
    q_j = y_j * α
    return {'p_i': p_i, 'q_j': q_j,
            'val1': α + np.min(self.A_i_j) - 1,
            'val2': β + np.min(self.B_i_j) - 1}

Bimatrix_game.lemke_howson_solve = Bimatrix_game_lemke_howson_solve

In [15]:
penalty_nonzero_sum.lemke_howson_solve(verbose=2)

Basis:  ['r_1', 'r_2', 'r_3', 's_1', 's_2', 's_3']
x_1 enters, s_1 departs
Basis:  ['r_1', 'r_2', 'r_3', 'x_1', 's_2', 's_3']
y_1 enters, r_3 departs
Basis:  ['r_1', 'r_2', 'y_1', 'x_1', 's_2', 's_3']
x_3 enters, s_3 departs
Basis:  ['r_1', 'r_2', 'y_1', 'x_1', 's_2', 'x_3']
y_3 enters, r_2 departs
Basis:  ['r_1', 'y_3', 'y_1', 'x_1', 's_2', 'x_3']
x_2 enters, s_2 departs
Basis:  ['r_1', 'y_3', 'y_1', 'x_1', 'x_2', 'x_3']
y_2 enters, r_1 departs


{'p_i': array([0.3374337 , 0.24917907, 0.41338723]),
 'q_j': array([0.21908541, 0.10161508, 0.67929951]),
 'val1': 82.62606457928055,
 'val2': 36.37698012002113}

To compare with the solution from Mangasarian–Stone:

In [16]:
penalty_nonzero_sum.mangasarian_stone_solve()

{'p_i': array([0.3374337 , 0.24917907, 0.41338723]),
 'q_j': array([0.21908541, 0.10161508, 0.67929951]),
 'val1': 82.62606457928057,
 'val2': 36.37698012002112}

# Benchmarking

## Pritchard's example

In [17]:
# example from David Pritchard (EPFL)'s lecture notes. 
pritchard_ex = Bimatrix_game(np.array([[1,3,0],[0,0,2],[2,1,1]]),
                             np.array([[2,1,0],[1,3,1],[0,0,3]])) 

In [18]:
pritchard_ex.lemke_howson_solve(verbose=2)

Basis:  ['r_1', 'r_2', 'r_3', 's_1', 's_2', 's_3']
x_1 enters, s_1 departs
Basis:  ['r_1', 'r_2', 'r_3', 'x_1', 's_2', 's_3']
y_1 enters, r_3 departs
Basis:  ['r_1', 'r_2', 'y_1', 'x_1', 's_2', 's_3']
x_3 enters, s_3 departs
Basis:  ['r_1', 'r_2', 'y_1', 'x_1', 's_2', 'x_3']
y_3 enters, r_2 departs
Basis:  ['r_1', 'y_3', 'y_1', 'x_1', 's_2', 'x_3']
x_2 enters, s_2 departs
Basis:  ['r_1', 'y_3', 'y_1', 'x_1', 'x_2', 'x_3']
y_2 enters, r_1 departs


{'p_i': array([0.46153846, 0.23076923, 0.30769231]),
 'q_j': array([0.11111111, 0.33333333, 0.55555556]),
 'val1': 1.1111111111111112,
 'val2': 1.1538461538461537}

In [19]:
pritchard_ex.mangasarian_stone_solve()

{'p_i': array([0.46153846, 0.23076923, 0.30769231]),
 'q_j': array([0.11111111, 0.33333333, 0.55555556]),
 'val1': 1.1111111111111112,
 'val2': 1.1538461538461537}

## Random example

In [20]:
I, J = 5, 4
random_game = Bimatrix_game(A_i_j = 100*np.random.rand(I,J),
                            B_i_j = 100*np.random.rand(I,J))

In [21]:
random_game.mangasarian_stone_solve() 

{'p_i': array([1., 0., 0., 0., 0.]),
 'q_j': array([1., 0., 0., 0.]),
 'val1': 93.72236043429095,
 'val2': 89.56188907069242}

In [22]:
sol = random_game.lemke_howson_solve(verbose=2)
sol

Basis:  ['r_1', 'r_2', 'r_3', 'r_4', 'r_5', 's_1', 's_2', 's_3', 's_4']
x_1 enters, s_1 departs
Basis:  ['r_1', 'r_2', 'r_3', 'r_4', 'r_5', 'x_1', 's_2', 's_3', 's_4']
y_1 enters, r_1 departs


{'p_i': array([1., 0., 0., 0., 0.]),
 'q_j': array([1., 0., 0., 0.]),
 'val1': 93.72236043429095,
 'val2': 89.56188907069243}

In [23]:
def Bimatrix_game_is_NashEq(self, p_i, q_j, tol=1e-5):
        for i in range(self.nbi):
            if np.eye(self.nbi)[i] @ self.A_i_j @ q_j > p_i @ self.A_i_j @ q_j + tol:
                print('Pure strategy i =', i, 'beats p_i.')
                return False
        for j in range(self.nbj):
            if p_i @ self.B_i_j @ np.eye(self.nbj)[j] > p_i @ self.B_i_j @ q_j + tol:
                print('Pure strategy j =', j, 'beats q_j.')
                return False
        return True

Bimatrix_game.is_NashEq = Bimatrix_game_is_NashEq

In [24]:
random_game.is_NashEq(sol['p_i'], sol['q_j'])

True