Skip to content

Commit

Permalink
Implement lp formulation of minimax theorem (#217)
Browse files Browse the repository at this point in the history
* Write formulation of LP in docs.

* Write functions to build components of LP

* Implement LP formulation

* Add more tests and documentation.

* Add how to documentation.

Closes #23
  • Loading branch information
drvinceknight committed Aug 1, 2023
1 parent be4530a commit f215795
Show file tree
Hide file tree
Showing 8 changed files with 369 additions and 0 deletions.
1 change: 1 addition & 0 deletions docs/how-to/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ How to:
create-a-game.rst
calculate-utilities.rst
check-best-responses.rst
use-minimax.rst
solve-with-support-enumeration.rst
solve-with-vertex-enumeration.rst
solve-with-lemke-howson.rst
Expand Down
29 changes: 29 additions & 0 deletions docs/how-to/use-minimax.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
.. _how-to-use-minimax:

Use the minimax theorem
=======================

One of the algorithms implemented in :code:`Nashpy` is based on :ref:`the
minimax theorem <the-minimax-theorem>`, this is implemented as a
method on the :code:`Game` class::

>>> import nashpy as nash
>>> import numpy as np
>>> A = np.array([[1, -1], [-1, 1]])
>>> matching_pennies = nash.Game(A)

This returns the Nash equilibria by solving the underlying :ref:`linear program
<formulation-of-linear-program>`::

>>> matching_pennies.linear_program()
(array([0.5, 0.5]), array([0.5, 0.5]))

Note that this is only defined for :ref:`Zero sum games <zero-sum-games>`::

>>> A = np.array([[1, -1], [-1, 1]])
>>> B = np.array([[2, -2], [-2, 2]])
>>> game = nash.Game(A, B)
>>> game.linear_program()
Traceback (most recent call last):
...
ValueError: The Linear Program corresponding to the minimax theorem is defined only for Zero Sum games.
1 change: 1 addition & 0 deletions docs/text-book/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ Nashpy Game Theory Text book
normal-form-games.rst
strategies.rst
best-responses.rst
zero-sum-games.rst
support-enumeration.rst
vertex-enumeration.rst
extensive-form-games.rst
Expand Down
87 changes: 87 additions & 0 deletions docs/text-book/zero-sum-games.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
.. _zero-sum-games:

Zero Sum Games
==============

.. _motivating-example-zero-sum-games:

Motivating example: changing Rock Paper Scissors
------------------------------------------------

If we modify the game of :ref:`Rock Paper Scissora
<motivating-example-strategy-for-rps>` to add a single new strategy "Spock":

- Spock smashes Scissors and Rock.
- Paper disproves Spock.

The other modification is that the game is no longer symmetric: only the row
player can use the Well.

The mathematical representation of this game is given by:

.. math::
A = \begin{pmatrix}
0 & -1 & 1 \\
1 & 0 & -1\\
-1 & 1 & 0\\
1 & -1 & 1
\end{pmatrix}
Is there a way that the Row player can play that guarantees a particular
expected proportion of wins?

Optimising worst case outcomes
------------------------------

The value of a game
-------------------

.. _formulation-of-linear-program:

Formulation of the linear program
---------------------------------

In a :ref:`Zero Sum Game <definition-of-zero-sum-game>`, given a row player
payoff matrix :math:`A` with :math:`m` rows and :math:`n` columns, the following
linear programme will give a strategy for the row player that ensures the best
possible utility as well as the value of the game:

.. math::
\max_{x\in\mathbb{R}^{(m + 1)\times 1}} cx
Subject to:

.. math::
\begin{align}
M_{\text{ub}}x &\leq b_{\text{ub}} \\
M_{\text{eq}}x &= b_{\text{eq}} \\
x_i &\geq 0&&\text{ for }i\leq m
\end{align}
Where the parameters of the linear programme are defined by:

.. math::
\begin{align}
c &= (\underbrace{0, \dots, 0}_{m}, 1) && c\in\{0, 1\}^{1 \times (m + 1)}\\
M_{\text{ub}} &= \begin{pmatrix}(-A^T)_{11}&\dots&(-A^T)_{1m}&1\\
\vdots &\ddots&\vdots &1\\
(-A^T)_{n1}&\dots&(-A^T)_{nm}&1\end{pmatrix} && M\in\mathbb{R}^{n\times (m + 1)}\\
b_{\text{ub}} &= (\underbrace{0, \dots, 0}_{n})^T && b_{\text{ub}}\in\{0\}^{n\times 1}\\
M_{\text{eq}} &= (\underbrace{1, \dots, 1}_{m}, 0) && M_{\text{eq}}\in\{0, 1\}^{1\times(m + 1)}\\
b_{\text{eq}} &= 1 \\
\end{align}
.. _the-minimax-theorem:

The minimax theorem
-------------------

Using Nashpy
------------

See :ref:`how-to-use-minimax` for guidance of how to use Nashpy to
find the Nash equilibria of Zero sum games using the mini max theorem.
24 changes: 24 additions & 0 deletions src/nashpy/game.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
from .algorithms.lemke_howson import lemke_howson
from .algorithms.support_enumeration import support_enumeration
from .algorithms.vertex_enumeration import vertex_enumeration
from .linalg.minimax import linear_program
from .egt.moran_process import moran_process, fixation_probabilities
from .learning.fictitious_play import fictitious_play
from .learning.replicator_dynamics import (
Expand Down Expand Up @@ -398,3 +399,26 @@ def fixation_probabilities(
interaction_graph_adjacency_matrix=interaction_graph_adjacency_matrix,
replacement_stochastic_matrix=replacement_stochastic_matrix,
)

def linear_program(self):
"""
Returns the Nash Equilibrium for a zero sum game by solving the Linear
Program that corresponds to the minimax theorem.
Returns
-------
tuple
The Nash equilibria
Raises
------
ValueError
A value error is raised if the game is not zero sum
"""
if self.zero_sum is False:
raise ValueError(
"The Linear Program corresponding to the minimax theorem is defined only for Zero Sum games."
)
A, B = self.payoff_matrices
row_strategy = linear_program(row_player_payoff_matrix=A)
column_strategy = linear_program(row_player_payoff_matrix=B.T)
return row_strategy, column_strategy
134 changes: 134 additions & 0 deletions src/nashpy/linalg/minimax.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
"""
This module contains
"""
import numpy as np
import numpy.typing as npt
import scipy.optimize


def get_c(number_of_rows: int) -> npt.NDArray:
"""
Return the coefficient vector for the objective function of the LP that
corresponds to the minimax theorem.
Parameters
----------
number_of_rows : int
The number of rows in the payoff matrix
Returns
-------
array
A vector with m 0s followed by a single 1 where m is the number of rows
in the payoff matrix.
"""
c = np.zeros(shape=(number_of_rows + 1))
c[-1] = -1
return c


def get_A_ub(row_player_payoff_matrix: npt.NDArray) -> npt.NDArray:
"""
Return the upper bound linear matrix for the objective function of the LP
that corresponds to the minimax theorem.
Parameters
----------
row_player_payoff_matrix : array
The payoff matrix
Returns
-------
array
A matrix that corresponds to the upper bound.
"""
_, number_of_columns = row_player_payoff_matrix.shape
return np.hstack(
(-row_player_payoff_matrix.T, np.ones(shape=(number_of_columns, 1)))
)


def get_b_ub(number_of_columns: int) -> npt.NDArray:
"""
Return the upper bound vector for the LP that corresponds to the minimax
theorem.
Parameters
----------
number_of_columns : int
The number of columns in the payoff matrix
Returns
-------
array
A vector of zeros
"""
return np.zeros(shape=(number_of_columns, 1))


def get_A_eq(number_of_rows: int) -> npt.NDArray:
"""
Return the equality linear coefficients for the LP that corresponds to the
minimax theorem.
Parameters
----------
number_of_rows : int
The number of rows in the payoff matrix
Returns
-------
array
A vector with m 1s followed by a single 0 where m is the number of rows
in the payoff matrix.
"""
A_eq = np.ones(shape=(1, number_of_rows + 1))
A_eq[0, -1] = 0
return A_eq


def get_bounds(number_of_rows: int) -> list:
"""
Return the bounds for each variable the LP that corresponds to the
minimax theorem.
Parameters
----------
number_of_rows : int
The number of rows in the payoff matrix
Returns
-------
list
A list of tuples, each tuple contains the lower and upper bound for each
variable.
"""
return [(0, None) for _ in range(number_of_rows)] + [(None, None)]


def linear_program(row_player_payoff_matrix: npt.NDArray) -> npt.NDArray:
"""
The Linear Program that corresponds to the minimax theorem. This builds and
returns the row players' strategy.
Parameters
----------
row_player_payoff_matrix : array
The payoff matrix
Returns
-------
array
The row player maxmin strategy
"""
number_of_rows, number_of_columns = row_player_payoff_matrix.shape
c = get_c(number_of_rows=number_of_rows)
A_ub = get_A_ub(row_player_payoff_matrix=row_player_payoff_matrix)
b_ub = get_b_ub(number_of_columns=number_of_columns)
A_eq = get_A_eq(number_of_rows=number_of_rows)
b_eq = 1
bounds = get_bounds(number_of_rows=number_of_rows)

res = scipy.optimize.linprog(
c=c,
A_ub=A_ub,
b_ub=b_ub,
A_eq=A_eq,
b_eq=b_eq,
bounds=bounds,
)
return res.x[:-1]
15 changes: 15 additions & 0 deletions tests/unit/test_game.py
Original file line number Diff line number Diff line change
Expand Up @@ -638,3 +638,18 @@ def test_fixation_probabilities_seed_1(self):
assert probabilities == expected_probabilities

# TODO Add tests for graphs.

def test_linear_program_for_non_zero_sum_games(self):
A = np.array([[-1, 0], [-1, 1]])
B = np.array([[3, 0], [1, -1]])
g = nash.Game(A, B)
with pytest.raises(ValueError):
g.linear_program()

def test_linear_program_for_zero_sum_games(self):
A = np.array([[-1, 0], [-1, 1]])
B = np.array([[1, 0], [1, -1]])
g = nash.Game(A, B)
equilibria = g.linear_program()
expected_equilibria = (np.array([1, 0]), np.array([1, 0]))
assert np.array_equal(equilibria, expected_equilibria)

0 comments on commit f215795

Please sign in to comment.