Skip to content

Latest commit

 

History

History
322 lines (222 loc) · 9.88 KB

best-responses.rst

File metadata and controls

322 lines (222 loc) · 9.88 KB

Best responses

Motivating example: Best Responses in Matching Pennies

Considering the game matching-pennies:

$$\begin{aligned} A = \begin{pmatrix} 1 & -1\\\ -1 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} -1 & 1\\\ 1 & -1 \end{pmatrix} \end{aligned}$$

If the row player knows that the column player is playing the strategy <strategies-discussion> σc = (0, 1) the utility of the row player is maximised by playing σr = (0, 1).

In this case σr is referred to as a best response to σc.

Alternatively, if the column player knows that the row player is playing the strategy <strategies-discussion> σr = (0, 1) the column player's best response is σc = (1, 0).

Definition of a best response in a normal form game

In a two player game (A, B) ∈ ℝm × n2 a strategy σr* of the row player is a best response to a column players' strategy σc if and only if:


σr* = argmaxσr ∈ 𝒮1σrAσcT.

Where 𝒮1 denotes the space of all strategies<definition-of-strategy-spaces-in-normal-form-games> for the first player.

Similarly a mixed strategy σc* of the column player is a best response to a row players' strategy σr if and only if:


σc* = argmaxσc ∈ 𝒮2σrBσcT.

Question

For the Prisoners Dilemma <prisoners-dilemma>:

What is the row player's best response to either of the actions of the column player?

Answer

Recalling that A is given by:

$$\begin{aligned} A = \begin{pmatrix} 3 & 0\\\ 5 & 1 \end{pmatrix} \end{aligned}$$

Against the first action of the column player the best response is to choose the second action which gives a utility of 5. This can be expressed as:


argmaxi ∈ 𝒮1Ai1 = 2

Against the second action of the column player the best response is to choose the second action which gives a utility of 1. This can be expressed as:


argmaxi ∈ 𝒮1Ai2 = 2

The row player's best response to either of the actions of the column player is σr* = (1, 0). This can be expressed as:


argmaxi ∈ 𝒮1Aij = 2 for all j ∈ 𝒜2

Generic best responses in 2 by 2 games

In two player normal form games with |A1| = |A2| = 2: a 2 by 2 game, the utility of a row player playing σr = (x, 1 − x) against a strategy σc = (y, 1 − y) is linear in x:

$$\begin{aligned} u_r(\sigma_r, \sigma_c) &= (x, 1 - x) A (y, 1 - y) ^T \\\ &= A_{11}xy + A_{12}x(1-y) + A_{21}(1-x)y + A_{22}(1-x)(1-y) \\\ &= a x + b \end{aligned}$$

where:

$$\begin{aligned} a &= A_{11}y + A_{12}(1 - y) - A_{21}y - A_{22}(1 - y)\\\ b &= A_{21}y + A_{22}(1 - y) \end{aligned}$$

This observation allows us to obtain the best response σr* against any σc = (y, 1 − y).

For example, consider matching-pennies. Below is a plot of ur(σr, σc) as a function of y for σr ∈ {(1, 0), (0, 1)}.

import matplotlib.pyplot as plt import nashpy as nash import numpy as np

A = np.array([[1, -1], [-1, 1]]) game = nash.Game(A) ys = [0, 1] sigma_rs = [(1, 0), (0, 1)] u_rs = [[game[sigma_r, (y, 1 - y)][0] for y in ys] for sigma_r in sigma_rs] plt.plot(ys, u_rs[0], label="$(Asigma_c^T)_1$") plt.plot(ys, u_rs[1], label="$(Asigma_c^T)_2$") plt.xlabel("$sigma_c=(y, 1-y)$") plt.title("Utility to row player") plt.legend()

Given that the utilities in both cases are linear, the best response to any value of y ≠ 1/2 is either (1, 0) or (0, 1. The best response σr* is given by:

$$\begin{aligned} \sigma_r ^* = \begin{cases} (1, 0),& \text{ if } y > 1/2\\\ (0, 1),& \text{ if } y < 1/2\\\ \text{indifferent},& \text{ if } y=1/2 \end{cases} \end{aligned}$$

Question

For the matching-pennies game:

What is the column player's best response as a function of x where σr = (x, 1 − x).

Answer

Recalling that B is given by:

$$\begin{aligned} B = \begin{pmatrix} -1 & 1\\\ 1 & -1 \end{pmatrix} \end{aligned}$$

This gives:

$$\begin{aligned} u_c(\sigma_r, (1, 0)) =& -x + (1-x)= 1 - 2x\\\ =& x - (1-x)= -1 + 2x \end{aligned}$$

Here is a plot of the utilities:

import matplotlib.pyplot as plt import nashpy as nash

xs = np.array([0, 1]) u_cs = [1 - 2 * xs, - 1 + 2 * xs] plt.plot(xs, u_cs[0], label="$(sigma_rB)_1$") plt.plot(xs, u_cs[1], label="$(sigma_rB)_2$") plt.xlabel("$sigma_r=(x, 1-x)$") plt.title("Utility to column player") plt.legend()

General condition for a best response

In a two player game (A, B) ∈ ℝm × n2 a strategy σr* of the row player is a best response to a column players' strategy σc if and only if:


σr*i > 0 ⇒ (AσcT)i = maxk ∈ 𝒜2(AσcT)k for all i ∈ 𝒜1

Proof

(AσcT)i is the utility of the row player when they play their ith action. Thus:

$$\sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i$$

Let u = maxk(AσcT)k giving:

$$\begin{aligned} \sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\\ &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\\ &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i) \end{aligned}$$

We know that u − (AσcT)i ≥ 0, thus the largest σrAσcT can be is u which occurs if and only if σri > 0 ⇒ (AσcT)i = u as required.

Question

For the Rock Paper Scissors <motivating-example-strategy-for-rps> game:

Which of the following pairs of strategies are best responses to each other:

  1. σr = (0, 0, 1) and σc = (0, 1/2, 1/2)
  2. σr = (1/3, 1/3, 1/3) and σc = (0, 1/2, 1/2)
  3. σr = (1/3, 1/3, 1/3) and σc = (1/3, 1/3, 1/3)

Answer

Recalling that A and B are given by:

$$\begin{aligned} A = \begin{pmatrix} 0 & -1 & 1 \\\ 1 & 0 & -1\\\ -1 & 1 & 0\\\ \end{pmatrix} \end{aligned}$$

$$\begin{aligned} B = - A = \begin{pmatrix} 0 & 1 & -1 \\\ -1 & 0 & 1\\\ 1 & -1 & 0\\\ \end{pmatrix} \end{aligned}$$

We can apply the best response condition to each pairs of strategies:

  1. $A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}$. max(AσcT) = 1/2. The only i for which σri > 0 is i = 3 and (AσcT)3 = max(AσcT) thus σr is a best response to σc. σrB = (1,  − 1, 0). max(σrB) = 1. The values of i for which σci > 0 are i = 2 and i = 3 but (σrB)2 ≠ max(σrB) thus σc is not a best response to σr.
  2. $A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}$. max(AσcT) = 1/2. The values of i for which σri > 0 are i = 1, i = 2 and i = 3 however, (AσcT)2 ≠ max(AσcT) thus σr is not a best response to σc. σrB = (0, 0, 0). max(σrB) = 0. The values of i for which σci > 0 are i = 2 and i = 3 and (σrB)2 = (σrB)3 = max(σrB) thus σc is a best response to σr.
  3. $A\sigma_c^T = \begin{pmatrix}0\\ 0\\ 0\\\end{pmatrix}$. max(AσcT) = 0. The values of i for which σri > 0 are i = 1, i = 2 and i = 3 and (AσcT)1 = (AσcT)2 = (AσcT)3 = max(AσcT) thus σr is a best response to σc. σrB = (0, 0, 0). max(σrB) = 0. The values of i for which σci > 0 are i = 1, i = 2 and i = 3 and (σrB)1 = (σrB)2 = (σrB)3 = max(σrB) thus σc is a best response to σr.

Definition of Nash equilibrium

In a two player game (A, B) ∈ ℝm × n2, (σr, σc) is a Nash equilibria if σr is a best response to σc and σc is a best response to σr.

Using Nashpy

See how-to-check-best-responses for guidance of how to use Nashpy to check if a strategy is a best response.