Best responses

Motivating example: Best Responses in Matching Pennies

Considering the game matching-pennies:

$$\begin{aligned} A = \begin{pmatrix} 1 & -1\\\ -1 & 1 \end{pmatrix} \qquad B = \begin{pmatrix} -1 & 1\\\ 1 & -1 \end{pmatrix} \end{aligned}$$

If the row player knows that the column player is playing the strategy <strategies-discussion> σ_c = (0, 1) the utility of the row player is maximised by playing σ_r = (0, 1).

In this case σ_r is referred to as a best response to σ_c.

Alternatively, if the column player knows that the row player is playing the strategy <strategies-discussion> σ_r = (0, 1) the column player's best response is σ_c = (1, 0).

Definition of a best response in a normal form game

In a two player game (A, B) ∈ ℝ^m × n² a strategy σ_r^* of the row player is a best response to a column players' strategy σ_c if and only if:

σ_r^* = argmax_{σ_r ∈ 𝒮₁}σ_rAσ_c^T.

Where 𝒮₁ denotes the space of all strategies<definition-of-strategy-spaces-in-normal-form-games> for the first player.

Similarly a mixed strategy σ_c^* of the column player is a best response to a row players' strategy σ_r if and only if:

σ_c^* = argmax_{σ_c ∈ 𝒮₂}σ_rBσ_c^T.

Question

For the Prisoners Dilemma <prisoners-dilemma>:

What is the row player's best response to either of the actions of the column player?

Answer

Recalling that A is given by:

$$\begin{aligned} A = \begin{pmatrix} 3 & 0\\\ 5 & 1 \end{pmatrix} \end{aligned}$$

Against the first action of the column player the best response is to choose the second action which gives a utility of 5. This can be expressed as:

argmax_{i ∈ 𝒮₁}A_i1 = 2

Against the second action of the column player the best response is to choose the second action which gives a utility of 1. This can be expressed as:

argmax_{i ∈ 𝒮₁}A_i2 = 2

The row player's best response to either of the actions of the column player is σ_r^* = (1, 0). This can be expressed as:

argmax_{i ∈ 𝒮₁}A_ij = 2 for all j ∈ 𝒜₂

Generic best responses in 2 by 2 games

In two player normal form games with |A₁| = |A₂| = 2: a 2 by 2 game, the utility of a row player playing σ_r = (x, 1 − x) against a strategy σ_c = (y, 1 − y) is linear in x:

$$\begin{aligned} u_r(\sigma_r, \sigma_c) &= (x, 1 - x) A (y, 1 - y) ^T \\\ &= A_{11}xy + A_{12}x(1-y) + A_{21}(1-x)y + A_{22}(1-x)(1-y) \\\ &= a x + b \end{aligned}$$

where:

$$\begin{aligned} a &= A_{11}y + A_{12}(1 - y) - A_{21}y - A_{22}(1 - y)\\\ b &= A_{21}y + A_{22}(1 - y) \end{aligned}$$

This observation allows us to obtain the best response σ_r^* against any σ_c = (y, 1 − y).

For example, consider matching-pennies. Below is a plot of u_r(σ_r, σ_c) as a function of y for σ_r ∈ {(1, 0), (0, 1)}.

import matplotlib.pyplot as plt import nashpy as nash import numpy as np

A = np.array([[1, -1], [-1, 1]]) game = nash.Game(A) ys = [0, 1] sigma_rs = [(1, 0), (0, 1)] u_rs = [[game[sigma_r, (y, 1 - y)][0] for y in ys] for sigma_r in sigma_rs] plt.plot(ys, u_rs[0], label="$(Asigma_c^T)_1$") plt.plot(ys, u_rs[1], label="$(Asigma_c^T)_2$") plt.xlabel("$sigma_c=(y, 1-y)$") plt.title("Utility to row player") plt.legend()

Given that the utilities in both cases are linear, the best response to any value of y ≠ 1/2 is either (1, 0) or (0, 1. The best response σ_r^* is given by:

$$\begin{aligned} \sigma_r ^* = \begin{cases} (1, 0),& \text{ if } y > 1/2\\\ (0, 1),& \text{ if } y < 1/2\\\ \text{indifferent},& \text{ if } y=1/2 \end{cases} \end{aligned}$$

Question

For the matching-pennies game:

What is the column player's best response as a function of x where σ_r = (x, 1 − x).

Answer

Recalling that B is given by:

$$\begin{aligned} B = \begin{pmatrix} -1 & 1\\\ 1 & -1 \end{pmatrix} \end{aligned}$$

This gives:

$$\begin{aligned} u_c(\sigma_r, (1, 0)) =& -x + (1-x)= 1 - 2x\\\ =& x - (1-x)= -1 + 2x \end{aligned}$$

Here is a plot of the utilities:

import matplotlib.pyplot as plt import nashpy as nash

xs = np.array([0, 1]) u_cs = [1 - 2 * xs, - 1 + 2 * xs] plt.plot(xs, u_cs[0], label="$(sigma_rB)_1$") plt.plot(xs, u_cs[1], label="$(sigma_rB)_2$") plt.xlabel("$sigma_r=(x, 1-x)$") plt.title("Utility to column player") plt.legend()

General condition for a best response

In a two player game (A, B) ∈ ℝ^m × n² a strategy σ_r^* of the row player is a best response to a column players' strategy σ_c if and only if:

σ_r^*_i > 0 ⇒ (Aσ_c^T)_i = max_{k ∈ 𝒜₂}(Aσ_c^T)_k for all i ∈ 𝒜₁

Proof

(Aσ_c^T)_i is the utility of the row player when they play their i^th action. Thus:

$$\sigma_rA\sigma_c^T=\sum_{i=1}^{m}{\sigma_r}_i(A\sigma_c^T)_i$$

Let u = max_k(Aσ_c^T)_k giving:

$$\begin{aligned} \sigma_rA\sigma_c^T&=\sum_{i=1}^{m}{\sigma_r}_i(u - u + (A\sigma_c^T)_i)\\\ &=\sum_{i=1}^{m}{\sigma_r}_iu - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i)\\\ &=u - \sum_{i=1}^{m}{\sigma_r}_i(u - (A\sigma_c^T)_i) \end{aligned}$$

We know that u − (Aσ_c^T)_i ≥ 0, thus the largest σ_rAσ_c^T can be is u which occurs if and only if σ_r_i > 0 ⇒ (Aσ_c^T)_i = u as required.

Question

For the Rock Paper Scissors <motivating-example-strategy-for-rps> game:

Which of the following pairs of strategies are best responses to each other:

σ_r = (0, 0, 1) and σ_c = (0, 1/2, 1/2)
σ_r = (1/3, 1/3, 1/3) and σ_c = (0, 1/2, 1/2)
σ_r = (1/3, 1/3, 1/3) and σ_c = (1/3, 1/3, 1/3)

Answer

Recalling that A and B are given by:

$$\begin{aligned} A = \begin{pmatrix} 0 & -1 & 1 \\\ 1 & 0 & -1\\\ -1 & 1 & 0\\\ \end{pmatrix} \end{aligned}$$

$$\begin{aligned} B = - A = \begin{pmatrix} 0 & 1 & -1 \\\ -1 & 0 & 1\\\ 1 & -1 & 0\\\ \end{pmatrix} \end{aligned}$$

We can apply the best response condition to each pairs of strategies:

$A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}$. max(Aσ_c^T) = 1/2. The only i for which σ_r_i > 0 is i = 3 and (Aσ_c^T)₃ = max(Aσ_c^T) thus σ_r is a best response to σ_c. σ_rB = (1, − 1, 0). max(σ_rB) = 1. The values of i for which σ_c_i > 0 are i = 2 and i = 3 but (σ_rB)₂ ≠ max(σ_rB) thus σ_c is not a best response to σ_r.
$A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}$. max(Aσ_c^T) = 1/2. The values of i for which σ_r_i > 0 are i = 1, i = 2 and i = 3 however, (Aσ_c^T)₂ ≠ max(Aσ_c^T) thus σ_r is not a best response to σ_c. σ_rB = (0, 0, 0). max(σ_rB) = 0. The values of i for which σ_c_i > 0 are i = 2 and i = 3 and (σ_rB)₂ = (σ_rB)₃ = max(σ_rB) thus σ_c is a best response to σ_r.
$A\sigma_c^T = \begin{pmatrix}0\\ 0\\ 0\\\end{pmatrix}$. max(Aσ_c^T) = 0. The values of i for which σ_r_i > 0 are i = 1, i = 2 and i = 3 and (Aσ_c^T)₁ = (Aσ_c^T)₂ = (Aσ_c^T)₃ = max(Aσ_c^T) thus σ_r is a best response to σ_c. σ_rB = (0, 0, 0). max(σ_rB) = 0. The values of i for which σ_c_i > 0 are i = 1, i = 2 and i = 3 and (σ_rB)₁ = (σ_rB)₂ = (σ_rB)₃ = max(σ_rB) thus σ_c is a best response to σ_r.

Definition of Nash equilibrium

In a two player game (A, B) ∈ ℝ^m × n², (σ_r, σ_c) is a Nash equilibria if σ_r is a best response to σ_c and σ_c is a best response to σ_r.

Using Nashpy

See how-to-check-best-responses for guidance of how to use Nashpy to check if a strategy is a best response.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

best-responses.rst

best-responses.rst

Best responses

Motivating example: Best Responses in Matching Pennies

Definition of a best response in a normal form game

Generic best responses in 2 by 2 games

General condition for a best response

Proof

Definition of Nash equilibrium

Using Nashpy

Files

best-responses.rst

Latest commit

History

best-responses.rst

File metadata and controls

Best responses

Motivating example: Best Responses in Matching Pennies

Definition of a best response in a normal form game

Generic best responses in 2 by 2 games

General condition for a best response

Proof

Definition of Nash equilibrium

Using Nashpy