Considering the game matching-pennies
:
If the row player knows that the column player is playing the strategy
<strategies-discussion>
σc = (0, 1) the utility of the row player is maximised by playing σr = (0, 1).
In this case σr is referred to as a best response to σc.
Alternatively, if the column player knows that the row player is playing the strategy <strategies-discussion>
σr = (0, 1) the column player's best response is σc = (1, 0).
In a two player game (A, B) ∈ ℝm × n2 a strategy σr* of the row player is a best response to a column players' strategy σc if and only if:
σr* = argmaxσr ∈ 𝒮1σrAσcT.
Where 𝒮1 denotes the space of all
strategies<definition-of-strategy-spaces-in-normal-form-games>
for the first player.
Similarly a mixed strategy σc* of the column player is a best response to a row players' strategy σr if and only if:
σc* = argmaxσc ∈ 𝒮2σrBσcT.
Question
For the Prisoners Dilemma <prisoners-dilemma>
:
What is the row player's best response to either of the actions of the column player?
Answer
Recalling that A is given by:
Against the first action of the column player the best response is to choose the second action which gives a utility of 5. This can be expressed as:
argmaxi ∈ 𝒮1Ai1 = 2
Against the second action of the column player the best response is to choose the second action which gives a utility of 1. This can be expressed as:
argmaxi ∈ 𝒮1Ai2 = 2
The row player's best response to either of the actions of the column player is σr* = (1, 0). This can be expressed as:
argmaxi ∈ 𝒮1Aij = 2 for all j ∈ 𝒜2
In two player normal form games with |A1| = |A2| = 2: a 2 by 2 game, the utility of a row player playing σr = (x, 1 − x) against a strategy σc = (y, 1 − y) is linear in x:
where:
This observation allows us to obtain the best response σr* against any σc = (y, 1 − y).
For example, consider matching-pennies
. Below is a plot of ur(σr, σc) as a function of y for σr ∈ {(1, 0), (0, 1)}.
import matplotlib.pyplot as plt import nashpy as nash import numpy as np
A = np.array([[1, -1], [-1, 1]]) game = nash.Game(A) ys = [0, 1] sigma_rs = [(1, 0), (0, 1)] u_rs = [[game[sigma_r, (y, 1 - y)][0] for y in ys] for sigma_r in sigma_rs] plt.plot(ys, u_rs[0], label="$(Asigma_c^T)_1$") plt.plot(ys, u_rs[1], label="$(Asigma_c^T)_2$") plt.xlabel("$sigma_c=(y, 1-y)$") plt.title("Utility to row player") plt.legend()
Given that the utilities in both cases are linear, the best response to any value of y ≠ 1/2 is either (1, 0) or (0, 1. The best response σr* is given by:
Question
For the matching-pennies
game:
What is the column player's best response as a function of x where σr = (x, 1 − x).
Answer
Recalling that B is given by:
This gives:
Here is a plot of the utilities:
import matplotlib.pyplot as plt import nashpy as nash
xs = np.array([0, 1]) u_cs = [1 - 2 * xs, - 1 + 2 * xs] plt.plot(xs, u_cs[0], label="$(sigma_rB)_1$") plt.plot(xs, u_cs[1], label="$(sigma_rB)_2$") plt.xlabel("$sigma_r=(x, 1-x)$") plt.title("Utility to column player") plt.legend()
In a two player game (A, B) ∈ ℝm × n2 a strategy σr* of the row player is a best response to a column players' strategy σc if and only if:
σr*i > 0 ⇒ (AσcT)i = maxk ∈ 𝒜2(AσcT)k for all i ∈ 𝒜1
(AσcT)i is the utility of the row player when they play their ith action. Thus:
Let u = maxk(AσcT)k giving:
We know that u − (AσcT)i ≥ 0, thus the largest σrAσcT can be is u which occurs if and only if σri > 0 ⇒ (AσcT)i = u as required.
Question
For the Rock Paper Scissors <motivating-example-strategy-for-rps>
game:
Which of the following pairs of strategies are best responses to each other:
- σr = (0, 0, 1) and σc = (0, 1/2, 1/2)
- σr = (1/3, 1/3, 1/3) and σc = (0, 1/2, 1/2)
- σr = (1/3, 1/3, 1/3) and σc = (1/3, 1/3, 1/3)
Answer
Recalling that A and B are given by:
We can apply the best response condition to each pairs of strategies:
- $A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}$. max(AσcT) = 1/2. The only i for which σri > 0 is i = 3 and (AσcT)3 = max(AσcT) thus σr is a best response to σc. σrB = (1, − 1, 0). max(σrB) = 1. The values of i for which σci > 0 are i = 2 and i = 3 but (σrB)2 ≠ max(σrB) thus σc is not a best response to σr.
- $A\sigma_c^T = \begin{pmatrix}0\\ -1/2\\ 1/2\\\end{pmatrix}$. max(AσcT) = 1/2. The values of i for which σri > 0 are i = 1, i = 2 and i = 3 however, (AσcT)2 ≠ max(AσcT) thus σr is not a best response to σc. σrB = (0, 0, 0). max(σrB) = 0. The values of i for which σci > 0 are i = 2 and i = 3 and (σrB)2 = (σrB)3 = max(σrB) thus σc is a best response to σr.
- $A\sigma_c^T = \begin{pmatrix}0\\ 0\\ 0\\\end{pmatrix}$. max(AσcT) = 0. The values of i for which σri > 0 are i = 1, i = 2 and i = 3 and (AσcT)1 = (AσcT)2 = (AσcT)3 = max(AσcT) thus σr is a best response to σc. σrB = (0, 0, 0). max(σrB) = 0. The values of i for which σci > 0 are i = 1, i = 2 and i = 3 and (σrB)1 = (σrB)2 = (σrB)3 = max(σrB) thus σc is a best response to σr.
In a two player game (A, B) ∈ ℝm × n2, (σr, σc) is a Nash equilibria if σr is a best response to σc and σc is a best response to σr.
See how-to-check-best-responses
for guidance of how to use Nashpy to check if a strategy is a best response.