Stochastic fictitious play

The stochastic fictitious play algorithm implemented in Nashpy is based on the one given in [Hofbauer2002].

As explained in [Fudenberg1998] stochastic fictitious play "avoids the discontinuity inherent in standard fictitious play, where a small change in the data can lead to an abrupt change in behaviour."

The algorithm is designed to converge in cases where fictitious play does not converge. Note that in some cases this will require a thoughtful choice of the etha and epsilon_bar parameters.

For a game (A, B) ∈ ℝ^m × n define κ_tⁱ : S^− 1 → ℕ to be a function that in a given time interval t for a player i maps a strategy s from the opponent's strategy space S^− 1 to a number of total times the opponent has played s.

As per standard fictitious-play, each player assumes their opponent is playing a mixed strategy based on κ_t − 1. If no play has taken place, then the probability of playing each action is assumed to be equal. The assumed mixed strategies of a player's opponent are multplied by the player's own payoff matrices to calculate the expected payoff of each action.

A stochastic pertubation ϵ_i is added to each expected payoff π_i to give a pertubated payoff. Each ϵ_i is independent of each π_i and is a random number between 0 and epsilon_bar.

A logit choice function is used to map the pertubated payoff to a non-negative probability distribution, corresponding to the probability with which each strategy is chosen by the player. The logit choice function can be seen below:

$$L_i( \pi ) = \frac{\exp (\eta ^{-1} \pi_i )}{\sum_{j}\exp (\eta ^{-1} \pi_j)}$$

Discussion

Using the same game from the fictitious play discussion section, we can visualise a lack of convergence when using the default value of epsilon_bar:

>>> import numpy as np
>>> import nashpy as nash
>>> A = np.array([[0, 1, 0], [0, 0, 1], [1, 0, 0]])
>>> B = np.array([[0, 0, 1], [1, 0, 0], [0, 1, 0]])
>>> game = nash.Game(A, B)
>>> iterations = 10000
>>> np.random.seed(0)
>>> play_counts_and_distribuions = tuple(game.stochastic_fictitious_play(iterations=iterations))
>>> play_counts, distributions = play_counts_and_distribuions[-1]
>>> print(play_counts)
[array([3937., 1907., 4156.]), array([2823., 5458., 1719.])]

>>> import matplotlib.pyplot as plt
>>> plt.figure() # doctest: +SKIP
>>> probabilities = [
...     row_play_counts / np.sum(row_play_counts)
...     if np.sum(row_play_counts) != 0
...     else row_play_counts + 1 / len(row_play_counts)
...     for (row_play_counts, col_play_counts), _ in play_counts_and_distribuions]
>>> for number, strategy in enumerate(zip(*probabilities)):
...     plt.plot(strategy, label=f"$s_{number}$") # doctest: +SKIP
>>> plt.xlabel("Iteration") # doctest: +SKIP
>>> plt.ylabel("Probability") # doctest: +SKIP
>>> plt.title("Actions taken by row player") # doctest: +SKIP
>>> plt.legend() # doctest: +SKIP

Observe below that the game converges when passing values for etha and epsilon_bar:

>>> A = np.array([[1 / 2, 1, 0], [0, 1 / 2, 1], [1, 0, 1 / 2]])
>>> B = np.array([[1 / 2, 0, 1], [1, 1 / 2, 0], [0, 1, 1 / 2]])
>>> game = nash.Game(A, B)
>>> iterations = 10000
>>> etha = 0.1
>>> epsilon_bar = 10**-1
>>> np.random.seed(0)
>>> play_counts_and_distribuions = tuple(game.stochastic_fictitious_play(iterations=iterations, etha=etha, epsilon_bar=epsilon_bar))
>>> play_counts_and_distribuions[-1]
([array([3300., 3293., 3407.]), array([3320., 3372., 3308.])], [array([0.33502382, 0.41533594, 0.24964024]), array([0.18890743, 0.42793694, 0.38315563])])
>>> import matplotlib.pyplot as plt
>>> plt.figure() # doctest: +SKIP
>>> probabilities = [
...     row_play_counts / np.sum(row_play_counts)
...     if np.sum(row_play_counts) != 0
...     else row_play_counts + 1 / len(row_play_counts)
...     for (row_play_counts, col_play_counts), _ in play_counts_and_distribuions]
>>> for number, strategy in enumerate(zip(*probabilities)):
...     plt.plot(strategy, label=f"$s_{number}$") # doctest: +SKIP
>>> plt.xlabel("Iteration") # doctest: +SKIP
>>> plt.ylabel("Probability") # doctest: +SKIP
>>> plt.title("Actions taken by row player") # doctest: +SKIP
>>> plt.legend() # doctest: +SKIP

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

stochastic-fictitious-play.rst

stochastic-fictitious-play.rst

Stochastic fictitious play

Discussion

Files

stochastic-fictitious-play.rst

Latest commit

History

stochastic-fictitious-play.rst

File metadata and controls

Stochastic fictitious play

Discussion