# A first attempt at a Fictional Play Game Solver

The below implementations follow the conventions given in Thomas Ferguson's "Game Theory", part II, section 4.7, pages 44 through 46. We briefly recap them here:

 * $A(i,j)$ is an $m \times n$ payoff matrix  with strategies $1, \ldots, m$ and $1, \ldots, n$ for players I and II, respectively.
 * We start with an arbitrary pure strategy for player **I** (defaulting to strategy 1). The players then alternate selecting best-response strategies assuming that the previous strategies of the opposing player a selected with uniform probability according to the frequency with which they have occured.
     * *(Specifically, if player I has strategies (1,2,3) and has played (1,1,3,1,2), then player II selects a best response assuming that player I's mixed strategy is (3/5, 1/5, 1/5)*.
 * The upper value  $\overline{V}_k$ and lower value $\underline{V}_k$ are calculated after each turn. These converge tothe value of the game, but do not converge monotonically. Convergence is thought to be on the order of $1/\sqrt{k}.$

## Testing

In the book, the example matrix is given (on page 46) as

$$ A = \begin{pmatrix} 2 & -1 & 6 \\ 0 & 1 & -1 \\ -2 & 2 & 1 \end{pmatrix}$$

The game has value $.5$ and optimal mixed strategies of $(.25,.75,0)$ and $(.5,.5,0)$.

A full table of relevant calculations up to round 15 is given on page 46. We omit it here.

If a deterministic fictious play is used (by selecting the smallest strategy on each round of play), the optimum strategy for player **II** is given on the second round. 

On the other hand, it appears that the algorithm is much slower to converge for player **I**. Two values of note are given in the book, and we use these to test our algorithms:
    
    * On round 13, $\inf(\underline{V}_k)$ is found to be 5/13 = 0.3846..., yielding the mixed strategy $(5/13, 6/13, 12/3)$
    * On round 91, $\inf(\underline{V}_k)$ is found to be 44/91 = .4835..., yielding the mixed strategy $(25/91, 63/91, 3/91)$
    


-------------------

## First implementation

Our first implementation, `naive_fp`, implements the algorithm as a serial calculation mirroring the calculations done in the book. It also implements a "pretty printing" of the resulting calculations to make it easier to verify that the calculations match up at each step to those given in the book (or done by hand).

The algorithm is 0-indexed. That is, the strategies for player **I** are given as $(0, \ldots, m-1),$ and the game starts on round 0.

The algorithm maintains its historical state in the following variables:
 * *payoff\_matrix*: A payoff matrix of arbitrary size  
 * $k$: the round index, beginning at 0
 * $i$: an array of the pure strategies selected for player I at each round; i.e., $i[4]$ is the strategy selected for player I as a best response to the 4th round of play by player II. Given mathematically as the $i$ that maximizes the expectation $$i_k = \left( \frac{1}{k+1}\right) = \sum_{l=0}^k A_{i,j_l}$$
 * $j$: an array of the pure strategies slected for player II at each round. Given mathematically as the $j$ that minmizes the expecation $$j_k \left( \frac{1}{k+1}\right) \sum_{l=0}^k A(i_l, j)$$
 * $s$: an array of incremental payoffs for player II. This is given as $$s_k = \sum_{l=0}^k A(i_l),j),$$ which allows $j_k$ to be defined as $$ j_k = \text{argmin }s_k(j_k).$$
 * $t$: an array of incremental payoffs for player I. This is given as $$ t_{k}(i) = \sum_{l=0}^k A(i,j_l)$$ which yields $$i_{k+1} = \text{argmax } t_k(i).$$
 * *v_lower*: A list of lower bounds given on round $k$, defined as $$\left(\frac{1}{k+1}\right) t_{k}(i_{k+1}).$$
 * *v_upper*: A list of upper bounds given on round $k$, defined as $$\left(\frac{1}{k+1}\right) s_{k}(j_k).$$
 * *sup\_v\_upper*: Defined as the tuple *(min(v\_upper), argmin(v\_upper))*; the least upper bound of the game's value and the index at which it first occured.
 * *inf\_v\_lower*: Defined as the tuple *(max(v\_lower), argmax(v\_lower))*; the greatest lower bound of the game's value and the index at which it first occured.
 * *pI\_strategy*: The best-response strategy given on the round at which *inf\_v\_lower* occured.
 * *pII\_strategy*: The best-response strategy given on the round at which *sup\_v\_upper* occured.




The algorithm proceeds as follows:

 * An object is instantiated and the first round of play is conducted with the given inital value of $i$ (defaulting to $i = 0.)$
 * For each round of play, we do the following in order:
  * Increment $k$ (the round index)
  * Calculate the incremental payoffs for player II as $s[k] = s[k-1] + A(i[k],:)$ and append to the array $s$; i.e., the previous row of incremental payoff plus the row of payoffs as selected by the current $i$.
  * Calculate the best-response for player $I$, $j[k]$, given $s[k].$ This is $\text{argmin }s[k].$
  * Calculate the incremental payoffs for player I as $t[k] = t[k-1] + A(:,j[k])$ and append to $t$.
  * Calculate the best-response for player $I$, $i[k+1]$, given $t[k]$.
  * Calculate *v_upper* and *v_lower* as the payoff given the best-response pure-strategy on round $k$ divided by the number of rounds (for each player) and append
  * Calculate *sup_v_upper* and *inf_v_lower* as the min/max of *v_upper/v_lower*.
  * Calculate the strategy for each player on the round given by the previous calculation by calculating the number of times each pure strategy was played on the rounds until the last update of *sup_v_upper* and *inf_v_lower**, divided by $k+1$. Append to the respective list.


In [1]:
import numpy as np
from tabulate import tabulate

# The testing array
A = np.array([[2,-1,6],[0,1,-1],[-2,2,1]])


class naive_fp():

    def __init__(self, 
                 payoff_matrix, 
                 initial_i=0,
                 print_head=5,
                 print_tail=5,
                 print_indexing_at_1 = False):
        """Variables named according to Ferguson's 'Game Theory',
        part II, chapter 4, page 46.
        
        Note: to ease indexing, the strategies begin at 0; i.e.,
        a 3x3 matrix will have strategies 0, 1, and 2 rather than
        1, 2, and 3. This also means that the index for the `round`
        starts at k = 0 (meaning only the inital play has occured).
        """
        
        # Set how many lines to pretty-print
        self.print_head, self.print_tail = print_head, print_tail 
        # Set whether the indexing for printing starts a 0 (default) or 1
        # (makes it easier to see if the values match the values in texts that
        # start at k = 1)
        self.print_indexing_at_1 = print_indexing_at_1
        
        
        self.payoff_matrix = payoff_matrix
        
        # Round index
        self.k = 0
        
        # Pure strategy selections for player I
        self.i = np.array([initial_i], dtype = int)
        # Incremental payoffs for player II
        self.s = np.array([self.payoff_matrix[self.i[self.k]]])
     
        # Pure strategy selection for player II
        self.j = np.array([np.argmin(self.s[self.k])], dtype = int)
        # Incremental payoffs for player I
        self.t = np.array([self.payoff_matrix[:, self.j[self.k]]])
        
        # Select best-response for player I; choose i_{k+1}
        self.i = np.append(self.i, np.argmax(self.t[self.k]))
        
        # Game value lists
        self.v_lower = np.array([self.s[self.k][self.j[self.k]]], dtype = np.float64)
        self.v_upper = np.array([self.t[self.k][self.i[self.k + 1]]], dtype = np.float64)
        
        
        # supremum of v_upper
        self.sup_v_upper = self.v_upper
        # infimum of v_lower
        self.inf_v_lower = self.v_lower
        
        ## Best strategies on round k
        # Player I's initial strategy is the first row with probability 1
        self.pI_strategy = np.zeros([payoff_matrix.shape[0]])
        self.pI_strategy[0] = 1
        # Player II's initial strategy is j[0] = argmin(s[0]) with probability 1
        self.pII_strategy = np.zeros([payoff_matrix.shape[1]])
        self.pII_strategy[self.j[0]] = 1
        
        
    def next_rounds(self, rounds=1):
        """The rounds are calculated as follows:
        At the end of each round, all values of i[k], s[k], j[k], t[k], v_upper[k] and v_lower[k]
        have been updated.
        
        During each round, they are updated in that order.
        """

        
        for _ in range(rounds):   
            # Increment round
            self.k += 1
            
            # Increment payoffs for player II; calculate s_k
            self.s = np.append(self.s, [self.s[self.k - 1] + self.payoff_matrix[self.i[self.k]]], axis=0)
            # Select best-response for player II; choose j_k
            self.j = np.append(self.j, np.argmin(self.s[self.k]))
        
            # Increment payoffs for player I: calculate t_k
            self.t = np.append(self.t, [self.t[self.k - 1] + self.payoff_matrix[:, self.j[self.k]]], axis = 0)
            # Select best-response for player I; choose i_k
            self.i = np.append(self.i, np.argmax(self.t[self.k]))
       
    
            ## Change value bounds
            # Calculate v_upper_k
            self.v_upper = np.append(self.v_upper, 1 / (self.k+1) * self.t[self.k][self.i[self.k+1]])
            # Calculate v_lower_k
            self.v_lower = np.append(self.v_lower, 1 / (self.k + 1) * self.s[self.k][self.j[self.k]])
            
            
        
        # Compute supremum and infimum
        self.sup_v_upper = {'value' : min(self.v_upper), 'index' : np.argmin(self.v_upper)}
        self.inf_v_lower = {'value' : max(self.v_lower), 'index' : np.argmax(self.v_lower)}
      
        ## Update strategies    
        # Zero out the strategies
        self.pI_strategy = np.zeros(self.payoff_matrix.shape[0])
        # Get the number of times each pure strategy was played up to the inf
        strats_I, counts_I = np.unique(self.i[: self.inf_v_lower['index'] + 1], return_counts = True)
        # Play each strategy equally likely
        for i,j in zip(strats_I, counts_I):
            self.pI_strategy[i] = j / (self.inf_v_lower['index']+1)
    
        # Same for player II
        self.pII_strategy = np.zeros(self.payoff_matrix.shape[1])
        # Get the number of times each pure strategy was played up to the sup
        strats_II, counts_II = np.unique(self.j[: self.sup_v_upper['index'] + 1], return_counts = True)
        for i,j in zip(strats_II, counts_II):
            self.pII_strategy[i] = j / (self.sup_v_upper['index']+1)   
    
    
    def __str__(self):
        ph = self.print_head
        pt = self.print_tail
        
        headers = ('k', 'i_k', 's_k', 'v_lower_k', 'j_k', 't_k', 'v_upper_k')
        
        bounds_str = "\n\nUpper game value: " + str(self.sup_v_upper) + "\n" \
            "Lower game value: " + str(self.inf_v_lower) + "\n\n"
        
        strategies_str = "Player I optimal strategy: \n\t" + str(list(self.pI_strategy)) + \
                         "\nPlayer II optimal strategy: \n\t" + str(list(self.pII_strategy)) +\
                         "\n\n"
        
        # Build up a return value as we go
        rv = bounds_str + strategies_str
        
        if ph + pt >= self.k+1:
            # If the total number of lines to print is less than or equal to the
            # number of rounds, we don't need to print a line of '[...]'.
            
            # Return table with inidces starting at 1
            if self.print_indexing_at_1 == True:
                data = zip(
                    list(range(1, self.k + 1)), 
                    self.i + 1, 
                    self.s, 
                    self.v_lower, 
                    self.j + 1, 
                    self.t, 
                    self.v_upper)
                rv += tabulate(data, headers = headers)
                return rv
                
            # Return table with indices starting at 0
            data = zip(
                list(range(self.k + 1)), 
                self.i, 
                self.s, 
                self.v_lower, 
                self.j, 
                self.t, 
                self.v_upper)
            rv += tabulate(data, headers = headers)
            return rv
        
        # Total number of rounds exceeds ph+pt; print a line break
        line_break = "\n\n[...]\n" + str(self.k - ph - pt) + " lines skipped... \n" + "[...]\n\n"
        
        if self.print_indexing_at_1 == True:
                data_head = list(zip(
                    list(range(1, ph + 1)), 
                    self.i[:ph] + 1, 
                    self.s[:ph], 
                    self.v_lower[:ph], 
                    self.j[:ph] + 1, 
                    self.t[:ph], 
                    self.v_upper[:ph]))
                 
                rv += tabulate(data_head, headers = headers)
                rv += line_break
                
                data_tail = list(zip(
                    list(range(self.k - pt + 1, self.k + 2)), 
                    self.i[self.k - pt:self.k] + 1, 
                    self.s[self.k - pt:self.k], 
                    self.v_lower[self.k - pt:self.k], 
                    self.j[self.k - pt:self.k] + 1, 
                    self.t[self.k - pt:self.k], 
                    self.v_upper[self.k - pt:self.k]))
                
                rv += tabulate(data_tail, headers = headers)
                
                return rv
            
        # Return table with indices starting at 0
        data_head = list(zip(
                    list(range(ph)), 
                    self.i[:ph], 
                    self.s[:ph], 
                    self.v_lower[:ph], 
                    self.j[:ph], 
                    self.t[:ph], 
                    self.v_upper[:ph]))
        
        rv += tabulate(data_head, headers = headers)            
        rv += line_break
        
        data_tail = list(zip(
                    list(range(self.k  - pt, self.k + 2)), 
                    self.i[self.k - pt:self.k], 
                    self.s[self.k - pt:self.k], 
                    self.v_lower[self.k - pt:self.k], 
                    self.j[self.k - pt:self.k], 
                    self.t[self.k - pt:self.k], 
                    self.v_upper[self.k - pt:self.k]))
        
        rv += tabulate(data_tail, headers = headers)
        return rv

Lets test with matrix $A$. On round 13, we should get the values:

In [2]:
print("Upper game value = .5, index = 1")
print("Lower game value = ", 5/13, " index = 12")
print("pI strategy = ", (5/13, 6/13, 2/13))
print("pII strategy = (.5, .5, 0)")

Upper game value = .5, index = 1
Lower game value =  0.38461538461538464  index = 12
pI strategy =  (0.38461538461538464, 0.46153846153846156, 0.15384615384615385)
pII strategy = (.5, .5, 0)


In [3]:
A_naive = naive_fp(A)
A_naive.next_rounds(12)
print(A_naive)



Upper game value: {'value': 0.5, 'index': 1}
Lower game value: {'value': 0.38461538461538464, 'index': 12}

Player I optimal strategy: 
	[0.38461538461538464, 0.46153846153846156, 0.15384615384615385]
Player II optimal strategy: 
	[0.5, 0.5, 0.0]

  k    i_k  s_k           v_lower_k    j_k  t_k           v_upper_k
---  -----  ----------  -----------  -----  ----------  -----------
  0      0  [ 2 -1  6]        -1         1  [-1  1  2]     2
  1      2  [0 1 7]            0         0  [1 1 0]        0.5
  2      0  [ 2  0 13]         0         1  [0 2 2]        0.666667
  3      1  [ 2  1 12]         0.25      1  [-1  3  4]     1
  4      2  [ 0  3 13]         0         0  [1 3 2]        0.6

[...]
2 lines skipped... 
[...]

  k    i_k  s_k           v_lower_k    j_k  t_k        v_upper_k
---  -----  ----------  -----------  -----  -------  -----------
  7      0  [ 4  2 24]     0.25          1  [4 4 0]     0.5
  8      0  [ 6  1 30]     0.111111      1  [3 5 2]     0.555556
  9      

These values check out, and the values of the table check out (when adjusted for the 0-indexing.) Lets try for round 91. We should get a value of 44/91 = .4835... and an optimal strategy of (25/91, 63/91, 3/91).

In [4]:
(25/91, 63/91, 3/91)

(0.27472527472527475, 0.6923076923076923, 0.03296703296703297)

In [5]:
A_naive.next_rounds(91-13)

In [6]:
print(A_naive)



Upper game value: {'value': 0.5, 'index': 1}
Lower game value: {'value': 0.4835164835164836, 'index': 90}

Player I optimal strategy: 
	[0.27472527472527475, 0.6923076923076923, 0.03296703296703297]
Player II optimal strategy: 
	[0.5, 0.5, 0.0]

  k    i_k  s_k           v_lower_k    j_k  t_k           v_upper_k
---  -----  ----------  -----------  -----  ----------  -----------
  0      0  [ 2 -1  6]        -1         1  [-1  1  2]     2
  1      2  [0 1 7]            0         0  [1 1 0]        0.5
  2      0  [ 2  0 13]         0         1  [0 2 2]        0.666667
  3      1  [ 2  1 12]         0.25      1  [-1  3  4]     1
  4      2  [ 0  3 13]         0         0  [1 3 2]        0.6

[...]
80 lines skipped... 
[...]

  k    i_k  s_k           v_lower_k    j_k  t_k           v_upper_k
---  -----  ----------  -----------  -----  ----------  -----------
 85      1  [44 39 95]     0.453488      1  [37 45  8]     0.523256
 86      1  [44 40 94]     0.45977       1  [36 46 10]     0.

Our naive algorithm works!

## Serial Improvements
A number of improvements on the algorithm come immedialely even:

 * Storing the entire history is unnecessary. 
     * The values of $s,t$ are calculated recursively. We only need to keep the current value of each.
     * The actual values of $i$ and $j$ do not need to be kept; only their cumulative counts.
     * The upper/lower values of $v$ do not need to be kept; only the sup/inf need to be updated.
 * The sup/inf of $v$ can be updated as needed, rather than calculated by a full search of all historical values.
 * The optimal strategies can be stored with a divisor; the division only have to occur once.

The serial algorithm maintains its state in the following variables:
 * $k$: the round index
 * *payoff\_matrix*: an $m \times n$ payoff matrix
 * *i\_counts* and *j\_counts*: lists of size $m$ and $n$, respectively, that whos' $l$th entry corresponds to the number of times strategy $l$ was played on rounds $(0,\ldots k)$ (this is an invariant).


In [30]:
class serial_fp():
    def __init__(self, payoff_matrix, initial_i = 0):
        self.payoff_matrix = payoff_matrix
        
        # Round index
        self.k = 0
        
        # Create counts
        self.i_counts = np.zeros(payoff_matrix.shape[0])      
        self.j_counts = np.zeros(payoff_matrix.shape[1])

        ## Initial play
        # Initial i
        self.i_counts[initial_i] += 1
        
        # Recursively generated payoffs for player I
        self.s = self.payoff_matrix[initial_i]
        
        # Initialize strategy values for player II
        j = np.argmin(self.s)
        
        # Recursively generated payoffs for player II
        self.t = self.payoff_matrix[:,j]
        
        # The infimum of the v_lower calculated so far
        self.inf_v_lower = float("-inf")
        # The supremum of the v_upper calculated so far
        self.sup_v_upper = float("inf")
        
        # Optimal strategies at round k
        self.player_I_optimal_strategy_with_divisor = [None, None]
        self.player_II_optimal_strategy_with_divisor = [None, None]
        
    def _calculate_player_I_strategy(self, i_counts, sj, k):                
        if sj/(k + 1) > self.inf_v_lower:
            self.inf_v_lower = sj/(k + 1)
            self.player_I_optimal_strategy_with_divisor = [i_counts, k + 1]

    def _calculate_player_II_strategy(self, j_counts, ti, k):
        if ti/(k) < self.sup_v_upper:
            self.sup_v_upper = ti/(k)
            self.player_II_optimal_strategy_with_divisor = [j_counts, k]
            
    def next_rounds(self, rounds = 1):
        for _ in range(rounds):
            # Increment round
            self.k += 1
            
            # Calculate the best-response and update the count for player I
            i = np.argmax(self.t)
            self.i_counts[i] += 1
            
            # Recursively generate incremental payoffs for player I
            self.s = self.s + self.payoff_matrix[i]
            
            # Calculate the best-response and update the count for player II
            j = np.argmin(self.s)
            self.j_counts[j] += 1
            
            self._calculate_player_I_strategy(self.i_counts, self.s[j], self.k)
            self._calculate_player_II_strategy(self.j_counts, self.t[i], self.k)
                        
            self.t = self.t + self.payoff_matrix[:, j]   

Let us test with $A$:

In [20]:
A_sfp = serial_fp(A)
A_sfp.next_rounds(12)

In [21]:
print("Supremum value on round 13: ", A_sfp.sup_v_upper)
print("Infimum value on round 13: ", A_sfp.inf_v_lower)

print("\nPlayer I optimal strategy on round 13: ", A_sfp.player_I_optimal_strategy_with_divisor[0]/A_sfp.player_I_optimal_strategy_with_divisor[1])
print("Player II optimal strategy on round 13: ", A_sfp.player_II_optimal_strategy_with_divisor[0]/A_sfp.player_II_optimal_strategy_with_divisor[1])

Supremum value on round 13:  0.5
Infimum value on round 13:  0.38461538461538464

Player I optimal strategy on round 13:  [0.38461538 0.46153846 0.15384615]
Player II optimal strategy on round 13:  [0.5 0.5 0. ]


In [22]:
A_naive = naive_fp(A)
A_naive.next_rounds(12)
print(A_naive)



Upper game value: {'value': 0.5, 'index': 1}
Lower game value: {'value': 0.38461538461538464, 'index': 12}

Player I optimal strategy: 
	[0.38461538461538464, 0.46153846153846156, 0.15384615384615385]
Player II optimal strategy: 
	[0.5, 0.5, 0.0]

  k    i_k  s_k           v_lower_k    j_k  t_k           v_upper_k
---  -----  ----------  -----------  -----  ----------  -----------
  0      0  [ 2 -1  6]        -1         1  [-1  1  2]     2
  1      2  [0 1 7]            0         0  [1 1 0]        0.5
  2      0  [ 2  0 13]         0         1  [0 2 2]        0.666667
  3      1  [ 2  1 12]         0.25      1  [-1  3  4]     1
  4      2  [ 0  3 13]         0         0  [1 3 2]        0.6

[...]
2 lines skipped... 
[...]

  k    i_k  s_k           v_lower_k    j_k  t_k        v_upper_k
---  -----  ----------  -----------  -----  -------  -----------
  7      0  [ 4  2 24]     0.25          1  [4 4 0]     0.5
  8      0  [ 6  1 30]     0.111111      1  [3 5 2]     0.555556
  9      

The values match! 
Let see how much faster we've become:

In [27]:
A_naive = naive_fp(A)
%time A_naive.next_rounds(10)
%time A_naive.next_rounds(1000)
%time A_naive.next_rounds(10000)
A_naive.pI_strategy

CPU times: user 2.73 ms, sys: 985 µs, total: 3.72 ms
Wall time: 3.77 ms
CPU times: user 92.3 ms, sys: 945 µs, total: 93.3 ms
Wall time: 92.6 ms
CPU times: user 1.34 s, sys: 0 ns, total: 1.34 s
Wall time: 1.34 s


array([2.50204341e-01, 7.49523204e-01, 2.72454818e-04])

In [32]:
A_sfp = serial_fp(A)
%time A_sfp.next_rounds(10)
%time A_sfp.next_rounds(1000)
%time A_sfp.next_rounds(10000)
A_sfp.player_I_optimal_strategy_with_divisor[0]/A_sfp.player_I_optimal_strategy_with_divisor[1]

CPU times: user 733 µs, sys: 79 µs, total: 812 µs
Wall time: 564 µs
CPU times: user 44 ms, sys: 4.66 ms, total: 48.6 ms
Wall time: 39.5 ms
CPU times: user 291 ms, sys: 62.8 ms, total: 354 ms
Wall time: 275 ms


array([2.50204341e-01, 7.49523204e-01, 2.72454818e-04])

This is a pretty good speed up! Lets see if we can parallelize.

# Parallelization
*(in progress)*

In [25]:
import threading
import queue

class threaded_fp():
    def __init__(self, payoff_matrix, initial_i = 0):
        self.payoff_matrix = payoff_matrix
        
        # Round index
        self.k = 0
        
        # Create a FIFO queues and counts
        self.i_counts = np.zeros(payoff_matrix.shape[0])
        self.i_counts[initial_i] += 1
        
        self.j_counts_ti_k_queue = queue.Queue()
        
        self.j_counts = np.zeros(payoff_matrix.shape[1])
        self.i_counts_sj_k_queue = queue.Queue()
        
        
        # Initialize queue values
        self.j_counts_ti_k_queue.put((0, float("inf"), 1))
        
        # Recursively generated payoffs for player I
        self.s = self.payoff_matrix[initial_i]
        
        # Initialize strategy values for player II
        j = np.argmin(self.s)
        self.i_counts_sj_k_queue.put((j, self.s[j], 1))
        
        # Recursively generated payoffs for player II
        self.t = self.payoff_matrix[:,j]
        
        # The infimum of the v_lower calculated so far
        self.inf_v_lower = float("-inf")
        # The supremum of the v_upper calculated so far
        self.sup_v_upper = float("inf")
        
        # Optimal strategies at round k
        self.player_I_optimal_strategy_with_divisor = [None, None]
        self.player_II_optimal_strategy_with_divisor = [None, None]
        
    def _calculate_player_I_strategy(self):
        while True:
            i_counts, sj, k = self.i_counts_sj_k_queue.get()                
            if sj/(k + 1) > self.inf_v_lower:
                self.inf_v_lower = sj/(k + 1)
                self.player_I_optimal_strategy_with_divisor = [i_counts, k + 1]

    def _calculate_player_II_strategy(self):
        while True:
            j_counts, ti, k = self.j_counts_ti_k_queue.get()
            if ti/(k) < self.sup_v_upper:
                self.sup_v_upper = ti/(k)
                self.player_II_optimal_strategy_with_divisor = [j_counts, k]
            
    def next_rounds(self, rounds = 1):
        """At the start of each round, we've calculated:
        i_k, s_k, j_k, t_k"""
        self.pI_strategies = threading.Thread(
                                group = None,
                                target = self._calculate_player_I_strategy,
                                name = 'pI_strategies')
        self.pI_strategies.daemon = True
        self.pI_strategies.start()
            
        self.pII_strategies = threading.Thread(
                                group = None,
                                target = self._calculate_player_II_strategy,
                                name = 'pII_strategies')
        self.pII_strategies.daemon = True
        self.pII_strategies.start()
        
        for _ in range(rounds):
            
            # Increment round
            self.k += 1
            
            i = np.argmax(self.t)
            self.i_counts[i] += 1
            
            
            # Recursively generated payoffs for player I
            self.s = self.s + self.payoff_matrix[i]
            
            j = np.argmin(self.s)
            self.j_counts[j] += 1
            
            self.j_counts_ti_k_queue.put((self.j_counts, self.t[i], self.k))
            self.i_counts_sj_k_queue.put((self.i_counts, self.s[j], self.k))
                        
            self.t = self.t + self.payoff_matrix[:, j]           

In [29]:
A_tfp = threaded_fp(A)
%time A_tfp.next_rounds(10)
%time A_tfp.next_rounds(1000)
%time A_tfp.next_rounds(10000)
A_tfp.player_I_optimal_strategy_with_divisor[0]/A_tfp.player_I_optimal_strategy_with_divisor[1]

CPU times: user 1.31 ms, sys: 1.05 ms, total: 2.36 ms
Wall time: 2.45 ms
CPU times: user 61.1 ms, sys: 11.5 ms, total: 72.6 ms
Wall time: 62.8 ms
CPU times: user 743 ms, sys: 178 ms, total: 922 ms
Wall time: 754 ms


array([2.50204341e-01, 7.49523204e-01, 2.72454818e-04])

In [15]:
R = np.random.rand(10000,10000)

In [16]:
R_naive = naive_fp(R)
R_serial = serial_fp(R)
R_threaded = threaded_fp(R)

In [17]:
%time R_threaded.next_rounds(1000)

CPU times: user 284 ms, sys: 38.3 ms, total: 323 ms
Wall time: 272 ms


In [18]:
%time R_serial.next_rounds(1000)

CPU times: user 197 ms, sys: 531 µs, total: 198 ms
Wall time: 206 ms
