# Payoff Matrices (part 1)

> This module contains payoff matrices for different evolutionary games
>
> Part 1 contains payoff matrices for the following games
> - DSAIR
> - DSAIR with peer punishment or reward
> - DSAIR with voluntary commitments
> - DSAIR with collective risk
>
> Note that all of the payoff matrices here are replications of the models from The Anh et al. 2020, 2021, 2022.

In [1]:
#| default_exp payoffs

In [2]:
#| hide
#| export
from nbdev.showdoc import *
from fastcore.test import test_eq, test_close
import collections
import functools
from gh_pages_example.utils import *
from gh_pages_example.types import *
from gh_pages_example.methods import *
from gh_pages_example.model_utils import *
import itertools
import math
import typing

import fastcore.test
import more_itertools
import numpy as np
import nptyping

  if (ind not in allowed_inds) and (str(ind) not in allowed_inds):
  ergodic = np.array(V.transpose(0, 2, 1)[y], dtype=float)


In [3]:
np.set_printoptions(suppress=True) # don't use scientific notation

## DSAIR Model Paramaters

| keyword | value type | range | optional | description | 
|---------|------------|-------|----------|-------------|
| b | NDArray | b > 0| | The size of the per round benefit of leading the AI development race|
| c | NDArray | c > 0| | The cost of implementing safety recommendations per round|
| s | NDArray | s > 1| | The speed advantage from choosing to ignore safety recommendations|
| p | NDArray | [0, 1]| | The probability that unsafe firms avoid an AI disaster|
| B | NDArray | B >> b| | The size of the prize from winning the AI development race|
| W | NDArray | $$[10, 10^6]$$| | The anticipated timeline until the development race has a winner if everyone behaves safely|
| pfo | NDArray | [0, 1]|Yes| The probability that firms who ignore safety precautions are found out|
| epsilon | NDArray | ϵ > 0|Yes| The cost of setting up a voluntary commitment|
| ω | NDArray | [0, 1]|Yes| Noise in arranging an agreement, with some probability they fail to succeed in making an agreement|

In [4]:
show_doc(ModelTypeDSAIR)

---

[source](https://github.com/PaoloBova/gh-pages-example/blob/main/gh_pages_example/types.py#LNone){target="_blank" style="float:right; font-size:smaller"}

### ModelTypeDSAIR

>      ModelTypeDSAIR (b:gh_pages_example.types.Array1D,
>                      c:gh_pages_example.types.Array1D,
>                      s:gh_pages_example.types.Array1D,
>                      p:gh_pages_example.types.Array1D,
>                      B:gh_pages_example.types.Array1D,
>                      W:gh_pages_example.types.Array1D,
>                      pfo:gh_pages_example.types.Array1D=None,
>                      α:gh_pages_example.types.Array1D=None,
>                      γ:gh_pages_example.types.Array1D=None,
>                      epsilon:gh_pages_example.types.Array1D=None,
>                      ω:gh_pages_example.types.Array1D=None,
>                      collective_risk:gh_pages_example.types.Array1D=None)

This is the schema for the inputs to a DSAIR model.

Note: This schema is not enforced and is here purely for documentation
purposes.

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| b | Array1D |  | benefit: The size of the per round benefit of leading the AI development race, b>0 |
| c | Array1D |  | cost: The cost of implementing safety recommendations per round, c>0 |
| s | Array1D |  | speed: The speed advantage from choosing to ignore safety recommendations, s>1 |
| p | Array1D |  | avoid_risk: The probability that unsafe firms avoid an AI disaster, p ∈ [0, 1] |
| B | Array1D |  | prize: The size of the prize from winning the AI development race, B>>b |
| W | Array1D |  | timeline: The anticipated timeline until the development race has a winner if everyone behaves safely, W ∈ [10, 10**6] |
| pfo | Array1D | None | detection risk: The probability that firms who ignore safety precautions are found out, pfo ∈ [0, 1] |
| α | Array1D | None | the cost of rewarding/punishing a peer |
| γ | Array1D | None | the effect of a reward/punishment on a developer's speed |
| epsilon | Array1D | None | commitment_cost: The cost of setting up and maintaining a voluntary commitment, ϵ > 0 |
| ω | Array1D | None | noise: Noise in arranging an agreement, with some probability they fail to succeed in making an agreement, ω ∈ [0, 1] |
| collective_risk | Array1D | None | The likelihood that a disaster affects all actors |

In [5]:
show_doc(Array1D)

---

[source](https://github.com/PaoloBova/gh-pages-example/blob/main/gh_pages_example/types.py#LNone){target="_blank" style="float:right; font-size:smaller"}

### Array1D

>      Array1D (ModelVector:NDArray[Shape['N_models'],Any])

An alias for a 1D numpy array.

|    | **Type** | **Details** |
| -- | -------- | ----------- |
| ModelVector | NDArray | A 1D numpy array suitable for stacks of scalar parameter values |

In [6]:
#| export
valid_dtypes = typing.Union[float, list[float], np.ndarray, dict]
def build_DSAIR(b:valid_dtypes=4, # benefit: The size of the per round benefit of leading the AI development race, b>0
                c:valid_dtypes=1, # cost: The cost of implementing safety recommendations per round, c>0
                s:valid_dtypes={"start":1, # speed: The speed advantage from choosing to ignore safety recommendations, s>1
                                "stop":5.1,
                                "step":0.1}, 
                p:valid_dtypes={"start":0, # avoid_risk: The probability that unsafe firms avoid an AI disaster, p ∈ [0, 1]
                                "stop":1.02,
                                "step":0.02}, 
                B:valid_dtypes=10**4, # prize: The size of the prize from winning the AI development race, B>>b
                W:valid_dtypes=100, # timeline: The anticipated timeline until the development race has a winner if everyone behaves safely, W ∈ [10, 10**6]
                pfo:valid_dtypes=0, # detection risk: The probability that firms who ignore safety precautions are found out, pfo ∈ [0, 1]
                α:valid_dtypes=0, # the cost of rewarding/punishing a peer
                γ:valid_dtypes=0, # the effect of a reward/punishment on a developer's speed
                epsilon:valid_dtypes=0, # commitment_cost: The cost of setting up and maintaining a voluntary commitment, ϵ > 0
                ω:valid_dtypes=0, # noise: Noise in arranging an agreement, with some probability they fail to succeed in making an agreement, ω ∈ [0, 1]
                collective_risk:valid_dtypes=0, # The likelihood that a disaster affects all actors
                β:valid_dtypes=0.01, # learning_rate: the rate at which players imitate each other
                Z:int=100, # population_size: the number of players in the evolutionary game
                strategy_set:list[str]=["AS", "AU"], # the set of available strategies
                exclude_args:list[str]=['Z', 'strategy_set'], # a list of arguments that should be returned as they are
                override:bool=False, # whether to build the grid if it is very large
                drop_args:list[str]=['override', 'exclude_args', 'drop_args'], # a list of arguments to drop from the final result
               ) -> dict: # A dictionary containing items from `ModelTypeDSAIR` and `ModelTypeEGT`
    """Initialise baseline DSAIR models for all combinations of the provided
    parameter valules. By default, we create models for replicating Figure 1
    of Han et al. 2021."""
    
    saved_args = locals()
    models = model_builder(saved_args,
                           exclude_args=exclude_args,
                           override=override,
                           drop_args=drop_args)
    return models

## DSAIR Payoff Matrix (Short Run)

| Strategy | Safe | Unsafe |
|----------|---|---|
| **Safe** | $$\frac{b}{2} - c$$|  $$\frac{b}{s+1} - c$$ |
| **Unsafe** | $$b \frac{s}{s+1}$$| $$\frac{b}{2} $$|

In [7]:
#| export
def payoffs_sr(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
              ) -> dict : # The `models` dictionary with added payoff matrix `payoffs_sr`
    """The short run payoffs for the DSAIR game."""
    s, b, c = [models[k] for k in ['s', 'b', 'c']]
    πAA = -c + b/2
    πAB = -c + b/(s+1)
    πBA = s*b/(s+1)
    πBB = b/2
    
    # Promote all stacks to 3D arrays
    πAA = πAA[:, None, None]
    πAB = πAB[:, None, None]
    πBA = πBA[:, None, None]
    πBB = πBB[:, None, None]
    matrix = np.block([[πAA, πAB], 
                       [πBA, πBB]])
    return {**models, 'payoffs_sr':matrix}

In [8]:
show_doc(payoffs_sr)

---

[source](https://github.com/PaoloBova/gh-pages-example/blob/main/gh_pages_example/payoffs.py#L63){target="_blank" style="float:right; font-size:smaller"}

### payoffs_sr

>      payoffs_sr (models:dict)

The short run payoffs for the DSAIR game.

|    | **Type** | **Details** |
| -- | -------- | ----------- |
| models | dict | A dictionary containing the items in `ModelTypeDSAIR` |
| **Returns** | **dict** | **The `models` dictionary with added payoff matrix `payoffs_sr`** |

## DSAIR Payoff Matrix (Short Run) with probability of being found out

| Strategy | Safe | Unsafe |
|----------|---|---|
| **Safe** | $$\frac{b}{2} - c$$|  $$(1 - p_{fo}) \frac{b}{s+1} + p_{fo} b - c$$ |
| **Unsafe** | $$ (1 - p_{fo}) b \frac{s}{s+1}$$| $$(1 - p_{fo}^2) \frac{b}{2} $$|

In [9]:
#| export

def payoffs_sr_pfo_extension(models):
    """The short run payoffs for the DSAIR game with a chance of unsafe
    behaviour being spotted."""
    s, b, c, pfo = [models[k] for k in ['s', 'b', 'c', 'pfo']]
    πAA = -c + b/2
    πAB = -c + b/(s+1) * (1 - pfo) + pfo * b
    πBA = (1 - pfo) * s * b / (s+1)
    πBB = (1 - pfo**2) * b/2
    
    # Promote all stacks to 3D arrays
    πAA = πAA[:, None, None]
    πAB = πAB[:, None, None]
    πBA = πBA[:, None, None]
    πBB = πBB[:, None, None]
    matrix = np.block([[πAA, πAB],
                       [πBA, πBB]])
    return {**models, 'payoffs_sr':matrix}

## DSAIR Payoff Matrix (Long Run)

Denote $\pi$ as one of the short run payoff matrices discussed above with rows and columns indexed by letters A, B, ...

| Strategy | Always Safe | Always Unsafe |
|----------|---|---|
| **Always Safe** | $$πAA + \frac{B}{2W}$$|  $$πAB$$ |
| **Always Unsafe** | $$p \, (s \frac{B}{W} + πBA)$$| $$p \, (s \frac{B}{2W} + πBB)$$|

*Note: In a model where we suffer a collective risk of an AI disaster if the winner is unsafe, payoffs for firms who play safe when facing an unsafe firm are also multiplied by $p$.*

In [10]:
#| export

def payoffs_lr(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
              ) -> dict : # The `models` dictionary with added payoff matrix `payoffs`
    """The long run average payoffs for the DSAIR game."""
    # All 1D arrays must be promoted to 3D Arrays for broadcasting
    s, p, B, W = [models[k][:, None, None]
                  for k in ['s', 'p', 'B', 'W']]
    πAA,πAB,πBA,πBB = [models['payoffs_sr'][:, i:i+1, j:j+1]
                       for i in range(2) for j in range(2)]    
    πAA = πAA + B/(2*W)
    πAB = πAB
    πBA = p*(s*B/W + πBA)
    πBB = p*(s*B/(2*W) + πBB)
    payoffs = np.block([[πAA, πAB],
                        [πBA, πBB]])
    return {**models, 'payoffs': payoffs}

## DSAIR Payoff Matrix with punishments (Long Run)

Denote $\pi$ as one of the short run payoff matrices discussed above with rows and columns indexed by letters A, B, ...

**Always Safe** and **Always Unsafe** play as they usually do.

**Punish Unsafe** always plays Safe. However, they will pay a cost to punish their co-player if the co-player plays Unsafe.


| Strategy | Always Safe | Always Unsafe | Punish Unsafe |
|----------|---|---|---|
| **Always Safe** | $$πAA + \frac{B}{2W}$$|  $$πAB$$ | $$πAA + \frac{B}{2W}$$ |
| **Always Unsafe** | $$p \, (s \frac{B}{W} + πBA)$$| $$p \, (s \frac{B}{2W} + πBB)$$| punished_payoff|
| **Punish Unsafe** | $$πAA + \frac{B}{2W}$$| sanctioner_payoff | $$πAA + \frac{B}{2W}$$ |

In [11]:
#| export

def punished_and_sanctioned_payoffs(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
                                   ) -> dict : # The `models` dictionary with added payoff matrix `payoffs`:
    """Compute the payoffs for the punished and sanctioner players in a DSAIR
    model with peer punishment."""
    # All 1D arrays must be promoted to 3D Arrays for broadcasting
    s,b,c, p, B, W, pfo = [models[k][:, None, None]
                      for k in ['s', 'b', 'c', 'p', 'B', 'W', 'pfo']]
    α, γ = [models[k][:, None, None] for k in ['α', 'γ']]
    πAA,πAB,πBA,πBB = [models['payoffs_sr'][:, i:i+1, j:j+1]
                       for i in range(2) for j in range(2)]
    
    s_punished = s - γ
    s_sanctioner = 1 - α
    sum_of_speeds = np.maximum(1e-20, s_punished + s_sanctioner)
    punished_wins = (s_punished > 0) & (((W-s)*np.maximum(0, s_sanctioner))
                                        <= ((W-1) * s_punished))
    punished_draws = (s_punished > 0) & (((W-s) * s_sanctioner)
                                         == ((W-1) * s_punished))
    sanctioner_wins = (s_sanctioner > 0) & (((W-s) * s_sanctioner)
                                            >= ((W-1)*np.maximum(0,s_punished)))
    no_winner = (s_punished <= 0) & (s_sanctioner <= 0)

    both_speeds_positive = (s_punished > 0) & (s_sanctioner > 0)
    only_sanctioner_speed_positive = (s_punished <= 0) & (s_sanctioner > 0)
    only_punisher_speed_positive = (s_punished > 0) & (s_sanctioner <= 0)

    p_loss = np.where(punished_wins | punished_draws, p, 1)
    R = np.where(no_winner,
                 1e50,
                 1 + np.minimum((W-s)/ np.maximum(s_punished, 1e-10),
                                (W-1)/ np.maximum(s_sanctioner, 1e-10)))
    B_s = np.where(sanctioner_wins, B, np.where(punished_draws, B/2, 0))
    B_p = np.where(punished_wins, B, np.where(punished_draws, B/2, 0))
    b_s = np.where(both_speeds_positive,
                   (1-pfo) * b * s_sanctioner / sum_of_speeds + pfo * b,
                   np.where(only_sanctioner_speed_positive, b, 0))
    b_p = np.where(both_speeds_positive,
                   (1-pfo) * b * s_punished / sum_of_speeds,
                   np.where(only_punisher_speed_positive, (1 - pfo)*b, 0))
    sanctioner_payoff = (1 / R) * (πAB + B_s - (b_s - c)) + (b_s - c)
    # sanctioner_payoff = (1 / R) * (πAB + B_s + (R-1)*(b_s - c))
    punished_payoff = (p_loss / R) * (πBA + B_p - b_p) + p_loss * b_p
    # punished_payoff = (p_loss / R) * (πBA + B_p + (R-1)*b_p)
    return {**models,
            'sanctioner_payoff':sanctioner_payoff,
            'punished_payoff':punished_payoff}

Below, I test that we produce expected results for the punished and sanctioned payoffs. 

In [12]:
models = build_DSAIR(b=4,
                     c=1,
                     p=0.25,
                     s=1.5,
                     B=10**4,
                     W=10**2,
                     pfo=0,
                     α=np.array([0]),
                     γ=np.array([0]),
                     β=0.01,
                     Z=100,
                     strategy_set=["AS", "AU", "PS"],
                     collective_risk=0)

results = thread_macro(models,
                       payoffs_sr,
                       punished_and_sanctioned_payoffs)

expected_result = (1/4 * 3 / 200 * (12/5 + 10**4 + 197/3 * 12/5))
test_eq(results['punished_payoff'], expected_result)

In [13]:
models = build_DSAIR(b=4,
                     c=1,
                     p=0.25,
                     s=1.5,
                     B=10**4,
                     W=10**2,
                     pfo=0,
                     α= np.arange(0, 3, 0.1),
                     γ= np.arange(0, 3, 0.1),
                     β=0.01,
                     Z=100,
                     strategy_set=["AS", "AU", "PS"],
                     collective_risk=0)

results = thread_macro(models,
                       payoffs_sr,
                       punished_and_sanctioned_payoffs)

In [14]:
def expected_fn1(α, γ):
    p_punish = np.where((3/2 - γ) * (100 - 1) > (1 - α) * (100 - 3/2),
                        1/4,
                        1)
    origin_speed = np.where((3/2 - γ) * (100 - 1) > (1 - α) * (100 - 3/2),
                         3/2, 
                         1)
    win_speed = np.where((3/2 - γ) * (100 - 1) > (1 - α) * (100 - 3/2),
                         (3/2 - γ), 
                         (1 - α))
    Bp = np.where((3/2 - γ) * (100 - 1) > (1 - α) * (100 - 3/2),
                  10**4,
                  np.where((3/2 - γ) * (100 - 1) == (1 - α) * (100 - 3/2),
                           10**4 / 2,
                           0))
    sum_of_speeds = np.maximum(1e-20, (3/2 - γ) + (1 - α))
    b_p = np.where((3/2 > γ) & (1 > α),
                   4 * (3/2 - γ) / sum_of_speeds,
                   np.where((3/2 > γ),
                            4,
                            0))
    R_inv = (np.maximum(0, win_speed) 
                          / (100 - origin_speed + np.maximum(0, win_speed)))
    punished_payoff = (p_punish * R_inv * (12/5 + Bp)
                       + p_punish * b_p
                       - p_punish * b_p * R_inv
                      )
    return punished_payoff

In [15]:
test_close(results['punished_payoff'][:, 0, 0],
           expected_fn1(results['α'], results['γ']))

In [16]:
def expected_fn2(α, γ):
    origin_speed = np.where((3/2 - γ) * (100 - 1) > (1 - α) * (100 - 3/2),
                         3/2, 
                         1)
    win_speed = np.where((3/2 - γ) * (100 - 1) > (1 - α) * (100 - 3/2),
                         (3/2 - γ), 
                         (1 - α))
    Bs = np.where((3/2 - γ) * (100 - 1) < (1 - α) * (100 - 3/2),
                  10**4,
                  np.where((3/2 - γ) * (100 - 1) == (1 - α) * (100 - 3/2),
                           10**4 / 2,
                           0))
    sum_of_speeds = np.maximum(1e-20, (3/2 - γ) + (1 - α))
    b_s = np.where((3/2 > γ) & (1 > α),
                   4 * (1 - α) / sum_of_speeds,
                   np.where((1 > α),
                            4,
                            0))
    R_inv = (np.maximum(0, win_speed) 
             / (100 - origin_speed + np.maximum(0, win_speed)))
    punished_payoff = (R_inv * (3/5 + Bs)
                       + (b_s - 1)
                       - (b_s - 1) * R_inv
                      )
    return punished_payoff

In [17]:
test_close(results['sanctioner_payoff'][:, 0, 0],
           expected_fn2(results['α'], results['γ']))

In [18]:
#| export
def payoffs_lr_peer_punishment(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
              ) -> dict : # The `models` dictionary with added payoff matrix `payoffs`:
    """The long run average payoffs for the DSAIR game with peer punishment."""
    # All 1D arrays must be promoted to 3D Arrays for broadcasting
    s,b,c, p, B, W = [models[k][:, None, None]
                      for k in ['s', 'b', 'c', 'p', 'B', 'W']]
    α, γ = [models[k][:, None, None] for k in ['α', 'γ']]
    πAA,πAB,πBA,πBB = [models['payoffs_sr'][:, i:i+1, j:j+1]
                       for i in range(2) for j in range(2)]
    models = punished_and_sanctioned_payoffs(models)
    
    ΠAA = πAA + B/(2*W)
    ΠAB = πAB
    ΠAC = πAA + B/(2*W)
    ΠBA = p*(s*B/W + πBA)
    ΠBB = p*(s*B/(2*W) + πBB)
    ΠBC = models["punished_payoff"]
    ΠCA = πAA + B/(2*W)
    ΠCB = models["sanctioner_payoff"]
    ΠCC = πAA + B/(2*W)
    matrix = np.block([[ΠAA, ΠAB, ΠAC], 
                       [ΠBA, ΠBB, ΠBC],
                       [ΠCA, ΠCB, ΠCC],
                       ])
    return {**models, 'payoffs':matrix}

### Expressions for the sanctioner and punished payoffs

For convenience we denote a number of new variables to simplify the expressions for the sanctioner and punished payoffs.

\begin{equation}
\text{sanctioner payoff} = \frac{1}{R} (\pi AB + B_s + (R-1) (b_s - c))\\
\end{equation}

\begin{equation}
\text{punished payoff} = \frac{p_{punish}}{R} (πBA + B_p + (R-1) b_p)\\
\end{equation}

*Note: In a model where we suffer a collective risk of an AI disaster if the winner is unsafe, payoffs for firms who play safe when facing an unsafe firm are also multiplied by $p_{punish}$.*

We can read the above payoffs as telling us the average payoffs over the R rounds of the race for each firm, assuming the punishment is levied at the end of the first round and the remaining $R - 1$ rounds are played with the punishment in effect.

Note that $s_{\beta}$ denotes the new speed of the firm who is punished and $s_{\alpha}$ as the speed of the firm who levies the punishment.

Below we denote the four possible outcomes (ignoring disaster) of a race between a sanctioner and a punished firm:

\begin{equation}
\text{punished wins} = (s_{\beta} > 0) \, \& \, (\frac{W-s}{s_{\beta}} <= \frac{W-1}{s_{\alpha}})
\end{equation}

\begin{equation}
\text{sanctioner wins} = (s_{\alpha} > 0) \, \& \, (\frac{W-1}{s_{\alpha}} <= \frac{W-s}{s_{\beta}})
\end{equation}

\begin{equation}
\text{draw} = (s_{\beta} > 0) \, \& \, (\frac{W-s}{s_{\beta}} = \frac{W-1}{s_{\alpha}})
\end{equation}

\begin{equation}
\text{no winner} = (s_{\beta} <= 0)  \, \& \,  (s_{\alpha} <= 0)
\end{equation}

We can use the above expressions to define the following variables:

$p_{punish}$ is the probability of avoiding an AI disaster if a punishment is levied and depends on who wins the race.

\begin{equation}
p_{punish} = \begin{cases} 0 & \text{sanctioner wins | no winner} \\
p & otherwise
\end{cases}
\end{equation}

$R$ is the number of rounds that the race lasts for; the race ends when the first firm reaches the finish line.

\begin{equation}
R = \begin{cases} \infty & \text{no winner} \\
\frac{W - 1}{s_{\alpha}} & \text{sanctioner wins} \\
\frac{W - s}{s_{\beta}} & \text{punished wins | draw} \\
\end{cases}
\end{equation}

$B_s$ is the prize that the sanctioner receives at the end of the race.

\begin{equation}
B_s = \begin{cases} B & \text{sanctioner wins} \\
\frac{B}{2} & \text{draw} \\
0 & otherwise \\
\end{cases}
\end{equation}

$B_p$ is the prize that the punished receives at the end of the race.

\begin{equation}
B_p = \begin{cases} B & \text{punished wins} \\
\frac{B}{2} & \text{draw} \\
0 & otherwise \\
\end{cases}
\end{equation}

$b_s$ is the benefit the sanctioner receives each round, they only gain a benefit if their speed is positive but gain the whole benefit if they are the only firm with positive speed.

\begin{equation}
b_s = \begin{cases} p_{fo} b + (1-p_{fo}) b \frac{s_{\alpha}}{s_{\alpha} + s_{\beta}} & s_{\alpha}, s_{\beta} > 0\\
b & s_{\alpha} > 0 >= s_{\beta} \\
0 & s_{\alpha} <= 0 \\
\end{cases}
\end{equation}

$b_p$ is the benefit the punished receives each round, they only gain a benefit if their speed is positive but gain the whole benefit if they are the only firm with positive speed.

\begin{equation}
b_p = \begin{cases} (1-p_{fo}) b \frac{s_{\beta}}{s_{\alpha} + s_{\beta}} & s_{\alpha}, s_{\beta} > 0\\
b & s_{\beta} > 0 >= s_{\alpha} \\
0 & s_{\beta} <= 0 \\
\end{cases}
\end{equation}

## DSAIR Payoff Matrix with rewards (Long Run)

Denote $\pi$ as one of the short run payoff matrices discussed above with rows and columns indexed by letters A, B, ...

**Always Safe** and **Always Unsafe** play as they usually do.

**Reward Safe** always plays Safe. However, they will pay a cost to reward their co-player if the co-player plays Safe.

| Strategy | Always Safe | Always Unsafe | Reward Safe |
|----------|---|---|---|
| **Always Safe** | $$πAA + \frac{B}{2W}$$|  $$πAB$$ | $$πAA + \frac{B (1 + s_{\beta})}{W}$$ |
| **Always Unsafe** | $$p \, (s \frac{B}{W} + πBA)$$| $$p \, (s \frac{B}{2W} + πBB)$$| $$p \, (s \frac{B}{W} + πBA)$$|
| **Reward Safe** | $$ πAA $$| $$ πAB $$| $$πAA + \frac{B (1 + s_{\beta} - s_{\alpha})}{2W}$$ |

In [19]:
#| export

def payoffs_lr_peer_reward(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
              ) -> dict : # The `models` dictionary with added payoff matrix `payoffs`:
    """The long run average payoffs for the DSAIR game with peer punishment."""
    # All 1D arrays must be promoted to 3D Arrays for broadcasting
    s,b,c, p, B, W = [models[k][:, None, None]
                      for k in ['s', 'b', 'c', 'p', 'B', 'W']]
    α, γ = [models[k][:, None, None] for k in ['α', 'γ']]
    πAA,πAB,πBA,πBB = [models['payoffs_sr'][:, i:i+1, j:j+1]
                       for i in range(2) for j in range(2)]
    
    s_rewarded = 1 + γ
    s_helper = np.maximum(0, 1 - α)
    s_colaborative = np.maximum(0, 1 + γ - α)
    ΠAA = πAA + B/(2*W)
    ΠAB = πBA
    ΠAC = πAA + B * s_rewarded / W
    ΠBA = p*(s*B/W + πBA)
    ΠBB = p*(s*B/(2*W) + πBB)
    ΠBC = p*(s*B/W + πBA)
    ΠCA = πAA
    ΠCB = πAB
    ΠCC = πAA + B * s_colaborative/(2*W)
    matrix = np.block([[ΠAA, ΠAB, ΠAC], 
                       [ΠBA, ΠBB, ΠBC],
                       [ΠCA, ΠCB, ΠCC],
                       ])
    return {**models, 'payoffs':matrix}

## DSAIR Payoff Matrix with voluntary commitments (Long Run)

Denote $\pi$ as one of the short run payoff matrices discussed above with rows and columns indexed by letters A, B, ...

The strategies below are less obvious than in earlier models. **Always Safe Out** and **Always Unsafe Out** are the same strategies we are used to.

**Always Safe In** is willing to form a commitment to play Safe. Otherwise, they will always play Unsafe.

**Always Unsafe In** is willing to form a commitment but will violate it by always playing Unsafe. This way, they anticipate that they can encourage other firms to play safe and so pull ahead of them in the race.

**Punish Violator** is willing to form a commitment to play Safe. Otherwise, they will always play Unsafe. If the coparty to the commitment violates the commitment by playing Unsafe, then this player pays a cost to levy a punishment on the violator.

| Strategy| Always Safe Out | Always Unsafe Out |  Always Safe In | Always Unsafe In  | Punish Violator |
|----------|---|---|---|---|---|
| **Always Safe Out** | $$πAA + \frac{B}{2W}$$| $$πAB$$ | $$πAB$$ | $$πAB$$ | $$πAB$$|
| **Always Unsafe Out** | $$p \, (s \frac{B}{W} + πBA)$$| $$p \, (s \frac{B}{2W} + πBB)$$| $$p \, (s \frac{B}{2W} + πBB)$$| $$p \, (s \frac{B}{2W} + πBB)$$ | $$p \, (s \frac{B}{2W} + πBB)$$ |
| **Always Safe In** | $$p \, (s \frac{B}{W} + πBA)$$|  $$p \, (s \frac{B}{2W} + πBB)$$ | $$πAA + \frac{B}{2W} - \epsilon$$| $$πAB - \epsilon$$| $$πAA + \frac{B}{2W} - \epsilon$$ |
| **Always Unsafe In** | $$p \, (s \frac{B}{W} + πBA)$$| $$p \, (s \frac{B}{2W} + πBB)$$| $$p \, (s \frac{B}{W} + πBA) - \epsilon$$| $$p \, (s \frac{B}{2W} + πBB) - \epsilon$$ | punished_payoff - ϵ |
| **Punish Violator** | $$p \, (s \frac{B}{W} + πBA)$$| $$p \, (s \frac{B}{2W} + πBB)$$| $$πAA + \frac{B}{2W} - \epsilon$$ | sanctioner_payoff - ϵ| $$πAA + \frac{B}{2W} - \epsilon$$ |

The punished and sanctioner payoffs above are exactly the same as in the model with punishments above, so I do not repeat this here.

In [20]:
#| export

def payoffs_lr_voluntary(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
              ) -> dict : # The `models` dictionary with added payoff matrix `payoffs`:
    """The long run average payoffs for the DSAIR game with voluntary
    commitments."""
    # All 1D arrays must be promoted to 3D Arrays for broadcasting
    s,b,c, p, B, W = [models[k][:, None, None]
                      for k in ['s', 'b', 'c', 'p', 'B', 'W']]
    α, γ, ϵ = [models[k][:, None, None] for k in ['α', 'γ', 'epsilon']]
    πAA,πAB,πBA,πBB = [models['payoffs_sr'][:, i:i+1, j:j+1]
                       for i in range(2) for j in range(2)]
    models = punished_and_sanctioned_payoffs(models)
    
    ΠAA = πAA + B/(2*W)
    ΠAB = πAB
    ΠAC = πAB
    ΠAD = πAB
    ΠAE = πAB
    ΠBA = p*(s*B/W + πBA)
    ΠBB = p*(s*B/(2*W) + πBB)
    ΠBC = p*(s*B/(2*W) + πBB)
    ΠBD = p*(s*B/(2*W) + πBB)
    ΠBE = p*(s*B/(2*W) + πBB)
    ΠCA = p*(s*B/W + πBA)
    ΠCB = p*(s*B/(2*W) + πBB)
    ΠCC = πAA + B/(2*W) - ϵ
    ΠCD = πAB - ϵ
    ΠCE = πAA + B/(2*W) - ϵ
    ΠDA = p*(s*B/W + πBA)
    ΠDB = p*(s*B/(2*W) + πBB)
    ΠDC = p*(s*B/W + πBA) - ϵ
    ΠDD = p*(s*B/(2*W) + πBB) - ϵ
    ΠDE = models['punished_payoff'] - ϵ
    ΠEA = p*(s*B/W + πBA) - ϵ
    ΠEB = p*(s*B/(2*W) + πBB)
    ΠEC = πAA + B/(2*W) - ϵ
    ΠED = models['sanctioner_payoff'] - ϵ
    ΠEE = πAA + B/(2*W) - ϵ
    matrix = np.block([[ΠAA, ΠAB, ΠAC, ΠAD, ΠAE], 
                       [ΠBA, ΠBB, ΠBC, ΠBD, ΠBE],
                       [ΠCA, ΠCB, ΠCC, ΠCD, ΠCE],
                       [ΠDA, ΠDB, ΠDC, ΠDD, ΠDE],
                       [ΠEA, ΠEB, ΠEC, ΠED, ΠEE]
                       ])
    return {**models, 'payoffs':matrix}

## DSAIR Payoff Matrix (Long Run) with collective risk

Denote $\pi$ as one of the short run payoff matrices discussed above with rows and columns indexed by letters A, B, ...

| Strategy | Always Safe | Always Unsafe |
|----------|---|---|
| **Always Safe** | $$πAA + \frac{B}{2W}$$|  $$p \, πAB$$ |
| **Always Unsafe** | $$p \, (s \frac{B}{W} + πBA)$$| $$p^2 \, (s \frac{B}{2W} + πBB)$$|

In [21]:
def payoffs_lr_collective(models:dict, # A dictionary containing the items in `ModelTypeDSAIR`
              ) -> dict : # The `models` dictionary with added payoff matrix `payoffs`:
    """Long run average payoffs for the DSAIR model with collective risk."""
    # All 1D arrays must be promoted to 3D Arrays for broadcasting
    s,b,c, p, B, W = [models[k][:, None, None]
                      for k in ['s', 'b', 'c', 'p', 'B', 'W']]
    risk_shared = models["collective_risk"][:, None, None]
    πAA,πAB,πBA,πBB = [models['payoffs_sr'][:, i:i+1, j:j+1]
                       for i in range(2) for j in range(2)]
    πAA = πAA + B/(2*W)
    πAB = πAB * (1 - (1-p)*risk_shared)
    πBA = p*(s*B/W + πBA)
    πBB = p*(s*B/(2*W) + πBB) * (1 - (1-p)*risk_shared)
    matrix = np.block([[πAA, πAB],
                       [πBA, πBB]])
    return {**models, 'payoffs':matrix}

# Payoff Matrices (part 2)

> This module contains payoff matrices for different evolutionary games
>
> Part 2 contains payoff matrices for the following games
> - Encanacao et al. 2016
> - Vasconcelos et al. 2014
> - Stochastic payoffs, a. la. Hilbe et al. 2018

In [22]:
# | export
def payoffs_encanacao_2016(models):
    names = ['b_r', 'b_s', 'c_s', 'c_t', 'σ']
    b_r, b_s, c_s, c_t, σ = [models[k] for k in names]
    payoffs = {}
    n_players = 3
    n_sectors = 3
    n_strategies_per_sector = [2, 2, 2]
    n_strategies_total = 6
    # All players are from the first sector, playing that sector's first strategy
    index_min = "0-0-0"
    # All players are from the third sector, playing that sector's second strategy
    index_max = "5-5-5"
    # Note: The seperator makes it easy to represent games where n_strategies_total >= 10.

    # It is also trivial to define a vector which maps these indexes to strategy profiles
    # As sector order is fixed we could neglect to mention suscripts for each sector
    strategy_names = ["D", "C", "D", "C", "D", "C"]

    zero = np.zeros(b_r.shape[0])
    # As in the main text
    payoffs["C-C-C"] = {"P3": b_r-2*c_s,
                        "P2": σ+b_s-c_t,
                        "P1": σ+b_s}
    payoffs["C-C-D"] = {"P3": -c_s,
                        "P2": b_s-c_t,
                        "P1": zero}
    payoffs["C-D-C"] = {"P3": b_r-c_s,
                        "P2": zero,
                        "P1": b_s}
    payoffs["C-D-D"] = {"P3": zero,
                        "P2": σ,
                        "P1": σ}
    payoffs["D-C-C"] = {"P3": zero,
                        "P2": σ-c_t,
                        "P1": σ}
    payoffs["D-C-D"] = {"P3": zero,
                        "P2": -c_t,
                        "P1": zero}
    payoffs["D-D-C"] = {"P3": zero,
                        "P2": zero,
                        "P1": zero}
    payoffs["D-D-D"] = {"P3": zero,
                        "P2": σ,
                        "P1": σ}

    # The following indexes capture all strategy profiles where each player is fixed to a unique sector
    # (and player order does not matter, so we need only consider one ordering of sectors).
    payoffs["4-2-0"] = payoffs["D-D-D"]
    payoffs["4-2-1"] = payoffs["D-D-C"]
    payoffs["4-3-0"] = payoffs["D-C-D"]
    payoffs["4-3-1"] = payoffs["D-C-C"]
    payoffs["5-2-0"] = payoffs["C-D-D"]
    payoffs["5-2-1"] = payoffs["C-D-C"]
    payoffs["5-3-0"] = payoffs["C-C-D"]
    payoffs["5-3-1"] = payoffs["C-C-C"]
    return {**models, "payoffs": payoffs}


## Vasconselos et al. 2014

They introduce a model of a Collective Risk Dilemma. It is a variant of the
public goods game where players must achieve a target level of contributions
to avoid risking a disaster which destroys the group's endowments.

We compute payoffs when players contribute $0$ or a fixed $c$ proportion of
their endowment as a contribution in
a game with up to $n$ participants. To do this, we compute the payoffs as a
function of the number of contributors, then use that function for each
relevant strategy profile.

In [23]:
# | export
@multi
def build_payoffs(models: dict):
    return models.get('payoffs_key')


@method(build_payoffs, 'vasconcelos_2014_primitives')
def build_payoffs(models: dict):
    names = ['payoffs_state', 'c', 'T', 'b_r', 'b_p', 'r']
    payoffs_state, c, T, b_r, b_p, r = [models[k] for k in names]
    strategy_counts = payoffs_state['strategy_counts']
    n_r = strategy_counts["2"]
    n_p = strategy_counts["4"]
    risk = r * (n_r * c * b_r + n_p * c * b_p < T)
    # The payoffs must be computed for each strategy type in the interaction.
    # In games where we employ hypergeometric sampling, we usually do not
    # care about player order in the interaction. If order did matter, then
    # we would represent the payoffs per strategy still but it would capture
    # the expected payoffs given how likely a player of that strategy was to
    # play in each node of the extensive-form game. Non-players of type 0
    # usually do not have payoffs.
    payoffs = {"1": (1 - risk) * b_r,  # rich_free_rider
               "2": (1 - risk) * c * b_r,  # rich_contributor
               "3": (1 - risk) * b_p,  # poor_free_rider
               "4": (1 - risk) * c * b_p}  # poor_contributor
    return {**models, "payoff_primitives": payoffs}


@method(build_payoffs, 'vasconcelos_2014')
def build_payoffs(models: dict):
    profiles = create_profiles({'n_players': models.get('n_players', 5),
                                'n_strategies': [2, 2]})['profiles']
    payoffs = {}
    for profile in profiles:
        profile_tuple = thread_macro(profile,
                                     (str.split, "-"),
                                     (map, int, "self"),
                                     list,
                                     reversed,
                                     list,
                                     np.array,
                                     )
        strategy_counts = {f"{i}": np.sum(
            profile_tuple == i) for i in range(5)}
        payoffs_state = {'strategy_counts': strategy_counts}
        primitives = thread_macro(models,
                                  (assoc,
                                   'payoffs_state', payoffs_state,
                                   'payoffs_key', "vasconcelos_2014_primitives"),
                                  build_payoffs,
                                  (get, "payoff_primitives"),
                                  )
        payoffs[profile] = {}
        for i, strategy in enumerate(profile_tuple):
            if strategy == 0:
                continue
            elif strategy == 1:
                payoffs[profile][f"P{i+1}"] = primitives['1']
            elif strategy == 2:
                payoffs[profile][f"P{i+1}"] = primitives['2']
            elif strategy == 3:
                payoffs[profile][f"P{i+1}"] = primitives['3']
            elif strategy == 4:
                payoffs[profile][f"P{i+1}"] = primitives['4']
            else:
                continue
    return {**models, "payoffs": payoffs}


Here are a few simple tests of the payoff primitives for their model.

In [24]:
models = {'payoffs_state': {'strategy_counts': {"2": 2,
                                                "4": 4}},
          'c': 0.5,
          'T': 2,
          'b_r': 4,
          'b_p': 2,
          'r': 0.5,
          'payoffs_key': 'vasconcelos_2014_primitives'}
models = build_payoffs(models)
fastcore.test.test_eq(models['payoff_primitives'],
                      {'1': 4,
                       '2': 2,
                       '3': 2,
                       '4': 1})
models = {**models,
          'payoffs_state': {'strategy_counts': {"2": 0,
                                                "4": 1}}, }
models = build_payoffs(models)
fastcore.test.test_eq(models['payoff_primitives'],
                      {'1': 2,
                       '2': 1,
                       '3': 1,
                       '4': 0.5})


We quickly check that we can generate payoffs for each of the 4**5 possible
interactions in their model.

In [25]:
models = {'c': 0.5,
          'T': 2,
          'b_r': 4,
          'b_p': 2,
          'r': 0.5,
          'payoffs_key': 'vasconcelos_2014'}
models = build_payoffs(models)
fastcore.test.test_eq(len(models['payoffs']), 4**5)


If we are unwilling to use the 5**5 possible strategy profiles for computing
the transition matrices for the evolutionary system, we can always restrict
our attention to the payoffs given the number of contributors from each sector.
We often use hypergeometric sampling anyways when computing the success of
each strategy in the evolutionary system.

## General Payoff Wrapper

In [26]:
# | export
@method(build_payoffs, 'payoff_function_wrapper')
def build_payoffs(models: dict):
    profiles = create_profiles(models)['profiles']
    profile_payoffs_key = models['profile_payoffs_key']
    payoffs_state = models.get("payoffs_state", {})
    payoffs = {}
    for profile in profiles:
        profile_tuple = string_to_tuple(profile)
        strategy_counts = dict(zip(*np.unique(profile_tuple,
                                              return_counts=True)))
        payoffs_state = {**payoffs_state,
                         'strategy_counts': strategy_counts}
        profile_models = {**models,
                          "strategy_profile": profile,
                          "payoffs_state": payoffs_state,
                          "payoffs_key": profile_payoffs_key}
        profile_payoffs = thread_macro(profile_models,
                                       build_payoffs,
                                       (get, "profile_payoffs"),
                                       )
        payoffs[profile] = {}
        for i, strategy in enumerate(profile_tuple):
            if strategy == 0:
                # A strategy of 0 is reserved for missing players, missing
                # players do not have payoffs.
                continue
            elif str(strategy) in profile_payoffs.keys():
                payoffs[profile][f"P{i+1}"] = profile_payoffs[f"{strategy}"]
            else:
                continue
    return {**models, "payoffs": payoffs}


## Stochastic Payoffs

### Stochastic payoffs

We can compute the payoffs of stochastic games with state-action transition
matrix, $M$, and state-action utilities, $u$, and discount factor, $\delta$,
as follows:

$v = (1 - \delta) v^0 (I - \delta M)^{-1}$ \
$payoffs = v \cdot u$

When $\delta \rightarrow 1$, we instead compute $v$ as the eigenvector of $M$
with associated eigenvalue $1$.

$M$ is the product of a transition matrix and a matrix containing the
probabilities with which each action profile occurs (i.e. a matrix of player
(mixed) strategies). $M$ has size $2mk + 1$, where $m$ is the number of states
and $k$ is the number of strategies available to each player.

We first need to define our flow payoffs, that is, at each state-action
combination, what are the payoffs to each type of player.

In [27]:
# | export
@method(build_payoffs, "flow_payoffs_wrapper")
def build_payoffs(models):
    "Build the flow payoffs for each state-action in a stochastic game."
    state_actions = models['state_actions']
    payoffs_state = models.get('payoffs_state', {})
    flow_payoffs = collections.defaultdict()
    for state_action in state_actions:
        state, action_profile = str.split(state_action, ":")
        action_tuple = string_to_tuple(action_profile)
        action_counts = dict(zip(*np.unique(action_tuple,
                                            return_counts=True)))
        payoffs_state = {**payoffs_state,
                         'strategy_counts': action_counts,
                         'state': state}
        payoffs_flow_key = models['payoffs_flow_key']
        profile_models = {**models,
                          "payoffs_state": payoffs_state,
                          "payoffs_key": payoffs_flow_key}
        flow_payoffs[state_action] = thread_macro(profile_models,
                                                  build_payoffs,
                                                  (get, "flow_payoffs"),
                                                  )
    return {**models, "flow_payoffs": flow_payoffs}


#| hide

Specify Q as a map of state-action keys to probabilities for each next state \
Specify P as a map of state-action keys to probabilities for each next action that each player could take.

There can be memory issues with storing so many state_actions:
- Total number of possible state_actions is equal to (n_states * n_choices^n_players)^2
- If game is anonymous (so order does not matter), this reduces to (n_states * ncr(n_choices + n_players -1, n_choices-1))^2

The second approach is much much smaller if n_choices is a lot larger than n_players. Unfortunately, we
still need to add up the likelihood of each possible action profile, so computation may still 
be incredibly slow, even if the result still fits in memory.

If we ever find ourselves needing to look at many players when trying to
compute the transition probabilities for a stochastic game, use a monte carlo
simulation instead to learn the transition probabilities and payoffs.

In [28]:
# | export
@multi
def compute_transition(models):
    "Compute the transition likelihood for the given transition."
    return models.get('compute_transition_key')

@method(compute_transition, 'anonymous_actions')
def compute_transition(models):
    """Compute transition likelihood when we are only passed anonymous action
    profiles (i.e. order does not matter)."""
    P, Q = [models[k] for k in ['P', 'Q']]
    transition_start, transition_end = [models[k] for k in ['transition_start',
                                                            'transition_end']]
    next_state, action_profile = transition_end.split(":")
    action_tuple = string_to_tuple(action_profile)
    action_count = dict(zip(*np.unique(action_tuple, return_counts=True)))
    profiles = create_profiles({**models,
                                "profiles_rule": "from_strategy_count",
                                "strategy_count": action_count})['profiles']
    profile_tuples = map(string_to_tuple, profiles)
    p = [np.prod([P[f"P{player + 1}"][transition_start].get(f"A{action}", 0)
                  for player, action in enumerate(profile_tuple)])
         for profile_tuple in profile_tuples]
    return np.sum(p) * Q[transition_start][next_state]


@method(compute_transition)
def compute_transition(models):
    "Compute transition likelihood given the states and action profiles."
    P, Q = [models[k] for k in ['P', 'Q']]
    transition_start, transition_end = [models[k] for k in ['transition_start',
                                                            'transition_end']]
    next_state, action_profile = transition_end.split(":")
    action_tuple = string_to_tuple(action_profile)
    p = np.prod([P[f"P{player + 1}"][transition_start].get(f"A{action}", 0)
                 for player, action in enumerate(action_tuple)])
    return p * Q[transition_start][next_state]

In [29]:
# | export
@method(build_payoffs, "stochastic-no-discounting")
def build_payoffs(models: dict):
    """Compute the payoffs for a stochastic game with the given flow_payoffs,
    state_transitions, strategies, and strategy_profile, when there is no
    discounting."""
    u = models['flow_payoffs']
    Q = models['state_transitions']
    strategy_profile = models['strategy_profile'].split("-")[::-1]
    strategies = models['strategies']
    P = {f"P{player + 1}": strategies[strategy_key]
         for player, strategy_key in enumerate(strategy_profile)}
    state_actions = list(Q.keys())
    M = np.zeros((len(state_actions), len(state_actions)))
    for row, transition_start in enumerate(state_actions):
        for col, transition_end in enumerate(state_actions):
            transition_data = {**models,
                               "P": P,
                               "Q": Q,
                               "transition_start": transition_start,
                               "transition_end": transition_end}
            M[row, col] = compute_transition(transition_data)
    v = thread_macro({**models, "transition_matrix": np.array([M])},
                     find_ergodic_distribution,
                     (get, "ergodic"))[0]
    u = np.array([[u[s][f"{i+1}"] for i in range(len(u[s]))]
                  for s in state_actions])
    for _ in range(v.ndim, u.ndim):
        v = v[:, None]
    payoffs = np.sum(v * u, axis=0)
    profile_payoffs = {f"{i+1}": pi for i, pi in enumerate(payoffs)}
    return {**models, "profile_payoffs": profile_payoffs}


@method(build_payoffs, "stochastic-with-discounting")
def build_payoffs(models: dict):
    """Compute the payoffs for a stochastic game with the given flow_payoffs,
    state_transitions, strategies, and strategy_profile."""
    u = models['flow_payoffs']
    Q = models['state_transitions']
    d = models['discount_rate']
    v0 = models['initial_state_action_distribution']
    strategy_profile = models['strategy_profile'].split("-")[::-1]
    strategies = models['strategies']
    P = {f"P{player + 1}": strategies[strategy_key]
         for player, strategy_key in enumerate(strategy_profile)}
    state_actions = list(Q.keys())
    M = np.zeros((len(state_actions), len(state_actions)))
    for row, transition_start in enumerate(state_actions):
        for col, transition_end in enumerate(state_actions):
            transition_data = {**models,
                               "P": P,
                               "Q": Q,
                               "transition_start": transition_start,
                               "transition_end": transition_end}
            M[row, col] = compute_transition(transition_data)
    v = (1 - d) * v0 * np.linalg.inv(np.eye(M.shape) - d * M)
    u = np.array([[u[s][f"{i+1}"] for i in range(len(u[s]))]
                  for s in state_actions])
    for _ in range(v.ndim, u.ndim):
        v = v[:, None]
    payoffs = np.sum(v * u, axis=0)
    profile_payoffs = {f"{i+1}": pi for i, pi in enumerate(payoffs)}
    return {**models, "profile_payoffs": profile_payoffs}


#### Tests for "flow_payoffs_wrapper" method of `build_payoffs`

Here is an example of flow payoffs.

In [30]:
# | export
@method(build_payoffs, 'vasconcelos_2014_flow')
def build_payoffs(models: dict):
    names = ['payoffs_state', 'c', 'T', 'b_r', 'b_p', 'r', 'g']
    payoffs_state, c, T, b_r, b_p, r, g = [models[k] for k in names]
    strategy_counts = payoffs_state['strategy_counts']
    state = payoffs_state['state']
    reward_bonus = g if state=='1' else 1
    n_r = strategy_counts.get("2", 0)
    n_p = strategy_counts.get("4", 0)
    risk = r * (n_r * c * b_r + n_p * c * b_p < T)
    payoffs = {"1": (1 - risk) * b_r * reward_bonus,  # rich_free_rider
               "2": (1 - risk) * c * b_r * reward_bonus,  # rich_contributor
               "3": (1 - risk) * b_p * reward_bonus,  # poor_free_rider
               "4": (1 - risk) * c * b_p * reward_bonus}  # poor_contributor
    return {**models, "flow_payoffs": payoffs}

In [31]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": [1, 2],
                                "S2": [3, 4]},
          "profiles_rule": "allowed_sectors",}
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = []
for profile in action_profiles:
    for state in range(n_states):
        state_actions.append(f"{state}:{profile}")

models = {"payoffs_flow_key": "vasconcelos_2014_flow",
          "payoffs_key": "flow_payoffs_wrapper",
          "state_actions": state_actions,
          'c': 0.5,
          'T': 2,
          'b_r': 4,
          'b_p': 2,
          'r': 0.8,
          'g': 2,
          }
flow_payoffs = build_payoffs(models)['flow_payoffs']

In [32]:
flow_payoffs

defaultdict(None,
            {'0:1-1': {'1': 0.7999999999999998,
              '2': 0.3999999999999999,
              '3': 0.3999999999999999,
              '4': 0.19999999999999996},
             '1:1-1': {'1': 1.5999999999999996,
              '2': 0.7999999999999998,
              '3': 0.7999999999999998,
              '4': 0.3999999999999999},
             '0:1-2': {'1': 0.7999999999999998,
              '2': 0.3999999999999999,
              '3': 0.3999999999999999,
              '4': 0.19999999999999996},
             '1:1-2': {'1': 1.5999999999999996,
              '2': 0.7999999999999998,
              '3': 0.7999999999999998,
              '4': 0.3999999999999999},
             '0:1-3': {'1': 0.7999999999999998,
              '2': 0.3999999999999999,
              '3': 0.3999999999999999,
              '4': 0.19999999999999996},
             '1:1-3': {'1': 1.5999999999999996,
              '2': 0.7999999999999998,
              '3': 0.7999999999999998,
              '4': 0.39

#### State transition functions

In [33]:
# | export
@multi
def state_transition(models):
    "Compute the likelihood of the given state_transition."
    return models.get('state_transition_key')

@method(state_transition, 'ex1')
def state_transition(models):
    """Compute transition likelihood for a model with 2 states and an arbitrary
    number of players. To stay in the good state, 0, all players need to choose
    to cooperate, i.e. action 1."""
    state_action, next_state = [models[k] for k in ['state_action',
                                                    'next_state']]
    current_state, action_profile = state_action.split(":")
    action_tuple = string_to_tuple(action_profile)
    action_count = dict(zip(*np.unique(action_tuple, return_counts=True)))
    n_players = len(action_tuple)
    n_cooperators = action_count.get(1, 0) + action_count.get(3, 0)
    if (current_state == '0'
        and next_state == '1'
        and n_cooperators != n_players):
        transition_likelihood = 1
    elif (current_state == '1'
          and next_state == '0'
          and n_cooperators == n_players):
        transition_likelihood = 1
    elif (current_state == '0'
          and next_state == '0'
          and n_cooperators == n_players):
        transition_likelihood = 1
    elif (current_state == '1'
          and next_state == '1'
          and n_cooperators != n_players):
        transition_likelihood = 1
    else:
        transition_likelihood = 0
    return transition_likelihood

In [34]:
# | export
def build_state_transitions(models):
    state_actions = models['state_actions']
    n_states = models['n_states']
    state_transitions = {}
    for state_action in state_actions:
        state_transitions[state_action] = {}
        for next_state in [f"{i}" for i in range(n_states)]:
            likelihood = state_transition({**models,
                                           "state_action": state_action,
                                           "next_state": next_state})
            state_transitions[state_action][next_state] = likelihood
    return {**models, "state_transitions": state_transitions}

#### Tests for `build_state_transitions`

In [35]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": [1, 2],
                                "S2": [3, 4]},
          "profiles_rule": "allowed_sectors", }
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = [f"{state}:{a}"
                 for a in action_profiles
                 for state in range(n_states)]
models = {'n_states':n_states,
          'state_actions': state_actions,
          'state_transition_key': 'ex1'}
result = build_state_transitions(models)['state_transitions']
expected = {'0:1-1': {'0': 1, '1': 0},
 '1:1-1': {'0': 1, '1': 0},
 '0:1-2': {'0': 0, '1': 1},
 '1:1-2': {'0': 0, '1': 1},
 '0:1-3': {'0': 1, '1': 0},
 '1:1-3': {'0': 1, '1': 0},
 '0:1-4': {'0': 0, '1': 1},
 '1:1-4': {'0': 0, '1': 1},
 '0:2-1': {'0': 0, '1': 1},
 '1:2-1': {'0': 0, '1': 1},
 '0:2-2': {'0': 0, '1': 1},
 '1:2-2': {'0': 0, '1': 1},
 '0:2-3': {'0': 0, '1': 1},
 '1:2-3': {'0': 0, '1': 1},
 '0:2-4': {'0': 0, '1': 1},
 '1:2-4': {'0': 0, '1': 1},
 '0:3-1': {'0': 1, '1': 0},
 '1:3-1': {'0': 1, '1': 0},
 '0:3-2': {'0': 0, '1': 1},
 '1:3-2': {'0': 0, '1': 1},
 '0:3-3': {'0': 1, '1': 0},
 '1:3-3': {'0': 1, '1': 0},
 '0:3-4': {'0': 0, '1': 1},
 '1:3-4': {'0': 0, '1': 1},
 '0:4-1': {'0': 0, '1': 1},
 '1:4-1': {'0': 0, '1': 1},
 '0:4-2': {'0': 0, '1': 1},
 '1:4-2': {'0': 0, '1': 1},
 '0:4-3': {'0': 0, '1': 1},
 '1:4-3': {'0': 0, '1': 1},
 '0:4-4': {'0': 0, '1': 1},
 '1:4-4': {'0': 0, '1': 1}}
fastcore.test.test_eq(result, expected)

In [36]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": ["1", "2"],
                                "S2": ["3", "4"]},
          "profiles_rule": "anonymous", }
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = [f"{state}:{a}"
                 for a in action_profiles
                 for state in range(n_states)]
models = {'n_states':n_states,
          'state_actions': state_actions,
          'state_transition_key': 'ex1'}
result = build_state_transitions(models)['state_transitions']
expected = {'0:4-4': {'0': 0, '1': 1},
 '1:4-4': {'0': 0, '1': 1},
 '0:4-3': {'0': 0, '1': 1},
 '1:4-3': {'0': 0, '1': 1},
 '0:3-3': {'0': 1, '1': 0},
 '1:3-3': {'0': 1, '1': 0},
 '0:4-2': {'0': 0, '1': 1},
 '1:4-2': {'0': 0, '1': 1},
 '0:3-2': {'0': 0, '1': 1},
 '1:3-2': {'0': 0, '1': 1},
 '0:2-2': {'0': 0, '1': 1},
 '1:2-2': {'0': 0, '1': 1},
 '0:4-1': {'0': 0, '1': 1},
 '1:4-1': {'0': 0, '1': 1},
 '0:3-1': {'0': 1, '1': 0},
 '1:3-1': {'0': 1, '1': 0},
 '0:2-1': {'0': 0, '1': 1},
 '1:2-1': {'0': 0, '1': 1},
 '0:1-1': {'0': 1, '1': 0},
 '1:1-1': {'0': 1, '1': 0}}
fastcore.test.test_eq(result, expected)

#### Strategy construction

In [37]:
# | export
@multi
def build_strategy(models):
    "Build the desired strategy"
    return models.get('strategy_key')

@method(build_strategy, 'ex1_rich_cooperator')
def build_strategy(models):
    """A rich player who cooperates with 95% probability if everyone currently
    cooperates, otherwise defects with 95% probability."""
    state_action = models['state_action']
    current_state, action_profile = state_action.split(":")
    action_tuple = string_to_tuple(action_profile)
    action_count = dict(zip(*np.unique(action_tuple, return_counts=True)))
    n_players = len(action_tuple)
    n_cooperators = action_count.get(1, 0) + action_count.get(3, 0)
    if (current_state == '0'
        and n_cooperators == n_players):
        strategy = {"A1": 0.95, "A2": 0.05}
    elif (current_state == '0'
          and n_cooperators != n_players):
        strategy = {"A1": 0.05, "A2": 0.95}
    elif (current_state == '1'
          and n_cooperators == n_players):
        strategy = {"A1": 0.95, "A2": 0.05}
    elif (current_state == '1'
          and n_cooperators != n_players):
        strategy = {"A1": 0.05, "A2": 0.95}
    return strategy

@method(build_strategy, 'ex1_rich_defector')
def build_strategy(models):
    """A rich player who defects with 95% probability no matter what others
    do, nor what state they are in."""
    state_action = models['state_action']
    current_state, action_profile = state_action.split(":")
    action_tuple = string_to_tuple(action_profile)
    action_count = dict(zip(*np.unique(action_tuple, return_counts=True)))
    n_players = len(action_tuple)
    n_cooperators = action_count.get(1, 0) + action_count.get(3, 0)
    if (current_state == '0'
        and n_cooperators == n_players):
        strategy = {"A1": 0.05, "A2": 0.95}
    elif (current_state == '0'
          and n_cooperators != n_players):
        strategy = {"A1": 0.05, "A2": 0.95}
    elif (current_state == '1'
          and n_cooperators == n_players):
        strategy = {"A1": 0.05, "A2": 0.95}
    elif (current_state == '1'
          and n_cooperators != n_players):
        strategy = {"A1": 0.05, "A2": 0.95}
    return strategy

@method(build_strategy, 'ex1_poor_cooperator')
def build_strategy(models):
    """A poor player who cooperates with 95% probability if everyone currently
    cooperates, otherwise defects with 95% probability."""
    state_action = models['state_action']
    current_state, action_profile = state_action.split(":")
    action_tuple = string_to_tuple(action_profile)
    action_count = dict(zip(*np.unique(action_tuple, return_counts=True)))
    n_players = len(action_tuple)
    n_cooperators = action_count.get(1, 0) + action_count.get(3, 0)
    if (current_state == '0'
        and n_cooperators == n_players):
        strategy = {"A3": 0.95, "A4": 0.05}
    elif (current_state == '0'
          and n_cooperators != n_players):
        strategy = {"A3": 0.05, "A4": 0.95}
    elif (current_state == '1'
          and n_cooperators == n_players):
        strategy = {"A3": 0.95, "A4": 0.05}
    elif (current_state == '1'
          and n_cooperators != n_players):
        strategy = {"A3": 0.05, "A4": 0.95}
    return strategy

@method(build_strategy, 'ex1_poor_defector')
def build_strategy(models):
    """A poor player who defects with 95% probability no matter what others
    do, nor what state they are in."""
    state_action = models['state_action']
    current_state, action_profile = state_action.split(":")
    action_tuple = string_to_tuple(action_profile)
    action_count = dict(zip(*np.unique(action_tuple, return_counts=True)))
    n_players = len(action_tuple)
    n_cooperators = action_count.get(1, 0) + action_count.get(3, 0)
    if (current_state == '0'
        and n_cooperators == n_players):
        strategy = {"A3": 0.05, "A4": 0.95}
    elif (current_state == '0'
          and n_cooperators != n_players):
        strategy = {"A3": 0.05, "A4": 0.95}
    elif (current_state == '1'
          and n_cooperators == n_players):
        strategy = {"A3": 0.05, "A4": 0.95}
    elif (current_state == '1'
          and n_cooperators != n_players):
        strategy = {"A3": 0.05, "A4": 0.95}
    return strategy

In [38]:
# | export
def build_strategies(models):
    "Build a dictionary containing the specified strategies in `models`"
    state_actions, strategy_keys = [models[k] for k in ["state_actions",
                                                        "strategy_keys"]]
    strategies = {f"{i+1}": {s: build_strategy({"strategy_key": strategy_key,
                                            "state_action": s})
                         for s in state_actions}
              for i, strategy_key in enumerate(strategy_keys)}
    return {**models, "strategies": strategies}

#### Tests for `build_strategy`

In [39]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": ["1", "2"],
                                "S2": ["3", "4"]},
          "profiles_rule": "anonymous", }
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = [f"{state}:{a}"
                 for a in action_profiles
                 for state in range(n_states)]
strategy_keys = ["ex1_rich_cooperator",
                 "ex1_rich_defector",
                 "ex1_poor_cooperator",
                 "ex1_poor_defector",]
strategies = {f"{i+1}": {s: build_strategy({"strategy_key": strategy_key,
                                            "state_action": s})
                         for s in state_actions}
              for i, strategy_key in enumerate(strategy_keys)}
expected = {'1': {'0:4-4': {'A1': 0.05, 'A2': 0.95},
  '1:4-4': {'A1': 0.05, 'A2': 0.95},
  '0:4-3': {'A1': 0.05, 'A2': 0.95},
  '1:4-3': {'A1': 0.05, 'A2': 0.95},
  '0:3-3': {'A1': 0.95, 'A2': 0.05},
  '1:3-3': {'A1': 0.95, 'A2': 0.05},
  '0:4-2': {'A1': 0.05, 'A2': 0.95},
  '1:4-2': {'A1': 0.05, 'A2': 0.95},
  '0:3-2': {'A1': 0.05, 'A2': 0.95},
  '1:3-2': {'A1': 0.05, 'A2': 0.95},
  '0:2-2': {'A1': 0.05, 'A2': 0.95},
  '1:2-2': {'A1': 0.05, 'A2': 0.95},
  '0:4-1': {'A1': 0.05, 'A2': 0.95},
  '1:4-1': {'A1': 0.05, 'A2': 0.95},
  '0:3-1': {'A1': 0.95, 'A2': 0.05},
  '1:3-1': {'A1': 0.95, 'A2': 0.05},
  '0:2-1': {'A1': 0.05, 'A2': 0.95},
  '1:2-1': {'A1': 0.05, 'A2': 0.95},
  '0:1-1': {'A1': 0.95, 'A2': 0.05},
  '1:1-1': {'A1': 0.95, 'A2': 0.05}},
 '2': {'0:4-4': {'A1': 0.05, 'A2': 0.95},
  '1:4-4': {'A1': 0.05, 'A2': 0.95},
  '0:4-3': {'A1': 0.05, 'A2': 0.95},
  '1:4-3': {'A1': 0.05, 'A2': 0.95},
  '0:3-3': {'A1': 0.05, 'A2': 0.95},
  '1:3-3': {'A1': 0.05, 'A2': 0.95},
  '0:4-2': {'A1': 0.05, 'A2': 0.95},
  '1:4-2': {'A1': 0.05, 'A2': 0.95},
  '0:3-2': {'A1': 0.05, 'A2': 0.95},
  '1:3-2': {'A1': 0.05, 'A2': 0.95},
  '0:2-2': {'A1': 0.05, 'A2': 0.95},
  '1:2-2': {'A1': 0.05, 'A2': 0.95},
  '0:4-1': {'A1': 0.05, 'A2': 0.95},
  '1:4-1': {'A1': 0.05, 'A2': 0.95},
  '0:3-1': {'A1': 0.05, 'A2': 0.95},
  '1:3-1': {'A1': 0.05, 'A2': 0.95},
  '0:2-1': {'A1': 0.05, 'A2': 0.95},
  '1:2-1': {'A1': 0.05, 'A2': 0.95},
  '0:1-1': {'A1': 0.05, 'A2': 0.95},
  '1:1-1': {'A1': 0.05, 'A2': 0.95}},
 '3': {'0:4-4': {'A3': 0.05, 'A4': 0.95},
  '1:4-4': {'A3': 0.05, 'A4': 0.95},
  '0:4-3': {'A3': 0.05, 'A4': 0.95},
  '1:4-3': {'A3': 0.05, 'A4': 0.95},
  '0:3-3': {'A3': 0.95, 'A4': 0.05},
  '1:3-3': {'A3': 0.95, 'A4': 0.05},
  '0:4-2': {'A3': 0.05, 'A4': 0.95},
  '1:4-2': {'A3': 0.05, 'A4': 0.95},
  '0:3-2': {'A3': 0.05, 'A4': 0.95},
  '1:3-2': {'A3': 0.05, 'A4': 0.95},
  '0:2-2': {'A3': 0.05, 'A4': 0.95},
  '1:2-2': {'A3': 0.05, 'A4': 0.95},
  '0:4-1': {'A3': 0.05, 'A4': 0.95},
  '1:4-1': {'A3': 0.05, 'A4': 0.95},
  '0:3-1': {'A3': 0.95, 'A4': 0.05},
  '1:3-1': {'A3': 0.95, 'A4': 0.05},
  '0:2-1': {'A3': 0.05, 'A4': 0.95},
  '1:2-1': {'A3': 0.05, 'A4': 0.95},
  '0:1-1': {'A3': 0.95, 'A4': 0.05},
  '1:1-1': {'A3': 0.95, 'A4': 0.05}},
 '4': {'0:4-4': {'A3': 0.05, 'A4': 0.95},
  '1:4-4': {'A3': 0.05, 'A4': 0.95},
  '0:4-3': {'A3': 0.05, 'A4': 0.95},
  '1:4-3': {'A3': 0.05, 'A4': 0.95},
  '0:3-3': {'A3': 0.05, 'A4': 0.95},
  '1:3-3': {'A3': 0.05, 'A4': 0.95},
  '0:4-2': {'A3': 0.05, 'A4': 0.95},
  '1:4-2': {'A3': 0.05, 'A4': 0.95},
  '0:3-2': {'A3': 0.05, 'A4': 0.95},
  '1:3-2': {'A3': 0.05, 'A4': 0.95},
  '0:2-2': {'A3': 0.05, 'A4': 0.95},
  '1:2-2': {'A3': 0.05, 'A4': 0.95},
  '0:4-1': {'A3': 0.05, 'A4': 0.95},
  '1:4-1': {'A3': 0.05, 'A4': 0.95},
  '0:3-1': {'A3': 0.05, 'A4': 0.95},
  '1:3-1': {'A3': 0.05, 'A4': 0.95},
  '0:2-1': {'A3': 0.05, 'A4': 0.95},
  '1:2-1': {'A3': 0.05, 'A4': 0.95},
  '0:1-1': {'A3': 0.05, 'A4': 0.95},
  '1:1-1': {'A3': 0.05, 'A4': 0.95}}}
fastcore.test.test_eq(strategies, expected)

In [40]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": ["1", "2"],
                                "S2": ["3", "4"]},
          "profiles_rule": "anonymous", }
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = [f"{state}:{a}"
                 for a in action_profiles
                 for state in range(n_states)]
strategy_keys = ["ex1_rich_cooperator",
                 "ex1_rich_defector",
                 "ex1_poor_cooperator",
                 "ex1_poor_defector",]
models = {**models,
          "strategy_keys": strategy_keys,
          "state_actions": state_actions}
strategies = build_strategies(models)['strategies']
expected = {'1': {'0:4-4': {'A1': 0.05, 'A2': 0.95},
  '1:4-4': {'A1': 0.05, 'A2': 0.95},
  '0:4-3': {'A1': 0.05, 'A2': 0.95},
  '1:4-3': {'A1': 0.05, 'A2': 0.95},
  '0:3-3': {'A1': 0.95, 'A2': 0.05},
  '1:3-3': {'A1': 0.95, 'A2': 0.05},
  '0:4-2': {'A1': 0.05, 'A2': 0.95},
  '1:4-2': {'A1': 0.05, 'A2': 0.95},
  '0:3-2': {'A1': 0.05, 'A2': 0.95},
  '1:3-2': {'A1': 0.05, 'A2': 0.95},
  '0:2-2': {'A1': 0.05, 'A2': 0.95},
  '1:2-2': {'A1': 0.05, 'A2': 0.95},
  '0:4-1': {'A1': 0.05, 'A2': 0.95},
  '1:4-1': {'A1': 0.05, 'A2': 0.95},
  '0:3-1': {'A1': 0.95, 'A2': 0.05},
  '1:3-1': {'A1': 0.95, 'A2': 0.05},
  '0:2-1': {'A1': 0.05, 'A2': 0.95},
  '1:2-1': {'A1': 0.05, 'A2': 0.95},
  '0:1-1': {'A1': 0.95, 'A2': 0.05},
  '1:1-1': {'A1': 0.95, 'A2': 0.05}},
 '2': {'0:4-4': {'A1': 0.05, 'A2': 0.95},
  '1:4-4': {'A1': 0.05, 'A2': 0.95},
  '0:4-3': {'A1': 0.05, 'A2': 0.95},
  '1:4-3': {'A1': 0.05, 'A2': 0.95},
  '0:3-3': {'A1': 0.05, 'A2': 0.95},
  '1:3-3': {'A1': 0.05, 'A2': 0.95},
  '0:4-2': {'A1': 0.05, 'A2': 0.95},
  '1:4-2': {'A1': 0.05, 'A2': 0.95},
  '0:3-2': {'A1': 0.05, 'A2': 0.95},
  '1:3-2': {'A1': 0.05, 'A2': 0.95},
  '0:2-2': {'A1': 0.05, 'A2': 0.95},
  '1:2-2': {'A1': 0.05, 'A2': 0.95},
  '0:4-1': {'A1': 0.05, 'A2': 0.95},
  '1:4-1': {'A1': 0.05, 'A2': 0.95},
  '0:3-1': {'A1': 0.05, 'A2': 0.95},
  '1:3-1': {'A1': 0.05, 'A2': 0.95},
  '0:2-1': {'A1': 0.05, 'A2': 0.95},
  '1:2-1': {'A1': 0.05, 'A2': 0.95},
  '0:1-1': {'A1': 0.05, 'A2': 0.95},
  '1:1-1': {'A1': 0.05, 'A2': 0.95}},
 '3': {'0:4-4': {'A3': 0.05, 'A4': 0.95},
  '1:4-4': {'A3': 0.05, 'A4': 0.95},
  '0:4-3': {'A3': 0.05, 'A4': 0.95},
  '1:4-3': {'A3': 0.05, 'A4': 0.95},
  '0:3-3': {'A3': 0.95, 'A4': 0.05},
  '1:3-3': {'A3': 0.95, 'A4': 0.05},
  '0:4-2': {'A3': 0.05, 'A4': 0.95},
  '1:4-2': {'A3': 0.05, 'A4': 0.95},
  '0:3-2': {'A3': 0.05, 'A4': 0.95},
  '1:3-2': {'A3': 0.05, 'A4': 0.95},
  '0:2-2': {'A3': 0.05, 'A4': 0.95},
  '1:2-2': {'A3': 0.05, 'A4': 0.95},
  '0:4-1': {'A3': 0.05, 'A4': 0.95},
  '1:4-1': {'A3': 0.05, 'A4': 0.95},
  '0:3-1': {'A3': 0.95, 'A4': 0.05},
  '1:3-1': {'A3': 0.95, 'A4': 0.05},
  '0:2-1': {'A3': 0.05, 'A4': 0.95},
  '1:2-1': {'A3': 0.05, 'A4': 0.95},
  '0:1-1': {'A3': 0.95, 'A4': 0.05},
  '1:1-1': {'A3': 0.95, 'A4': 0.05}},
 '4': {'0:4-4': {'A3': 0.05, 'A4': 0.95},
  '1:4-4': {'A3': 0.05, 'A4': 0.95},
  '0:4-3': {'A3': 0.05, 'A4': 0.95},
  '1:4-3': {'A3': 0.05, 'A4': 0.95},
  '0:3-3': {'A3': 0.05, 'A4': 0.95},
  '1:3-3': {'A3': 0.05, 'A4': 0.95},
  '0:4-2': {'A3': 0.05, 'A4': 0.95},
  '1:4-2': {'A3': 0.05, 'A4': 0.95},
  '0:3-2': {'A3': 0.05, 'A4': 0.95},
  '1:3-2': {'A3': 0.05, 'A4': 0.95},
  '0:2-2': {'A3': 0.05, 'A4': 0.95},
  '1:2-2': {'A3': 0.05, 'A4': 0.95},
  '0:4-1': {'A3': 0.05, 'A4': 0.95},
  '1:4-1': {'A3': 0.05, 'A4': 0.95},
  '0:3-1': {'A3': 0.05, 'A4': 0.95},
  '1:3-1': {'A3': 0.05, 'A4': 0.95},
  '0:2-1': {'A3': 0.05, 'A4': 0.95},
  '1:2-1': {'A3': 0.05, 'A4': 0.95},
  '0:1-1': {'A3': 0.05, 'A4': 0.95},
  '1:1-1': {'A3': 0.05, 'A4': 0.95}}}
fastcore.test.test_eq(strategies, expected)

#### Stochastic payoffs test

In [41]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": ["1", "2"],
                                "S2": ["3", "4"]},
          "profiles_rule": "anonymous", }
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = [f"{state}:{a}"
                 for a in action_profiles
                 for state in range(n_states)]

strategy_keys = ["ex1_rich_cooperator",
                 "ex1_rich_defector",
                 "ex1_poor_cooperator",
                 "ex1_poor_defector",]
models = {**models,
          "strategy_keys": strategy_keys,
          "state_actions": state_actions}
strategies = build_strategies(models)['strategies']
strategy_profile = "1-2-3"
models = {"payoffs_flow_key": "vasconcelos_2014_flow",
          "payoffs_key": "flow_payoffs_wrapper",
          "state_actions": state_actions,
          "strategies": strategies,
          "strategy_profile": strategy_profile,
          'n_states':n_states,
          'state_transition_key': 'ex1',
          'compute_transition_key': "anonymous_actions",
          'c': 0.5,
          'T': 2,
          'b_r': 4,
          'b_p': 2,
          'r': 0.8,
          'g': 2,
          }

models = build_state_transitions(models)
models = build_payoffs(models)
models = {**models,
          "payoffs_key": "stochastic-no-discounting"}
results = build_payoffs(models)
expected = {'1': 1.5979057591623032,
 '2': 0.7989528795811516,
 '3': 0.7989528795811516,
 '4': 0.3994764397905758}
for k, v in results['profile_payoffs'].items():
    fastcore.test.test_close(v, expected[k])

In [42]:
models = {"allowed_sectors": {"P1": ["S1", "S2"],
                              "P2": ["S1", "S2"]},
          "sector_strategies": {"S1": ["1", "2"],
                                "S2": ["3", "4"]},
          "profiles_rule": "anonymous", }
action_profiles = create_profiles(models)["profiles"]
n_states = 2
state_actions = [f"{state}:{a}"
                 for a in action_profiles
                 for state in range(n_states)]

strategy_keys = ["ex1_rich_cooperator",
                 "ex1_rich_defector",
                 "ex1_poor_cooperator",
                 "ex1_poor_defector",]
models = {**models,
          "strategy_keys": strategy_keys,
          "state_actions": state_actions}
strategies = build_strategies(models)['strategies']
strategy_profile = "1-2-3"
models = {**models,
          "payoffs_flow_key": "vasconcelos_2014_flow",
          "payoffs_key": "flow_payoffs_wrapper",
          "state_actions": state_actions,
          "strategies": strategies,
          "strategy_profile": strategy_profile,
          'n_states':n_states,
          'state_transition_key': 'ex1',
          'compute_transition_key': "anonymous_actions",
          'c': 0.5,
          'T': 2,
          'b_r': 4,
          'b_p': 2,
          'r': 0.8,
          'g': 2,
          }
models = build_state_transitions(models)
models = build_payoffs(models)
models = {**models,
          "payoffs_key": "payoff_function_wrapper",
          "profile_payoffs_key": "stochastic-no-discounting"}
results = build_payoffs(models)
results['payoffs']

{'4-4': {'P1': 0.39949999999999986, 'P2': 0.39949999999999986},
 '4-3': {'P1': 0.7989528795811517, 'P2': 0.39947643979057584},
 '3-3': {'P1': 0.7899999999999998, 'P2': 0.7899999999999998},
 '4-2': {'P1': 0.7989999999999997, 'P2': 0.39949999999999986},
 '3-2': {'P1': 0.7989528795811518, 'P2': 0.7989528795811518},
 '2-2': {'P1': 0.7989999999999997, 'P2': 0.7989999999999997},
 '4-1': {'P1': 1.5979057591623036, 'P2': 0.3994764397905759},
 '3-1': {'P1': 1.5799999999999994, 'P2': 0.7899999999999997},
 '2-1': {'P1': 1.5979057591623034, 'P2': 0.7989528795811517},
 '1-1': {'P1': 1.5799999999999994, 'P2': 1.5799999999999994}}

# Payoff Matrices (part 3)

> This module contains payoff matrices for different evolutionary games
>
> Part 3 contains payoff matrices for the following games
> - Regulatory Markets

In [43]:
# | export
@method(build_payoffs, "regulatory_markets_v1_reward_before")
def build_payoffs(models):
    """Regulatory market payoffs when incentives are given in advance and only
    taken away if firms act unsafely."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = b / (s+1) * (1 - pfo_h) * risk_shared + (b + B / W) * pfo_h - c
    Π_h21 = p * ( 1 - pfo_h) * (s*b / (s + 1) + s * B / W)
    Π_h22 = p * ( 1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = b / (s+1) * (1 - pfo_l) * risk_shared + (b + B / W) * pfo_l - c
    Π_l21 = p * ( 1 - pfo_l) * (s*b / (s + 1) + s * B / W)
    Π_l22 = p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g
    Ω_12 = r_l + g
    Ω_21 = r_h + g * pfo_h**2 - λ_h
    Ω_22 = r_l + g * pfo_l**2 - λ_l
    Ω_31 = r_h + g
    Ω_32 = r_l + g * pfo_l**2 - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [44]:
# | export
@method(build_payoffs, "regulatory_markets_v1_reward_after")
def build_payoffs(models):
    """Regulatory market payoffs when there is only a reward after catching
    unsafe firms."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = b / (s+1) * (1 - pfo_h) * risk_shared + (b + B / W) * pfo_h - c
    Π_h21 = p * ( 1 - pfo_h) * (s*b / (s + 1) + s * B / W)
    Π_h22 = p * ( 1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = b / (s+1) * (1 - pfo_l) * risk_shared + (b + B / W) * pfo_l - c
    Π_l21 = p * ( 1 - pfo_l) * (s*b / (s + 1) + s * B / W)
    Π_l22 = p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    # No ex-ante reward for regulators
    Ω_11 = r_h
    Ω_12 = r_l
    # Expect to catch n*p unsafe firms, where n=2 and p=pfo_h
    # They may expect to be penalised if a disaster occurs under their watch
    # but by default the penalty λ may be 0.
    Ω_21 = r_h + g * 2 * pfo_h - λ_h
    Ω_22 = r_l + g * 2 * pfo_l - λ_l
    Ω_31 = r_h
    Ω_32 = r_l + g * 2 * pfo_l - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [45]:
# | export
@method(build_payoffs, "regulatory_markets_v1a_reward_after")
def build_payoffs(models):
    """An alternative payoff scheme for regulatory markets which more closely
    matches the DSAIR model"""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = b / (s+1) * (1 - pfo_h) * risk_shared + (b) * pfo_h - c
    Π_h21 = p * (s*b / (s + 1)  * ( 1 - pfo_h) + s * B / W)
    Π_h22 = p * (b/2 * ( 1 - pfo_h**2) + s*B/(2*W)) * risk_shared
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = b / (s+1) * (1 - pfo_l) * risk_shared + (b) * pfo_l - c
    Π_l21 = p * (s*b / (s + 1) * ( 1 - pfo_l) + s * B / W)
    Π_l22 = p * (b/2 * ( 1 - pfo_l**2) + s*B/(2*W)) * risk_shared
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    # No ex-ante reward for regulators
    Ω_11 = r_h
    Ω_12 = r_l
    # Expect to catch n*p unsafe firms, where n=2 and p=pfo_h
    # They may expect to be penalised if a disaster occurs under their watch
    # but by default the penalty λ may be 0.
    Ω_21 = r_h + g * 2 * pfo_h - λ_h
    Ω_22 = r_l + g * 2 * pfo_l - λ_l
    Ω_31 = r_h
    Ω_32 = r_l + g * 2 * pfo_l - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


# | export
@method(build_payoffs, "regulatory_markets_v1a_reward_before")
def build_payoffs(models):
    """Regulatory market payoffs when incentives are given in advance and only
    taken away if firms act unsafely."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = b / (s+1) * (1 - pfo_h) * risk_shared + (b) * pfo_h - c
    Π_h21 = p * (s*b / (s + 1)  * ( 1 - pfo_h) + s * B / W)
    Π_h22 = p * (b/2 * ( 1 - pfo_h**2) + s*B/(2*W)) * risk_shared
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = b / (s+1) * (1 - pfo_l) * risk_shared + (b) * pfo_l - c
    Π_l21 = p * (s*b / (s + 1) * ( 1 - pfo_l) + s * B / W)
    Π_l22 = p * (b/2 * ( 1 - pfo_l**2) + s*B/(2*W)) * risk_shared
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g
    Ω_12 = r_l + g
    Ω_21 = r_h + g * pfo_h**2 - λ_h
    Ω_22 = r_l + g * pfo_l**2 - λ_l
    Ω_31 = r_h + g
    Ω_32 = r_l + g * pfo_l**2 - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [46]:
# | export
@method(build_payoffs, "regulatory_markets_v1_reward_mixed")
def build_payoffs(models):
    """Regulatory market payoffs when there is only a reward after catching
    unsafe firms."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    mix = models.get('incentive_mix', 0)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = b / (s+1) * (1 - pfo_h) * risk_shared + (b + B / W) * pfo_h - c
    Π_h21 = p * ( 1 - pfo_h) * (s*b / (s + 1) + s * B / W)
    Π_h22 = p * ( 1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = b / (s+1) * (1 - pfo_l) * risk_shared + (b + B / W) * pfo_l - c
    Π_l21 = p * ( 1 - pfo_l) * (s*b / (s + 1) + s * B / W)
    Π_l22 = p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g * mix
    Ω_12 = r_l + g * mix
    Ω_21 = r_h + g * (pfo_h**2 * mix + pfo_h * (1 - mix)) - λ_h
    Ω_22 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    Ω_31 = r_h + g * mix
    Ω_32 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [47]:
# | export
@multi
def compute_game_welfare(models):
    "Compute the welfare generated by the game in each state."
    return models.get('game_welfare_rule')

@method(compute_game_welfare, "regulatory_markets_v1_reward_before")
def compute_game_welfare(models):
    "Compute the welfare generated by the game in each state."
    names = ['payoffs', 'ergodic']
    payoffs, ergodic = [models[k] for k in names]
    p, g, pfo_h, pfo_l = [models[k] for k in ['p', 'g', 'pfo_h', 'pfo_l']]
    consumer_surplus = models.get('consumer_surplus', 0)
    externality = models.get('externality', 0)

    company_payoffs_safe_hq = payoffs['4-1-1']["P1"] + payoffs['4-1-1']["P2"]
    company_payoffs_safe_lq = payoffs['5-1-1']["P1"] + payoffs['5-1-1']["P2"]
    company_payoffs_unsafe_hq = payoffs['4-2-2']["P1"] + payoffs['4-2-2']["P2"]
    company_payoffs_unsafe_lq = payoffs['5-2-2']["P1"] + payoffs['5-2-2']["P2"]
    company_payoffs_vetted_hq = payoffs['4-3-3']["P1"] + payoffs['4-3-3']["P2"]
    company_payoffs_vetted_lq = payoffs['5-3-3']["P1"] + payoffs['5-3-3']["P2"]

    welfare_safe_hq = (company_payoffs_safe_hq * (1 + consumer_surplus)
                       - g)
    welfare_unsafe_hq = (company_payoffs_unsafe_hq * (1 + consumer_surplus)
                         - (1-p)*externality
                         - g * pfo_h**2)
    welfare_vetted_hq = (company_payoffs_vetted_hq * (1 + consumer_surplus)
                         - g)
    welfare_safe_lq = (company_payoffs_safe_lq * (1 + consumer_surplus)
                       - g)
    welfare_unsafe_lq = (company_payoffs_unsafe_lq * (1 + consumer_surplus)
                       - (1-p)*externality
                       - g * pfo_l**2)
    welfare_vetted_lq = (company_payoffs_vetted_lq * (1 + consumer_surplus)
                       - (1-p)*externality
                       - g * pfo_l**2)
    welfares = [welfare_safe_hq,
                welfare_unsafe_hq,
                welfare_vetted_hq,
                welfare_safe_lq,
                welfare_unsafe_lq,
                welfare_vetted_lq]
    game_welfare = np.sum([welfare * state_frequency
                           for welfare, state_frequency in zip(welfares,
                                                               ergodic.T)],
                          axis=0)
    return {**models, "game_welfare": game_welfare}

@method(compute_game_welfare, "regulatory_markets_v1_reward_after")
def compute_game_welfare(models):
    "Compute the welfare generated by the game in each state."
    names = ['payoffs', 'ergodic']
    payoffs, ergodic = [models[k] for k in names]
    p, g, pfo_h, pfo_l = [models[k] for k in ['p', 'g', 'pfo_h', 'pfo_l']]
    consumer_surplus = models.get('consumer_surplus', 0)
    externality = models.get('externality', 0)

    company_payoffs_safe_hq = payoffs['4-1-1']["P1"] + payoffs['4-1-1']["P2"]
    company_payoffs_safe_lq = payoffs['5-1-1']["P1"] + payoffs['5-1-1']["P2"]
    company_payoffs_unsafe_hq = payoffs['4-2-2']["P1"] + payoffs['4-2-2']["P2"]
    company_payoffs_unsafe_lq = payoffs['5-2-2']["P1"] + payoffs['5-2-2']["P2"]
    company_payoffs_vetted_hq = payoffs['4-3-3']["P1"] + payoffs['4-3-3']["P2"]
    company_payoffs_vetted_lq = payoffs['5-3-3']["P1"] + payoffs['5-3-3']["P2"]

    welfare_safe_hq = (company_payoffs_safe_hq * (1 + consumer_surplus))
    welfare_unsafe_hq = (company_payoffs_unsafe_hq * (1 + consumer_surplus)
                         - (1-p)*externality
                         - g * pfo_h)
    welfare_vetted_hq = (company_payoffs_vetted_hq * (1 + consumer_surplus))
    welfare_safe_lq = (company_payoffs_safe_lq * (1 + consumer_surplus))
    welfare_unsafe_lq = (company_payoffs_unsafe_lq * (1 + consumer_surplus)
                         - (1-p)*externality
                         - g * pfo_l)
    welfare_vetted_lq = (company_payoffs_vetted_lq * (1 + consumer_surplus)
                         - (1-p)*externality
                         - g * pfo_l)
    welfares = [welfare_safe_hq,
                welfare_unsafe_hq,
                welfare_vetted_hq,
                welfare_safe_lq,
                welfare_unsafe_lq,
                welfare_vetted_lq]
    game_welfare = np.sum([welfare * state_frequency
                           for welfare, state_frequency in zip(welfares,
                                                               ergodic.T)],
                          axis=0)
    return {**models, "game_welfare": game_welfare}

@method(compute_game_welfare, "regulatory_markets_v1_reward_mixed")
def compute_game_welfare(models):
    "Compute the welfare generated by the game in each state."
    names = ['payoffs', 'ergodic']
    payoffs, ergodic = [models[k] for k in names]
    p, g, pfo_h, pfo_l = [models[k] for k in ['p', 'g', 'pfo_h', 'pfo_l']]
    consumer_surplus = models.get('consumer_surplus', 0)
    externality = models.get('externality', 0)
    mix = models['incentive_mix']

    company_payoffs_safe_hq = payoffs['4-1-1']["P1"] + payoffs['4-1-1']["P2"]
    company_payoffs_safe_lq = payoffs['5-1-1']["P1"] + payoffs['5-1-1']["P2"]
    company_payoffs_unsafe_hq = payoffs['4-2-2']["P1"] + payoffs['4-2-2']["P2"]
    company_payoffs_unsafe_lq = payoffs['5-2-2']["P1"] + payoffs['5-2-2']["P2"]
    company_payoffs_vetted_hq = payoffs['4-3-3']["P1"] + payoffs['4-3-3']["P2"]
    company_payoffs_vetted_lq = payoffs['5-3-3']["P1"] + payoffs['5-3-3']["P2"]

    welfare_safe_hq = (company_payoffs_safe_hq * (1 + consumer_surplus)
                       - g * mix)
    welfare_unsafe_hq = (company_payoffs_unsafe_hq * (1 + consumer_surplus)
                         - (1-p) * (1 - pfo_h**2) * externality
                         - g * (mix * pfo_h**2 * + (1 - mix) * pfo_h))
    welfare_vetted_hq = (company_payoffs_vetted_hq * (1 + consumer_surplus)
                         - g * mix)
    welfare_unsafe_lq = (company_payoffs_unsafe_lq * (1 + consumer_surplus)
                         - (1-p) * (1 - pfo_l**2) * externality
                         - g * (mix * pfo_l**2 * + (1 - mix) * pfo_l))
    welfare_safe_lq = (company_payoffs_safe_lq * (1 + consumer_surplus)
                       - g * mix)
    welfare_vetted_lq = (company_payoffs_vetted_lq * (1 + consumer_surplus)
                         - (1-p) * (1 - pfo_l**2) * externality
                         - g * (mix * pfo_l**2 * + (1 - mix) * pfo_l))
    welfares = [welfare_safe_hq,
                welfare_unsafe_hq,
                welfare_vetted_hq,
                welfare_safe_lq,
                welfare_unsafe_lq,
                welfare_vetted_lq]
    game_welfare = np.sum([welfare * state_frequency
                           for welfare, state_frequency in zip(welfares,
                                                               ergodic.T)],
                          axis=0)
    return {**models, "game_welfare": game_welfare}
      

In [48]:
# | export
@method(build_payoffs, "regulatory_markets_v2_reward_mixed")
def build_payoffs(models):
    """Regulatory market payoffs when there is only a reward after catching
    unsafe firms."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    mix = models.get('incentive_mix', 0)
    
    k = models.get('decisiveness', 100)
    phi_h = models.get('phi_h', (1 - pfo_h))
    phi_l = models.get('phi_l', (1 - pfo_l))
    caught_loses_h = ((s * phi_h)**k + 1)**(-1)
    caught_loses_l = ((s * phi_l)**k + 1)**(-1)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = ((1 - pfo_h) * b / (s+1) * risk_shared
             + pfo_h * caught_loses_h * (b + B / W)
             - c)
    Π_h21 = (p * (1 - pfo_h) * (s*b / (s + 1) + s * B / W)
             + pfo_h * (1 - caught_loses_h) * B / W)
    Π_h22 = p * ( 1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
    Π_h22 = (p * (1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
             + pfo_h**2 * B/(2*W))
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = ((1 - pfo_l) * b / (s+1) * risk_shared
             + pfo_l * caught_loses_l * (b + B / W)
             - c)
    Π_l21 = (p * (1 - pfo_l) * (s*b / (s + 1)  + s * B / W)
                 + pfo_l * (1 - caught_loses_l) * B / W)
    Π_l22 = p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
    Π_l22 = (p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
             + pfo_l**2 * B/(2*W))
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g * mix
    Ω_12 = r_l + g * mix
    Ω_21 = r_h + g * (pfo_h**2 * mix + pfo_h * (1 - mix)) - λ_h
    Ω_22 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    Ω_31 = r_h + g * mix
    Ω_32 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [49]:
# | export
@method(build_payoffs, "regulatory_markets_v3_reward_mixed")
def build_payoffs(models):
    """Regulatory market payoffs when there is only a reward after catching
    unsafe firms."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    mix = models.get('incentive_mix', 0)
    
    k = models.get('decisiveness', 100)
    phi_h = models.get('phi_h', 1/s)
    phi_l = models.get('phi_l', 1/s)
    caught_loses_h = ((s * phi_h)**k + 1)**(-1)
    caught_loses_l = ((s * phi_l)**k + 1)**(-1)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = ((1 - pfo_h) * b / (s+1) * risk_shared
             + pfo_h * caught_loses_h * (b + B / W)
             - c)
    Π_h21 = (p * (1 - pfo_h) * (s*b / (s + 1) + s * B / W)
                 + pfo_h * (1 - caught_loses_h) * B / W)
    Π_h22 = p * ( 1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = ((1 - pfo_l) * b / (s+1) * risk_shared
             + pfo_l * caught_loses_l * (b + B / W)
             - c)
    Π_l21 = (p * (1 - pfo_l) * (s*b / (s + 1)  + s * B / W)
                 + pfo_l * (1 - caught_loses_l) * B / W)
    Π_l22 = p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g * mix
    Ω_12 = r_l + g * mix
    Ω_21 = r_h + g * (pfo_h**2 * mix + pfo_h * (1 - mix)) - λ_h
    Ω_22 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    Ω_31 = r_h + g * mix
    Ω_32 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [50]:
# | export
@method(build_payoffs, "regulatory_markets_v4_reward_mixed")
def build_payoffs(models):
    """Regulatory market payoffs for a mix of incentives and allows a
    schedule of measures to apply to firms detected as unsafe."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    mix = models.get('incentive_mix', 0)
    
    k = models.get('decisiveness', 100)
    phi_h = models.get('phi_h', 1/s)
    phi2_h = models.get('phi2_h,', 1/s)
    phi_l = models.get('phi_l', 1/s)
    phi2_l = models.get('phi2_l', 1/s)
    caught_loses_h = ((s * phi_h)**k + 1)**(-1)
    caught_loses_l = ((s * phi_l)**k + 1)**(-1)
    both_caught_lose_h = ((s * phi2_h)**k + 1)**(-1)
    both_caught_lose_l = ((s * phi2_l)**k + 1)**(-1)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = ((1 - pfo_h) * b / (s+1) * risk_shared
             + pfo_h * caught_loses_h * (b + B / W)
             - c)
    Π_h21 = (p * (1 - pfo_h) * (s*b / (s + 1) + s * B / W)
             + pfo_h * (1 - caught_loses_h) * B / W)
    Π_h22 = p * ( 1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
    Π_h22 = (p * (1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
             + pfo_h**2 * both_caught_lose_h * B/(2*W))
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = ((1 - pfo_l) * b / (s+1) * risk_shared
             + pfo_l * caught_loses_l * (b + B / W)
             - c)
    Π_l21 = (p * (1 - pfo_l) * (s*b / (s + 1)  + s * B / W)
                 + pfo_l * (1 - caught_loses_l) * B / W)
    Π_l22 = p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
    Π_l22 = (p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
             + pfo_l**2  * both_caught_lose_l * B/(2*W))
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g * mix
    Ω_12 = r_l + g * mix
    Ω_21 = r_h + g * (pfo_h**2 * mix + pfo_h * (1 - mix)) - λ_h
    Ω_22 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    Ω_31 = r_h + g * mix
    Ω_32 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [51]:
# | export
@method(build_payoffs, "regulatory_markets_v5_reward_mixed")
def build_payoffs(models):
    """Regulatory market payoffs for a mix of incentives and allows a
    schedule of measures to apply to firms detected as unsafe. Investigating
    a potential correction to the probability of a firm winning."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    mix = models.get('incentive_mix', 0)
    
    k = models.get('decisiveness', 100)
    # Speed impact of regulators when they catch 1 or 2 safety violators
    phi_h = models.get('phi_h', 1/s)
    phi2_h = models.get('phi2_h,', 1/s)
    phi_l = models.get('phi_l', 1/s)
    phi2_l = models.get('phi2_l', 1/s)
    # Tullock contest to determine which firm wins after
    # one safety violator is caught
    caught_loses_h = ((s * phi_h)**k + 1)**(-1)
    caught_loses_l = ((s * phi_l)**k + 1)**(-1)
    # Tullock contest to determine whether any firm wins if they are both
    # safety violators who were caught by the regulator
    both_caught_fail_h = ((s * phi2_h)**k + 1)**(-1)
    both_caught_fail_l = ((s * phi2_l)**k + 1)**(-1)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = ((1 - pfo_h) * b / (s+1) * risk_shared
             + pfo_h * caught_loses_h * (b + B / W)
             - c)
    Π_h21 = (p * (1 - pfo_h) * (s*b / (s + 1) + s * B / W)
             + (pfo_h * (1 - caught_loses_h)
                * B / W))
    Π_h22 = (p * (1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
             + (pfo_h**2 * (1 - both_caught_fail_h)
                * B/(2*W)))
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = ((1 - pfo_l) * b / (s+1) * risk_shared
             + pfo_l * caught_loses_l * (b + B / W)
             - c)
    Π_l21 = (p * (1 - pfo_l) * (s*b / (s + 1)  + s * B / W)
             + (pfo_l * (1 - caught_loses_l)
                * B / W))
    Π_l22 = (p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
             + (pfo_l**2  * (1 - both_caught_fail_l)
                * B/(2*W)))
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g * mix
    Ω_12 = r_l + g * mix
    Ω_21 = r_h + g * (pfo_h**2 * mix + pfo_h * (1 - mix)) - λ_h
    Ω_22 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    Ω_31 = r_h + g * mix
    Ω_32 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [52]:
# | export
@method(build_payoffs, "regulatory_markets_v6_reward_mixed")
def build_payoffs(models):
    """Regulatory market payoffs for a mix of incentives and allows a
    schedule of measures to apply to firms detected as unsafe. The speed of
    caught safety violators is not necessarily 1 and the risk is not
    necessarily 0."""
    names1 = ['b', 'c', 's', 'p', 'B', 'W']
    names2 = ['pfo_l', 'pfo_h', 'λ', 'r_l', 'r_h', 'g']
    b, c, s, p, B, W = [models[k] for k in names1]
    pfo_l, pfo_h, λ, r_l, r_h, g = [models[k] for k in names2]
    collective_risk = models.get('collective_risk', 0)
    risk_shared = (1 - (1-p)*collective_risk)
    mix = models.get('incentive_mix', 0)
    
    k = models.get('decisiveness', 100)
    # Win impact of regulators when they catch 1 or 2 safety violators
    phi_h = models.get('phi_h', 1/s)
    phi2_h = models.get('phi2_h,', 1/s)
    phi_l = models.get('phi_l', 1/s)
    phi2_l = models.get('phi2_l', 1/s)
    # Speed impact of regulators when they catch 1 or 2 safety violators
    theta_h = models.get('theta_h', 1/s)
    theta2_h = models.get('theta2_h,', 1/s)
    theta_l = models.get('theta_l', 1/s)
    theta2_l = models.get('theta2_l,', 1/s)
    # Risk impact of regulators when they catch 1 or 2 safety violators
    gamma_h = models.get('gamma_h', phi_h)
    gamma2_h = models.get('gamma2_h', phi2_h)
    gamma_l = models.get('gamma_l', phi_l)
    gamma2_l = models.get('gamma2_l', phi2_l)
    # Tullock contest to determine which firm wins after
    # one safety violator is caught
    caught_loses_h = ((s * phi_h)**k + 1)**(-1)
    caught_loses_l = ((s * phi_l)**k + 1)**(-1)
    # Tullock contest to determine whether any firm wins if they are both
    # safety violators who were caught by the regulator
    both_caught_fail_h = ((s * phi2_h)**k + 1)**(-1)
    both_caught_fail_l = ((s * phi2_l)**k + 1)**(-1)
    risk_shared_reg2_h = (1 - (1-p)*collective_risk * gamma2_h)
    risk_shared_reg2_l = (1 - (1-p)*collective_risk * gamma2_l)
    
    Π_h11 = B / (2*W) + b/2 - c
    Π_h12 = ((1 - pfo_h) * b / (s+1) * risk_shared
             + pfo_h * caught_loses_h * (b + B / W)
             - c)
    Π_h21 = (p * (1 - pfo_h) * (s*b / (s + 1) + s * B / W)
             + ((1 - (1 - p) * gamma_h)
                * pfo_h * (1 - caught_loses_h)
                * theta_h * s
                * B / W))
    Π_h22 = (p * (1 - pfo_h**2) * (b/2 + s*B/(2*W)) * risk_shared
             + ((1 - (1 - p) * gamma2_h) * risk_shared_reg2_h
                * pfo_h**2 * (1 - both_caught_fail_h)
                * theta2_h * s
                * B/(2*W)))
    
    Π_l11 = B / (2*W) + b/2 - c
    Π_l12 = ((1 - pfo_l) * b / (s+1) * risk_shared
             + pfo_l * caught_loses_l * (b + B / W)
             - c)
    Π_l21 = (p * (1 - pfo_l) * (s*b / (s + 1)  + s * B / W)
             + ((1 - (1 - p) * gamma_l)
                * pfo_l * (1 - caught_loses_l)
                * theta_l* s
                * B / W))
    Π_l22 = (p * ( 1 - pfo_l**2) * (b/2 + s*B/(2*W)) * risk_shared
             + ((1 - (1 - p) * gamma2_l) * risk_shared_reg2_l
                * pfo_l**2  * (1 - both_caught_fail_l)
                * theta2_l * s
                * B/(2*W)))
    
    λ_h = λ * (1 - p) * (1 - pfo_h)
    λ_l = λ * (1 - p) * (1 - pfo_l)
    
    Ω_11 = r_h + g * mix
    Ω_12 = r_l + g * mix
    Ω_21 = r_h + g * (pfo_h**2 * mix + pfo_h * (1 - mix)) - λ_h
    Ω_22 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    Ω_31 = r_h + g * mix
    Ω_32 = r_l + g * (pfo_l**2 * mix + pfo_l * (1 - mix)) - λ_l
    
    payoffs = {}
    payoffs["4-1-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-1-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-1-3"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-2-1"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-2-2"] = {"P3": Ω_21,
                        "P2": Π_h22,
                        "P1": Π_h22}
    payoffs["4-2-3"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h21,
                        "P1": Π_h12}
    payoffs["4-3-1"] = {"P3": Ω_11,
                        "P2": Π_h11,
                        "P1": Π_h11}
    payoffs["4-3-2"] = {"P3": (Ω_11 + Ω_21) / 2,
                        "P2": Π_h12,
                        "P1": Π_h21}
    payoffs["4-3-3"] = {"P3": Ω_31,
                        "P2": Π_h11,
                        "P1": Π_h11}
    
    payoffs["5-1-1"] = {"P3": Ω_12,
                        "P2": Π_l11,
                        "P1": Π_l11}
    payoffs["5-1-2"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-1-3"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l12,
                        "P1": Π_l21}
    payoffs["5-2-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-2-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-2-3"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-1"] = {"P3": (Ω_12 + Ω_22) / 2,
                        "P2": Π_l21,
                        "P1": Π_l12}
    payoffs["5-3-2"] = {"P3": Ω_22,
                        "P2": Π_l22,
                        "P1": Π_l22}
    payoffs["5-3-3"] = {"P3": Ω_32,
                        "P2": Π_l22,
                        "P1": Π_l22}

    return {**models, "payoffs": payoffs}


In [None]:
# | export
@method(build_payoffs, "group_competition_v1")
def build_payoffs(models):
    """A group competition model of blocs competiting over AI when using
    a regulatory market policy."""
    
    # Payoffs for an economic bloc are the sum of expected EGT payoffs for all
    # AI labs in their economy given the bloc's choice of g, phi (and λ).
    # Blocs engage in pairwise interactions with other blocs where they lose
    # if they have the lower payoff. For simplicity, we capture this using
    # social learning only. Bloc payoffs are independent.
    
    # We first need to define the model of interactions between AI labs and
    # auditors in each bloc.
    models_inner = models["models_inner"]

    # The user should define a list of policy_bundles to evaluate the bloc
    # payoffs for.
    policy_bundles = models['policy_bundles']
    bloc_payoffs = []
    for policy in policy_bundles:
         results = thread_macro({**models_inner,
                                 **policy,
                                    },
                                    payoffs_sr_pfo_extension,
                                    create_profiles,
                                    apply_profile_filters,
                                    build_payoffs,
                                    build_transition_matrix,
                                    find_ergodic_distribution,
                                    calculate_sd_helper,
                                    compute_game_welfare,
                                    )
         bloc_payoffs.append(results['game_welfare'])
    
    # The bloc interaction is a 2 player game where players choose among 4
    # strategies.
    
     
    payoffs = {}
    for i, in range(len(bloc_payoffs)):
        for j in range(len(bloc_payoffs)):
                payoffs[f"{j+1}-{i+1}"] = {"P2": bloc_payoffs[j],
                                           "P1": bloc_payoffs[i]}

    return {**models, "payoffs": payoffs}


# Payoff Matrices (part 4)

> This module contains payoff matrices for different evolutionary games
>
> Part 4 contains payoff matrices for the following games
> - Game 1

# Payoffs Part 5

In [53]:
#| hide
import nbdev; nbdev.nbdev_export()