In [1]:
import numpy as np
import matplotlib.pyplot as plt

<br>

# Game forms
---

<br>

### Normal form

Represent the game as a matrix (for 2 players) where each pure-strategy (for one-stage game, the strategy is just a single action) appears on the side of the matrix and the payoff for each pair of player-strategy is shown in the cells, separated with a slash:

<div style="width:80%">
    <div style="width:33%;float:left;">
        <div style="text-align:center">Prisonner's dilema</div>
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>D</th>
            <th>S</th>
          </tr>
          <tr>
            <th>D</th>
            <td>-4/-4</td>
            <td>0/-5</td>
          </tr>
          <tr>
            <th>S</th>
            <td>-5/0</td>
            <td>-1/-1</td>
          </tr>
        </table>
    </div>
    <div style="width:33%;float:left;">
        <div style="text-align:center">Matching pennies</div>
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>A</th>
            <th>B</th>
          </tr>
          <tr>
            <th>A</th>
            <td>1/-1</td>
            <td>-1/1</td>
          </tr>
          <tr>
            <th>B</th>
            <td>-1/1</td>
            <td>1/-1</td>
          </tr>
        </table>
    </div>
    <div style="width:33%;float:right;">
        <div style="text-align:center">Battle of sexes</div>
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>O</th>
            <th>F</th>
          </tr>
          <tr>
            <th>O</th>
            <td>2/1</td>
            <td>0/0</td>
          </tr>
          <tr>
            <th>F</th>
            <td>0/0</td>
            <td>1/2</td>
          </tr>
        </table>
    </div>
</div>

In the prisonner's dilemna, **D** is the denouce option and **S** is the silence option. The matching pennies example is such that the player 1 wins if both action matches and loses otherwise. The Battle of Sexes is a cooperation game in which both player want to go the same place, either **O** for operator or **F** for football club.

<br>

### Extended form

The extended form of a game is a **game tree** representation of a game particularly useful when the game is made of several stages, or when the actions are sequential and not simultaneous.

* Each node in the circle corresponds to a player choice of move (or a chance node if chance is involved)
* Each leaf correspond to a payoff (the cumulative result at the end of the game)
* Labels along the edges represent the actions performed by the players
* Dash lines between node represent uncertainty about which of the states we are in

![title](img/ExtendedForm.png)

An **information set** is information available to a player at a given stage of the game to make its decision for the next action. A singleton information set contains the exact identifier of the node (the player has knowledge about the past). Dash lines represent uncertainty about the state and correspond to information set with several state identifiers inside them.

Erasing information about past move allows **simultaneous moves** games encoded as extended forms, while preserving the state information allows to represent **sequential move** games.

*Note: For each extended form, there is an equivalent unique normal form (where each strategy on the side is the path in the game tree), but not the other way around.*

<br>

### Types of games

A **perfect information game** is a game in which the extended form does not contain any uncertainly regarding the information set (each player knows the past) and does not contain any chance node. Games that do not fulfil these requirements are called **imperfect information games**.

**Static game** are games in which the actions of the players does not influence the actions of the other player. Typically, these are one-stage games with simulatenous moves like the ones shown above. **Dynamic games** are multi-stage games in which the actions are not the same depending on the past (exemple of Chess, Go). These games are necessarily partially sequential.

**Multi stage games** are games composed of a series of independent sub-games (for instance, playing prisoner's dilema followed by matching pennies). **Repeated games** are multi-stage games where we repeat the same sub-game.

<br>

# Solving one-stage static games
---

<br>

### Stategy and Mixed Strategy

 A **pure strategy** for a game is a function that returns the action to play for each information set. A **mixed strategy** is a distribution of probability over a selection of pure strategy. A **behavioral strategy** is a function that returns a probability distribution of actions to play for each information set. It turns out that mixed strategies and behavioral strategies are mostly equivalent: we can always represent one with the other.

The **Pareto Optimum** of a game is the combination of strategy that would lead to the best outcome for both players. For instance in the prisoner dilema is the stay silent for both player. Unfortunately, strategies that lead to this optimum do not necessarily correspond to what players would play because the best interest solution does not entail the general overal good.

<br>

### Dominance and Best response

A **strictly dominated** strategy is such that another strategy is stricly better whatever what the opponents play. A player should never play a strictly dominated strategy, at least in single stage games (we will see later for multi-stage games). Eliminated recursively strictly dominated strategies can sometimes lead to a single resulting pure strategy:

<div style="width:80%">
    <div style="width:33%;float:left;">
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>A</th>
            <th>B</th>
          </tr>
          <tr>
            <th>C</th>
            <td>4/2</td>
            <td>5/3</td>
          </tr>
          <tr>
            <th>D</th>
            <td>5/4</td>
            <td>6/0</td>
          </tr>
        </table>
    </div>
    <div style="width:33%;float:left;">
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>A</th>
            <th>B</th>
          </tr>
          <tr>
            <th><strike>&nbsp;C&nbsp;</strike></th>
            <td><strike>4/2</strike></td>
            <td><strike>5/3</strike></td>
          </tr>
          <tr>
            <th>D</th>
            <td>5/4</td>
            <td>6/0</td>
          </tr>
        </table>
    </div>
    <div style="width:33%;float:right;">
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>A</th>
            <th><strike>&nbsp;B&nbsp;</strike></th>
          </tr>
          <tr>
            <th><strike>&nbsp;C&nbsp;</strike></th>
            <td><strike>4/2</strike></td>
            <td><strike>5/3</strike></td>
          </tr>
          <tr>
            <th>D</th>
            <td>5/4</td>
            <td><strike>6/0</strike></td>
          </tr>
        </table>
    </div>
</div>

A **best response** to the opponent strategy is a strategy that is equal or better in payoff. After removing all the strictly dominated, the remaining strategies are always the best response to another strategy.

<br>

### Pure strategy Nash equilibrium

A nash equilibrium corresponds to a pair of strategies that are **stable**:

* if player 1 believes that player 2 will play his move then player 1 has no better response
* if player 2 believes that player 1 will play his move then player 2 has no better response

There might be one pure strategy nash equilibrium (D-D for Prisoner's Dilema), several pure strategy nash equilibriums (O-O or F-F for the Battle of Sexes) or no pure strategy nash equilibrium (in matching pennies):

<div style="width:80%">
    <div style="width:33%;float:left;">
        <div style="text-align:center">Prisonner's dilema</div>
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>D</th>
            <th>S</th>
          </tr>
          <tr>
            <th>D</th>
            <td><b>-4/-4</b></td>
            <td>0/-5</td>
          </tr>
          <tr>
            <th>S</th>
            <td>-5/0</td>
            <td>-1/-1</td>
          </tr>
        </table>
    </div>
    <div style="width:33%;float:left;">
        <div style="text-align:center">Matching pennies</div>
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>A</th>
            <th>B</th>
          </tr>
          <tr>
            <th>A</th>
            <td>1/-1</td>
            <td>-1/1</td>
          </tr>
          <tr>
            <th>B</th>
            <td>-1/1</td>
            <td>1/-1</td>
          </tr>
        </table>
    </div>
    <div style="width:33%;float:right;">
        <div style="text-align:center">Battle of sexes</div>
        <table style="text-align:center;width:150px;margin-left:auto;margin-right:auto">
          <tr>
            <th></th>
            <th>O</th>
            <th>F</th>
          </tr>
          <tr>
            <th>O</th>
            <td><b>2/1</b></td>
            <td>0/0</td>
          </tr>
          <tr>
            <th>F</th>
            <td>0/0</td>
            <td><b>1/2</b></td>
          </tr>
        </table>
    </div>
</div>

Nash equilibrium correspond to a **stable pair of strategies** in the sense that **no player would like to deviate from his strategy if his belief of the opponent player the matching strategy is correct**. And they are quite easy to find: for each player look at the best possible action for each of the opponent strategy (look for the maximum payoff) and see when they match.

<br>

### Mixed Nash equilibrium

For some games, there are **mixed strategies Nash equilibrium**. For instance, in the matching pennies game, if we note $p$ and $q$ the probability of player 1 and 2 to play move $A$, then $(p=\frac{1}{2}, q=\frac{1}{2})$ is a mixed nash equilibrium:

* player 1 would not benefit from increasing or decreasing $p$
* player 2 would not benefit from increasing or decreasing $q$

Note that if either $p$ or $q$ would not be equal to one half, then the other player would have benefits in playing a pure strategy instead. The way to identify the mixed nash equilibrium here is therefore to solve for $p$ and $q$ to be such that playing A or B is indifferent for both player A and B.

In general, we should look for combinations of move such that each move as the same expected payoff under the belief that the other player will play a given strategy or mixed strategy.

<br>

### Nash equilibrium vs Pareto optimum

A **stable position does not mean that the outcome is the best for both players** (the pareto optimum). For instance, in the prisoner's dilema, the nash equilibrium is when both player denouce each other. Indeed, any other position is unstable:

* if both are silent, one of them could benefit from talking
* if one is talking, the other should start talking as well

If you think that this does not represent reality, it is either because the payoff do not represent your true values (not betraying your partners) or because you think in terms of a repeated game in which you are accountable for your bad actions and this might erode the trusts people have in you: in repeated games, it becomes possible to play something else than a Nash equilibrium.

Similarly, a **mixed nash equilibrium does not necessarily have a better payoff than a pure-one**: for instance in Battle of the Sexes, the mixed Nash equilibrium would have a payoff of 1/1, which is less than the payoff of 2/1 or 1/2 of the pure Nash equilibrium.

<br>

### Continuous actions

* example of the production or prices

<br>

# Dynamic games
---

* backward induction (dynamic programming) for games with no chance
* sub-game perfect nash equilibrium
* one-stage un-improvable
* exemple of centipede: the nash strategy is not the "best for both" strategy (ex of prisonner's dilemna)

<br>

# Multi-stage / repeated games
---

* accountability
* should always finish by a nash equilibrium, but can bargain with it
* if several nash equilibrium at the end, you can use that to bargain (depend on the discounting factor)
