# Collaboration and Equilibrium
## The prisoners´ Dilemma
The prisoners´ dilemma is a recurring dilemma in game theory that illustrates the importance of collaborative strategies in Game Theory. 
In its classical form, the two criminals are suspects of committing a crime and are arrested in solitary confinement in separate cells. Police has not enough evidence on the main charges, but they have enough evidence to convict both on lesser charges. However, police officers plot a scheme to make the prisoners confess, offering each each prisoner a bargain: 

>If you confess the crime, you will go free, and your partner in crime will take full charges

With this bargain, if they both betray each other, they will go to prison, share the main charges and take a 2 years sentence. 

However, if one of the prisoners betrays the other, he/she will go free, while his partner in crime takes full charges and go to prison for 3 years. If they both remain silent, they will be charged only of the lesser crime and go to prison for 1 year. 

Let us illustrate the game in a tabular form: 

| Prisoner A/B | Confess | Do not confess |
|----------|----------------|---------|
|   Confess   |   $(-2, -2)$   |   $(0,-3)$  |
|   Do not confess   |   $(-3, 0)$   |   $(-1, -1)$  | 

In the tabular form, we have assumed a year in prison as a loss. If we pay attention to the values, we can clearly see what police was trying to accomplish here. The alternative to cooperate with the police and **Confess is dominant** with the alternative to not confess. No matter what the other prisoner decides, it is better to confess, actually, as described in decision theory, the decision problem can be simplified by applying dominance, removing the dominated alternatives from the problem. From each prisoners' perspective, the best alternative is always to cooperate. 

However, prisoners could have made a different representation of the problem, considering the game as a collaborative game against the police. In this set-up, prisoners should evaluate the total amount of years that they would jointly spend together in prison, as a form of zero-sum game against the police (although the alternatives for the police as a player are not strictly the same, we keep them for the sake of clarity):

| Prisoner | Confess | Do not confess |
|----------|----------------|---------|
|   Confess   |   $-4$   |   $-3$  |
|   Do not confess   |   $-3$   |   $-2$  | 

Now, the dominant option for the prisoner is not to confess!
These dilemma is used to highlight the importance of the analysis of cooperation in games, and use thoroughly to design incentive programs based on game theory. 

## Game Equilibrium
A game has *equilibrium* when both players select the same cell in the tabular representation of a problem using a given criteria like the MinMax criterion. On this condition, one can assume that players will always play the same move over and over again. 
Conversely, if both players do not choose the same matrix cell, the game has no equilibrium. If this is the case, players might be interested in changing their decision depending of the opponent’s decision on consecutive game moves. 

When this happens, players must adopt a strategy to obtain high **game mean value**. Basically a strategy is defined as the set of selection probabilities that defines how the player should select the moves, likely conditioned on the decisions made by the other player.  The mean game value is defined as the value that the player will get out of the game in a given set of moves (finite or infinite) with a given strategy.

Furtnermore, **mixed strategies** are strategies that provide an optimal game mean value for both players

The following section describes how to derive the selection probabilities that define the mixed strategies in 2x2 zero-sum games. 

## Mixed Strategies in mxn Games
Given a mxn zero sum game, 

| Player A /B | $b_1$ | $b_2$ | ... | $b_n$ |
|----------|----------------|---------|----------------|---------|
|   $a_1$   |   $a_{11}$   |   $a_{12}$  | ...   |   $a_{1n}$  |
|   $a_2$   |   $a_{21}$   |   $a_{22}$  | ...   |   $a_{22}$  |
|   ...   |   ...   |   ...  | ...   |   ...  |
|   $a_m$   |   $a_{m1}$   |   $a_{m2}$  | ...   |   $a_{mn}$  |

The objective is to calculate the optimal selection probabilities for each player.
Let us for now assume that all $a_{ij} \geq 0$ without loss of generality. 
Let us note as $p_i$ the probability that player A selects alternative $i$ and as $q_j$ the probability that player B selects alternative $j$. The **expected game value** is: 

$v = \sum_{i=1}^{m}{\sum_{j=1}^{n}{p_i*q_j*a_{ij}}}$

That, is, the expected game value is the expected value of the game taking into account the probabilities that each players selects each of their corresponding alternatives. 

### Strategy for Player A
Let us now define a strategy of player A such that, no matter what is the move of player B, there is a minimum bound for the expected game value, that is: 

$\sum_{i=1}^{m}{\sum_{j=1}^{n}{p_i*q_j*a_{ij}}} \geq v_{min}$

Let us now ensure that the game value is higher than this lower bound in the event that player B selects each of her/his alternatives. That is, let us ensure that the expected game value is higher than the lower bound in the event that player B always selects alternative 1 (i.e. $q_1 = 1$), in the event that player B always selects alternative 2 (i.e. $q_2 = 1$), and so on: 

$\sum_{i=1}^{m}{p_i*a_{i1}} \geq v_{min}$

$\sum_{i=1}^{m}{p_i*a_{i2}} \geq v_{min}$

$...$

$\sum_{i=1}^{m}{p_i*a_{ij}} \geq v_{min}$

That is, 

$\sum_{i=1}^{m}{p_i*a_{ij}} \geq v_{min} \quad \forall j$

Now, let us apply a change of variable so that the RHS of the constraints is equal to 1: 


$x_i = \frac{p_i}{v_{min}}$

$\sum_{i=1}^{m}{x_i*a_{ij}} \geq 1 \quad \forall j$

Now, recall that, in order to be consistent, the probabilities must add to one, that is: 

$\sum_{i=1}^{m}{p_i} = 1$

Let us use this expression to define the objective function of a CLP to calculate the optimal strategy of player A. Since the objective is to maximise $v$, let us try to maximise its lower bound $v_{min}$ or equivalently, to minimize $\frac{1}{v_{min}}$:

$\min z = \frac{1}{v_{min}} = \frac{\sum_{i=1}^{m}{p_i}}{v_{min}} = \sum_{i=1}^{m}{x_i}$

Hence, the optimal strategy for player A, can be obtained solving the following CLP: 

$\min z = \sum_{i=1}^{m}{x_i}$

$\text{s.t}$

$\sum_{i=1}^{m}{x_i*a_{ij}} \geq 1 \quad \forall j$

Once the values are obtained, we can find the minimum bound for the game value as $v_{min} = \frac{1}{z}$ and calculate the probabilities as $p_i = \frac{x_i}{\sum_{i=1}^{m}{x_i}}$.

### Strategy for Player B
Now, let us try to derive the strategy for Player B in a similar way. Since it is a zero sum game, for Player B, no matter what is the move of player a, there must be a maximum bound for the expected game value, that is: 

$\sum_{i=1}^{m}{\sum_{j=1}^{n}{p_i*q_j*a_{ij}}} \leq v_{max}$

Let us now ensure that the game value is lower than this upper bound in the event that player A selects each of her/his alternatives. That is, let us ensure that the expected game value is lower than the upper bound in the event that player A always selects alternative 1 (i.e. $p_1 = 1$), in the event that player A always selects alternative 2 (i.e. $p_2 = 1$), and so on: 

$\sum_{j=1}^{n}{q_j*a_{1j}} \leq v_{max}$

$\sum_{j=1}^{n}{q_j*a_{2j}} \leq v_{max}$

$...$

$\sum_{j=1}^{n}{q_j*a_{mj}} \leq v_{max}$

That is, 

$\sum_{j=1}^{n}{q_j*a_{ij}} \leq v_{max} \quad \forall i$

Now, let us again apply a change of variable so that the RHS of the constraints is equal to 1: 


$u_j = \frac{q_j}{v_{max}}$

$\sum_{j=1}^{n}{u_j*a_{ij}} \leq 1 \quad \forall i$

Now, again, recall that, in order to be consistent, the probabilities for Player B must also add to one, that is: 

$\sum_{j=1}^{n}{q_j} = 1$

Let us use this expression to define the objective function of a CLP to calculate the optimal strategy of player B. For Player B, the objective is to minimise $v$, thus, let us try to minimise its upper bound $v_{max}$ or equivalently, to maximize $\frac{1}{v_{max}}$:

$\max z = \frac{1}{v_{max}} = \frac{\sum_{j=1}^{n}{q_j}}{v_{max}} = \sum_{j=1}^{n}{u_j}$

Hence, the optimal strategy for player B, can be obtained solving the following CLP: 

$\max z = \sum_{j=1}^{n}{u_j}$

$\text{s.t}$

$\sum_{j=1}^{n}{u_j*a_{ij}} \leq 1 \quad \forall i$

Note that this CLP is the dual of the CLP that defines the strategy of player A! Hence, we can plug in the solution of the dual to calculate the strategy of player B.  
Again, Once the values are obtained, we can find the maximum bound for the game value for player B as $v_{max} = -\frac{1}{z}$ and calculate the probabilities as $q_j = \frac{u_j}{\sum_{j=1}^{n}{u_j}}$.

In case that any of the values $a_{ij} < 0$, it suffices to add a constant $k$ such that all values are non-negative, solve the problem, and substract this value $k$ to the derived expected game value.  
 