# MSDM5058 Tutorial 9-1 - Non-cooperative Games

## Contents

1. Non-zero sum game
2. Dominant-strategy equilibrium
3. Nash equilibrium
4. Pareto optimal


---

Historically, game theory only studied zero-sum games (with von Neumann being the leading figure) from 1930s to 40s until John Nash generalized the theory in 1950. The two types of games are collectively called non-cooperative games. They are contrasted with cooperative games, in which the players can discuss before playing and thus can form coalitions with others.

In this tutorial, we will go through two players games using more general terms.

---

# 1. Non-zero sum game

As mentioned in the last tutorial, a game of two players can be expressed as a payoff bi-matrix 

$$
(\mathbf{A},\mathbf{B}^\intercal) = 
\left(\begin{array}{c|cccc}
& B_1 & B_2 & \cdots & B_n\\ \hline
A_1 & (a_{11},b_{11}) & (a_{12},b_{12}) & \cdots & (a_{1n},b_{1n}) \\
A_2 & (a_{21},b_{21}) & (a_{22},b_{22}) & \cdots & (a_{2n},b_{2n}) \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
A_m & (a_{m1},b_{m1}) & (a_{m2},b_{m2}) & \cdots & (a_{mn},b_{mn})
\end{array}\right)
$$

In general, $\mathbf{A}+\mathbf{B}^\intercal\neq 0$. 

Furthermore, we can express the players' strategies by vectors $\mathbf{x} = \left(x_1,x_2,...,x_m\right)^\intercal$ and $\mathbf{y} = \left(y_1,y_2,...,y_n\right)^\intercal$, where $\sum x_i = \sum y_j = 1$, no matter the strategies are pure or mixed. The payoffs of the players are calculated as:

- $A$'s payoff = $E_A(\mathbf{x},\mathbf{y}) = \mathbf{x}^\intercal \mathbf{A}\mathbf{y}$.
- $B$'s payoff = $E_B(\mathbf{x},\mathbf{y}) = \mathbf{y}^\intercal \mathbf{B}\mathbf{x}=\mathbf{x}^\intercal \mathbf{B}^\intercal\mathbf{y}$

Usually when we talk about the "solution" of a game, we are refering to that the game has reached an **equilibrium** - a strategy profile $(\mathbf{x}^*,\mathbf{y}^*)$ consisting of a best strategy for each of the players in the game. 


Note that an equilibrium is not guarenteed to exist or be unique in a game. 

---

# 2. Dominant-strategy equilibrium

## 2.1. Dominance: revisit

Here we define dominance in a more general way. Consider the payoff of player $p$ $(=A\text{ or }B)$ is $E_p(\mathbf{x},\mathbf{y})$ when the game is played through a strategy profile $(\mathbf{x},\mathbf{y})$. For example, if $A$'s payoffs by two of his strategies $\mathbf{x},\mathbf{x}'$ satisfy

$$E_A(\mathbf{x}', \mathbf{y}) \geq E_A(\mathbf{x}, \mathbf{y})$$

for any strategies $\mathbf{y}$ played by $B$, then we say

- $\mathbf{x}'$ is the **dominant strategy**.
- $\mathbf{x}$ is the **dominated strategy**.


If it is an strict inequality, then the relation is **strictly dominance**; else it is **weakly dominance**.

Note that a player’s dominant strategies are his strictly best responses even to wildly irrational actions of the other player.

## 2.2. Dominant strategy equilibrium

An dominant strategy equilibrium is a strategy profile found by iteration (Thus sometimes called "Iterated-dominance equilibrium"):

- Deleting a (weakly) dominated strategy from the strategy set of one of the players.
- Recalculating to find which remaining strategies are (weakly) dominated.
- Deleting one of the re-calculated (weakly) dominated strategies.
- ...

and continuing the process until only one strategy remains for each player. In fact, we have already seen this in zero-sum games.


Note that dominant strategy equilibrium may not exist.

#### Example: Non zero-sum game with dominated strategies

Consider the following payoff bimatrix. Is there a dominant strategy equilibrium?

$$
(\mathbf{A},\mathbf{B}^\intercal) = 
\left(\begin{array}{c|cccc}
& B_1 & B_2 & B_3\\ \hline
A_1 & (4,3) & (5,1) &  (6,2) \\
A_2 & (2,1) & (8,4) & (3,6) \\
A_3 & (3,0) & (9,6) & (2,8)
\end{array}\right)
$$


**Solution.**

1. A quick observation reveals that the column $B_2$ $(1,4,6)^\intercal$ is strictly dominated by column $B_3$ $(2,6,8)^\intercal$. So column $B_2$ is removed. 

 $$
(\mathbf{A},\mathbf{B}^\intercal) \rightarrow
\left(\begin{array}{c|cccc}
& B_1  & B_3\\ \hline
A_1 & (4,3) & (6,2) \\
A_2 & (2,1) & (3,6) \\
A_3 & (3,0) & (2,8)
\end{array}\right)
$$

2. Then we can see that row $A_1$ $(4,6)$ strictly dominates $A_2$ $(2,3)$ and $A_3$ $(3,2)$. So it remains

 $$
(\mathbf{A},\mathbf{B}^\intercal) \rightarrow
\left(\begin{array}{c|cccc}
& B_1  & B_3\\ \hline
A_1 & (4,3) & (6,2) \\
\end{array}\right)
$$


3. Finally, column $B_1$ $(3)$ dominates $B_3$ $(2)$. So the dominant strategy is $(A_1,B_1)$ and the payoffs are $(4,3)$.


---

# 3. Nash equilibrium

## 3.1. Best response strategy

First we need to define best response strategy - suppose $A$ knows or assumes that $B$ is using strategy $\mathbf{y}$, no matter optimal or not for $B$. In this case, $A$ should play the a strategy $\mathbf{x}$ that maximizes his payoff $E_A(\mathbf{x}, \mathbf{y})$ for the given $\mathbf{y}$. The strategy $\mathbf{x}$ is then the **best response strategy** to the use of $\mathbf{x}$ by $B$.

In mathematical terms, player $A$'s strategy $\mathbf{x}^*$ is a best response strategy towards a particular strategy $\mathbf{y}$ played by $B$ if 

$$E_A(\mathbf{x}^*,\mathbf{y}) \geq E_A(\mathbf{x},\mathbf{y}) \quad\text{for any }\mathbf{x}$$

and similarly - player $B$'s strategy $\mathbf{y}^*$ is a best response strategy towards a particular strategy $\mathbf{x}$ played by $B$ if 

$$E_B(\mathbf{x},\mathbf{y}^*) \geq E_B(\mathbf{x},\mathbf{y}) \quad\text{for any }\mathbf{y}$$


## 3.2. Nash equilibrium

The definition of Nash equilibrium is based on best response strategy - **_the strategy profile $(\mathbf{x}^*,\mathbf{y}^*)$ is a Nash equilibrium if both players are playing his best response strategy against each other_**, i.e. 

$$
\Big(E_A(\mathbf{x}^*,\mathbf{y}^*) \geq E_A(\mathbf{x},\mathbf{y}^*) \quad \text{for all }\mathbf{x}\neq \mathbf{x}^*\Big) \quad\text{and}\quad \Big(E_B(\mathbf{x}^*,\mathbf{y}^*) \geq E_B(\mathbf{x}^*,\mathbf{y}) \quad \text{for all }\mathbf{y}\neq \mathbf{y}^*\Big)
$$


In a Nash equilibrium, no player has incentive to deviate from his Nash strategy given that the other player does not deviate, or else he gets less payoff by deviating from his best response strategy towards the other player's Nash strategy.



**Note 1:** Every dominant-strategy equilibrium is a Nash equilibrium, but not every Nash equilibrium is a dominant-strategy equilibrium.
- If a strategy is dominant, it is a best response to any strategies the other players pick, including their Nash equilibrium strategies.
- If a strategy is part of a Nash equilibrium, it needs only be a best response to the other players’ Nash equilibrium strategies.


**Note 2:** In particular in a zero-sum game, the strategy profile $(\mathbf{x}^*,\mathbf{y}^*)$ is a saddle point of the payoff matrix $\mathbf{A}$ if and only if $(\mathbf{x}^*,\mathbf{y}^*)$ is a Nash equilibrium of payoff bi-matrix $(\mathbf{A},-\mathbf{A})$.

## 3.3 Nash theorem

Nash theorem states that all non-cooperative games with finite strategies possess at least one Nash equilibrium if mixed strategies are allowed. A Nash equilibrium is pure if it occurs when the players use pure strategies, otherwise it is mixed.

**_The trick to find a pure Nash equilibrium is to mark every largest_** $a_{ij}$ **_(the first element) on each column and every largest_** $b_{ij}$ **_(the second element) on each row - any entry with both_** $a_{ij}$ **_and_** $b_{ij}$ **_marked is a pure Nash equilibrium._**

#### Example 1: A pure equilibrium

Consider the same matrix in the example of dominant-strategy equilibrium. Which strategy profile is a Nash equilibrium?

$$
(\mathbf{A},\mathbf{B}^\intercal) = 
\left(\begin{array}{c|cccc}
& B_1 & B_2 & B_3\\ \hline
A_1 & (4,3) & (5,1) &  (6,2) \\
A_2 & (2,1) & (8,4) & (3,6) \\
A_3 & (3,0) & (9,6) & (2,8)
\end{array}\right)
$$

**Solution.** Mark every largest $a_{ij}$ (the first element) on each column and every largest $b_{ij}$ (the second element) on each row with boxes. 

$$
(\mathbf{A},\mathbf{B}^\intercal) = 
\left(\begin{array}{c|cccc}
& B_1 & B_2 & B_3\\ \hline
A_1 & (\boxed{4},\boxed{3}) & (5,1) &  (\boxed{6},2) \\
A_2 & (2,1) & (8,4) & (3,\boxed{6}) \\
A_3 & (3,0) & (\boxed{9},6) & (2,\boxed{8})
\end{array}\right)
$$


Because $(A_1,B_1)=(4, 3)$ has two boxes, it is the only pure Nash equilibrium. This is guarenteed because it is in fact the dominant strategy profile. 

#### Example 2: Multiple pure equilibrium

Consider the following matrix. Does it have any pure Nash equilibrium?

$$
(\mathbf{A}, \mathbf{B}^\intercal) =
\left(\begin{array}{c|ccc} 
&B_1 &B_2 &B_3 \\
\hline
A_1 &(1,2) &(-1,0) &(0,2) \\
A_2 &(1,1) &(2,2) &(3,3) \\
A_3 &(0,3) &(0,2) &(4,3)
\end{array}\right)
$$
 

**Solution.** Mark every largest $a_{ij}$ (the first element) on each column and every largest $b_{ij}$ (the second element) on each row with boxes. 

$$
(\mathbf{A}, \mathbf{B}^\intercal) =
\left(\begin{array}{c|ccc} 
&B_1 &B_2 &B_3 \\
\hline
A_1 &(\boxed{1},\boxed{2}) &(-1,0) &(0,\boxed{2}) \\
A_2 &(\boxed{1},1) &(\boxed{2},2) &(3,\boxed{3}) \\
A_3 &(0,\boxed{3}) &(0,2) &(\boxed{4},\boxed{3})
\end{array}\right)
$$

We can see that there are two pure Nash equilibrium: $(A_1, B_1)=(1,2)$ and $(A_3, B_3)=(4,3)$. 

## 3.4. Mixed Nash equilibrium

Recall that in zero-sum game, if there is no way to further reduce the payoff matrix, the mixed saddle point needs to be solved from two systems of inequalities, and the standard approach is by linear programming. Similarly, for mixed Nash equilibrium in non zero-sum game, we have a similar pair of systems of inequalities (that playing pure strategy is always worse than playing the optimal mixed strategy):

$$
\begin{array}{c}
\begin{cases}
E_A(\mathbf{i}(i=1),\mathbf{y}^*) = \displaystyle\sum_{j=1}^m a_{1j}y^*_j \leq v_A \\
\qquad\vdots \\
E_A(\mathbf{i}(i=m),\mathbf{y}^*) = \displaystyle\sum_{j=1}^m a_{mj}y^*_j \leq v_A
\end{cases}\\[0.5em]
\text{subject to }\displaystyle\sum_{j=1}^n y_j^*=1
\end{array}
\qquad \text{and}\qquad
\begin{array}{c}
\begin{cases}
E_B(\mathbf{x}^*,\mathbf{j}(j=1)) = \displaystyle\sum_{i=1}^m x^*_i b_{i1}\leq v_B \\
\qquad\vdots \\
E_B(\mathbf{x}^*,\mathbf{j}(j=n)) = \displaystyle\sum_{i=1}^m x^*_i b_{in}\leq v_B
\end{cases}\\[0.5em]
\text{subject to }\displaystyle\sum_{i=1}^m x_i^*=1
\end{array}
$$

where $v_A=E_A(\mathbf{x}^*,\mathbf{y}^*)$ and $v_B=E_B(\mathbf{x}^*,\mathbf{y}^*)$, and they are await to be solved together with $\mathbf{x}^*$ and $\mathbf{y}^*$. 

> Finding a mixed Nash equilibrium in non-zero sum game is more disgusting than in zero-sum game because it is a non-linear programming problem. The standard way we formulate the non-linear program is 
>
> $$
\begin{align*}
\underset{\mathbf{x},\mathbf{y},v_A,v_B}{\text{maximize}}\quad &\mathbf{x}^\intercal A\mathbf{y} + \mathbf{x}^\intercal B\mathbf{y} - v_A - v_B \\
\text{subject to}\quad &\sum_j a_{ij}y_j\leq v_A \quad \text{for all }i\\
&\sum_i x_i^*b_{ij}\leq v_B \quad \text{for all }j \\
& x_i^*\geq 0,\ y_j^*\geq 0,\ \sum_i x^*_i = \sum_j y^*_j = 1
\end{align*}
$$


## 3.5. Equility of payoffs theorem

Luckily, if the optimal strategy profile is strictly positive (i.e. all $x_i^*>0$ and $y_j^*>0$), then we can turn the above system of inequalities into equalities, and the solution (if valid) is a mixed Nash equilibrium. This is known as **equility of payoffs theorem**. This is the non zero-sum game version of equilibrium theorem in zero-sum game. 



#### Example: Equility of payoffs theorem
Consider the following payoff bi-matrix. 

$$
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline
A_1 &(-4,1) & (2,0) \\
A_2 &(2,2) & (1,3)
\end{array}\right)
$$

By equility of payoffs theorem, we get the two systems of equations:

$$
\begin{cases}
-4y^*_1 + 2y^*_2 = v_A \\
2y^*_1 + y^*_2 = v_A \\
y^*_1 + y^*_2 = 1
\end{cases}
\quad\text{and}\quad
\begin{cases}
x^*_1 + 2x^*_2 = v_B \\
3y^*_2 = v_B \\
x^*_1 + x^*_2 = 1
\end{cases}
$$

On solving, we get $(y^*_1, y^*_2)=\left(\frac{1}{7},\frac{6}{7}\right)$ with $v_A = \frac{8}{7}$, and also $(x^*_1,x^*_2)=\left(\frac{1}{2},\frac{1}{2}\right)$ with $v_B=\frac{3}{2}$.

---

# 4. Pareto optimal

Pareto optimal is not an equilibrium, but rather a description to if the strategy profile is "good" to every player on a whole. Here are the formal definition: 

- A strategy profile $(\mathbf{x}', \mathbf{y}')$ is said to be **Pareto-dominating** another strategy profile $(\mathbf{x},\mathbf{y})$ if the playoff of all players by $(\mathbf{x}', \mathbf{y}')$ are at least higher than the playoff by $(\mathbf{x},\mathbf{y})$. i.e. 

 $$E_p(\mathbf{x}', \mathbf{y}') \geq E_p(\mathbf{x},\mathbf{y}) \quad\text{for both }p=A \text{ or }B$$

 If it is a strict inequality, it is **strictly Pareto-dominance**; else it is **weakly Pareto-dominance**.
 
- A strategy profile is **Pareto optimal** if is not Pareto-dominated, i.e. there is no other strategy profile where some players can increase their current payoffs without decreasing the current payoff of other players. 


#### Example: Pareto optimal

Here are 3 payoff-bimatrices. Do they have Pareto optimal strategy profile?

$$
\begin{array}{ccccc}
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline 
A_1 &(2,3) &(3,2)\\
A_2 &(1,0) &(0,1)\\
\end{array}\right)
%
&\qquad&
%
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline 
A_1 &(1,1) &(2,5)\\
A_2 &(5,2) &(-1,-1)\\
\end{array}\right)
%
&\qquad&
%
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline 
A_1 &(2,4) &(1,0)\\
A_2 &(3,1) &(0,4)\\
\end{array}\right)\\
%
(a) & &(b) & &(c)
\end{array}
$$

**Solution.** A strategy profile is Pareto dominated if there is a pair of strategies that both numbers are higher than it. Remove it. The remaining strategy profiles are all Pareto dominant.

$$
\require{cancel}
\begin{array}{ccccc}
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline 
A_1 &(2,3) &(3,2)\\
A_2 &\xcancel{(1,0)} &\xcancel{(0,1)}\\
\end{array}\right)
%
&\qquad&
%
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline 
A_1 &\xcancel{(1,1)} &(2,5)\\
A_2 &(5,2) &\xcancel{(-1,-1)}\\
\end{array}\right)
%
&\qquad&
%
\left(\begin{array}{c|cc}
&B_1 &B_2 \\
\hline 
A_1 &(2,4) &\xcancel{(1,0)}\\
A_2 &(3,1) &\xcancel{(0,4)}\\
\end{array}\right)\\
%
(a) & &(b) & &(c)
\end{array}
$$

