# BL40A2010 Introduction to IoT-Based Systems

## Assignment 6, 25.9.2023

### Author: Hamed Ahmadinia

**Prisoner's dilemma** is a standard example of a game analyzed in game theory that shows why two completely rational individuals might not cooperate, even if it appears that it is in their best interests to do so. It was originally framed by Merrill Flood and Melvin Dresher while working at RAND in 1950. Albert W. Tucker formalized the game with prison sentence rewards and named it "prisoner's dilemma", presenting it as follows:

"Two members of a criminal gang are arrested and imprisoned. Each prisoner is in solitary confinement with no means of communicating with the other. The prosecutors lack sufficient evidence to convict the pair on the principal charge, but they have enough to convict both on a lesser charge. Simultaneously, the prosecutors offer each prisoner a bargain. Each prisoner is given the opportunity either to betray the other by testifying that the other committed the crime, or to cooperate with the other by remaining silent. The possible outcomes are:

- If A and B each betray the other (not-cooperating to each other), each of them serves $z$ years in prison (payoff of $-z$)
- If A betrays B (not-cooperating with B) but B remains silent (cooperating with A), A will serve $y$ years in prison (payoff $-y$) and B will serve $w$ years  (payoff of $-w$).
- If B betrays A (not-cooperating with A) but A remains silent (cooperating with B), B will serve $y$ years in prison (payoff $-y$) and A will serve $w$ years  (payoff of $-w$).
- If A and B both remain silent, both of them will serve $x$ years in prison (payoff of $-x$)."

The payoff table is presented below. 

|                | $B$ cooperates  | $B$ not-cooperating   |
|----------------|:---------------:|--------------:|
| $A$ cooperates |  $A \rightarrow -x$   | $A\rightarrow -w$  |
|                |  $B\rightarrow -x$   | $B\rightarrow -y$  |
|                |                 |               |
| $A$ not-cooperating   |  $A\rightarrow -y$   | $A\rightarrow -z$  |
|                |  $B\rightarrow -w$   | $B\rightarrow -z$  |

**However, this is only a *Prisoner's Dilemma GAME* for A GIVEN RELATION between the years in prison (payoffs) as to be studied next.**

ps. Text adapted from [Wikipedia](https://en.wikipedia.org/wiki/Prisoner's_dilemma).

**(1) Consider the Prisoner's dilemma description given above.**

**(a) What is the relation between the payoffs values $x\geq 0$, $y\geq 0$, $w\geq 0$ and $z \geq 0$ so that the game can be classified as [Prisoner's Dilemma](https://en.wikipedia.org/wiki/Prisoner's_dilemma)?**

**(b) Verify the results (i.e., the proposed inequality) with numerical examples using [nashpy](https://nashpy.readthedocs.io/en/stable/index.html). Please provide one example when the inequality holds and one it does not (check my example for Dove and Hawyk game).**

In [20]:
#Verification with nashpy
#Installing a missing library
!pip install nashpy==0.0.21

Defaulting to user installation because normal site-packages is not writeable


In [21]:
import numpy as np
import nashpy as nash

### Example 1: (Inequality Holds)
#### Let's consider x=1, y=0, w=3, z=2 
#### This satisfies w>z>x>y


|                | $B$ aggressive  | $B$ not-aggressive     |
|----------------|:---------------:|--------------:|
| $A$ agreessive |  $A \rightarrow -1$   | $A \rightarrow -3$  |
|                |  $B \rightarrow -1$   | $B\rightarrow 0$  |
|                |                 |               |
| $A$ not-aggressive |  $A\rightarrow 0$      | $A\rightarrow -2$  |
|                |  $B\rightarrow -3$   | $B\rightarrow -2$  |


$$
A =
\begin{pmatrix}
    -1 & -3\\
    0 & -2
\end{pmatrix}\qquad
B =
\begin{pmatrix}
    -1 & 0\\
    -3 & -2
\end{pmatrix}
$$

In [22]:
# Example 1 (Inequality Holds)
# Test 1: Let's consider x=1, y=0, w=3, z=2
x = 1
y = 0
w = 3
z = 2

# This satisfies w>z>x>y

A = np.array([[-x, -w], [-y, -z]]) 
B = np.array([[-x, -y], [-w, -z]])

hawk_dove = nash.Game(A, B)
hawk_dove

Bi matrix game with payoff matrices:

Row player:
[[-1 -3]
 [ 0 -2]]

Column player:
[[-1  0]
 [-3 -2]]

In [23]:
eqs = hawk_dove.support_enumeration()
list(eqs)

[(array([0., 1.]), array([0., 1.]))]

This points to a special Nash equilibrium in which the two players never agree to work together.

### Example 2: (Inequality Does Not Hold)
#### Now consider x=1, y=2, w=3, z=0
#### This does not satisfy w>z>x>y.


|                | $B$ aggressive  | $B$ not-aggressive     |
|----------------|:---------------:|--------------:|
| $A$ agreessive |  $A \rightarrow -1$   | $A \rightarrow -3$  |
|                |  $B \rightarrow -2$   | $B\rightarrow 0$  |
|                |                 |               |
| $A$ not-aggressive |  $A\rightarrow -1$      | $A\rightarrow -2$  |
|                |  $B\rightarrow -3$   | $B\rightarrow 0$  |

$$
A =
\begin{pmatrix}
    -1 & -3\\
    -2 & 0
\end{pmatrix}\qquad
B =
\begin{pmatrix}
    -1 & -2\\
    -3 & 0
\end{pmatrix}
$$

In [26]:
# Example 2 (Inequality Does Not Hold)
# Test 2: Now consider x=1, y=2, w=3, z=0
x = 1
y = 2
w = 3
z = 0

A = np.array([[-x, -w], [-y, -z]])  
B = np.array([[-x, -y], [-w, -z]])  

hawk_dove2 = nash.Game(A, B)
hawk_dove2

Bi matrix game with payoff matrices:

Row player:
[[-1 -3]
 [-2  0]]

Column player:
[[-1 -2]
 [-3  0]]

In [27]:
eqs = hawk_dove2.support_enumeration()
list(eqs)

[(array([1., 0.]), array([1., 0.])),
 (array([0., 1.]), array([0., 1.])),
 (array([0.75, 0.25]), array([0.75, 0.25]))]

The output shows various possible strategies. First, both players choose to cooperate. Second, both decide not to cooperate. Third, both players mix their strategies, sometimes cooperating and sometimes not. These multiple possibilities mean the game doesn't fit the typical Prisoner’s Dilemma, which usually has one clear outcome where both players choose not to cooperate.

**(2) Justify why the game from the previous exercise is or is not a good (reasonable) model when $A$ and $B$ are:**

**1. Two trained members from the army when they are in prison.**


**2. Competitive companies in the market discussing standardization.**


**3. Two different autonomous IoT-based home energy management algorithms that are focus on energy efficiency.**


**4. Two different autonomous IoT-based home energy management algorithms that are focus on profit maximization.**

**ps. You need to think about the assumption used in Game Theory and in the Prisoner's dilemma problem setting.**

Answer: Prisoner's dilemma as a model

**1.** Yes, Army members are trained to be loyal and cooperative with one another, which aligns with the scenario in the Prisoner’s Dilemma where cooperation could lead to a better overall outcome. However, the uncertainty and lack of communication might drive them to betray each other, fearing the other might do the same.

**2.** No, In the Prisoner’s Dilemma, players cannot communicate, which is not a realistic assumption for companies discussing standardization. Companies can and do communicate, negotiate, and collaborate on standardisation, making the game not a suitable model.

**3.** No, These algorithms focus on energy efficiency, not on competing interests. The Prisoner's Dilemma is built on the assumption of competing interests (to betray or not), which may not apply here. Cooperation between the two to achieve maximum energy efficiency is the likely scenario, without the risk of betrayal.

**4.** Yes, In this scenario, both algorithms aim to maximise profit, creating a potential for conflict similar to the Prisoner's Dilemma. They might have to choose between cooperation for collective benefit and betrayal to outperform each other, making the game a reasonable model for this context.