# Game theory

Formulated by John von Neumann and Oskar Morgenstern during the Cold War


Game theory is the standard quantitative tool for analysing the interactions of **multiple decision makers**

The decision-making process is framed as a deterministic dynamical system, adding scientific rigor to the analysis of human behaviour

Games are contrived strategic scenarios but they are still very useful to shed light on human motivation and behaviour and have been the focus of extensive experimental research in all sorts of areas from social psychology to pure maths
 
We will explore Game Theory via a  particular example that was first discussed in the 1950s (prisoner's dilemma).

Of course many complex systems involve multiple decision makers and game theory is therefore a natural framework for modelling them
 
The interactions are *games* where there is a choice to 'cooperate' with each other or 'defect', and where the reward/punishment depends on what the opponent' chooses

<center> 
<img src="Complex_systems_organizational_map.jpeg" width="400"/>
</center>

Figure from Sayama

<!-- Figure 1.1: Visual, organizational map of complex systems science broken into seven topical areas. The three circles on the left (Nonlinear Dynamics, Systems Theory, and Game Theory) are the historical roots of complex systems science, while the other four circles (Pattern Formation, Evolution and Adaptation, Networks, and Collective Behavior) are the more recently studied topical areas. -->

# Game
<!-- Set - a collection of things (things could be sets themselves). Think 'container' -->
A game says who makes the decision, what decision they make and what reward they get as a result of the decision of *all* players

Formally:
$$
G = \{P, A, U\}
$$
- $P$: players
    - A set of decision makers in a game $\{p_1, p_2, \ldots, p_n\}$ (this is an n-player game)
- $A$: actions 
    - A set that tells us what each player can do in the game $\{a_1,a_2, \ldots, a_n\}$
    - e.g. the actions of player 1 $a_1=\{\text{up}, \text{down}\}$
- $U$: utility/payoff 
    - A set of rewards (or punishments) $\{u_1, u_2, \ldots, u_n\}$
    - Each element is a function for each player
    - e.g. the utility of player 1 in a 2-player game if $p_1$ plays the action $\text{up}$ and $p_2$ plays the action $\text{down}$ is $u_1(\text{up}, \text{down})=2$



# Prisoner's dilemma

This is the most well-known game within Game Theory

Two criminals $P=\{p_1,p_2\}$ are caught by police but the police don't have enough evidence to charge them for the full crime

They are interrogated separately and can either cooperate ($C$) and stay silent/protect or defect ($D$) and blab/betray, i.e. $A=\{\{C,D\},\{C,D\}\}$ 

The outcome is a certain number of years in jail, i.e. $U=\{u_1,u_2\}$ with $u_1(a_1,a_2)=\text{time in jail}$ for $p_1$
<!-- <center> 
<img src="MidJ_Prisoners.png" width="400"/>
</center> -->

## The rules of the game
The police offer the following Faustian bargain (Faust sold his soul to the devil for unlimited knowledge and worldly pleasures. His name is associated with sacrificing spiritual values for power, knowledge, or material gain.):
- If $p_1$ betrays but $p_2$ remains silent, $p_1$ will be set free and $p_2$ will serve 3 years in prison and vice versa $u_1(D,C)=0$, $u_1(C,D)=3$
- If $p_1$ and $p_2$ betray each other, they'll both serve 2 years $u_1(D,D)=2$
- If $p_1$ and $p_2$ both remain silent, they'll both only serve 1 year $u_1(C,C)=1$

Player 2's payoffs are similar so this is referred to as a symmetric game

Note that confess ($C$)/don't confess ($D$) are also used (confusingly) and that the utility is sometimes represented as time saved off your sentence

## Assumptions

The players are **rational utility maximisers** 
- **Rational** means they don't make mistakes when they choose the action
- **Utility maximiser** means they always choose the action that's best for them

    Within this major assumption are the following:
    - Both players understand the game
    - Players prefer the better payoff 
    - No other considerations (e.g. reputation, loyalty, opportunity for retribution)
    - No communication (no negotiating, making promises or threats)

i.e. players are only concerned with the immediate goal of minimising their own sentence

Behavioural Game Theory relaxes these assumptions.

## Payoff matrix (normal form)
Defined over strategy space $S=\{C,D\}\times\{C,D\}=\{(C,C),(C,D),(D,C),(D,D)\}$

All of this can be summarised as:

<center> 
<img src="StrategySpace_PD.png" width="300"/>
</center>

Player: Time in prison

Highly stylised and parsimonious description (abstract the vital information and suppress the irrelevant)

Alternatively we can represent things in payoff space

<center> 
<img src="PayoffSpace_PD.png" width="400"/>
</center>

Points linked by a players choice (line connecting (0,3) and (2,2) represents player 1's choice to defect)

(The dual of the normal form)

## So what actions should the players choose?
From $p_1$'s perspective (same goes for $p_2$ from symmetry)
<center> 
<img src="StrategySpace_PD.png" width="300"/>
</center>

<!-- What should $p_1$ do? -->

- If $p_2$ cooperates: $p_1$ is better off defecting; she would go free rather than serve 1 year
- If $p_2$ defects: $p_1$ is still better off defecting; she would serve only 2 years rather than 3
- No matter what $p_2$ does, $p_1$ is better off defecting 


Defecting is the best choice in all cases

Equally, you could assume a particular set of actions (position of the normal form) and assess whether a player would opt to change their mind when informed what the other player has chosen. 

This is very clear in the payoff space when lines are replaced with arrows:

<center> 
<img src="PayoffSpace_Analysed.png" width="300"/>
</center>

The temptation for competitive advantage is strong!

## Nash equilibrium 
Both players defecting is the (strictly) *'dominant strategy'* and the **Nash equilibrium** 

The Nash equilibrium is often phrased as 'No player has an incentive to unilaterally deviate', i.e. there's nothing to gain by changing personal strategy. It is named for Nobel Laureate John Nash

'Unilaterally' $\implies $ a player can only change their own strategy (not that of others).

Formally:

Let the actions of $p_1$ and $p_2$ be $a_1=\{\alpha_1,\alpha_2, \ldots \alpha_n\}$ and $a_2=\{\beta_1,\beta_2, \ldots \beta_n\}$ respectively.

Then $(\alpha^*, \beta^*)$ is a Nash equilibrium if:
$$
u_1(\alpha^*,\beta^*)>u_1(\alpha',\beta^*) \quad \forall \alpha'\in a_1
$$

and 

$$
\color{white}{u_2(\alpha^*,\beta^*)>u_2(\alpha^*,\beta') \quad \forall \beta' \in a_2}
$$

Note that for it to not to be a Nash equilibrium only one player needs an incentive to deviate but in this case both players do

<!-- Khan Academy has a tutorial: https://www.khanacademy.org/economics-finance-domain/microeconomics/nash-equilibrium-tutorial -->

## What about the greater good?
There is nothing the prisoners can do to achieve the outcome they both want (the **Pareto Optimal** outcome)
    
People acting in their own self-interest do not create the best outcome in this game
    
The irony is they are 'hurting themselves when they're actually only thinking about themselves'
    
For example, the tragedy of the commons arises as public goods are always prone to over-use.
 

## Generalising the payoff

<center> 
<img src="General_PD.png" width="400"/>
</center>

In general, if: 
$$S<P<R<T$$
then the game is classified as a prisoner's dilemma.

- $R>P$ implies that mutual cooperation is superior to mutual defection
- $T>R$ and $P>S$ imply that defection is the dominant strategy for both agents.

Note: '$>$' here implies 'better', which will depend on the question context (e.g. if the utility is time in jail vs time spared from jail)

## Prisoner's dilemmas outside of the prison
Many social interactions can be modeled as variations on the Prisoner's Dilemma

Some classic examples include:
- Business competition between two companies deciding whether to lower their prices to attract more customers or keep prices high to maximise profits
- Environmental conservation between neighbouring countries that share a common water resource
- Arms race between countries building up their military capabilities 
- Traffic congestion: https://www.youtube.com/watch?v=cALezV_Fwi0
- International diplomacy
- Common-pool resource management
- ...


And if it doesn't fit the Prisoner's Dilemma framework then there is another game it will fit!
<center> 
<img src="2x2Games.jpeg" width="400"/>
</center>

https://upload.wikimedia.org/wikipedia/commons/3/32/2x2chart110602.pdf

A systematic approach to (strict/indifferent ordinal) $2\times 2$ games was provided by Robinson and Goforth [2005]
<!-- Indifferent rules out ties -->

Payoffs are strictly ordinal so each player has $4!=24$ ways to assign the outcomes

Since the rankings of Player 1 and Player 2 are independent, the total number of possible strict ordinal 2x2 games is $24\times 24=576$

Many games are essentially the same when considering symmetries:
- No swaps (original game)
- Swap Player 1's strategies (flip rows)
- Swap Player 2's strategies (flip columns)
- Swap both Player 1 and Player 2's strategies (flip both rows and columns).

Hence, there is factor 4 reduction and the number of distinct games is $576/4=144$ strict ordinal 2x2 games.

In payoff space:

<center> 
<img src="PatternsWirings.png" width="400"/>
</center>


# Reality
Game theory tells us what a perfectly rational agent *should* do
    
It is much harder to predict what real people *actually* do

<!-- I remember doing this in my class once. If everyone raises their hand, everyone gets 1 candy, if no one raises, everyone gets 2, if 1 raises, he gets 10 and no one else gets anything, if more than one raises, no one gets anything. It was a fun day. We were all betrayed. -->

See this 'real life' example, the Golden Balls: https://www.youtube.com/watch?v=S0qjK3TWZE8



# Is it so surprising that people are not rational agents?

How can we explain people not doing what Game Theory says they should?
- people are not smart enough to understand the scenario?
- people are knowingly acting contrary to their own interest?
- something else?

We will take a small detour via evolution and genetic algorithms and come back to this 'something else'...

# Evolution
To evolve is to change over time

In biology, evolution refers to the gradual process by which species change genetically across generations i.e. it's an aggregate effect: *individuals don't evolve, populations do*

The theory of evolution claims that new species are created and existing species change due to **natural selection**

Natural selection is the process where inherited variations between individuals cause differences in survival and reproduction

## Darwin

<center> 
<img src="DarwinsFinches.jpeg" width="300"/>
</center>

Finches from the Galapagos inspired Darwin during his visit to the islands in 1835: They are thought to have evolved from a single finch species that came to the islands more than a million years ago. Differences between species are in the size and shape of their beaks, which are highly adapted to different food sources available on the different islands

According to Darwin all organisms share a common ancestor that lived about 580 million years ago and there's no hierarchy -- each organism alive now is the most evolved of its kind. 

### Beware the linear depiction of ape to man!

<center> 
<img src="Evolution_ApeMan_Linear.png" width="300"/>
</center>

Image: http://creationwiki.org/Devolution

Many variations of this picture - some have a humorous spirit, but most aim to ridicule the monkey to man theory

Linear depictions like this may confirm false preconceptions about evolution, such as intelligent design (the idea that life has an intelligent creator behind it)
<!-- Evolution is not some goal-directed process with humans as the final product/pinnacle -->

The reality is that each is a unique species with their own evolutionary paths and humans are no more special than any other primate.

### Evolution is not linear

<center> 
<img src="Evolution_ApeMan_Proper.jpeg" width="300"/>
</center>

Image from: doi:10.1038/nature01400
 
Evolution is a process of continuous branching and divergence of populations of organisms

Humans did not evolve from chimpanzees or any of the other great apes that live today

We share a common ancestor that lived $\approx$10 million years ago - asking why modern chimps don't look more like humans is like asking why the children of your cousins don't look more like you than their parents - they're on an entirely different evolutionary path.

<!-- Some good reading and some really bad reading on this topic exists. Do your own homework.  -->
<!-- https://www.washingtonpost.com/news/speaking-of-science/wp/2016/07/25/dear-science-answers-your-questions-about-evolution/ -->

## Explaining global/macroscopic observations
i.e. the output

The theory of evolution is powerful because it explains phenomena we see in the natural world:
- **Adaptation**: Natural selection favors adaptations that best enable creatures to survive under the circumstances in which they live. (This *does not* mean they should all be becoming more human!)
<!--     - How ridiculously elitist to think we are the goal and culmination of 4.5 billion years of evolution.
    - We continue to evolve - lactose tolerance came long after we were making cheese. Malaria resilience in parts of the world etc.
    - Our species could continue to evolve and change for many millennia to come, but we could just as easily go extinct, as have 99 percent of all life forms that came before us.
    - Many features of natural systems seem as if they were designed. -->
- **Increasing diversity**: Over time the number of species on earth has generally increased (despite periods of mass extinction)
- **Increasing complexity**: The history of life on earth starts with relatively simple life forms, with more complex organisms appearing later in the geological record.


## Mechanisms for evolution
i.e. the inputs required

<!-- Gradual change happens on timescales that vary from organism to organism based on the length of a generation -->

To model evolution the following three mechanisms are sufficient:
1. **Variation**: variability in the population, i.e. differences between individuals
2. **Differential survival or reproduction**: differences between individuals affect their ability to survive or reproduce
3. **Replicators**: a population of agents that can reproduce in some way (perfect copies of themselves or imperfect copying, i.e. mutation)

# Modelling evolution

The (typical) process is:
- Agents have genetic information, called their "genotype" 
- Based on their genotype a "fitness" is calculated
- Some agents die
- Some agents are born 

There are modelling decisions and choices are everywhere here!

e.g.:
- What is the genotype?
- What is fitness? How does it depend on the agent genotype?
- Who dies - dependent on what? Fitness? How many die?
- Who is born - how many? What are their properties? Are they children of existing agents? Who gets to reproduce?
- etc etc etc...
- 
## An example
https://www.complexity-explorables.org/explorables/maggots-in-the-wiggle-room/

### Fitness
Evolution is a change in a population's distribution of genotypes (may be hard to visualise)

As genotypes change, we expect fitness to change 

Therefore, we use changes in the distribution of fitness as evidence of evolution (easier to visualise as it's just a single value)

##### Fitness landscapes

We saw fitness landscapes in the context of particle swarm optimisation

In genetic algorithms the fitness landscape, sometimes called an evolutionary landscape, is a function that maps genotype to fitness
<!-- a relationship between genotype and fitness is needed for evolution but it can be any relationship -- even a totally random fitness landscape will work. -->
        
- The agent's fitness is the 'height' of the landscape at that location, which relates to an agent's ability to survive or reproduce
- Increasing fitness (of the population) means that the species is getting better at surviving in its environment
- In biological terms, the fitness landscape represents information about how the genotype of an organism is related to its physical form and capabilities, called its phenotype, and how the phenotype interacts with its environment

##### Remember...
Agents don't move over the fitness landscape (the genotype of an organism doesn't change)
    
When an agent dies, it can leave a location unoccupied. And when a mutation occurs, it can occupy a new location 
        
As agents disappear from some locations and appear in others, the *population* migrates across the landscape (like a glider in Game of Life) 
        
i.e. organisms don't evolve; populations do

## Building the required mechanisms for adaptation into the model
1. Variation: initialise the population with a variety of genotypes (or could simply rely on mutation)
2. Differential survival and reproduction: genotype is mapped to fitness and a function converts the fitness to agent's ability to survive or reproduce
3. Replication: the genotype is copied by agents that reproduce. Mutation of the genotype is included to increase diversity

### How would you test these mechanisms?

They should result in: adaptation, increasing diversity, increasing complexity

For example:
- Increasing fitness would mean that the species is getting better at surviving in its environment (and hence adapting)
- Diversity could be captured by number of/distribution of genotypes

This model is not meant to be realistic - evolution in natural systems is obviously much more complicated than this
        
Rather, it is a demonstration that the features of the model are *sufficient* to produce the behaviour we are trying to explain

This doesn't prove that evolution in nature is caused by these mechanisms alone. But it is reasonable to think that they at least contribute to natural evolution
                
Similarly, it doesn't prove that these mechanisms always cause evolution. But the results are fairly robust.

# Altruistic genes (and selfish ones) with an evolutionary model

What do you expect of this computational model...?:

>*In a evolving population where some agents have genes that encourage them to help others, even to their own detriment, and other agents are purely selfish...*

>*... it seems like the selfish ones would benefit, the altruistic ones would suffer, and the genes for altruism would be driven to extinction -- Allen Downey*


This apparent conflict (between natural selection, which suggests that animals live in a state of constant competition, and altruism, which is the tendency of many animals to help other animals, even to their own detriment) is the 'problem/paradox of altruism'

Among biologists, there are many possible explanations, including reciprocal altruism, sexual selection, kin selection, and group selection. 

# Iterated prisoner's dilemma

There is no altruism in the Prisoner's Dilemma, and yet, when real people play these games we frequently observe them not chasing maximal utility in a rational way

Here's a third option for why people are not rational: altruism is simply adaptive (Occam's razor - again!) i.e. genes for altruism make people more likely to survive and reproduce

Robert Axelrod's tournament played many different strategies against each other in *repeated games* (200 rounds)

14 different strategies (+ random) were submitted by participants

<center> 
<img src="Axelrod_PD.png" width="300"/>
</center>

## Successful strategies
Successful strategies in the tournament had the following properties:
- **Nice**: The strategies that do well cooperate during the first round, and generally cooperate as often as they defect in subsequent rounds
- **Retaliating**: Strategies that cooperate all the time did not do as well as strategies that retaliate if the opponent defects (avoiding further losses and disincentivising defecting) 
- **Forgiving**: But strategies that were too vindictive tended to punish themselves as well as their opponents
- **Non-envious**: Some of the most successful strategies seldom outscore their opponents and aren't tempted; they are successful because they do well enough against a wide variety of opponents

A brain that is wired to be nice, tempered by a balance of retaliation and forgiveness, with foresight beyond the immediate future will tend to do well in a wide variety of circumstances.

### Tit for Tat
One strategy, called Tit for Tat, cooperated in the first instance and then copied the opponents move in the previous round of play

Does Tit for Tat have the properties of a successful strategy? (nice, retaliatory, forgiving, non-envious)
<!-- nice - cooperates in first round
retaliates - tit for tat
forgiving - tit for tat. But being more forgiving can be even better e.g. tit for 2tat avoids echo effect but can be taken advantage of once people know this strategy exists
non-envious - can only tie or lose against a single opponent -->

This strategy actually won the tournament but better strategies have been uncovered since.
<center> 
<img src="Iterated_Prisoners_Dilemma_Venn-Diagram.svg" width="600"/>
</center>




...For inclusion in 2025...
More background on PD variants needed: https://en.wikipedia.org/wiki/Prisoner%27s_dilemma#cite_note-1

<center> 
<img src="Axelrod_TournamentResults.png" width="600"/>
</center>

## Evolving a successful strategy

Axelrod's tournaments and the strategies that were played were designed by people - they didn't evolve 
        
Assume the points accrued at the end of one of Axelrod's tournaments are a kind of resource that can be used for reproduction

- Genotype: iterated PD strategy (we'll need a way to encode the PD strategy as a genotype)
- Generation: an Axelrod tournament (200 games against all opponents)
- Fitness: points accrued
- Reproduction: function of fitness such that the number of players with successful strategies increases in - successive generations


## Analysis
What sorts of questions might we try and answer with a computational model of the tournament?
- Can genes for niceness, retribution, and forgiveness appear by mutation? 
- Can they successfully invade a population of other strategies?
- Can they resist being invaded by subsequent mutations?

I'll leave it with you to think about modelling this to evolve a winning strategy and the sorts of behaviours that might emerge from this Complex System.

In [None]:
print("Lectures completed successfully!")


>*"Tiger got to hunt, bird got to fly; Man got to sit and wonder 'why, why, why?' Tiger got to sleep, bird got to land; Man got to tell himself he understand."* - Kurt Vonnegut

I hope that you've enjoyed this unit and that you think about it and what you have learned as you go into the next phase of your working life and education

Good luck for the upcoming Test and your final Project submissions :)