# Getting started with tomsup

Tomsup, Theory of Mind Simulation using Python, is a Python Package for Agent Based simulations. It includes 1) a framework for running agent based simulations using 2 by 2 payoffmatrices and most notably 2) an implementation of game theory of mind in a agent based framework following the implementation of [Devaine, et al. (2017)](http://dx.plos.org/10.1371/journal.pcbi.1005833).

This tutorial will simply introduce the framework of tomsup, for an introduction to the theory of mind (ToM) agent, see *introduction_to_tom.ipynb*.

It is also possible for create your own agents for an introduction on this see *creating_an_agent.ipynb*. 

Lastly we have also created an brief introduction to each of the simpler agents for this see *introduction_to_basic_agents.ipynb*.


In [2]:
#assuming you are in the github folder change the path - not relevant if tomsup is installed via. pip
import os
os.chdir("..") # go out of the tutorials folder

In [3]:
import tomsup as ts

To get an overview of possible agent you can use the following function. Give you a brief description of their strategy as well as a reference for further reading.

In [4]:
ts.valid_agents()

{'RB': {'name': 'Random Bias',
  'shorthand': 'RB',
  'example': 'RB(bias = 0.7)',
  'reference': 'Devaine, et al. (2017)',
  'strategy': 'Chooses 1 randomly based on a probability or bias'},
 'WSLS': {'name': 'Win-stay, lose-switch',
  'shorthand': 'WSLS',
  'example': 'WSLS()',
  'reference': 'Nowak & Sigmund (1993)',
  'strategy': 'If it win it chooses the same option again, if it lose it change to another'},
 'TFT': {'name': 'Tit-for-Tat',
  'shorthand': 'TFT',
  'example': 'TFT()',
  'reference': 'Shelling (1981)',
  'strategy': 'Intended the prisoners dilemma. It starts out cooperating and then simply copies it opponents action.'},
 'QL': {'name': 'Q-Learning Model',
  'shorthand': 'QL',
  'example': 'QL(learning_rate = 0.5, b_temp = 1)',
  'reference': 'Watkinns (1992)',
  'strategy': 'A simple reinforcement learning model, which is more choose e.g. 1 if 1 have previously been shown to yield positive result.'},
 'TOM': {'name': 'Theory of Mind',
  'shorthand': 'TOM',
  'example'

---
## 1) Creating an agent
First we will set up a Random bias (RB) agent. This agent simply choses randomly with a given bias.
There is two ways to setup an agent, either using the agent class (e.g. RB) or using the ```create_agents()``` function. We will start by calling the agent class RB directy. For a whole list of valid agent use the ```ts.valid_agents()```.

In [5]:
jung = ts.RB(bias = 0.7, save_history = True) #calling the agent subclass RB - for more on save_history see '3) inspecting Agent and AgentGroup'

# Let's examine the charlie
print(f"jung is an class of type: {type(jung)}")
if isinstance(jung, ts.Agent):
    print(f"but jung is also of has the parent class ts.Agent")

# let us have charlie make a choice 
choice = jung.compete()

print(f"jung chose {choice} and his probability for choosing 1 was {jung.get_bias()}.")

jung is an class of type: <class 'tomsup.agent.RB'>
but jung is also of has the parent class ts.Agent
jung chose 0 and his probability for choosing 1 was 0.7.


As previously mentioned you can also create agents using the create_agent function. Here we will create skinner as a Q-learning agent, which is a simple reinforcement learning agent, see Watkinns (1992) for more.

In [6]:
skinner = ts.create_agents(agents = "QL", start_params = {'save_history': True}) # create a reinforcement learning agent

Since skinner is a reinforcement learning agent his compete function requires him to know which game he is playing, so he can choose based on payoff. He also needs to know his opponents move during their last turn, so that he can update his belief about his opponent choices.

Let us have jung and skinner play the the matching pennies game. We can fetch the game from the function `PayoffMatrix`,

In [7]:
help(ts.PayoffMatrix) # check which payoff matreces are implemented in tomsup

penny = ts.PayoffMatrix(name = "penny_competitive") # fetch the competitive matching pennies game.
#It is also possible to create a custom payoff matrix by inputting the desired values into the PayoffMatrix function

#print the payoff matrix
print(penny)

#fetch the underlying numpy matrix
print(penny.get_matrix())

Help on class PayoffMatrix in module tomsup.payoffmatrix:

class PayoffMatrix(builtins.object)
 |  PayoffMatrix(name, predefined=None)
 |  
 |  A class of 2 by 2 payoff matrices.
 |  
 |  Currently include the following games:
 |  The staghunt game: 
 |      'staghunt'
 |  The matching pennies game (coop and competive): 
 |      'penny_competive'
 |      'penny_cooperative'
 |  The party dilemma:
 |      'party'
 |  The Battle of the sexes:
 |      'sexes'
 |  The chicken game:
 |      'chicken'
 |  The deadlock:
 |      'deadlock'
 |  
 |  Example:
 |  >>> staghunt = PayoffMatrix(name="staghunt")
 |  >>> staghunt.payoff(action_agent0 = 1, action_agent1 = 1 , agent = 0)
 |  5
 |  >>> staghunt.payoff(action_agent0 = 1, action_agent1 = 0 , agent = 0)
 |  0
 |  >>> staghunt.payoff(action_agent0 = 0, action_agent1 = 1 , agent = 0)
 |  3
 |  >>> chicken = PayoffMatrix(name="chicken")
 |  >>> chicken.payoff(0, 1 , 0)
 |  -1
 |  >>> dead = PayoffMatrix(name="deadlock")
 |  >>> dead.payoff(1, 

Let us try to have skinner and jung compete in the matching pennies game:

In [7]:
jung_a = jung.compete() # a for action
skinner_a = skinner.compete(p_matrix = penny, agent = 1, op_choice = None) #Note that op_choice can be unspecified (or None) in the first round

jung_p = penny.payoff(action_agent0 = jung_a, action_agent1 = skinner_a, agent = 0)
skinner_p = penny.payoff(action_agent0 = jung_a, action_agent1 = skinner_a, agent = 1)

print(f"jung chose {jung_a} and skinner chose {skinner_a}, which results in a payoff for jung of {jung_p} and skinner of {skinner_p}.")
# Note that you might get different results simply by chance

jung chose 1 and skinner chose 0, which results in a payoff for jung of -1 and skinner of 1.


---
## 2) Running a tournament
In the above case we saw how to have two agents compete for a single round. It is however rare that we only need 1 round and while the above functionality can be wrapped within a for we have made a `compete()` function for convenience. In this section we will also examine the class `AgentGroup`, which allows you to run tournaments with multiple agents.

Let us start with having the two agent compete for 30 rounds in the matching pennies game:

In [8]:
results = ts.compete(jung, skinner, p_matrix = penny, n_rounds = 30)
print(type(results))

jung_sum = results['payoff_agent0'].sum()
skinner_sum = results['payoff_agent1'].sum()

print(f"jung seemed to get a total of {jung_sum} points, while skinner got a total of {skinner_sum}.")

results.head() #inspect the first 5 rows of the df

<class 'pandas.core.frame.DataFrame'>
jung seemed to get a total of -16 points, while skinner got a total of 16.


Unnamed: 0,round,choice_agent0,choice_agent1,payoff_agent0,payoff_agent1
0,0,1,0,-1,1
1,1,1,0,-1,1
2,2,0,1,-1,1
3,3,0,0,1,-1
4,4,0,1,-1,1


We see that the output of the compete function if a pandas dataframe. It is possible to change this to a list by specifying `return_val = "list"`, but having it as a dataframe allow for convenient operators attributes such as mean() and sum().

The above case a the simplest possible version of the `compete()` function. You can also specify number of simulations, whether the agent should be reset after each simulation (this is recommended) and whether it should print what simulation it is running (`silent`).




In [9]:
results = ts.compete(jung, skinner, penny, n_rounds = 30, n_sim = 3, reset_agent = True, return_val = 'df', silent = False)
results.head()

Running simulation 1 out of 3
	Running simulation 2 out of 3
	Running simulation 3 out of 3


Unnamed: 0,n_sim,round,choice_agent0,choice_agent1,payoff_agent0,payoff_agent1
0,0,0,1,0,-1,1
1,0,1,0,0,1,-1
2,0,2,1,0,-1,1
3,0,3,1,0,-1,1
4,0,4,1,0,-1,1


**Note** that by adding simulations the dataframe now also have a column called 'n_sim', for which simulations in which the results belongs.

### AgentGroup
Now as promised, let us take a look at tournaments with multiple agents. We will start of by creating a group of agents using a list of the desired agents as well as a list of their starting parameters. If you are in doubt how to specify these you can always the starting parameters from an existing agent using `jung.get_start_params()`.

In [11]:
agents = ['RB', 'QL', 'WSLS'] # create a list of agents
start_params = [{'bias': 0.7}, {'learning_rate': 0.5}, {}] # create a list of their starting parameters (an empty dictionary {} simply assumes defaults)

group = ts.create_agents(agents, start_params) # create a group of agents
print(group)
print("\n----\n") # to space out the outputs

group.set_env(env = 'round_robin') # round_robin e.g. each agent will play against all other agents

# make them compete
results = group.compete(p_matrix = penny, n_rounds = 20, n_sim = 2)
results.head() #examine the first 5 rows in results

<Class AgentGroup, envinment = None 

QL_0	 | 	{'bias': 0.7}
RB_0	 | 	{'learning_rate': 0.5}
WSLS_0	 | 	{}

----

Currently the pair, ('QL_0', 'RB_0'), is competing for 2 simulations, each containg 20 rounds.
	Running simulation 1 out of 2
	Running simulation 2 out of 2
Currently the pair, ('QL_0', 'WSLS_0'), is competing for 2 simulations, each containg 20 rounds.
	Running simulation 1 out of 2
	Running simulation 2 out of 2
Currently the pair, ('RB_0', 'WSLS_0'), is competing for 2 simulations, each containg 20 rounds.
	Running simulation 1 out of 2
	Running simulation 2 out of 2
Simulation complete


Unnamed: 0,n_sim,round,choice_agent0,choice_agent1,payoff_agent0,payoff_agent1,agent0,agent1
0,0,0,0,1,-1,1,QL_0,RB_0
1,0,1,0,1,-1,1,QL_0,RB_0
2,0,2,0,0,1,-1,QL_0,RB_0
3,0,3,0,0,1,-1,QL_0,RB_0
4,0,4,0,0,1,-1,QL_0,RB_0


As you can see once the group is created and environment it is easy to have the agent compete with one another.

(for more possible environment, see `help(group.set_env)`)

---
## 3) Inspecting Agent and AgentGroup
So let's examine some of the attributes of the agents, which applies to all agents. In this section we will also look a bit into how to extract an agent from an agentGroup and how to examine the enviroment.

In [0]:
# What if I want to know the starting parameters?
print("This is the starting parameters of jung: ",    jung.get_start_params()) # Note that it also prints out default parameters
print("This is the starting parameters of skinner: ", skinner.get_start_params())

# What if I want to know the agent last choice?
print("This is jung's last choice: ",    jung.get_choice())
print("This is skinner's last choice: ", skinner.get_choice())

# What if I want to know the agents strategy?
print("jung's strategy is: ", jung.get_strategy())
print("skinner's strategy is: ", skinner.get_strategy())

We can also get the history, recall that we specified `get_history = True` for skinner (and jung). This means we can go back and see all his previous state for the Random bias (RB) this only include the choice. This is by default returned as a dataframe. Note that by default `get_history` is `False` to save memory.

In [0]:
# What is the history of skinner (e.g. what is his choices and internal states)

history = jung.get_history(format = "df")
print(history.head())

print("\n --- \n") # for spacing

history = skinner.get_history(format = "df")
print(history.head(15)) # the first 15 rows

In the above we can see the history of the two agents, both their internal states, which is none for the the random bias agent (RB) and includes two for the reinforcement learning agent (QL). The two for for the RL indicates the expected value of choosing 0 rather than 1. As expected we here see that the expected value of choosing 0 increases due to the jungs bias of 0.7.

**Note** that if the agent are competing using multiple simulations it will resets the agent after each simulations, consequently their history is also reset.