# Take-home assignment: Agent-based Cognitive Modelling

Please complete each of the conceptual questions in a markdown cell (in written text), and each of the coding questions using code cells (in combination with markdown cells if a written part is also required for answering the question).

It is important that you complete each of the exercises in this take-home assignment **individually**. If we see signs of answers being shared between students, we will investigate.

**Exercise 1 (Conceptual question):**

This question is about agent-based cognitive modelling in general.

Below are four research questions. For each of these, write down: 
- Whether or not you think _agent-based_ modelling would be a sensible approach to address that research question, and explain why. 
- If your answer is that agent-based modelling would be a sensible approach, also write down whether the agents in this model would have to be _cognitive_ agents, and explain why.

1. Is categorisation in humans exemplar-based or feature-based? (In simple terms, _exemplar-based_ means: if a novel stimulus is similar to other dogs I've seen, it's probably a dog. While _feature-based_ means: if it barks, has four legs, and a wagging tail, it's probably a dog.)


2. Do more extreme ideas spread through a population more quickly than more moderate ideas?


3. When two individuals do a task like moving a sofa together (i.e., _joint action_), how do they coordinate?


4. How does eye colour spread through a population? (Assuming, for example, that the allele for blue eyes is recessive and the allele for brown eyes is dominant.)

**Exercise 2 (Conceptual question):**

This question is about game theory and pay-off matrices as a representation of social coordination situations.

Imagine the following situation:
- An employer has to make a decision about whether to pay their employee a low or a high salary, without knowing how much effort the employee is going to put into their work. If the employee puts in a high amount of effort, the high salary is worth it. If, instead, the employee puts in a low amount of effort, it's better (from the employer's point of view) to pay them a low salary.
- Simultaneously, the employee has to make a decision about whether to put high effort or low effort into their work, without knowing how much the employer is going to pay them. If the employer decides to pay them a high salary, the high amount of effort is worth it. If, instead, the employer decides to pay them a low salary, it's better (from the employee's point of view) to put in a low amount of effort.

Below is an empty pay-off matrix (game-theory style). Translate the situation above to a pay-off matrix, by filling in each of the cells in the table below, according to the situation described above. Replace each of the A's and B's in the table with the pay-off values for player A (the employer) and Player B (the employee), using the following values: $[0, 1, 2, 3]$ . Also write out and explain your considerations that went into deciding which numbers to put in each cell of the pay-off matrix.


| B (Employee):     | High effort | Low effort |
|-------------------|-------------|------------|
| **A (Employer):** |             |            |
| **High salary**   |     A, B    |    A, B    |
| **Low salary**    |     A, B    |    A, B    |

**Exercise 3 (Coding question):**

Use the tomsup package to simulate the following situation:

- agent0 = A ```'1-ToM'``` agent with default parameter settings
- agent1 = A ```'2-ToM'``` agent with default parameter settings
- game = ```'party'```
- environment = ```'round-robin'```
- n_sim = 10

Run 10 simulations of this interaction for a number of rounds that seems reasonable to you, and use the ```group.plot_p_k()``` method to plot how agent1's belief about agent0's ToM level changes over time. (The three code cells below make a start by loading in the relevant packages.)

Find out whether there is a certain number of rounds after which each of the 10 simulations reaches a point where agent1 has a fully accurate model of their opponent's _k_-level (and has reached maximum certainty about that).

Show a plot to back up your answer, and also explain your answer fully in words.

In [None]:
!pip install tomsup

In [None]:
import tomsup as ts
import numpy as np
import matplotlib.pyplot as plt

In [None]:
party = ts.PayoffMatrix(name='party')

print(party)

**Exercise 4 (Conceptual question):**

This question is about both Waade et al. (2022) and de Weerd et al. (2015), and the comparison between these two models.

In both the Waade et al. (2022) model and the de Weerd et al. (2015) model, the $k$-ToM agent has a belief about the $k$-level of the agent they're interacting with, and updates this belief over the course of the interactions. Describe the main similarities and differences in how this belief-updating about the other agent's $k$-level works in these two different models.


**Exercise 5 (Conceptual question):**

This question is about de Weerd et al. (2015).

Imagine you want to know whether actual humans adapt their strategy in the Tacit Communication Game depending on their estimate of the level of ToM that their interlocutor uses. Imagine that your experiment consists of the following two conditions (where in both conditions, the participant is _told_ that they are playing the game with another participant, even though in reality, their interlocutor is being simulated by a computer): 
- In the _0-ToM_ condition, participants interact with a computer that is implemented as a zero-order ToM agent from the de Weerd et al. (2015) model.
- In the _1-ToM_ condition, participants interact with a computer that is implemented as a 1st-order ToM agent from the de Weerd et al. (2015) model.

Now imagine that you have collected your experimental data and you are ready to analyse your results. These are the specific hypotheses you would like to test:

- _Null hypothesis_: Human participants do not adapt their communication strategy to the ToM-level of their interlocutor, and will always behave like a 2-ToM sender.
- _Alternative hypothesis_: Human participants _do_ adapt their communication strategy to the ToM-level of their interlocutor. Specifically, participants in the _0-ToM_ condition will behave more like a _1-ToM_ sender, and human participants in the _1-ToM_ condition will behave more like a _2-ToM_ sender.

Explain how you could use the model described in de Weerd et al. (2015) to test these hypotheses, _using_ the data you've collected in your experiment as described above. Describe the process step-by-step.

**Exercise 6 (Implementation/code-related question):**

This question is about Cuskley et al. (2018).

Imagine that you want to extend the Cuskley et al. (2018) model to look at the effect of social network structure on morphological complexity. To give you an idea of what this might look like, three different possible social network types are described at the bottom of this exercise (```"fully_connected"```, ```small-world``` and ```"scale-free"```). Describe _in words_ how you would have to adapt the code of Computer Lab 3 in order to be able to create populations with these three different social network types. More specifically, answer the three questions below:

**a)** How would you go about giving structure to the population (i.e., specifying which agent is connected with which other agents)? The code in Computer Lab 3 consists of several different classes, what attributes would you have to add to which of these classes in order to create such structure?


**b)** Assuming you have now added attributes to the relevant classes in order to specify for each agent to which other agents in the population it is connected. To which class would you have to add a method that can initialise a population with a specified type of social network structure (i.e., a method that takes the type of social network structure as one of its input arguments; an input argument that can be set to ```"fully_connected"```, ```small-world``` or ```"scale-free"```)?


**c)** Finally, once you've specified steps **a)** and **b)**, which method of which class would you have to adapt to make sure that agents only interact with agents they are connected to?


_Fully connected network_: This network is maximally dense, such that all possible connections are realized (i.e., all agents in the population get to interact with each other). It is also homogenous, in the sense that every agent has the same number of connections.

_Small-world network_: This network is also relatively homogenous such that every agent has approximately the same number of connections, yet it is much sparser than the fully connected network and realizes only half of the possible connections. This network type has the small-world property of "strangers" being indirectly linked by a short chain of individuals. 

_Scale-free network_: This network is equally sparse as the small-world network, and it has the same number of possible connections overall. However, it is not homogenous: not every agent has the same number of connections. While some agents are highly connected, others are more isolated. The distribution of connections in this network roughly follows a power-law distribution, with few agents having many connections (forming "hubs" who interact with almost everyone in the population), and a tail of agents having very few connections.

**Exercise 7 (Coding question):**

This question is about Mudd et al. (2022).

Imagine that you'd want to adapt the model of Mudd et al. (2022) to change the way in which agents update their vocabulary when they find out that they are using two different forms for the same concept. Instead of only the receiver doing _bit update_ to make their form more similar to that of the sender, you want to adapt the model such that _both_ the sender and the receiver update their language representation by adapting to each other; specifically, by finding a middle ground between their two forms. (Like two speakers of English who start out with the two different forms _sofa_ and _couch_, and adapt by both updating their form to a middle ground form like _souch_.)

Below is an empty skeleton of a function called ```update_forms()```. Complete this function so that it finds a middle ground between the producer's form and the comprehender's form, and updates both the producer's and the comprehender's language representation with this new form. **Note** that the resulting form that the two agents update their language representation with should still (or again) be a vector containing bits (i.e., only 0s and 1s). (Because changing this to continuous values would require more extensive changes to the rest of the code and model.)

_For context:_ this new ```update_forms()``` function would be replacing the ```update_comprehender_concept()``` function (final code cell in section 1.3 of Computer Lab 4).

In [None]:
# CODE SKELETON TO COMPLETE FOR EXERCISE 7:

# DEFINITIONS:
# producer.language_rep[producer_concept_choice][1] = producer's form for the current concept, as in Lab 4
# comprehender.language_rep[producer_concept_choice][1]) = comprehender's form for the current concept, as in Lab 4

def update_forms(producer, producer_concept_choice, comprehender):
    # Step 1: Find the middle ground between the producer's form and the comprehender's form:
    
    # NOTE: Before moving on to Step 2, make sure that the middle ground form created above is 
    # (or gets converted back into) a bit vector (i.e., containing only 0s and 1s)
    
    # Step 2: Update the producer's and comprehender's language representations to the new form:
    
    pass

**Exercise 8 (Model design question):**

This question is about agent-based cognitive modelling in general.

Below is a research question, followed by a verbal explanation of what this research question is trying to get at. How would you design an agent-based model to answer this research question? (More specific instructions for what to specify follow below.)

_Research question:_ What is more important for successful problem-solving: expertise or diversity?

_Explanation of the question:_ Imagine a population in which individuals can develop different strategies for solving a particular problem. For ease of explanation, let's imagine the problem is something practical, and the solution is to design and build a particular tool that consists of different components. Imagine each component has a value that represents how much it contributes to solving the problem, and that an optimal solution to the problem requires a tool that combines several optimal components. Imagine individuals in this populations can have one of two possible strategies:

1. Select one individual from the population who you want to learn from, spend a lot of time to perfectly acquire their solution (i.e., how to make their variant of the tool), and innovate that variant.
2. Take in examples from many different individuals in the population, and try to combine their solutions (i.e., their variants of the tool).

This research question is getting at a trade-off between accuracy of learning and diversity of input. An important assumption of your model should be that each individual has the same limited amount of time. Spending more time on learning one variant perfectly (as in strategy 1) means that you will be able to reproduce the variant more accurately, and understand better how it works (which should allow you to make more targeted innovations), but it comes at the cost of not being able to see a diverse set of solutions to the problem. Vice versa, taking in examples from different individuals in the population (strategy 2) has as an advantage that you'll be able to take in a diverse set of possible solutions to the problem (which should allow you to combine the good parts of the different solutions), but that comes at the cost of not being able to acquire/reproduce these perfectly (because you can't spend as much time learning about each individual solution).

Specify, in bullet points, what the three major components of the model should consist of:
- the agents
- the interactions (agent-agent interactions and/or agent-environment interactions)
- the environment

If you think one of these three components is not relevant for answering this research question, write "not relevant" and briefly explain _why_ you believe this component is not relevant to the question.

**Exercise 9 (Coding question):**

This question is about Mudd et al. (2022).

Mudd et al. (2022) find that population size has an effect on the degree of lexical variability in the population, where larger populations lead to less lexical variability (i.e., more convergence). Their explanation for this effect is that this is a result of the feedback loop illustrated in Figure 12 in the paper (copy-pasted below):

<img src="Fig_12_feedback_loop_Mudd_et_al.png" alt="drawing" width="600"/>

However, Mudd et al. (2022) do not show plots with the proportion of language game results to illustrate what this hypothesised feedback loop would look like in a single simulation. So that is what you are going to try and do below.

**a)** Imagine you run a simulation contrasting a population of 5 agents with a population of 100 agents, and these simulations would behave according to the feedback loop described in Figure 12 of Mudd et al. (2022). Now imagine you would generate plots of the proportion of language game results (similar to Figures 7 and 8 from the Mudd et al. paper) for each of these populations. Describe in words what you think these plots should look like to illustrate the feedback loop. How would you expect the plots for the small and large population to be different? Explain why.


**b)** Now actually run these simulations and generate the corresponding plots with the proportion of language game results, by adapting the final three code cells in this notebook. Instead of contrasting ```n_groups``` = 1 with ```n_groups``` = 10, your code should contrast ```n_agents = 5``` with  ```n_agents = 100```.
Set the ```n_groups``` parameter to ```n_groups = 5``` (for both simulations).
Run these simulations about 10 times, until you find an example case that looks like what you've described for part **a)** of this exercise. **Note** that this will definitely not happen every time you run the simulations and compare a single simulation with population size 5 with a single simulation with population size 100, but it will happen _sometimes_ (maybe around 1/3 of the time). The difference does not have to be very stark, but it does have to be visible.

## Necessary installations and imports:

In [None]:
pip install mesa

In [None]:
import random
import numpy as np
import itertools
from math import sqrt
import time
from mesa import Agent, Model
from mesa.datacollection import DataCollector
from mesa.time import RandomActivation
from mesa.batchrunner import BatchRunner, FixedBatchRunner
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

## Parameter settings:


In [None]:
################# PARAMETER SETTINGS: ################# 

test_params = dict(
    n_concepts=10, # int: number of concepts
    n_bits=10,  # int: number of bits (determining length of forms and culturally-salient feature vectors)
    n_agents=10, # int: number of agents in the population
    n_groups=1,  # determines how many different semantic groups there are
    initial_degree_of_overlap=0.9,  # degree of overlap between the form and meaning components
    n_steps=2000  # number of timesteps to run the simulation for (called "model stages" in the paper)
)

## Initialising the population and their language representations:

In [None]:
def language_skeleton(n_concepts, n_bits):
    """ initiate language with n_concepts and n_bits
    in the form {0: [meaning, form], 1: [meaning, from], ...}
    the meaning and form components are initiated with None """
    skeleton_concept_meaning_form = {}
    for n in range(n_concepts):
        skeleton_concept_meaning_form[n] = [[None] * n_bits] * 2
    return skeleton_concept_meaning_form

In [None]:
def language_create_meanings(n_concepts, n_bits, n_groups):
    """ generate the meaning representation for each group
    returns a dictionary with group: meaning representation
    ex. {0: [[1, 1, 0, 1, 1], [0, 0, 0, 1, 0]], 1: [[0, 0, 0, 1, 1], [1, 1, 1, 1, 0]]} """
    group_meaning_dic = {}
    for n in range(n_groups):
        condition = False
        while condition == False:
            single_group_meaning_list = []
            for concept in range(n_concepts):
                single_group_meaning_list.append(random.choices([0, 1], k=n_bits))  # list of len n_components
            if len(set(tuple(row) for row in single_group_meaning_list)) == len(
                    single_group_meaning_list):
                condition = True
                group_meaning_dic[n] = single_group_meaning_list
                
    return group_meaning_dic

In [None]:
def language_add_meaning(agent, meaning_dic):
    """ takes in the language skeleton and adds the meaning component depending on group of agent """
    counter = 0  # to keep track of which meaning component in meaning_dic values
    for concept, meaning_form in agent.language_rep.items():
        meaning_form[0] = meaning_dic[agent.group][counter]  # meaning_form[0] is the meaning only
        counter += 1
    return agent

In [None]:
def language_add_form(agent, initial_degree_of_overlap):
    """ start with meaning representation and assign form representation
    depending on the desired degree of overlap """
    for concept, meaning_form in agent.language_rep.items():
        forms = []
        for bit in meaning_form[0]:
            my_choice = np.random.choice([True, False], p=[initial_degree_of_overlap, 1 - initial_degree_of_overlap])  # p = weights
            if not my_choice:  # if my_choice == False
                random_choice = np.random.choice([0, 1])
                forms.append(random_choice)  # random choice 0 or 1 if False (random)
            else:
                forms.append(bit)  # append the same bit (iconic)
        meaning_form[1] = forms
    return agent

## Running a language game and updating the agents' language representations:

In [None]:
def language_game(sorted_agent_list):
    """ takes agent list sorted by group
    chooses and agent to be the producer """
    form_success = 0
    meaning_success = 0
    bit_update = 0

    for a in sorted_agent_list:
        what_is_updated = language_game_structure(a, sorted_agent_list)

        if what_is_updated == "3a":
            form_success += 1
        elif what_is_updated == "3b1":
            meaning_success += 1
        else:  # "3b2"
            bit_update += 1

    language_game_stats = {"form_success": form_success, "meaning_success": meaning_success, "bit_update": bit_update}
    return language_game_stats

In [None]:
def language_game_structure(producer, all_agents):
    comprehender = random.choice(all_agents)
    producer_concept_choice = random.choice(list(producer.language_rep))  # 1
    form_match_answer = does_closest_form_match(producer, producer_concept_choice, comprehender)  # 2
    if form_match_answer == False:  # 3b
        meaning_match_answer = does_closest_meaning_match(producer, producer_concept_choice, comprehender)
        if meaning_match_answer == False:  # 3b2
            update_comprehender_concept(producer, producer_concept_choice, comprehender)
            return "3b2"
        else:  # 3b1
            # None
            return "3b1"
    else:  # 3a
        # None
        return "3a"

In [None]:
def does_closest_form_match(producer, producer_concept_choice, comprehender):
    produced_form = producer.language_rep[producer_concept_choice][1]

    distance_from_produced_form = {}
    for concept, meaning_form in comprehender.language_rep.items():
        # compare produced concept and all comp concepts, calculate distance between each
        distance = sum([abs(prod_bit - comp_bit) for prod_bit, comp_bit in zip(produced_form, meaning_form[1])])
        distance_from_produced_form[concept] = distance

    min_distance = min(distance_from_produced_form.values())
    comp_closest_form_list = [concept for concept, distance in distance_from_produced_form.items() if distance == min_distance]
    comp_chosen_form = random.choice(comp_closest_form_list)  # because there can be multiple, randomly choose from list

    return producer_concept_choice == comp_chosen_form  # returns True or False

In [None]:
def does_closest_meaning_match(producer, producer_concept_choice, comprehender):
    produced_form = producer.language_rep[producer_concept_choice][1]

    distance_from_produced_form = {}
    for concept, meaning_form in comprehender.language_rep.items():
        # compare produced concept and all comp concepts, calculate distance between each
        distance = sum([abs(prod_bit - comp_bit) for prod_bit, comp_bit in zip(produced_form, meaning_form[0])])
        distance_from_produced_form[concept] = distance
    comp_closest_meaning = min(distance_from_produced_form, key=distance_from_produced_form.get)

    return comp_closest_meaning == producer_concept_choice  # returns True or False

In [None]:
def update_comprehender_concept(producer, producer_concept_choice, comprehender):
    """ update comprehender form
    compare all producer and comprehender form, find the ones that don't match
    of the ones that don't match, choose one and flip this bit of the comprehender's form """
    comparison_list = ([(p_bit == c_bit) for p_bit, c_bit in zip(producer.language_rep[producer_concept_choice][1], comprehender.language_rep[producer_concept_choice][1])])
    # to prevent case where correct concept has a match for form producer and comprehender
    # this could happen if comprehender has 2 forms which both == form producer and the non-matching concept one gets chosen
    if all(comparison_list) == True:
        pass
    else:
        correctable_indexes = [i for i, comparison in enumerate(comparison_list) if comparison == False]  # get False indeces
        chosen_index_to_correct = random.choice(correctable_indexes)
        comprehender.language_rep[producer_concept_choice][1][chosen_index_to_correct] = abs(1 - (comprehender.language_rep[producer_concept_choice][1][chosen_index_to_correct]))

    return None

## Data-collector functions:

In [None]:
# lexical variability
def calculate_pop_lex_var(agent_list, n_concepts):
    pairs_of_agents = itertools.combinations(agent_list, r=2)

    pairs_lex_var = []

    for pair in pairs_of_agents:
        pair_lex_var = calculate_distance(pair, n_concepts)
        pairs_lex_var.append(pair_lex_var)

    pop_av_lex_var = sum(pairs_lex_var) / len(list(itertools.combinations(agent_list, r=2)))
    return (pop_av_lex_var)

In [None]:
def calculate_distance(pair, n_concepts):
    """ per concept per agent pair, distance = 0 if concepts are the same, distance = 1 if concepts are different
    add up concept distances and divide by total number of concepts """
    concept_lex_var_total = 0  # list of distances between individual concepts (compare iconic agent a and iconic agent b)
    for n in range(n_concepts):
        if pair[0].language_rep[n][1] != pair[1].language_rep[n][1]:
            concept_lex_var_total += 1  # if concepts don't match, add 1 to distance

    pair_mean_lex_var = concept_lex_var_total / n_concepts
    return pair_mean_lex_var

In [None]:
# iconicity
def calculate_degree_of_iconicity(agent):
    concept_iconicity_vals = []

    for concept, meaning_form in agent.language_rep.items():
        comparison_list = ([(p_bit == c_bit) for p_bit, c_bit in zip(meaning_form[0], meaning_form[1])])  # returns True or False for each comparison
        concept_iconicity_val = sum(comparison_list) / len(comparison_list)
        concept_iconicity_vals.append(concept_iconicity_val)

    mean_agent_iconicity = sum(concept_iconicity_vals) / len(concept_iconicity_vals)
    return mean_agent_iconicity

In [None]:
def calculate_prop_iconicity(agent_list):
    iconicity_list = [a.prop_iconicity for a in agent_list]
    return sum(iconicity_list) / len(agent_list)

## Defining the agent and the model as a whole (using the Mesa package):


In [None]:
class ContextAgent(Agent):
    def __init__(self, unique_id, model, n_concepts, n_bits, n_groups):
        super().__init__(unique_id, model)
        self.group = random.choice(range(n_groups))
        self.language_rep = language_skeleton(n_concepts, n_bits)  # dic = {concept: [[meaning], [form]}
        self.prop_iconicity = None

    def describe(self):
        #print(f"id = {self.unique_id}, prop iconicity = {self.prop_iconicity}, group = {self.group}, language = {self.language_rep}")
        print(self.language_rep)

    def step(self):
        self.prop_iconicity = calculate_degree_of_iconicity(self)
        

In [None]:
class ContextModel(Model):
    """A model with some number of agents."""
    def __init__(self, n_agents, n_concepts, n_bits, n_groups, initial_degree_of_overlap, n_steps, viz_on=False):
        super().__init__()
        self.placement_counter = 0
        self.n_agents = n_agents
        self.n_groups = n_groups
        self.n_concepts = n_concepts
        self.n_bits = n_bits
        self.n_steps = n_steps

        self.current_step = 0
        self.schedule = RandomActivation(self)
        self.running = True  # for server
        self.group_meanings_dic = language_create_meanings(n_concepts, n_bits, n_groups) # set up language structure (maybe eventually a class)

        self.width_height = int(sqrt(n_agents))
        self.coordinate_list = list(itertools.product(range(self.width_height), range(self.width_height)))  # generate coordinates for grid

        # language game successes and failures
        self.lg_form_success = 0
        self.lg_meaning_success = 0
        self.lg_bit_update = 0
        self.language_game_stats = {'form_success': None, 'meaning_success': None, 'bit_update': None}

        # for datacollector
        self.pop_iconicity = None
        self.pop_lex_var = None
        self.datacollector = DataCollector({'pop_iconicity': 'pop_iconicity',
                                            'pop_lex_var': 'pop_lex_var',
                                            'current_step': 'current_step',
                                            'lg_form_success': 'lg_form_success',
                                            'lg_meaning_success': 'lg_meaning_success',
                                            'lg_bit_update': 'lg_bit_update'},
                                           {'group': lambda agent: agent.group,
                                            'language': lambda agent: agent.language_rep,
                                            'prop iconicity': lambda agent: agent.prop_iconicity})

        # create agents
        for i in range(self.n_agents):
            a = ContextAgent(i, self, self.n_concepts, self.n_bits, self.n_groups)  # make a new agent
            language_add_meaning(a, self.group_meanings_dic)  # add meaning to language skeleton
            language_add_form(a, initial_degree_of_overlap)  # add form to language skeleton

            self.schedule.add(a)  # add agent to list of agents
            a.prop_iconicity = calculate_degree_of_iconicity(a)

        self.sorted_agents = sorted(self.schedule.agents, key=lambda agent: agent.group)  # sort agents by group

    def collect_data(self):
        self.pop_iconicity = calculate_prop_iconicity(self.schedule.agents)
        self.pop_lex_var = calculate_pop_lex_var(self.schedule.agents, self.n_concepts)
        self.current_step = self.current_step
        self.lg_form_success = self.language_game_stats['form_success']
        self.lg_meaning_success = self.language_game_stats['meaning_success']
        self.lg_bit_update = self.language_game_stats['bit_update']
        self.datacollector.collect(self)

    def tests(self, a):
        assert len(self.group_meanings_dic) == self.n_groups
        assert len(self.group_meanings_dic[0]) == self.n_concepts
        assert len(self.group_meanings_dic[0][0]) == self.n_bits
        assert len(a.language_rep) == self.n_concepts

    def step(self):
        """ Advance the model by one step """
        self.collect_data()  # set up = year 0

        if self.current_step == 0:
            self.tests(random.choice(self.schedule.agents))  # run tests on a random agent

        self.current_step += 1
        self.language_game_stats = language_game(self.sorted_agents)  # language game (only after the set up = year 0)

        #if self.current_step == self.n_steps:
        #    upgma_df = pd.DataFrame()

        #    for i in self.schedule.agents:
        #        for key, value in i.language_rep.items():
        #            new_row = {'id': i.unique_id, 'concept': key, 'form': value[1]}
        #            upgma_df = upgma_df.append(new_row, ignore_index=True)

        #    upgma_df.to_csv("upgma_data.csv")

        self.schedule.step()

In [None]:
n_groups = 1

start_time = time.time()

context_model = ContextModel(test_params["n_agents"], test_params["n_concepts"], test_params["n_bits"], 
                             n_groups, test_params["initial_degree_of_overlap"], test_params["n_steps"])

for i in range(test_params["n_steps"]+1):  # set up = year 0 + x years
    print(i)
    context_model.step()

print("Simulation(s) took %s minutes to run" % round(((time.time() - start_time) / 60.), 2))  # ADDED BY MW

df_model_output_1_group = context_model.datacollector.get_model_vars_dataframe()
## alternative option for the agents is get_agent_vars_dataframe(), returns ['Step', 'AgentID', 'neighborhood', 'language', 'prop iconicity']

csv_save_as = "n_concepts_"+str(test_params["n_concepts"])+"_n_bits_"+str(test_params["n_bits"])+"_n_agents_"+str(test_params["n_agents"])+"_n_groups_"+str(n_groups)+"_overlap_"+str(test_params["initial_degree_of_overlap"])+"_n_steps_"+str(test_params["n_steps"])
df_model_output_1_group = pd.DataFrame(df_model_output_1_group.to_records())  # gets rid of multiindex
df_model_output_1_group.to_csv(f"{csv_save_as}.csv")

In [None]:
n_groups = 10

start_time = time.time()

context_model = ContextModel(test_params["n_agents"], test_params["n_concepts"], test_params["n_bits"], 
                             n_groups, test_params["initial_degree_of_overlap"], test_params["n_steps"])

for i in range(test_params["n_steps"]+1):  # set up = year 0 + x years
    print(i)
    context_model.step()

print("Simulation(s) took %s minutes to run" % round(((time.time() - start_time) / 60.), 2))  # ADDED BY MW

df_model_output_10_groups = context_model.datacollector.get_model_vars_dataframe()
## alternative option for the agents is get_agent_vars_dataframe(), returns ['Step', 'AgentID', 'neighborhood', 'language', 'prop iconicity']

csv_save_as = "n_concepts_"+str(test_params["n_concepts"])+"_n_bits_"+str(test_params["n_bits"])+"_n_agents_"+str(test_params["n_agents"])+"_n_groups_"+str(n_groups)+"_overlap_"+str(test_params["initial_degree_of_overlap"])+"_n_steps_"+str(test_params["n_steps"])
df_model_output_10_groups = pd.DataFrame(df_model_output_10_groups.to_records())  # gets rid of multiindex
df_model_output_10_groups.to_csv(f"{csv_save_as}.csv")

In [None]:
%matplotlib inline

# colormap
cmap = plt.cm.viridis
cmaplist = [cmap(i) for i in range(cmap.N)]

# set up 2 column figure
fig, (ax0, ax1) = plt.subplots(ncols=2, constrained_layout=True)
fig.set_size_inches(9,3)

# FIG 1 GROUP EXAMPLE RUN
# 1 group, 10 stages on ax0

# Uncomment the line below if you want to load in your dataframe from a .csv file:
# model_output = pd.read_csv("", index_col=0)

model_output = df_model_output_1_group

model_output = model_output[['current_step', 'lg_form_success', 'lg_meaning_success', 'lg_bit_update']]
model_output = model_output.rename(columns={"lg_form_success": "form_success", "lg_meaning_success": "culturally_salient_features_success", "lg_bit_update": "update_bit"})
model_output = model_output.iloc[1:11]
model_output[["form_success", "culturally_salient_features_success", "update_bit"]] = model_output[["form_success", "culturally_salient_features_success", "update_bit"]].div(10, axis=0)

# https://www.python-graph-gallery.com/13-percent-stacked-barplot
# From raw value to percentage
totals = [i+j+k for i, j, k in zip(model_output['update_bit'], model_output['culturally_salient_features_success'], model_output['form_success'])]
bit_bars = [i / j for i,j in zip(model_output['update_bit'], totals)]
features_bars = [i / j for i,j in zip(model_output['culturally_salient_features_success'], totals)]
form_bars = [i / j for i,j in zip(model_output['form_success'], totals)]

steps = range(model_output["current_step"].min(), model_output["current_step"].max() + 1)  # min, max steps in df
ax0.bar(steps, bit_bars, color=cmaplist[0], width=1, edgecolor="none", label="bit update")  # Create green Bars
ax0.bar(steps, features_bars, bottom=bit_bars, color=cmaplist[128], width=1, edgecolor="none", label="CS features success")  # Create orange Bars
ax0.bar(steps, form_bars, bottom=[i + j for i, j in zip(bit_bars, features_bars)], color=cmaplist[-1], width=1, edgecolor="none", label="form success")  # Create blue Bars

# axes
ax0.set_xlabel("Model stage", fontsize=15)
ax0.set_ylim(0,1)
ax0.set_ylabel("Proportion", fontsize=15)
ax0.set_xticks(np.arange(1, 11, 1))


# 1 group, 2000 stages on ax1

# Uncomment the line below if you want to load in your dataframe from a .csv file:
# model_output = pd.read_csv("", index_col=0)

model_output = df_model_output_1_group

model_output = model_output[['current_step', 'lg_form_success', 'lg_meaning_success', 'lg_bit_update']]
model_output = model_output.rename(columns={"lg_form_success": "form_success", "lg_meaning_success": "culturally_salient_features_success", "lg_bit_update": "update_bit"})
model_output = model_output.drop([0])
model_output[["form_success", "culturally_salient_features_success", "update_bit"]] = model_output[["form_success", "culturally_salient_features_success", "update_bit"]].div(10, axis=0)

# add column with value for groups of 50 (1-50, 51-100, etc.)
for index, row in model_output.iterrows():
    model_output.at[index, "hist_block"] = int(index/50)

model_output_grouped = model_output.groupby(["hist_block"]).mean()
model_output_grouped["original_index"] = model_output_grouped.index * 50
model_output = model_output_grouped[["form_success", "culturally_salient_features_success", "update_bit", "original_index"]]

# https://www.python-graph-gallery.com/13-percent-stacked-barplot
# From raw value to percentage
totals = [i+j+k for i, j, k in zip(model_output['update_bit'], model_output['culturally_salient_features_success'], model_output['form_success'])]
bit_bars = [i / j for i,j in zip(model_output['update_bit'], totals)]
features_bars = [i / j for i,j in zip(model_output['culturally_salient_features_success'], totals)]
form_bars = [i / j for i,j in zip(model_output['form_success'], totals)]

steps = range(int(model_output.index.min()), int(model_output.index.max() + 1))  # min, max steps in df
ax1.bar(steps, bit_bars, color=cmaplist[0], width=1, edgecolor="none", label="bit update")  # Create green Bars
ax1.bar(steps, features_bars, bottom=bit_bars, color=cmaplist[128], width=1, edgecolor="none", label="CS features success")  # Create orange Bars
ax1.bar(steps, form_bars, bottom=[i + j for i, j in zip(bit_bars, features_bars)], color=cmaplist[-1], width=1, edgecolor="none", label="form success")  # Create blue Bars

# legend
handles, labels = ax1.get_legend_handles_labels()
handles = [handles[2], handles[1], handles[0]]
labels = [labels[2], labels[1], labels[0]]
ax1.legend(handles, labels, loc='center left', bbox_to_anchor=(1, 0.5))

# axes
ax1.set_xlabel("Model stage", fontsize=15)
ax1.set_ylim(0,1)
ax1.set_ylabel("", fontsize=15)
ax1.set_xticks(np.arange(0, 41, step=10))
ax1.set_xticklabels([0,500,1000,1500,2000])

plt.suptitle("1 group", fontsize=18, x=0.4, y=1.1)

plt.savefig("barplot_1group.png", dpi=1000, bbox_inches="tight")



# FIG 10 GROUPS EXAMPLE RUN
# set up 2 column figure
fig, (ax0, ax1) = plt.subplots(ncols=2, constrained_layout=True)
fig.set_size_inches(9,3)

# 10 groups, 10 stages on ax0

# Uncomment the line below if you want to load in your dataframe from a .csv file:
# model_output = pd.read_csv("", index_col=0)

model_output = df_model_output_10_groups

model_output = model_output[['current_step', 'lg_form_success', 'lg_meaning_success', 'lg_bit_update']]
model_output = model_output.rename(columns={"lg_form_success": "form_success", "lg_meaning_success": "culturally_salient_features_success", "lg_bit_update": "update_bit"})
model_output = model_output.iloc[1:11]
model_output[["form_success", "culturally_salient_features_success", "update_bit"]] = model_output[["form_success", "culturally_salient_features_success", "update_bit"]].div(10, axis=0)

# https://www.python-graph-gallery.com/13-percent-stacked-barplot
# From raw value to percentage
totals = [i+j+k for i, j, k in zip(model_output['update_bit'], model_output['culturally_salient_features_success'], model_output['form_success'])]
bit_bars = [i / j for i,j in zip(model_output['update_bit'], totals)]
features_bars = [i / j for i,j in zip(model_output['culturally_salient_features_success'], totals)]
form_bars = [i / j for i,j in zip(model_output['form_success'], totals)]

steps = range(model_output["current_step"].min(), model_output["current_step"].max() + 1)  # min, max steps in df
ax0.bar(steps, bit_bars, color=cmaplist[0], width=1, edgecolor="none", label="bit update")  # Create green Bars
ax0.bar(steps, features_bars, bottom=bit_bars, color=cmaplist[128], width=1, edgecolor="none", label="CS features success")  # Create orange Bars
ax0.bar(steps, form_bars, bottom=[i + j for i, j in zip(bit_bars, features_bars)], color=cmaplist[-1], width=1, edgecolor="none", label="form success")  # Create blue Bars

# axes
ax0.set_xlabel("Model stage", fontsize=15)
ax0.set_ylim(0,1)
ax0.set_ylabel("Proportion", fontsize=15)
ax0.set_xticks(np.arange(1, 11, 1))

# 10 groups, 2000 stages on ax1

# Uncomment the line below if you want to load in your dataframe from a .csv file:
# model_output = pd.read_csv("", index_col=0)

model_output = df_model_output_10_groups

model_output = model_output[['current_step', 'lg_form_success', 'lg_meaning_success', 'lg_bit_update']]
model_output = model_output.rename(columns={"lg_form_success": "form_success", "lg_meaning_success": "culturally_salient_features_success", "lg_bit_update": "update_bit"})
model_output = model_output.drop([0])
model_output[["form_success", "culturally_salient_features_success", "update_bit"]] = model_output[["form_success", "culturally_salient_features_success", "update_bit"]].div(10, axis=0)

# add column with value for groups of 50 (1-50, 51-100, etc.)
for index, row in model_output.iterrows():
    model_output.at[index, "hist_block"] = int(index/50)

model_output_grouped = model_output.groupby(["hist_block"]).mean()
model_output_grouped["original_index"] = model_output_grouped.index * 50
model_output = model_output_grouped[["form_success", "culturally_salient_features_success", "update_bit", "original_index"]]

# https://www.python-graph-gallery.com/13-percent-stacked-barplot
# From raw value to percentage
totals = [i+j+k for i, j, k in zip(model_output['update_bit'], model_output['culturally_salient_features_success'], model_output['form_success'])]
bit_bars = [i / j for i,j in zip(model_output['update_bit'], totals)]
features_bars = [i / j for i,j in zip(model_output['culturally_salient_features_success'], totals)]
form_bars = [i / j for i,j in zip(model_output['form_success'], totals)]

steps = range(int(model_output.index.min()), int(model_output.index.max() + 1))  # min, max steps in df
ax1.bar(steps, bit_bars, color=cmaplist[0], width=1, edgecolor="none", label="bit update")  # Create green Bars
ax1.bar(steps, features_bars, bottom=bit_bars, color=cmaplist[128], width=1, edgecolor="none", label="CS features success")  # Create orange Bars
ax1.bar(steps, form_bars, bottom=[i + j for i, j in zip(bit_bars, features_bars)], color=cmaplist[-1], width=1, edgecolor="none", label="form success")  # Create blue Bars

# legend
handles, labels = ax1.get_legend_handles_labels()
handles = [handles[2], handles[1], handles[0]]
labels = [labels[2], labels[1], labels[0]]
ax1.legend(handles, labels, loc='center left', bbox_to_anchor=(1, 0.5))

# axes
ax1.set_xlabel("Model stage", fontsize=15)
ax1.set_ylim(0,1)
ax1.set_ylabel("", fontsize=15)
ax1.set_xticks(np.arange(0, 41, step=10))
ax1.set_xticklabels([0,500,1000,1500,2000])

plt.suptitle("10 groups", fontsize=18, x=0.4, y=1.1)

plt.savefig("barplot_10groups.png", dpi=1000, bbox_inches="tight")