# Simulating Language 11, Learned signalling (lab) (some answers)

The simulation in this notebook implements the learning of a signalling system. In the previous simulations, an individual agent’s signalling system was provided innately, and didn’t change in its lifetime. Populations of agents evolved through natural selection according to the fitness function we specified, to be ‘optimal’ in some way for communication.

In this simulation, we’re ignoring evolution, and instead allowing the weights in an agent’s signalling system to change through learning, as a result of their experiences.

The first section of the code is similar to the code we used in our first simulation, when we introduced the following:
- a signalling system is represented as a list of lists - you can think of this as a matrix or as a neural network.
- how to produce a signal to express a meaning, using winner-take-all;
- how to decide which meaning a received signal is expressing, using winner-take-all;
- communication as a measure of how well the speaker’s meaning matches the hearer’s meaning after being transmitted via a signal.

These should all be very familiar by now. We have made one major change to the code though. In the earlier models we had separate matrices for production and reception. From now on we are going to use a model where we just have a single matrix which handles both processes. There are some small changes to the code to accomplish this.

*Identify the changes required to go from a two-matrix model to a one-matrix model, and figure out why they have been made.*

In [1]:
import random
%matplotlib inline
import matplotlib.pyplot as plt
from IPython.display import set_matplotlib_formats
set_matplotlib_formats('svg', 'pdf')

In [2]:
def wta(items):
    maxweight = max(items)
    candidates = []
    for i in range(len(items)):
        if items[i] == maxweight:
            candidates.append(i)
    return random.choice(candidates)

def reception_weights(system, signal):
    weights = []
    for row in system:
        weights.append(row[signal])
    return weights

def communicate(speaker_system, hearer_system, meaning):
    speaker_signal = wta(speaker_system[meaning])
    hearer_meaning = wta(reception_weights(hearer_system, speaker_signal))
    if meaning == hearer_meaning: 
        return 1
    else: 
        return 0

### New code

It's really remarkable how little extra code is needed to build a model of learning. Here it is:

In [3]:
def learn(system, meaning, signal):
    system[meaning][signal] += 1

Simple, huh? In learning, agents store the association between the meaning and signal. We need one simple function to implement learning. The function learn takes three arguments and is just two lines of code. The arguments are:
1. a signalling system
2. a meaning 
3. a signal

The function finds the appropriate cell in the signalling system matrix indexed by the meaning and signal, and adds one to the value of the weight in this cell.
Make sure you understand how this learning function works, what the parameters mean, and how the function updates the correct cell in the matrix.

Let's try out the function. Start with an "empty" agent that doesn't know anything about meanings and signals:

```python
s = [[0, 0, 0], 
     [0, 0, 0],
     [0, 0, 0],
     [0, 0, 0]]
```

(This agent has four meanings, and three signals).

We can train this agent with commands like these:

```python
learn(s,0,2)
learn(s,1,1)
learn(s,0,2)
learn(s,3,0)
```

And look at the resulting matrix using:

```python
print(s)
```

Make sure you understand how and why the matrix has changed.

In [4]:
s = [[0, 0, 0], 
     [0, 0, 0],
     [0, 0, 0],
     [0, 0, 0]]

learn(s,0,2)
learn(s,1,1)
learn(s,0,2)
learn(s,3,0)

print(s)

[[0, 0, 2], [0, 1, 0], [0, 0, 0], [1, 0, 0]]


### Training

Rather than input each learning episode individually (which is a bit laborious), we can give an agent a list of meaning-signal pairs, and learn them all through the single function train. This function goes through each item in the list, and learns each meaning-signal pair individually.

In [6]:
def train(system, ms_pair_list):
    for pair in ms_pair_list:
        learn(system, pair[0], pair[1])

To make sure you understand how this works, create a signalling system, then provide it with a list of learning exposures, and check that the system has learnt from the data you have given it. Does it do what you expect?

Hint: you'll need to use the `wta` function to find out what your trained agent would produce/understand. For example, if you had an agent called `network`, then: `wta(network[1])` would tell you what signal that agent would produce for meaning 1, and `wta(reception_weights(network, 2))` would tell you what meaning that agent would understand for signal 2. (Make sure you understand why these are the right things to type in!)

In [14]:
s = [[0, 0, 0], 
     [0, 0, 0],
     [0, 0, 0],
     [0, 0, 0]]

train(s, [[0, 2], [1, 1], [0, 2], [3, 0]])

for meaning in range(4):
    print("meaning ", end="")
    print(meaning, end=" -> signal ")
    print(wta(s[meaning]))

print()
for signal in range(3):
    print("signal ", end="")
    print(signal, end=" -> meaning ")
    print(wta(reception_weights(s, signal)))
    

meaning 0 -> signal 2
meaning 1 -> signal 1
meaning 2 -> signal 0
meaning 3 -> signal 0

signal 0 -> meaning 3
signal 1 -> meaning 1
signal 2 -> meaning 0


### Learning in a population

The next part of the code allows us to go from a single agent to a population (if we wish). `pop_learn` takes a list of signalling systems, a list of utterances (i.e. `[meaning, signal]` pairs), some number of learning episodes, and a learning rule. For the number of learning episodes specified, it trains a random individual in the population on a random utterance picked from the list of data.

The reason we need a function like `pop_learn` might not be immediately obvious, but will be clear when we come to the next notebook! For the time being, you can use this function to train a single agent by simply building a population that has a single agent in it. Alternatively, you can use it to look at whether two or more agents may end up speaking similar languages when exposed to utterances picked at random from a set of training data.

In [5]:
def pop_learn(population, data, no_learning_episodes):
    for n in range(no_learning_episodes):
        ms_pair = random.choice(data)
        learn(random.choice(population), ms_pair[0], ms_pair[1])

Try the following code out:

```python
p = [[[0, 0], [0, 0]]]
pop_learn(p, [[0, 0], [1, 1]], 100)
print(p)
```

Why are there three square brackets at the start of the variable “p”? Try different data. How can we use this way of training to model different frequencies of different types of utterance?

In [6]:
p = [[[0, 0], [0, 0]]]
pop_learn(p, [[0, 0], [1, 1]], 100)
print(p)

[[[49, 0], [0, 51]]]


We have a way for a population to learn from some data, but how about getting them to produce data, in order to evaluate how well they have learnt? `pop_produce` carries out this function. It takes a population and a required number of productions, and returns a list of utterances (meaning-signal pairs) generated by individuals picked randomly from the population:

In [27]:
def pop_produce(population, no_productions):
    ms_pairs = []
    for n in range(no_productions):
        speaker = random.choice(population)
        meaning = random.randrange(len(speaker))
        signal = wta(speaker[meaning])
        ms_pairs.append([meaning, signal])  
    return ms_pairs

Finally, we have a population-based version of our Monte Carlo measure of communicative accuracy: `ca_monte_pop`. This takes a population and a number of trials, and return a Monte Carlo estimate of the chance that a random communication between members of the population will be successful - note that it just returns a single value, rather than a list of values (which is different from previous implementations of Monte Carlo evaluation).

In [28]:
def ca_monte_pop(population, trials):
    total = 0.
    for n in range(trials):
        speaker = random.choice(population)
        hearer = random.choice(population)
        total += communicate(speaker, hearer, random.randrange(len(speaker)))
    return total / trials

### Questions

Answering questions 1-3 involves playing with the model - for question 4, you can just think about it (although you can have a go at coding your ideas up if you like).

1. How good is this model of learning? How can you test it? What does "good" even mean in this context? (Hint: this is a deeper and more important question than it first appears!)
2. Can you write some code to test how well an agent has learnt a particular language?

In [26]:
def production_test(agent, data, trials):
    train(agent, data)
    score = 0
    for i in range(trials):
        datum = random.choice(data)
        if wta(agent[datum[0]]) == datum[1]:
            score += 1
    return score/trials

def reception_test(agent, data, trials):
    train(agent, data)
    score = 0
    for i in range(trials):
        datum = random.choice(data)
        if wta(reception_weights(agent, datum[1])) == datum[0]:
            score += 1
    return score/trials

print(production_test([[0, 0], [0, 0]], [[0, 0], [0, 1], [1, 1]], 10000))
print(reception_test([[0, 0], [0, 0]], [[0, 0], [0, 1], [1, 1]], 10000))

0.6565
0.6673


3. Learning is implemented as a frequency count of associations. Are there other reasonable ways of updating the matrix? How else might you change weights in response to observations? 
4. In answering questions 1-3, you have probably been training agents on data that *you* provided. In a proper model of language learning, where would this data come from? Could you use the code above to model this?