# Linguistic Analyses for Compositional Abstractions
## Notebook 3: Forming conventions to talk about shared abstractions

This notebook was written by Robert Hawkins.  
Original modeling by Robert Hawkins, Will McCarthy, Cameron Holdaway, Haoliang Wang, and Judy Fan.

### *NOTE: THIS NOTEBOOK SERVES AS THE INSTRUCTOR VERSION*

In the previous notebook we studied a **concept learning** problem. We inferred libraries of program fragments corresponding to a collection of possible concepts people may have in their heads as they advance through the task. For each trial, we then generated a set of programs expressing the scene in different ways using different abstractions. 

In this notebook, we extend our model to address the **communication** problem. That is, given a tower scene and a set of concepts in the Architect's head, what linguistic instructions should they give to the Builder that will allow them to successfully reconstruct that scene? 

We approach this problem in a **probabilistic modeling** framework that extends the model of convention formation described by [Hawkins et al. (2023)](https://cocosci.princeton.edu/papers/hawkinspartners.pdf). 

This notebook is divided into 3 sections that incrementally build up to the full model.

**Section 1** begins by implementing the core notion of a *mental lexicon* -- a mapping from words to concepts.

**Section 2** implements a basic Architect and Builder agent that make decisions based on fixed lexicons.  

**Section 3** equips these agents with the ability to *learn* and update their *beliefs* about the lexicon over time.

**Section 4** finally reaches the core theoretical question of why speakers prefer one level of abstraction over another.

## Setup 

In [1]:
import os
import sys

import functools
import itertools
import json

import pandas as pd
import numpy as np

from numpy.random import choice

In [2]:
# import classes for our model
sys.path.append("../model/convention_formation/")
from distribution import *
from lexicon import *

In [3]:
%load_ext autoreload
%autoreload 2

## Section 1: Lexicons

### Representing lexicons

The basic building block of our model is an agent's **mental lexicon**, a particular set of correspondances between concepts and words. To define an agent model, we must first define the mental lexicon. 

We have created a class for the purpose called `BlockLexicon()` which manages the mapping between program primitives (as defined in **[the previous section](https://github.com/cogtoolslab/compositional_abstractions_tutorial/blob/main/notebooks/ca_programs.ipynb)**) to words and phrases. If you're interested in digging into the nitty-gritty details, we've put this class in a helper library __[here](https://github.com/cogtoolslab/compositional_abstractions_tutorial/blob/main/model/convention_formation/lexicon.py)__).

For our purposes here, though, we don't need to know exactly what's going on under the hood. Let's take the lexicon class out for a drive. We need to initialize it with two parameters, essentially corresponding to the base entities on each side of the mapping we want to define.

(1) `dsl`: a list of concepts in the DSL that might be expressed as words,

(2) `lexemes`: a list of words that are available to bind to new concepts. 

To build up our intuitions, we're going to work with a very simplified example lexicon, where the set of concepts is the library on trial 10 of the task (retrieved from the file we saved out in the previous section) and the list of lexemes is just a bunch of nonsense words that won't start with any meaning.

In [4]:
# Pull out the DSL primitives accessible at trial 10
d = pd.read_json('../data/model/programs_for_you/programs_ppt_1.json')
dsl = d['dsl'][10]

# Define a set of meaningless placeholder words available to be bound to meanings
lexemes = ['blah', 'blab', 'bloop', 'bleep', 'floop'] 
l = BlockLexicon(dsl, lexemes)

This lexicon object `l` that we've created has a few basic functions we can call. We can look up which word to use for any element of the `dsl`: 

In [5]:
print(dsl[0], '->', l.dsl_to_language(dsl[0]))
print(dsl[10], '->', l.dsl_to_language(dsl[10]))
print(dsl[-1], '->', l.dsl_to_language(dsl[-1]))

h -> place a horizontal block.
l_8 -> move to the left by 8
chunk_C -> place a blab.


and we can also go in the other direction, converting from a linguistic expression to a corresponding primitive in the DSL.
We went ahead and 'baked in' correspondences for the basic elements of our DSL, because we're assuming these meanings are pretty much deterministic; there's not much wiggle room about what 'place a horizontal block' or 'move to the left by 8' corresponds to. 

In [6]:
print('place a horizontal block. ->', l.language_to_dsl('place a horizontal block.'))
print('move to the left by 8 ->', l.language_to_dsl('move to the left by 8'))

place a horizontal block. -> h
move to the left by 8 -> l_8


> **sidenote**: if we wanted a more realistic model that worked on generic natural language, e.g. if we wanted to pair our agent with a real user writing in a chat box, we would want something more robust that doesn't require exact string matching. We might use a simple algorithm to find the nearest string in the lexicon, or we might use a fancier model operating over utterance embeddings.

There's a subtlety to notice about our `BlockLexicon()` class. If we pass in an unfamiliar utterance that isn't found in the list of lexemes we provided, it will return one of the concepts that doesn't already have a word assigned (and vice versa). We can think of this like the agent randomly 'guessing' rather than erroring.

In [11]:
print('place a blah. ->', l.language_to_dsl('place a blah.'))
print('place a flomp. ->', l.language_to_dsl('place a womp.'))

place a blah. -> chunk_C
place a flomp. -> chunk_L


### Representing beliefs about lexicons

The lexicon `l` we've been playing with so far is a deterministic data structure; it's just a single mapping. 

In a probabilistic model, however, we want to be able to talk in a mathematically rigorous way about an agent's subjective **beliefs** about a mapping. In other words, we want to be able to define a **probability distribution** over all possible lexicons $\mathcal{L}$. This will provide a precise definition for an agent's uncertainty about the lexicon in their partner's head, i.e. they assume their partner must be using one of these mappings $\mathcal{L}^*$, but do not know ahead of time exactly what it is.

We're going to introduce another custom class we've written called a `Distribution()`, which maintains the probabilities assigned to each possible lexicon. (Again, for nitty-gritty details, see the helper library __[here](https://github.com/cogtoolslab/compositional_abstractions_tutorial/blob/main/model/convention_formation/distribution.py)__). 

We'll define a **prior** distribution (the agent's initial beliefs) as a uniform distribution over all possible ways of binding elements in the DSL to the lexemes. 

In [15]:
# [f(x) for x in collection] is called a list comprehension 
# it creates a new list by applying some transformation to each element in a collection
# here we're relying on the fact that `BlockLexicon` uses the *order* of the list of lexemes
prior = LexiconPrior(dsl, lexemes)

We've also created some handy helper functions to work with distributions. For example, we can 'marginalize' to look at the possible mappings for any single chunk.

In [34]:
print('possible utterances for chunk_L : ', 
      prior.marginalize(lambda d : d.dsl_to_language('chunk_L')))

print('possible utterances for chunk_L : ', 
      prior.marginalize(lambda d : d.language_to_dsl('place a blah.')))

possible utterances for chunk_L :  {
    "place a blab.": 0.24999999999999997,
    "place a bloop.": 0.24999999999999997,
    "place a bleep.": 0.24999999999999997,
    "place a blah.": 0.24999999999999997
}
possible utterances for chunk_L :  {
    "chunk_C": 0.24999999999999997,
    "chunk_L": 0.24999999999999997,
    "chunk_Pi": 0.24999999999999997,
    "chunk_8": 0.24999999999999997
}


Here, we can verify that 'chunk_L' in the DSL is initially equally likely to be mapped to any utterance: $p = 1/|U|$ The equal probabilities for each expression tell us that the Agents think each expression is an equally good (or bad) translation of "chunk_L". This makes sense for this artificial langauge (where "bleeps" and "bloops" are used to refer to abstractions). Real people likely have strong priors about what words will mean (we can make quite a lot of sense of "build an L" before any shared experience, even if there's some remaining uncertainty about properties of the L like its size and width, etc). But a uniform prior helps us understand the dynamics of coordination in the most extreme case. 

Here are a few other things you can do with a distribution object:

In [35]:
# d.support() returns the support of the distribution d, i.e. the list of values that it is defined over
# e.g. we can print out the first lexicon in the support as an example of what lexicons look like
print('example element:', prior.support()[3])

example element: {
    "h": "place a horizontal block.",
    "v": "place a vertical block.",
    "l_0": "move to the left by 0",
    "l_1": "move to the left by 1",
    "l_2": "move to the left by 2",
    "l_3": "move to the left by 3",
    "l_4": "move to the left by 4",
    "l_5": "move to the left by 5",
    "l_6": "move to the left by 6",
    "l_7": "move to the left by 7",
    "l_8": "move to the left by 8",
    "l_9": "move to the left by 9",
    "l_10": "move to the left by 10",
    "l_11": "move to the left by 11",
    "l_12": "move to the left by 12",
    "r_0": "move to the right by 0",
    "r_1": "move to the right by 1",
    "r_2": "move to the right by 2",
    "r_3": "move to the right by 3",
    "r_4": "move to the right by 4",
    "r_5": "move to the right by 5",
    "r_6": "move to the right by 6",
    "r_7": "move to the right by 7",
    "r_8": "move to the right by 8",
    "r_9": "move to the right by 9",
    "r_10": "move to the right by 10",
    "r_11": "move to the

In [36]:
# d.score(val) returns the probability of val in the distribution d
# e.g. this is the probability assigned to the lexicon we just printed out
print(len(prior.support()))
print('P(^ that lexicon) = 1/24 = ', prior.score(prior.support()[0]))

24
P(^ that lexicon) = 1/24 =  0.041666666666666664


## Section 2: Simulating Agents

Now we have a way of representing an agent's beliefs over lexicons, we can define an Architect and Builder.

Both maintain their own belief distribution about the *other agent's* lexicon.

The **Architect** agent makes choices about what to say based by imaginging how the Builder will interpret them.  

The **Builder** takes actions based on its beliefs about what the Architect would say in different situations. 

In [37]:
class FixedAgent() :
    def __init__(self, role, trial) :
        '''
        Args: 
           * role: string giving agent's role in the task ('architect' or 'builder')
           * trial: dictionary of meta-data about the current trial 
        '''
        self.role = role
        self.actions = trial['dsl']

        # initialize beliefs to a uniform prior over possible lexicons, as above
        self.possible_lexicons = set([BlockLexicon(self.actions, list(mapping)) 
                                      for mapping in itertools.permutations(lexemes)])
        self.beliefs = UniformDistribution(self.possible_lexicons)
        self.utterances = list(self.possible_lexicons)[0].utterances
        
    def act(self, observation) :
        '''
        produce an action based on role and current beliefs
        '''
        if self.role == 'architect' :
            # Architect is going to build up a distribution over utterances to say
            utt_dist = self.beliefs.marginalize(lambda l : l.dsl_to_language(observation))
            return utt_dist.sample()

        if self.role == 'builder' :
            # get P(a | utt) by marginalizing over lexicons 
            action_dist = self.beliefs.marginalize(lambda l : l.language_to_dsl(observation))
            return action_dist.sample()

Because our focus is on modeling abstractions learned during the task, we assume that Architect Agents and Builder Agents can unambiguously communicate about the base DSL-- moving left and right and placing individual blocks. I.e. when an Architect wants a Builder to place a block they will always say "place a horizontal block", and the Builder will correctly interpret this utterance.

The only thing that varies across different lexicons is the words used for *learned* program fragments. In practice (for this example) only five of these were learned across all participants' trial sequences. The set of possible lexicons is therefore fully defined by the set of possible mappings from these fragments to the `lexemes` defined above.

For the built-in primitives like the horizontal block, this should be a deterministic choice:

In [38]:
first_trial = d.loc[0].to_dict()
architect = FixedAgent('architect', first_trial)
print('architect utterance: `h` -> ', architect.act('h'))

builder = FixedAgent('builder', first_trial)
print('builder choice: "place a horizontal block" -> ', builder.act('place a horizontal block.'))

architect utterance: `h` ->  place a horizontal block.
builder choice: "place a horizontal block" ->  h


### Running simulations

Now we have our agents, we need to run them forward through the trial sequence.

In [46]:
def run_simulation() :
    output = pd.DataFrame({"utt": [], "response": [], "target_program": [], "target_length" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = FixedAgent('architect', trial)
        builder = FixedAgent('builder', trial)

        # architect randomly selects which of the program representations to comunicate
        possiblePrograms = list(trial['programs_with_length'].keys())
        target_program = choice(possiblePrograms)

        # loop through steps of target program one at a time
        utts, responses, accs = [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            utts.append(utt)
            responses.append(response)
            accs.append(1.0 * (response == step))

        output = pd.concat([output, pd.DataFrame({
            "trial": int(i),
            "utt": utts,
            "response": responses,
            "acc": accs,
            "target_program": target_program,
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output

run_0 = run_simulation()
display(run_0)

Unnamed: 0,utt,response,target_program,target_length,acc,trial
0,place a horizontal block.,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
1,move to the left by 4,l_4,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
2,place a horizontal block.,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
3,move to the left by 1,l_1,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
4,place a vertical block.,v,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
...,...,...,...,...,...,...
4,place a horizontal block.,h,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,1.0,11.0
5,move to the right by 4,r_4,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,1.0,11.0
6,place a horizontal block.,h,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,1.0,11.0
7,move to the right by 9,r_9,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,1.0,11.0


In [47]:
# let's inspect the first trial
print('target program: \n', run_0.query('trial==0').loc[0,'target_program'])
run_0.query('trial==0')[['utt','response','acc']]

target program: 
 h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h


Unnamed: 0,utt,response,acc
0,place a horizontal block.,h,1.0
1,move to the left by 4,l_4,1.0
2,place a horizontal block.,h,1.0
3,move to the left by 1,l_1,1.0
4,place a vertical block.,v,1.0
5,place a vertical block.,v,1.0
6,move to the right by 9,r_9,1.0
7,place a vertical block.,v,1.0
8,move to the right by 6,r_6,1.0
9,place a vertical block.,v,1.0


In [48]:
# let's inspect the final trial
print('target program: \n', run_0.query('trial==11').loc[0,'target_program'])
run_0.query('trial==11')[['utt','response','acc']]

target program: 
 v r_6 v l_5 h r_4 h r_9 chunk_L


Unnamed: 0,utt,response,acc
0,place a vertical block.,v,1.0
1,move to the right by 6,r_6,1.0
2,place a vertical block.,v,1.0
3,move to the left by 5,l_5,1.0
4,place a horizontal block.,h,1.0
5,move to the right by 4,r_4,1.0
6,place a horizontal block.,h,1.0
7,move to the right by 9,r_9,1.0
8,place a bloop.,chunk_8,0.0


### <span style="color: orange"> Exercise: explore how accuracy changes </span>

Wait, why is the accuracy so bad? Well, our agents aren't actually *learning* -- they're continuing to use their initial uniform priors.

## Section 3: Simulating learning

In a probabilistic framework, learning is equivalent to updating one's beliefs given new observations. We thus need to extend the agent class with a `update_beliefs()` function, using Bayes rule to turn their prior into a posterior using the outcomes of the previous trials. Here's Bayes rule, which gives us a blueprint to handle this:

$$P(\mathcal{L} | o) = \frac{P(o | \mathcal{L})P(\mathcal{L})}{\sum_{\mathcal{L}} P(o | \mathcal{L}) P(\mathcal{L})}$$

We already have the prior term $P(\mathcal{L})$, but we're missing the likelihood term $P(o | \mathcal{L})$ which scores how likely a given utterance or action $o$ would be under different lexicons $\mathcal{L}$. Following Hawkins et al. (2023), we use a simple Rational Speech Act (RSA) likelihood, where each agent reasons about a (simplified) mental model of the other agent, and asks how likely they would be to make a given choice if they had a given lexicon in their head. Specifically, the architect tries to design utterances to maximize the probability of success assuming a literal builder $B_0(a | u)$ which chooses among actions that are literally consistent with the utterance they hear. Meanwhile, the builder tries to pick actions assuming the architect is choosing utterances that are literally consistent with the target step they are trying to convey.

> **Sidenote**: As an implementational detail, note that we are creating a new agent at each step with the correspondingly updated beliefs rather than repeatedly updating the same agent.

In [61]:
class LearningAgent(FixedAgent) :
    def __init__(self, role, curr_trial, previous_trial_df) :
        super().__init__(role, curr_trial)
        combined_primitives = set().union(*previous_trial_df['dsl']) if not previous_trial_df.empty else self.actions
        self.possible_lexicons = set([BlockLexicon(set().union(combined_primitives), list(mapping)) 
                                      for mapping in itertools.permutations(lexemes)])
        self.update_beliefs(previous_trial_df)

    def B0(self, utt, lexicon) :
        '''
        simple builder agent that has equal probability
        of building anything that's literally consistent with the utterance
        '''
        builder_dist = EmptyDistribution()
        for action in self.actions :
            builder_dist.update({action : 1 if action == lexicon.language_to_dsl(utt) else 0.01})
        builder_dist.renormalize()
        return builder_dist
        
    def A0(self, target, lexicon) :
        '''
        simple architect agent that has equal probability
        of saying anything that's literally consistent with the target
        '''
        architect_dist = EmptyDistribution()
        for utt in self.utterances :
            architect_dist.update({utt : 1 if utt == lexicon.dsl_to_language(target) else 0.01})
        architect_dist.renormalize()
        return architect_dist

    def update_beliefs(self, previous_trial_df) :
        '''
        run bayes rule given observations in previous trials
        note that we run the calculation in log space because it's more numerically stable
        '''
        posterior = EmptyDistribution()
        posterior.to_logspace()

        # for each data point, we calculate the likelihood of the datapoint under each lexicon, 
        # weighted by the prior probability of that lexicon: 
        # P(l | obs) \propto P(l) * \prod_{o \in obs} P(o | l)
        # log P(l|obs) \propto log P(l) + \sum_{o \in obs} log P(o | l)
        for lexicon in self.beliefs.support() :
            prior_term = np.log(self.beliefs.score(lexicon))
            likelihood_term = sum([
                np.log(self.A0(step.target, lexicon).score(step.utterance)) 
                if self.role == 'builder'
                else np.log(self.B0(step.utterance, lexicon).score(step.response))
                for i, step in previous_trial_df.iterrows()
            ])
            posterior.update({lexicon : prior_term + likelihood_term})
        posterior.renormalize()
        posterior.from_logspace()
        self.beliefs = posterior

In [62]:
def run_learning_simulation(verbose = False) :
    output = pd.DataFrame({
        "utterance": [], "response": [], "target": [], "full_program" : [], "target_length" : [], "dsl" : [], "acc": []
    })
    for i, trial in d.iterrows() :
        architect = LearningAgent('architect', trial, output) # create agent with updated beliefs
        builder = LearningAgent('builder', trial, output)     # create agent with updated beliefs

        # random program selected from the options
        possiblePrograms = list(trial['programs_with_length'].keys())
        target_program = choice(possiblePrograms)

        # loop through steps of target program one at a time
        target_steps, utts, responses, accs = [], [], [], []
        for step in target_program.split(' ') :
            # get utterance from architect
            utt = architect.act(step)
            # get response from builder
            response = builder.act(utt)
            # update records
            target_steps.append(step)
            utts.append(utt)
            responses.append(response)
            accs.append(response == step)

        if verbose:
            print('beliefs about chunk_C meaning', 
                  architect.beliefs.marginalize(lambda d : d.dsl_to_language('chunk_C')) 
                  if 'chunk_C' in trial['dsl'] else None)
            print('trial', i)
            print(pd.DataFrame({'utts' : utts, 'responses' : responses, 'correct' : accs, 'target' : target_steps}))

        output = pd.concat([output, pd.DataFrame({
            "trial": i,
            "utterance": utts,
            "response": responses,
            "acc": accs,
            "target_steps" : target_steps,
            "full_program": target_program,
            "dsl" : [trial['dsl']] * len(utts),
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output

In [63]:
learning_run_0 = run_learning_simulation(verbose = True)
learning_run_0

beliefs about chunk_C meaning None
trial 0
                         utts responses  correct target
0   place a horizontal block.         h     True      h
1       move to the left by 4       l_4     True    l_4
2   place a horizontal block.         h     True      h
3       move to the left by 1       l_1     True    l_1
4     place a vertical block.         v     True      v
5     place a vertical block.         v     True      v
6      move to the right by 9       r_9     True    r_9
7     place a vertical block.         v     True      v
8      move to the right by 6       r_6     True    r_6
9     place a vertical block.         v     True      v
10      move to the left by 5       l_5     True    l_5
11  place a horizontal block.         h     True      h
12     move to the right by 4       r_4     True    r_4
13  place a horizontal block.         h     True      h


  output = pd.concat([output, pd.DataFrame({


beliefs about chunk_C meaning None
trial 1
                         utts responses  correct target
0   place a horizontal block.         h     True      h
1       move to the left by 4       l_4     True    l_4
2   place a horizontal block.         h     True      h
3       move to the left by 1       l_1     True    l_1
4     place a vertical block.         v     True      v
5     place a vertical block.         v     True      v
6     move to the right by 12      r_12     True   r_12
7   place a horizontal block.         h     True      h
8       move to the left by 1       l_1     True    l_1
9     place a vertical block.         v     True      v
10    place a vertical block.         v     True      v
11     move to the right by 1       r_1     True    r_1
12  place a horizontal block.         h     True      h
beliefs about chunk_C meaning None
trial 2
                         utts responses  correct target
0   place a horizontal block.         h     True      h
1       move to th

Unnamed: 0,utterance,response,target,full_program,target_length,dsl,acc,trial,target_steps
0,place a horizontal block.,h,,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0,h
1,move to the left by 4,l_4,,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0,l_4
2,place a horizontal block.,h,,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0,h
3,move to the left by 1,l_1,,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0,l_1
4,place a vertical block.,v,,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0,v
...,...,...,...,...,...,...,...,...,...
4,place a horizontal block.,h,,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0,h
5,move to the right by 4,r_4,,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0,r_4
6,place a horizontal block.,h,,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0,h
7,move to the right by 9,r_9,,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0,r_9


## Section 4: Choosing programs.

In principle, words allow us to communicate arbitrarily complex concepts, and our mental representation of linguistic meaning is flexible enough to be updated over time. To the extent that both Architects and Builders are learning to 'chunk' blocks over time, each  program fragment) could, in principle, be assigned a new word or phrase, and conveyed directly through the Architects' instructions. Well, almost. There is no guarantee that the words the Architect chooses to pick out a new concept would invoke the same concept in someone else. There is actually uncertainty about how the Builder will interpret the Architects' instructions and this, we suggest, might change what the Architect chooses to say.

Our hypothesis is that Architects trade-off communicative *efficiency* with communicative *effectiveness*. While they generally want to say things concisely, if there is too much uncertainty about how the Builder will interpret their words, they will choose a less ambiguous (if more wordy) way of expressing the same information. Concretely, if the Architect wanted to say "build an L" but they thought that the Builder wouldn't get what "L" meant, they might spell out the steps to make an L-shaped tower instead.

Wow, this is great! We can see our agents are updating their beliefs about the lexicon over time and able to get somewhat more accurate as they coordinate. But one of the most interesting things about our empirical data is that speakers seem to be strategically choosing which representation of the tower to convey -- our best current theory of why participants reduce the length of their utterances over time is that even when new library chunks come online, architects don't always try to refer to them right away. They aren't confident enough that their partner will understand, as the block-level descriptions are much safer. However, the block-level descriptions are also much *costlier* in terms of time and effect because they have to laboriously describe one action at a time. 

So far, we just used a placeholder for how the speaker picks which representation to communication: they just randomly pick from the list of candidates, slightly preferring shorter programs. However, there are other considerations that ought to go into this decision, namely the estimated likelihood that the listener will do the right thing.

As an exercise, add one line of code to the simulation to weight the target program according to its utility

In [77]:
from scipy.special import softmax

class StrategicArchitect(LearningAgent) :
    def __init__(self, curr_trial, previous_trial_df) :
        super().__init__('architect', curr_trial, previous_trial_df) 
        self.alpha = 1
        
    def program_utility(self, target_program) :
        '''
        comptutes the utility of 
        '''
        program_length = len(target_program)
        expected_step_infs = []
        for step in target_program :
            # calculate expected inf(u) = \sum_L P(L) * ln P_B(a* | u, L) 
            expected_utt_utility = [
                sum([self.beliefs.score(l) * np.log(self.B0(utt, l).score(step))
                     for l in self.beliefs.support()])
                for utt in self.utterances
            ]
            # weight inf(u) by how likely softmax speaker is to actually say it
            expected_step_infs.append(
                sum(expected_utt_utility * softmax(self.alpha * expected_utt_utility))
            )
        # take average expected informativity at each steps of program (penalizing length)
        return np.mean(expected_step_infs) - program_length
        
    def speak(self, possible_programs) :
        '''
        produce an action based on role and current beliefs
        '''
        # Architect is going to build up a distribution over utterances to say
        # Architect selects which program representation to comunicate proportional to informativity and length
        raw_utilities = [self.program_utility(p.split(' ')) for p in possible_programs]
        print(raw_utilities)
        print(softmax(self.alpha * raw_utilities))
        chosen_p = choice(a = possible_programs, p = softmax(self.alpha * raw_utilities))
        print(chosen_p)
        return chosen_p, [self.act(step) for step in chosen_p.split(' ')]

def run_strategic_simulation() :
    output = pd.DataFrame({"utterance": [], "response": [], "target": [], "full_program" : [], "target_length" : [], "dsl" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = StrategicArchitect(trial, output)
        builder = LearningAgent('builder', trial, output)

        possible_programs = list(trial['programs_with_length'].keys())
        chosen_program, utt_seq = architect.speak(possible_programs)

        # loop through steps of target program one at a time
        target_steps, utts, responses, accs = [], [], [], []
        for step, utt in zip(chosen_program.split(' '), utt_seq) :
            response = builder.act(utt)
            target_steps.append(step)
            utts.append(utt)
            responses.append(response)
            accs.append(response == step)

        print('trial', i)
        print(pd.DataFrame({'utts' : utts, 'responses' : responses, 'correct' : accs, 'target' : target_steps}))
        output = pd.concat([output, pd.DataFrame({
            "utterance": utts,
            "response": responses,
            "acc": accs,
            "target" : target_steps,
            "full_program": chosen_program,
            "dsl" : [trial['dsl']] * len(utts),
            "target_length" : trial['programs_with_length'][chosen_program],
        })])
    return output
print(run_strategic_simulation())

[-15.218068829775046]
[1.]
h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h
trial 0
                         utts responses  correct target
0   place a horizontal block.         h     True      h
1       move to the left by 4       l_4     True    l_4
2   place a horizontal block.         h     True      h
3       move to the left by 1       l_1     True    l_1
4     place a vertical block.         v     True      v
5     place a vertical block.         v     True      v
6      move to the right by 9       r_9     True    r_9
7     place a vertical block.         v     True      v
8      move to the right by 6       r_6     True    r_6
9     place a vertical block.         v     True      v
10      move to the left by 5       l_5     True    l_5
11  place a horizontal block.         h     True      h
12     move to the right by 4       r_4     True    r_4
13  place a horizontal block.         h     True      h


  output = pd.concat([output, pd.DataFrame({


[-14.218068829775046]
[1.]
h l_4 h l_1 v v r_12 h l_1 v v r_1 h
trial 1
                         utts responses  correct target
0   place a horizontal block.         h     True      h
1       move to the left by 4       l_4     True    l_4
2   place a horizontal block.         h     True      h
3       move to the left by 1       l_1     True    l_1
4     place a vertical block.         v     True      v
5     place a vertical block.         v     True      v
6     move to the right by 12      r_12     True   r_12
7   place a horizontal block.         h     True      h
8       move to the left by 1       l_1     True    l_1
9     place a vertical block.         v     True      v
10    place a vertical block.         v     True      v
11     move to the right by 1       r_1     True    r_1
12  place a horizontal block.         h     True      h
[-15.218068829775046]
[1.]
h l_1 v v r_1 h r_6 v r_6 v l_5 h r_4 h
trial 2
                         utts responses  correct target
0   place a h

### Summary


Congratulations! You've finished the tutorial!

We hope you found it fun and informative.