# Linguistic Analyses for Compositional Abstractions (COSMOS)
## Notebook 3: forming conventions to talk about learned abstractions

In the previous notebook we inferred libraries of program fragments that correspond to the collection of part concepts people have in their heads as they step through the trials. For each trial we also generated programs that use these abstractions to express each scene more efficiently. Now we tackle the challenge of communicating these programs to another person. 

This is where language comes in. Words allow us to communicate arbitrarily complex concepts. Assuming that Architects and Builders learn the concepts over a trial sequence, each concept that is learned (each program fragment) could, in principle, be assigned a new word or phrase, and conveyed directly through the Architects' instructions. Well, almost. There is no guarantee that the words the Architect chooses to pick out a new concept would invoke the same concept in someone else. There is actually uncertainty about how the Builder will interpret the Architects' instructions and this, we suggest, might change what the Architect chooses to say.

Our hypothesis is that Architects trade-off communicative *efficiency* with communicative *effectiveness*. While they generally want to say things concisely, if there is too much uncertainty about how the Builder will interpret their words, they will choose a less ambiguous (if more wordy) way of expressing the same information. Concretely, if the Architect wanted to say "build an L" but they thought that the Builder wouldn't get what "L" meant, they might spell out the steps to make an L-shaped tower instead.
 
 


## Setup 

In [1]:
import os
import sys

import functools
import itertools
import json

import pandas as pd
import numpy as np

from numpy.random import choice

d = pd.read_json('../../../model/lib_learning_output/synthesis_output_cogsci_revised/ca_synthesis_cogsci_21_ppt_1.json')

In [2]:
# import classes for our model
sys.path.append("./model/convention_formation/")
from distribution import *
from lexicon import *

## Constructing the lexicon

First we need a way to represent the mapping of part concepts to words in the Architects' and Builders' heads. 

We have defined a BlockLexicon class in `lexicon.py`, which stores the mapping of program primitives (from the previous section) to words and phrases.

Let's take this class out for a drive. 

We initialize it with the primitives of the agent's DSL on a given trial and an (ordered) list of available lexemes.

In [3]:
# lexemes for program abstractions learned in previous section
# in our example, only 5 distinct abstractions were learned, so we only need 5 additional lexemes
lexemes = ['blah', 'blab', 'bloop', 'bleep', 'floop'] 

In [4]:
dsl = d['dsl'][10]
l = BlockLexicon(dsl, lexemes)
print(dsl[0], '->', l.dsl_to_language(dsl[0]))
print(dsl[10], '->', l.dsl_to_language(dsl[10]))
print(dsl[-1], '->', l.dsl_to_language(dsl[-1]))

h -> place a horizontal block.
l_8 -> move to the left by 8
chunk_C -> place a blab.


and we can also go in the other direction, converting from language to programmatic "concepts".

In [5]:
print('place a horizontal block. ->', l.language_to_dsl('place a horizontal block.'))
print('move to the left by 8 ->', l.language_to_dsl('move to the left by 8'))
print('place a blah. ->', l.language_to_dsl('place a blah.'))
print('place a flomp. ->', l.language_to_dsl('place a womp.'))

place a horizontal block. -> h
move to the left by 8 -> l_8
place a blah. -> chunk_Pi
place a flomp. -> chunk_Pi


## Representing beliefs about lexicons

To model the Architect's word choice, we explicitly model their beliefs about the Builder's lexicon. We represent this as a distribution over a set of possible lexicons. For this example, we're going to manually construct a distribution as another dictionary (from lexicons to probabilities). (See `distribution.py` for the implementation if you're interested in how this works).

Because our focus is on modeling abstractions learned during the task, we assume that Architect Agents and Builder Agents can unambiguously communicate about the base DSL-- moving left and right and placing individual blocks. I.e. when an Architect wants a Builder to place a block they will always say "place a horizontal block", and the Builder will correctly interpret this utterance.

The only thing that varies across different lexicons is the words used for *learned* program fragments. In practice (for this example) only five of these were learned across all participants' trial sequences. The set of possible lexicons is therefore fully defined by the set of possible mappings from these fragments to the `lexemes` defined above.

In [6]:
# construct full lexicon for all possible mappings
possible_lexicons = [BlockLexicon(dsl, list(mapping)) for mapping in itertools.permutations(lexemes)]

We define the Architects' and Builders' priors (their initial beliefs) as the uniform distribution over these lexicons:

In [7]:
prior = UniformDistribution(possible_lexicons)

print('Example lexicon')

print('lexicon:', json.dumps(prior.support()[0], indent = 4))

print('P(lexicon) =', prior.score(prior.support()[0]))

Example lexicon
lexicon: {
    "lexemes": [
        "blah",
        "blab",
        "bloop",
        "bleep",
        "floop"
    ],
    "h": "place a horizontal block.",
    "v": "place a vertical block.",
    "l_0": "move to the left by 0",
    "l_1": "move to the left by 1",
    "l_2": "move to the left by 2",
    "l_3": "move to the left by 3",
    "l_4": "move to the left by 4",
    "l_5": "move to the left by 5",
    "l_6": "move to the left by 6",
    "l_7": "move to the left by 7",
    "l_8": "move to the left by 8",
    "l_9": "move to the left by 9",
    "l_10": "move to the left by 10",
    "l_11": "move to the left by 11",
    "l_12": "move to the left by 12",
    "r_0": "move to the right by 0",
    "r_1": "move to the right by 1",
    "r_2": "move to the right by 2",
    "r_3": "move to the right by 3",
    "r_4": "move to the right by 4",
    "r_5": "move to the right by 5",
    "r_6": "move to the right by 6",
    "r_7": "move to the right by 7",
    "r_8": "move to the

We can also marginalize to look at values of any particular chunk:

In [8]:
print('possible values of chunk_L : ', 
      json.dumps(prior.marginalize(lambda d : d['chunk_L']), indent = 4))

possible values of chunk_L :  {
    "place a bloop.": 0.19999999999999998,
    "place a bleep.": 0.19999999999999998,
    "place a floop.": 0.19999999999999998,
    "place a blab.": 0.19999999999999998,
    "place a blah.": 0.19999999999999998
}


The equal probabilities for each expression tell us that the Agents think each expression is an equally good (or bad) translation of "chunk_L". This makes sense for this artificial langauge (where "bleeps" and "bloops" are used to refer to abstractions). Real people likely have strong priors about what words will mean (we can make quite a lot of sense of "build an L" before any shared experience), but starting with a uniform prior provides a good starting point.

# Create agents

Now we have a way of representing beliefs over lexicons, we can define our Architect and Builder agents.

Both maintain a set of possible lexicons, and belief distribution over the *other agent's* lexicon.

The **Architect** agent makes choices about what to say based on it's beliefs about how the **Builder** will interpret them.  

The **Builder** takes actions based on it's beliefs about what the **Architect's** utterances mean.

In [9]:
class FixedAgent() :
    def __init__(self, role, trial) :
        self.role = role
        self.actions = trial['dsl']

        # initialize beliefs to uniform prior over lexicons
        self.possible_lexicons = set([BlockLexicon(self.actions, list(mapping)) 
                                      for mapping in itertools.permutations(lexemes)])
        
#         print(lexemes)
        self.beliefs = UniformDistribution(self.possible_lexicons)
        self.utterances = set(list(self.possible_lexicons)[0].values())
        
    def act(self, observation) :
        if self.role == 'architect' :
            # get P(utt | target) by marginalizing over lexicons 
            utt_dist = EmptyDistribution()
            for lexicon in self.beliefs.support() :
                utt_dist.update({lexicon.dsl_to_language(observation) : self.beliefs.score(lexicon)})
            return choice(a = [*utt_dist.support()], 
                          p = [utt_dist.score(u) for u in utt_dist.support()])

        if self.role == 'builder' :
            # get P(a | utt) by marginalizing over lexicons 
            action_dist = EmptyDistribution()
            for lexicon in self.beliefs.support() :
                action_dist.update({lexicon.language_to_dsl(observation) : self.beliefs.score(lexicon)})
            return choice(a = [*action_dist.support()], 
                          p = [action_dist.score(a) for a in action_dist.support()])

In [10]:
architect = FixedAgent('architect', d.loc[0].to_dict())
print('architect choice: ', architect.act('h'))

builder = FixedAgent('builder', d.loc[0].to_dict())
print('builder choice: ', builder.act('place a horizontal block.'))

architect choice:  place a horizontal block.
builder choice:  h


In [37]:
architect = FixedAgent('architect', d.loc[0].to_dict())
print('architect choice: ', architect.act('chunk_L'))

architect choice:  blab


# Run simulation

Now we have our agents, we need to run them forward through the trial sequence.

In [38]:
alpha = 0.04 # weight that trades off between TODO

In [60]:
def run_simulation() :
    output = pd.DataFrame({"utt": [], "response": [], "target_program": [], "target_length" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = FixedAgent('architect', trial)
        builder = FixedAgent('builder', trial)

        # architect selects which program representation to comunicate proportional to length
        possiblePrograms = list(trial['programs_with_length'].keys())
        possibleLengths = np.array(list(trial['programs_with_length'].values()))
        utilities = np.exp(-alpha * possibleLengths) / sum(np.exp(-alpha * possibleLengths))
        target_program = choice(a = possiblePrograms, p = utilities)

        # loop through steps of target program one at a time
        utts, responses, accs = [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            utts.append(utt)
            responses.append(response)
            accs.append(1.0 * (response == step))

        output = pd.concat([output, pd.DataFrame({
            "trial": int(i),
            "utt": utts,
            "response": responses,
            "acc": accs,
            "target_program": target_program,
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output

run_0 = run_simulation()
display(run_0)

Unnamed: 0,utt,response,target_program,target_length,acc,trial
0,place a horizontal block.,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
1,move to the left by 4,l_4,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
2,place a horizontal block.,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
3,move to the left by 1,l_1,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
4,place a vertical block.,v,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,1.0,0.0
...,...,...,...,...,...,...
7,move to the right by 9,r_9,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,1.0,10.0
8,place a bloop.,chunk_Pi,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,0.0,10.0
0,place a blah.,chunk_8,chunk_Pi r_9 chunk_L,3.0,0.0,11.0
1,move to the right by 9,r_9,chunk_Pi r_9 chunk_L,3.0,1.0,11.0


In [74]:
# let's inspect the first trial
print('target program: \n', run_0.query('trial==0').loc[0,'target_program'])
run_0.query('trial==0')[['utt','response','acc']]

target program: 
 h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h


Unnamed: 0,utt,response,acc
0,place a horizontal block.,h,1.0
1,move to the left by 4,l_4,1.0
2,place a horizontal block.,h,1.0
3,move to the left by 1,l_1,1.0
4,place a vertical block.,v,1.0
5,place a vertical block.,v,1.0
6,move to the right by 9,r_9,1.0
7,place a vertical block.,v,1.0
8,move to the right by 6,r_6,1.0
9,place a vertical block.,v,1.0


In [75]:
# let's inspect the final trial
print('target program: \n', run_0.query('trial==11').loc[0,'target_program'])

run_0.query('trial==11')[['utt','response','acc']]

target program: 
 chunk_Pi r_9 chunk_L


Unnamed: 0,utt,response,acc
0,place a blah.,chunk_8,0.0
1,move to the right by 9,r_9,1.0
2,place a bleep.,chunk_Pi,0.0


### <span style="color: orange"> Exercise: explore how accuracy changes </span>

Wait, why is the accuracy so bad? Well, our agents aren't actually *learning* -- they're continuing to use their initial uniform priors.

# Update beliefs

To have our agents learn, we need to extend the agent class to do Bayesian inference.

Here we add an update_beliefs function, which performs a Bayesian update on the beliefs about the other agent's lexicon based on the outcome of the previous trial. Note that this update happens when the class is initialized, so really we're defining a new agent at each step with updated beliefs.

In [76]:
class LearningAgent(FixedAgent) :
    def __init__(self, role, curr_trial, previous_trial_df) :
        super().__init__(role, curr_trial)
        combined_primitives = set().union(*previous_trial_df['dsl']) if not previous_trial_df.empty else self.actions
        self.possible_lexicons = set([BlockLexicon(set().union(combined_primitives), list(mapping)) 
                                      for mapping in itertools.permutations(lexemes)])
        self.utterances = set(list(self.possible_lexicons)[0].values())
        self.update_beliefs(previous_trial_df)

    def update_beliefs(self, previous_trial_df) :
        # Initialize posterior 
        posterior = EmptyDistribution()
        posterior.to_logspace()

        # for each data point, calculate the marginal likelihood under lexicon distribution
        # P(l | obs) = 1/Z * P(l) * \prod_{o \in obs} P(o | l)
        # log P(l|obs) = -log Z + log P(l) + \sum_{o \in obs} log P(o | l)
        for lexicon in self.beliefs.support() :
            prior_term = np.log(self.beliefs.score(lexicon))
            likelihood_term = 0
            for i, step in previous_trial_df.iterrows() :
                if self.role == 'builder' :
                    likelihood_term += np.log(self.A1(step.target, lexicon).score(step.utterance))
                elif self.role == 'architect' :
                    likelihood_term += np.log(self.B0(step.utterance, lexicon).score(step.response))
            posterior.update({lexicon : prior_term + likelihood_term})
        posterior.renormalize()
        posterior.from_logspace()
        self.beliefs = posterior
        
    def B0(self, utt, lexicon) :
        builder_dist = EmptyDistribution()
        for action in self.actions :
            builder_dist.update({action : 1 if action == lexicon.language_to_dsl(utt) else 0.01})
        builder_dist.renormalize()
        return builder_dist
        
    def A1(self, target, lexicon) :
        architect_dist = EmptyDistribution()
        for utt in self.utterances :
            architect_dist.update({utt : 1 if utt == lexicon.dsl_to_language(target) else 0.01})
        architect_dist.renormalize()
        return architect_dist

In [80]:
def run_learning_simulation(verbose = False) :
    output = pd.DataFrame({"utterance": [], "response": [], "target": [], "full_program" : [], "target_length" : [], "dsl" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = LearningAgent('architect', trial, output) # create agent with updated beliefs
        builder = LearningAgent('builder', trial, output)     # create agent with updated beliefs
        
        # architect selects which program representation to comunicate proportional to length
        possiblePrograms = list(trial['programs_with_length'].keys())
        possibleLengths = np.array(list(trial['programs_with_length'].values()))
        utilities = np.exp(-alpha * possibleLengths) / sum(np.exp(-alpha * possibleLengths))
        target_program = choice(a = possiblePrograms, p = utilities)

        # loop through steps of target program one at a time
        target_steps, utts, responses, accs = [], [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            target_steps.append(step)
            utts.append(utt)
            responses.append(response)
            accs.append(response == step)

        if verbose:
            print('trial', i)
            print(pd.DataFrame({'utts' : utts, 'responses' : responses, 'correct' : accs, 'target' : target_steps}))
            print('beliefs about chunk_C meaning', 
                  json.dumps(architect.beliefs.marginalize(lambda d : d['chunk_C'] if 'chunk_C' in d else None), indent = 4))
        
        output = pd.concat([output, pd.DataFrame({
            "trial": i,
            "utterance": utts,
            "response": responses,
            "acc": accs,
            "target" : target_steps,
            "full_program": target_program,
            "dsl" : [trial['dsl']] * len(utts),
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output

learning_run_0 = run_learning_simulation()

  output = pd.concat([output, pd.DataFrame({


In [82]:
learning_run_0

Unnamed: 0,utterance,response,target,full_program,target_length,dsl,acc,trial
0,place a horizontal block.,h,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
1,move to the left by 4,l_4,l_4,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
2,place a horizontal block.,h,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
3,move to the left by 1,l_1,l_1,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
4,place a vertical block.,v,v,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
...,...,...,...,...,...,...,...,...
4,place a horizontal block.,h,h,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
5,move to the right by 4,r_4,r_4,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
6,place a horizontal block.,h,h,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
7,move to the right by 9,r_9,r_9,v r_6 v l_5 h r_4 h r_9 chunk_L,9.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0



# Jointly choose program and utterance

Wow, this is great! We can see our agents are updating their beliefs about the lexicon over time and able to get somewhat more accurate as they coordinate. But one of the most interesting things about our empirical data is that speakers seem to be strategically choosing which representation of the tower to convey -- our best current theory of why participants reduce the length of their utterances over time is that even when new library chunks come online, architects don't always try to refer to them right away. They aren't confident enough that their partner will understand, as the block-level descriptions are much safer. However, the block-level descriptions are also much *costlier* in terms of time and effect because they have to laboriously describe one action at a time. 

So far, we just used a placeholder for how the speaker picks which representation to communication: they just randomly pick from the list of candidates, slightly preferring shorter programs. However, there are other considerations that ought to go into this decision, namely the estimated likelihood that the listener will do the right thing.

As an exercise, add one line of code to the simulation to weight the target program according to its utility

In [83]:
def run_strategic_simulation() :
    output = pd.DataFrame({"utterance": [], "response": [], "target": [], "full_program" : [], "target_length" : [], "dsl" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = LearningAgent('architect', trial, output)
        builder = LearningAgent('builder', trial, output)
        
        # architect selects which program representation to comunicate proportional to length
        possiblePrograms = list(trial['programs_with_length'].keys())
        possibleLengths = np.array(list(trial['programs_with_length'].values()))
        
        ## 
        ## utilities = np.exp(-alpha * possibleLengths) / sum(np.exp(-alpha * possibleLengths))
        ## 
        
        target_program = choice(a = possiblePrograms, p = utilities)

        # loop through steps of target program one at a time
        target_steps, utts, responses, accs = [], [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            target_steps.append(step)
            utts.append(utt)
            responses.append(response)
            accs.append(response == step)

        print('trial', i)
        print(pd.DataFrame({'utts' : utts, 'responses' : responses, 'correct' : accs, 'target' : target_steps}))
        print('beliefs about chunk_C meaning', 
              json.dumps(architect.beliefs.marginalize(lambda d : d['chunk_C'] if 'chunk_C' in d else None), indent = 4))
        output = pd.concat([output, pd.DataFrame({
            "utterance": utts,
            "response": responses,
            "acc": accs,
            "target" : target_steps,
            "full_program": target_program,
            "dsl" : [trial['dsl']] * len(utts),
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output
display(run_learning_simulation())

  output = pd.concat([output, pd.DataFrame({


Unnamed: 0,utterance,response,target,full_program,target_length,dsl,acc,trial
0,place a horizontal block.,h,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
1,move to the left by 4,l_4,l_4,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
2,place a horizontal block.,h,h,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
3,move to the left by 1,l_1,l_1,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
4,place a vertical block.,v,v,h l_4 h l_1 v v r_9 v r_6 v l_5 h r_4 h,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",1.0,0.0
...,...,...,...,...,...,...,...,...
9,move to the left by 4,l_4,l_4,v r_6 v l_5 h r_4 h r_9 h l_4 h l_1 v v,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
10,place a horizontal block.,h,h,v r_6 v l_5 h r_4 h r_9 h l_4 h l_1 v v,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
11,move to the left by 1,l_1,l_1,v r_6 v l_5 h r_4 h r_9 h l_4 h l_1 v v,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
12,place a vertical block.,v,v,v r_6 v l_5 h r_4 h r_9 h l_4 h l_1 v v,14.0,"[h, v, l_0, l_1, l_2, l_3, l_4, l_5, l_6, l_7,...",True,11.0
