# Linguistic Analyses for Compositional Abstractions
## Notebook 3: Forming conventions to talk about shared abstractions

This notebook was written by Robert Hawkins.  
Original modeling by Robert Hawkins, Will McCarthy, Cameron Holdaway, Haoliang Wang, and Judy Fan.

### *NOTE: THIS NOTEBOOK SERVES AS THE INSTRUCTOR VERSION*

In the previous notebook we studied a **concept learning** problem. We inferred libraries of program fragments corresponding to a collection of possible concepts people may have in their heads as they advance through the task. For each trial, we then generated a set of programs expressing the scene in different ways using different abstractions. 

In this notebook, we extend our model to address the **communication** problem. That is, given a tower scene and a set of concepts in the Architect's head, what linguistic instructions should they give to the Builder that will allow them to successfully reconstruct that scene? 

We approach this problem in a **probabilistic modeling** framework that extends the model of convention formation described by [Hawkins et al. (2023)](https://cocosci.princeton.edu/papers/hawkinspartners.pdf). 

This notebook is divided into 3 sections that incrementally build up to the full model.

**Section 1** begins by implementing the core notion of a *mental lexicon* -- a mapping from words to concepts.

**Section 2** implements a basic Architect and Builder agent that make decisions based on fixed lexicons.  

**Section 3** equips these agents with the ability to *learn* and update their *beliefs* about the lexicon over time.

**Section 4** finally reaches the core theoretical question of why speakers prefer one level of abstraction over another.

## Setup 

In [4]:
import os
import sys

import functools
import itertools
import json

import pandas as pd
import numpy as np

from numpy.random import choice

In [6]:
# import classes for our model
sys.path.append("../model/convention_formation/")
from distribution import *
from lexicon import *

## Section 1: Lexicons

### Representing lexicons

The basic building block of our model is an agent's **mental lexicon**, a particular set of correspondances between concepts and words. To define an agent model, we must first define the mental lexicon. 

We have created a class for the purpose called `BlockLexicon()` which manages the mapping between program primitives (as defined in **[the previous section](https://github.com/cogtoolslab/compositional_abstractions_tutorial/blob/main/notebooks/ca_programs.ipynb)**) to words and phrases. If you're interested in digging into the nitty-gritty details, we've put this class in a helper library __[here](https://github.com/cogtoolslab/compositional_abstractions_tutorial/blob/main/model/convention_formation/lexicon.py)__).

For our purposes here, though, we don't need to know exactly what's going on under the hood. Let's take the lexicon class out for a drive. We need to initialize it with two parameters, essentially corresponding to the base entities on each side of the mapping we want to define.

(1) `dsl`: a list of concepts in the DSL that might be expressed as words,

(2) `lexemes`: a list of words that are available to bind to new concepts. 

To build up our intuitions, we're going to work with a very simplified example lexicon, where the set of concepts is the library on trial 10 of the task (retrieved from the file we saved out in the previous section) and the list of lexemes is just a bunch of nonsense words that won't start with any meaning.

In [11]:
# Pull out the DSL primitives accessible at trial 10
d = pd.read_json('../data/model/dsls/2/programs_for_you/programs_ppt_1.json')
dsl = d['dsl'][10]

# Define a set of meaningless placeholder words available to be bound to meanings
lexemes = ['blah', 'blab', 'bloop', 'bleep', 'floop'] 
l = BlockLexicon(dsl, lexemes)

FileNotFoundError: File ../data/model/dsls/2/programs_for_you/programs_ppt_1.json does not exist

This lexicon object `l` that we've created has a few basic functions we can call. We can look up the language for any element of the `dsl`: 

In [None]:
print(dsl[0], '->', l.dsl_to_language(dsl[0]))
print(dsl[10], '->', l.dsl_to_language(dsl[10]))
print(dsl[-1], '->', l.dsl_to_language(dsl[-1]))

and we can also go in the other direction, converting from a linguistic expression to a corresponding primitive in the DSL.
We went ahead and 'baked in' correspondences for the basic elements of our DSL, because we're assuming these meanings are pretty much deterministic; there's not much wiggle room about what 'place a horizontal block' or 'move to the left by 8' corresponds to. 

> sidenote: if we wanted a more realistic model that worked on generic natural language, e.g. if we wanted to pair our agent with a real user writing in a chat box, we would want something more robust that doesn't require exact string matching. We might use a simple algorithm to find the nearest string in the lexicon, or we might use a fancier model operating over utterance embeddings.

In [None]:
print('place a horizontal block. ->', l.language_to_dsl('place a horizontal block.'))
print('move to the left by 8 ->', l.language_to_dsl('move to the left by 8'))

There's one more interesting feature to notice about our `BlockLexicon()` class. If we pass in an unfamiliar utterance that isn't found in the list of lexemes we provided, it will return one of the concepts that doesn't already have a word assigned (and vice versa). We can think of this like the agent randomly 'guessing' rather than erroring.

In [None]:
print('place a blah. ->', l.language_to_dsl('place a blah.'))
print('place a flomp. ->', l.language_to_dsl('place a womp.'))

### Representing beliefs about lexicons

Our lexicon `l` so far has been a deterministic data structure; it's just a single mapping. 

In a probabilistic model, however, we want to be able to talk in a mathematically rigorous way about an agent's subjective **beliefs**. In other words, we want to be able to define a **probability distribution** over possible lexicons. This will provide a precise definition for an agent's **uncertainty** about the lexicon in their partner's head, i.e. they assume their partner is using some mapping `l*`, but do not know ahead of time exactly what it is.

For this example, we're going to construct a distribution as an object that maintains the probabilities for each possible lexicon. (Again, for nitty-gritty details, see the helper library __[here](https://github.com/cogtoolslab/compositional_abstractions_tutorial/blob/main/model/convention_formation/distribution.py)__). 

We'll define a **prior** distribution (the agent's initial beliefs) as a uniform distribution over all possible ways of binding elements in the DSL to the lexemes. 

In [16]:
# [f(x) for x in collection] is called a list comprehension 
# it creates a new list by applying some transformation to each element in a collection
# here we're relying on the fact that `BlockLexicon` uses the *order* of the list of lexemes
prior = UniformDistribution(
    [BlockLexicon(dsl, list(mapping)) for mapping in itertools.permutations(lexemes)]
)

NameError: name 'dsl' is not defined

We've also created some handy helper functions on distributions. For example, we can 'marginalize' to look at the possible mappings for any single chunk.

In [None]:
print('Example lexicon')
print('possible values of chunk_L : ', 
      json.dumps(prior.marginalize(lambda d : d['chunk_L']), indent = 4))

Here, we can see that 'chunk_L' in the DSL is initially equally likely to be mapped to any utterance: $P([[u]] = L) = 1/|U|$ The equal probabilities for each expression tell us that the Agents think each expression is an equally good (or bad) translation of "chunk_L". This makes sense for this artificial langauge (where "bleeps" and "bloops" are used to refer to abstractions). Real people likely have strong priors about what words will mean (we can make quite a lot of sense of "build an L" before any shared experience, even if there's some remaining uncertainty about properties of the L like its size and width, etc). But a uniform prior helps us understand the dynamics of coordination in the most extreme case. 

Here are a few other things you can do with a distribution object:

In [17]:
# d.support() returns the support of the distribution d, i.e. the list of values that it is defined over
# e.g. we can print out the first lexicon in the support as an example of what lexicons look like
print('example element:', json.dumps(prior.support()[0], indent = 4))

# d.score(val) returns the probability of val in the distribution d
# e.g. this is the probability assigned to the lexicon we just printed out
print('P(lexicon) =', prior.score(prior.support()[0]))

NameError: name 'prior' is not defined

## Section 2: Simulating Agents

Now we have a way of representing an agent's beliefs over lexicons, we can define an Architect and Builder.

Both maintain their own belief distribution about the *other agent's* lexicon.

The **Architect** agent makes choices about what to say based by imaginging how the Builder will interpret them.  

The **Builder** takes actions based on its beliefs about what the Architect would say in different situations. 

In [15]:
class FixedAgent() :
    def __init__(self, role, trial) :
        '''
        Args: 
           * role: string giving agent's role in the task ('architect' or 'builder')
           * trial: dictionary of meta-data about the current trial 
        '''
        self.role = role
        self.actions = trial['dsl']

        # initialize beliefs to a uniform prior over possible lexicons, as above
        self.possible_lexicons = set([BlockLexicon(self.actions, list(mapping)) 
                                      for mapping in itertools.permutations(lexemes)])
        self.beliefs = UniformDistribution(self.possible_lexicons)
        self.utterances = set(list(self.possible_lexicons)[0].values())
        
    def act(self, observation) :
        '''
        produce an action based on role and current beliefs
        '''
        if self.role == 'architect' :
            # Architect is going to build up a distribution over utterances to say
            utt_dist = EmptyDistribution()
            for lexicon in self.beliefs.support() :
                # They imagine what they would say under each possible lexicon 
                # and weight it by the likelihood that the builder is actually using that lexicon
                utt_dist.update({lexicon.dsl_to_language(observation) : self.beliefs.score(lexicon)})
            return choice(a = [*utt_dist.support()], 
                          p = [utt_dist.score(u) for u in utt_dist.support()])

        if self.role == 'builder' :
            # get P(a | utt) by marginalizing over lexicons 
            action_dist = EmptyDistribution()
            for lexicon in self.beliefs.support() :
                action_dist.update({lexicon.language_to_dsl(observation) : self.beliefs.score(lexicon)})
            return choice(a = [*action_dist.support()], 
                          p = [action_dist.score(a) for a in action_dist.support()])

Because our focus is on modeling abstractions learned during the task, we assume that Architect Agents and Builder Agents can unambiguously communicate about the base DSL-- moving left and right and placing individual blocks. I.e. when an Architect wants a Builder to place a block they will always say "place a horizontal block", and the Builder will correctly interpret this utterance.

The only thing that varies across different lexicons is the words used for *learned* program fragments. In practice (for this example) only five of these were learned across all participants' trial sequences. The set of possible lexicons is therefore fully defined by the set of possible mappings from these fragments to the `lexemes` defined above.


In [None]:
architect = FixedAgent('architect', d.loc[0].to_dict())
print('architect choice: ', architect.act('h'))

builder = FixedAgent('builder', d.loc[0].to_dict())
print('builder choice: ', builder.act('place a horizontal block.'))

In [None]:
architect = FixedAgent('architect', d.loc[0].to_dict())
print('architect choice: ', architect.act('chunk_L'))

### Running simulations

Now we have our agents, we need to run them forward through the trial sequence.

In [None]:
alpha = 0.04 # weight that trades off between TODO

In [None]:
def run_simulation() :
    output = pd.DataFrame({"utt": [], "response": [], "target_program": [], "target_length" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = FixedAgent('architect', trial)
        builder = FixedAgent('builder', trial)

        # architect selects which program representation to comunicate proportional to length
        possiblePrograms = list(trial['programs_with_length'].keys())
        possibleLengths = np.array(list(trial['programs_with_length'].values()))
        utilities = np.exp(-alpha * possibleLengths) / sum(np.exp(-alpha * possibleLengths))
        target_program = choice(a = possiblePrograms, p = utilities)

        # loop through steps of target program one at a time
        utts, responses, accs = [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            utts.append(utt)
            responses.append(response)
            accs.append(1.0 * (response == step))

        output = pd.concat([output, pd.DataFrame({
            "trial": int(i),
            "utt": utts,
            "response": responses,
            "acc": accs,
            "target_program": target_program,
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output

run_0 = run_simulation()
display(run_0)

In [None]:
# let's inspect the first trial
print('target program: \n', run_0.query('trial==0').loc[0,'target_program'])
run_0.query('trial==0')[['utt','response','acc']]

In [None]:
# let's inspect the final trial
print('target program: \n', run_0.query('trial==11').loc[0,'target_program'])

run_0.query('trial==11')[['utt','response','acc']]

### <span style="color: orange"> Exercise: explore how accuracy changes </span>

Wait, why is the accuracy so bad? Well, our agents aren't actually *learning* -- they're continuing to use their initial uniform priors.

## Section 3: Simulating learning

To have our agents learn, we need to extend the agent class to do Bayesian inference.

Here we add an update_beliefs function, which performs a Bayesian update on the beliefs about the other agent's lexicon based on the outcome of the previous trial. Note that this update happens when the class is initialized, so really we're defining a new agent at each step with updated beliefs.

In [None]:
class LearningAgent(FixedAgent) :
    def __init__(self, role, curr_trial, previous_trial_df) :
        super().__init__(role, curr_trial)
        combined_primitives = set().union(*previous_trial_df['dsl']) if not previous_trial_df.empty else self.actions
        self.possible_lexicons = set([BlockLexicon(set().union(combined_primitives), list(mapping)) 
                                      for mapping in itertools.permutations(lexemes)])
        self.utterances = set(list(self.possible_lexicons)[0].values())
        self.update_beliefs(previous_trial_df)

    def update_beliefs(self, previous_trial_df) :
        # Initialize posterior 
        posterior = EmptyDistribution()
        posterior.to_logspace()

        # for each data point, calculate the marginal likelihood under lexicon distribution
        # P(l | obs) = 1/Z * P(l) * \prod_{o \in obs} P(o | l)
        # log P(l|obs) = -log Z + log P(l) + \sum_{o \in obs} log P(o | l)
        for lexicon in self.beliefs.support() :
            prior_term = np.log(self.beliefs.score(lexicon))
            likelihood_term = 0
            for i, step in previous_trial_df.iterrows() :
                if self.role == 'builder' :
                    likelihood_term += np.log(self.A1(step.target, lexicon).score(step.utterance))
                elif self.role == 'architect' :
                    likelihood_term += np.log(self.B0(step.utterance, lexicon).score(step.response))
            posterior.update({lexicon : prior_term + likelihood_term})
        posterior.renormalize()
        posterior.from_logspace()
        self.beliefs = posterior
        
    def B0(self, utt, lexicon) :
        builder_dist = EmptyDistribution()
        for action in self.actions :
            builder_dist.update({action : 1 if action == lexicon.language_to_dsl(utt) else 0.01})
        builder_dist.renormalize()
        return builder_dist
        
    def A1(self, target, lexicon) :
        architect_dist = EmptyDistribution()
        for utt in self.utterances :
            architect_dist.update({utt : 1 if utt == lexicon.dsl_to_language(target) else 0.01})
        architect_dist.renormalize()
        return architect_dist

In [None]:
def run_learning_simulation(verbose = False) :
    output = pd.DataFrame({"utterance": [], "response": [], "target": [], "full_program" : [], "target_length" : [], "dsl" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = LearningAgent('architect', trial, output) # create agent with updated beliefs
        builder = LearningAgent('builder', trial, output)     # create agent with updated beliefs
        
        # architect selects which program representation to comunicate proportional to length
        possiblePrograms = list(trial['programs_with_length'].keys())
        possibleLengths = np.array(list(trial['programs_with_length'].values()))
        utilities = np.exp(-alpha * possibleLengths) / sum(np.exp(-alpha * possibleLengths))
        target_program = choice(a = possiblePrograms, p = utilities)

        # loop through steps of target program one at a time
        target_steps, utts, responses, accs = [], [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            target_steps.append(step)
            utts.append(utt)
            responses.append(response)
            accs.append(response == step)

        if verbose:
            print('trial', i)
            print(pd.DataFrame({'utts' : utts, 'responses' : responses, 'correct' : accs, 'target' : target_steps}))
            print('beliefs about chunk_C meaning', 
                  json.dumps(architect.beliefs.marginalize(lambda d : d['chunk_C'] if 'chunk_C' in d else None), indent = 4))
        
        output = pd.concat([output, pd.DataFrame({
            "trial": i,
            "utterance": utts,
            "response": responses,
            "acc": accs,
            "target" : target_steps,
            "full_program": target_program,
            "dsl" : [trial['dsl']] * len(utts),
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output

learning_run_0 = run_learning_simulation()

In [None]:
learning_run_0

## Section 4: Choosing programs.

In principle, words allow us to communicate arbitrarily complex concepts, and our mental representation of linguistic meaning is flexible enough to be updated over time. To the extent that both Architects and Builders are learning to 'chunk' blocks over time, each  program fragment) could, in principle, be assigned a new word or phrase, and conveyed directly through the Architects' instructions. Well, almost. There is no guarantee that the words the Architect chooses to pick out a new concept would invoke the same concept in someone else. There is actually uncertainty about how the Builder will interpret the Architects' instructions and this, we suggest, might change what the Architect chooses to say.

Our hypothesis is that Architects trade-off communicative *efficiency* with communicative *effectiveness*. While they generally want to say things concisely, if there is too much uncertainty about how the Builder will interpret their words, they will choose a less ambiguous (if more wordy) way of expressing the same information. Concretely, if the Architect wanted to say "build an L" but they thought that the Builder wouldn't get what "L" meant, they might spell out the steps to make an L-shaped tower instead.

Wow, this is great! We can see our agents are updating their beliefs about the lexicon over time and able to get somewhat more accurate as they coordinate. But one of the most interesting things about our empirical data is that speakers seem to be strategically choosing which representation of the tower to convey -- our best current theory of why participants reduce the length of their utterances over time is that even when new library chunks come online, architects don't always try to refer to them right away. They aren't confident enough that their partner will understand, as the block-level descriptions are much safer. However, the block-level descriptions are also much *costlier* in terms of time and effect because they have to laboriously describe one action at a time. 

So far, we just used a placeholder for how the speaker picks which representation to communication: they just randomly pick from the list of candidates, slightly preferring shorter programs. However, there are other considerations that ought to go into this decision, namely the estimated likelihood that the listener will do the right thing.

As an exercise, add one line of code to the simulation to weight the target program according to its utility

In [None]:
def run_strategic_simulation() :
    output = pd.DataFrame({"utterance": [], "response": [], "target": [], "full_program" : [], "target_length" : [], "dsl" : [], "acc": []})
    for i, trial in d.iterrows() :
        architect = LearningAgent('architect', trial, output)
        builder = LearningAgent('builder', trial, output)
        
        # architect selects which program representation to comunicate proportional to length
        possiblePrograms = list(trial['programs_with_length'].keys())
        possibleLengths = np.array(list(trial['programs_with_length'].values()))
        
        ## 
        ## utilities = np.exp(-alpha * possibleLengths) / sum(np.exp(-alpha * possibleLengths))
        ## 
        
        target_program = choice(a = possiblePrograms, p = utilities)

        # loop through steps of target program one at a time
        target_steps, utts, responses, accs = [], [], [], []
        for step in target_program.split(' ') :
            utt = architect.act(step)
            response = builder.act(utt)
            target_steps.append(step)
            utts.append(utt)
            responses.append(response)
            accs.append(response == step)

        print('trial', i)
        print(pd.DataFrame({'utts' : utts, 'responses' : responses, 'correct' : accs, 'target' : target_steps}))
        print('beliefs about chunk_C meaning', 
              json.dumps(architect.beliefs.marginalize(lambda d : d['chunk_C'] if 'chunk_C' in d else None), indent = 4))
        output = pd.concat([output, pd.DataFrame({
            "utterance": utts,
            "response": responses,
            "acc": accs,
            "target" : target_steps,
            "full_program": target_program,
            "dsl" : [trial['dsl']] * len(utts),
            "target_length" : trial['programs_with_length'][target_program],
        })])
    return output
display(run_learning_simulation())

### Summary


Congratulations! You've finished the tutorial!

We hope you found it fun and informative.