Welcome to the Distributional Bayesian Metaphor (DBM) Notebook. First let's load up some code:

In [75]:
from __future__ import division
import scipy
from utils.load_data import *
from utils.helperfunctions import *
from tensorflow_rsa import tf_rsa
import numpy as np
import pickle
import itertools
from utils.refine_vectors import h_dict,processVecMatrix
import nltk
import glob
import os
from dbm import *
import pickle
from utils.load_adj_nouns import load_adj_nouns

from utils.load_data import get_words

frequencies,nouns,adjectives,verbs = get_words(preprocess=False)
nouns,adjectives,verbs = sorted(nouns, key=lambda x: frequencies[x], reverse=True),sorted(adjectives, key=lambda x: frequencies[x], reverse=True),sorted(verbs, key=lambda x: frequencies[x], reverse=True)
vecs = pickle.load(open("data/word_vectors/glove.6B.mean_vecs50",'rb'),encoding='latin1')



In [76]:
from importlib import reload
import tensorflow_rsa
import dbm
reload(dbm)
reload(tensorflow_rsa)
from tensorflow_rsa import tf_rsa

There are several varieties of the model. All share the same L0 and S1. See the paper for details, but roughly speaking, the L0 recieves a word as utterance, and updates their world state (a multidimensional normal in the glove vector space) in the direction of the vector corresponding to this word. The S1 takes a uniform categorical distribution over some set of possible utterances, and updates it to favour those words which increase the likelihood of their world state (a point in the glove space) being drawn from the L0's posterior world state after hearing the given word.

3 variations of the model are displayed in this notebook, all relating to the L1. The first, referred to as the categorical L1, has a prior over QUDs consisting of a uniform categorical distribution over points in the glove space corresponding to a supplied set of words (for instance, a set of adjectives).

The second, the non-categorical L1, has the QUD prior be a continuous uniform distribution over unit vectors (or hyperplanes - it generalizes nicely) in the glove space. Unit vectors, because the length of the vector specifying the projection doesn't matter.

The third variation, the trivial QUD L1, is like the categorical L1, but has an extra possibile QUD (in addition to whatever set of words is supplied), one which doesn't project down to a subspace at all. This trivial QUD represents literal meaning, so the idea is that if this L1 gives a high score to the trivial QUD, the predication ("A is a B") in question is most likely to be literal, not metaphorical. Thus, this version of the L1 is used for the task of QUD identification.

Now for examples. The following cell will run an inference of the categorical L1. The possible utterances are a set of animals taken from Justine and Leon's original paper on Bayesian metaphor, and the QUDs are the corresponding animal features from that paper.

**Categorical L1**

When you run the following cell, you should see, after a few minutes:

    null model results: these are the top few of the supplied quds, ordered by their cosine distance to the mean/sum/product of the subject and predicate (i.e. A and B in "A is a B", respectively)
    
    world movement (euclidean and cosine): this is the set of candidate quds ordered by how much closer the mean of the listener's world gets to them, as it is updated from prior to posterior
    
    categorical results: the top elements (by probability mass) of a categorical distribution, the L1 posterior distribution over QUDs. NB: the topics may have the opposite polarity to expectation - that is, "independent" might be a good topic for sheep, since sheep are not independent. As an answer to "is the man independent?", "the man is a sheep" would be a good answer.

In [19]:
#catforl
for weight in [0.0]:
    for word in ['sheep']:
        for i in range(1):
            compute_results(subject=vecs['man'],predicate=word,
            quds=animal_features,possible_utterances=animals,
            mean_vecs=True,pca_remove_top_dims=False,
            sig1=10.0,sig2=0.1,
            qud_weight=weight,freq_weight=weight,
            categorical=True,
            vec_length=50,vec_type="glove.6B.",
            out_file="animals_pca.txt",
            sample_number = 1000,
            number_of_qud_dimensions=1,
            burn_in=500,
            pickle_name="test_sample",

                            seed=False,
            trivial_qud_prior=False)


loading vecs: glove.6B.mean_vecs50
the top mean null model predictions:  [('wild', 0.296809230645663), ('blind', 0.3005889050592907), ('small', 0.31577211343047829), ('large', 0.39584122567305913), ('happy', 0.39834227299171376)]
the top mult null model predictions:  [('slow', 0.47231478291624029), ('smooth', 0.48745485726873017), ('strong', 0.49825868683108954), ('full', 0.51326623191391896), ('weak', 0.51536785884469305)]
the top sum null model predictions:  [('wild', 0.296809230645663), ('blind', 0.3005889050592907), ('small', 0.31577211343047829), ('large', 0.39584122567305913), ('happy', 0.39834227299171376)]
Running categorical RSA with 34 possible utterances and 91 possible quds.
Mean centered=True and pca_removal=False

bouncy  not found
1000/1000 [100%] ██████████████████████████████ Elapsed: 33s | Acceptance Rate: 0.573
THE RESULTS OF L1 INFERENCE FOR: predicate=sheep


cosine world movement:  [('blind', -0.7452818824590004), ('dangerous', -0.73161238273586171), ('ugly', -0.

**Non-categorical L1**

The example here of the non-categorical L1 supplies as possible utterances 900 of the most common nouns in English. There is an option of weighting their importance by frequency, which is not taken here (but feel free to try it - it's  a Jupyter notebook after all).

When you run the following cell, you should see, after a few minutes:

    null model results: these are the top few of the supplied quds, ordered by their cosine distance to the mean/sum/product of the subject and predicate (i.e. A and B in "A is a B", respectively)
    
    world movement (euclidean and cosine): this is the set of candidate quds ordered by how much closer the mean of the listener's world gets to them, as it is updated from prior to posterior
    
    non-categorical results: the supplied list of words, ordering by which are closest (by cosine distance) to the mean of the posterior of the l1 qud distribution: this seems to produce bad results, unlike world movement. Our conjecture is that cosine distance is not an appropriate metric here.
    
Results are still a little unstable here, but I *think* it's working. Proper testing imminent.

In [77]:
from lm_1b_eval import predict
prob_dict = predict("The man is a")
filtered_adjs = [x for x in adjectives if x in vecs and x in prob_dict]
filtered_nouns = [x for x in nouns if x in vecs and x in prob_dict]
sorted_nouns = sorted(filtered_nouns,key = lambda x : prob_dict[x],reverse=True)
sorted_adjs = sorted(filtered_adjs,key = lambda x : prob_dict[x],reverse=True)


data/graph-2016-09-10.pbtxt


Recovering graph.


INFO:tensorflow:Recovering Graph data/graph-2016-09-10.pbtxt


Recovering checkpoint data/ckpt-*


['etcetera', 'while', 'are', 'over', 'at', 'radio', 'will', 'magazine', 'can', 'format', 'now', 'then', 'sound', 'series', 'might', 'gaff', 'show', 'do', 'scours', 'being']


the 


['day', 'night', 'morning', 'week', 'while', 'following', 'way', 'few', 'year', 'couple', 'moment', 'first', 'afternoon', 'latter', 'whole', 'record', 'evening', 'little', 'past', 'hour']


the man 


['at', 'making', 'he', 'being', 'will', 'sitting', 'can', 'must', 'then', 'going', 'driving', 'standing', 'now', 'are', 'do', 'thought', 'working', 'works', 'out', 'running']


the man is 


['now', 'still', 'at', 'white', 'no', 'there', 'sitting', 'dead', 'accused', 'then', 'back', 'black', 'going', 'being', 'right', 'lying', 'over', 'crazy', 'out', 'standing']


the man is a 


['man', 'good', 'real', 'genius', 'liar', 'great', 'fraud', 'criminal', 'bit', 'former', 'dead', 'friend', 'member', 'terrorist', 'victim', 'lot', 'suspect', 'waste', 'little', 'bad']


the man is a Christian 


['man', 'good', 'real', 'genius', 'liar', 'great', 'fraud', 'criminal', 'bit', 'former', 'dead', 'friend', 'member', 'terrorist', 'victim', 'lot', 'suspect', 'waste', 'little', 'bad']


In [None]:
#for leon
predicate = 'tree'
possible_utterances=sorted_nouns[:sorted_nouns.index(predicate)+1]

for i in range(1,2):
    for animal in ["tree"]:
        man_tree_noncat = DistRSAInference(
            subject=['man'],predicate=animal,
            quds=['strong','stable','wooden','connected','leafy','unyielding'],
            possible_utterances=possible_utterances,
            object_name="animals_spec",
            mean_vecs=True,
            pca_remove_top_dims=False,
            sig1=10.0,sig2=0.01,
            qud_weight=0.0,freq_weight=1.0,
            categorical="non-categorical",
            vec_length=50,vec_type="glove.6B.",
            sample_number = 50000,
            number_of_qud_dimensions=1,
            burn_in=40000,
            seed=False,trivial_qud_prior=False,
            step_size=0.005,
            frequencies=prob_dict
            )
        
        


loading vecs: glove.6B.mean_vecs50


In [None]:
for x in range(5):
    man_tree_noncat.compute_results(load=0,save=False)
    print(man_tree_noncat.qud_results())
    print(man_tree_noncat.qud_results(comparanda=[x for x in adjectives if x in vecs]))
#     print(man_tree_noncat.qud_results(comparanda=[x for x in nouns if x in vecs]))
#     print(man_tree_noncat.world_movement("cosine",comparanda=['strong','stable','wooden','connected','leafy','unyielding']))
#     print(man_tree_noncat.world_movement("cosine",comparanda=[x for x in adjectives if x in vecs]))
    print(man_tree_noncat.world_movement("cosine",do_projection=True,comparanda=['strong','stable','wooden','connected','leafy','unyielding']))
    print(man_tree_noncat.world_movement("cosine",do_projection=True,comparanda=[x for x in adjectives if x in vecs]))

NAME:  animals_spec
subject: ['man']
predicate tree
Running non-categorical RSA with 505 possible utterances and 6 possible quds.
Mean centered=True and pca_removal=False

qud_combinations 6
quds 6
(1, 505) S1 SHAPE
50000/50000 [100%] ██████████████████████████████ Elapsed: 667s | Acceptance Rate: 0.997
[(('unyielding', 0.98946789177234717),), (('strong', 1.0203005058981345),), (('connected', 1.0637341543238157),), (('stable', 1.128868600595698),), (('wooden', 1.2797596768586947),), (('leafy', 1.4909948004569631),)]
[(('hypoglycemic', 0.49328643914401649),), (('tortious', 0.52502387561748354),), (('negligent', 0.55697179443155531),), (('psychokinetic', 0.57230346590972458),), (('hypovolemic', 0.58189961420066449),), (('empathic', 0.59889722386753985),), (('excused', 0.60400462287788126),), (('unforced', 0.6085408109854753),), (('unrighteous', 0.61482521592369621),), (('anaphylactic', 0.62130693491078004),), (('virological', 0.62298873191132631),), (('nearsighted', 0.62323069186652225),

**Trivial QUD L1**

When you run the following cell, you should see, after a few minutes:

    null model results: these are the top few of the supplied quds, ordered by their cosine distance to the mean/sum/product of the subject and predicate (i.e. A and B in "A is a B", respectively)
    
    world movement (euclidean and cosine): this is the set of candidate quds ordered by how much closer the mean of the listener's world gets to them, as it is updated from prior to posterior
    
    the supplied list of qud words, ordering my probability mass in a categorical distribution, until the point the trivial qud appears 
    
(NB: this model isn't currently working, as far as the results are concerned: still tuning!)

In [None]:
for word in ['shark']:
    for i in range(2):
#         np.mean([vecs['sales'],vecs['had'],vecs['to']])
        compute_results(subject=vecs['man'],predicate=word,
#                         ["feminine","unmanly","womanly","girly","weak","fertile"]
        quds=adjectives[50:500],possible_utterances=nouns[50:1000]+['shark'],
        mean_vecs=True,pca_remove_top_dims=False,
        sig1=10.0,sig2=0.1,
        qud_weight=1.0,freq_weight=0.0,
        categorical=True,
        vec_length=50,vec_type="glove.6B.",
        out_file="animals_pca.txt",
        sample_number = 1000,
        number_of_qud_dimensions=1,
        burn_in=500,
        # pickle_name=str(i),
        seed=False,
        trivial_qud_prior=True)


loading vecs: glove.6B.mean_vecs50
removed noun:  mass_noun
the top mean null model predictions:  [('wild', 0.29110804031977178), ('animal', 0.30280776963224054), ('dead', 0.33326118279874473), ('young', 0.33577260479700666), ('turned', 0.33855197016096639)]
the top mult null model predictions:  [('essential', 0.5086325852288327), ('basic', 0.51584907435357341), ('external', 0.51854717531096584), ('internal', 0.52627306917003003), ('existing', 0.53272955590960724)]
the top sum null model predictions:  [('wild', 0.29110804031977178), ('animal', 0.30280776963224054), ('dead', 0.33326118279874473), ('young', 0.33577260479700666), ('turned', 0.33855197016096639)]
Running categorical RSA with 950 possible utterances and 450 possible quds.
Mean centered=True and pca_removal=False

1000/1000 [100%] ██████████████████████████████ Elapsed: 1016s | Acceptance Rate: 0.876
time taken by L1 inference was 1016.9331903457642 seconds
THE RESULTS OF L1 INFERENCE FOR: predicate=shark


cosine world mov