# Part 1: Compositionality with Math

Compositional semantics is the study of how to construct the meaning of complex phrases and sentences from their component parts.

Before moving into language, we will start with the following simple example from math: What is the meaning of the following sentence?

**two plus three times five equals seventeen**

To start off, the easiest things to deal with in this sentence are the numbers. For example, the meaning of "three" is the number 3. We indicate this by writing [[three]] = 3, where the double brackets are used to represent the meaning of whatever is contained in the brackets. Run the following cell to store this meaning of the word "three" in this notebook's memory:

In [1]:
%%lamb
||three|| = 3

Now replace the question marks with the proper meanings for the remaining relevant numbers:

In [2]:
%%lamb
||two|| = 2
||five|| = 5
||seventeen|| = 17

Now we come to a somewhat harder question: What is the meaning of "times"? It's pretty easy to tell what the meaning of "three times five" should be, namely 15. However, in the semantics framework we are using, you may only combine two words at a time, whereas "three times five" contains three words. Therefore, we need to find some way to build up the phrase "three times five" from smaller units. 

We will do this by first creating a meaning for just the phrase "times five." What can "times five" mean? Well, we know that when we insert "three" we get "three times five", so we can think of "times five" as being a function that takes an argument (in this case 3) and multiplies it by 5. It is easy to write such a function in Python:

In [3]:
def times_five(x):
    return x * 5

In [4]:
times_five(3) # This should give 15

15

Equivalently, the Python function can be written with a lambda expression as follows:

In [5]:
(lambda x: x * 5)(3)

15

The first component of that line (lambda x: x \* 5) defines a function, which takes x as its argument and returns x\*5. Then the 3 in parentheses is an argument that has been passed to this function, and the result of passing this argument is 15.

Now we can ask again what "times" should mean. Since "times five" should give us something like the Python function (lambda x: x \* 5), we want "times" to give us something that, when given 5 as an argument, returns (lambda x: x \* 5). To accomplish this, we can simply add in another lambda layer as follows:
(lambda y: (lambda x: x \* y))
When this function is given 5 as an argument, the outer lambda layer is evaluated, and the value that is returned is the function (lambda x: x \* 5), which is the same as out times_five function from before. And if you then pass 3 as an argument to this output function, you should get 15 as before, as demonstrated in the following cell:

In [6]:
(lambda y: (lambda x: x* y))(5)(3)

15

Therefore, we now can write the meaning for "times" using this lambda notation (the subscript $n$ indicates that $x$ and $y$ are of type number):

In [7]:
%%lamb
||times|| = L y_n : L x_n : x * y

Now we can compose together "times" and "five" to get the meaning of "times five". In our notation, we use the "\*" symbol to compose together the meanings of two words:

In [8]:
times * five

The output should tell you that the meaning of "times five" is a function that takes an argument $x$ and returns $x * 5$. The meaning of "three times five" should also behave as expected:

In [9]:
three * (times * five)

In the cell below, fill in the proper meaning for "plus"

In [10]:
%%lamb
||plus|| = L y_n : L x_n : x + y

Finally, we just need to define "equals" (in this notation, you have to use "<=>" for equality):

In [11]:
%%lamb
||equals|| = L y_n : L x_n : x <=> y

Now we can evaluate the meaning of our entire original equation. The equation is true, so the expression will evaluate as True.

In [12]:
((two * (plus * (three * (times * five)))) * (equals * seventeen))

# Part 2: Compositionality with Language

### a. Proper nouns

Now that we've sorted out numbers, we can extend the formalism to the more interesting case of natural language. The linguistic components with the simplest meanings are proper nouns; much as number words have numbers as their meanings, proper nouns have entities in the real world as their meanings. For example, the meaning of "John" is the person John (shown here with a subscript $e$ because John is an entity):

In [13]:
%%lamb
||John|| = John_e

### b. Verbs

Next up are verbs. We will think of verbs as functions that take entities as their arguments and return either true or false depending on whether the verb applies to that argument. In pseudocode, we might write this sort of function as follows:

`
def Walked(x):
    if x walked:
        return True
    if x did not walk:
        return False
`

In our notation, we would write the meaning of "walked" as follows:

In [14]:
%%lamb
||walked|| = L x: Walked(x)

INFO (meta): Coerced guessed type for 'Walked_t' into <e,t>, to match argument 'x_e'


We can now compute the meaning of the short sentence "John walked":

In [15]:
John * walked

In semantics terminology, the meaning of the sentence "John walked" is the proposition "Walked(John)", which is evaluated as true if John did indeed walk or false otherwise. A proposition is simply some claim about the state of the world. For example, in this case, "Walked(John)" is making the claim that the world is such that John walked.

### c. Common nouns

We've dealt with proper nouns (such as "John"), but what about regular nouns, like "dog"? Whereas "John" refers to a specific entity in the world, "dog" refers to a more general class of things. It turns out that we can think of nouns (like verbs) as denoting propositions. For example, the sentence we would say that [[John is a dog]] = dog(John). To allow this sort of interpretation, we will give "dog" the following meaning:

In [16]:
%%lamb
||dog|| = L x: Dog(x)

INFO (meta): Coerced guessed type for 'Dog_t' into <e,t>, to match argument 'x_e'


Even though "dog" does not refer to a specific entity, the phrase "the dog" does refer to a specific entity - namely, whichever dog is currently most relevant to the conversation. To facilitate this, we give the following definition for "the", where $\iota x$ means "the unique conversationally relevant entity x" (that symbol is the Greek letter iota):

In [17]:
%%lamb
||the|| = L f_<e,t> : Iota x_e : f(x) 

Thus, for example, we can view the meaning of "the dog" as follows, where the output of the cell should be read as "the unique conversationally relevant entity x such that x is a dog":

In [18]:
the * dog

### d. Transitive verbs

Try running the following cell:

In [19]:
John * (walked * (the * dog))

You should have gotten an error. This is because, when we defined "walked" above, we did not allow it to have a direct object; it was a function that only took one argument (its subject). Therefore, when we tried to give it two arguments as in the above cell, we got an error. To fix this, we'll need to create a second definition of "walked" where it is transitive. Think back to how we defined "times" as a two-layer lambda function; transitive verbs are handled similarly.

In [20]:
%%lamb
||walked2|| = L y: L x: Walked2(x,y)

INFO (meta): Coerced guessed type for 'Walked2_t' into <(e,e),t>, to match argument '(x_e, y_e)'


Now we can get the meaning for "John walked the dog" without an error:

In [21]:
John * (walked2 * (the * dog))

### e. Prepositional phrases

So far, most of the meanings we've built up may have seemed pretty obvious. But it becomes more interesting when we add in modifiers. For example, what should the meaning of "the dog near John" be? With just "the dog", we had the meaning $\iota$x . Dog(x), but now we want to add in the fact that the dog is near John. We would represent this as $\iota$x . Dog(x) $\wedge$ Near(x, John) (to be read as "the unique conversationally relevant entity x such that x is a dog and x is near John"; the $\wedge$ symbol means "and). In the cell below, we give a definition for "near" that accomplishes this:

In [22]:
%%lamb
||near|| = L z: L y_<e,t>: L x: y(x) & Near(x, z)

INFO (meta): Coerced guessed type for 'Near_t' into <(e,e),t>, to match argument '(x_e, z_e)'


In [23]:
the * (dog * (near * John))

Here are a few more words; fill in the relevant meanings:

In [24]:
%%lamb
||inside|| = L z: L y_<e,t>: L x: y(x) & Inside(x, z)
||on|| = L z: L y_<e,t>: L x: y(x) & On(x, z)
||sandwich|| = L x: Sandwich(x)
||house|| = L x: House(x)
||table|| = L x: Table(x)

INFO (meta): Coerced guessed type for 'Inside_t' into <(e,e),t>, to match argument '(x_e, z_e)'
INFO (meta): Coerced guessed type for 'On_t' into <(e,e),t>, to match argument '(x_e, z_e)'
INFO (meta): Coerced guessed type for 'Sandwich_t' into <e,t>, to match argument 'x_e'
INFO (meta): Coerced guessed type for 'House_t' into <e,t>, to match argument 'x_e'
INFO (meta): Coerced guessed type for 'Table_t' into <e,t>, to match argument 'x_e'


Now you should be able to run the following cell to get the meaning of "the sandwich inside the house on the table":

In [25]:
(the * (sandwich * (inside * (the * (house * (on * (the * table)))))))

But wait a second! Examine the above output carefully; it says that the house is on the table, which seems wrong. What we really want to say is that the sandwich is inside the house and that the sandwich is on the table. In other words, we really want to have [[the sandwich inside the house on the table]] = $\iota$x . Sandwich(x) $\wedge$ Inside(x, $\iota$x1 . House(x1)) $\wedge$ On(x, $\iota$x2.Table(x2)). Edit the cell above so that it gives you the correct meaning.

### f. Adjectives

Adjectives are another type of common modifier for nouns. Based on the prepositional phrase examples above, write the meaning for "blue" in the cell below so that "the blue dog" in the next cell evaluates properly as [[the blue dog]] = $\iota$x . Dog(x) $\wedge$ Blue(x):

In [26]:
%%lamb
||blue|| = L y_<e,t>: L x: y(x) & Blue(x)

INFO (meta): Coerced guessed type for 'Blue_t' into <e,t>, to match argument 'x_e'


In [27]:
the * (blue * dog)

### g. Quantifiers

The final types of words we'll discuss are the quantifiers "every" and "a", defined below:

In [28]:
%%lamb
||every|| = L f_<e,t> : L g_<e,t> : Forall x_e : f(x) >> g(x)
||a|| = L f_<e,t> : L g_<e,t> : Exists x_e : f(x) & g(x)

These definitions introduce a few more symbols. The symbol $\forall$ means "for all", the symbol $\exists$ means "there exists", and the symbol $\rightarrow$ means "implies". For example, running the cell below shows that the meaning of "every dog walked" is "for all entities x, the fact that x is a dog implies that x walked":

In [29]:
(every * dog) * walked

And the next cell shows that the meaning of "a dog walked" is "there exists an entity x such that x is a dog and x walked":

In [30]:
(a * dog) * walked

### h. Some puzzles

Below are meaning representations for several English words. Figure out which word goes with each representation (there may be multiple correct answers):

1. $\lambda$f . $\lambda$ x. f(x) $\wedge$ $\forall$ y . f(y) $\rightarrow$ height(y) $\leq$ height(x)
2. $\lambda$f . $\iota$ x. f(x) $\wedge$ possesses(you, x)
3. $\lambda$f . $\lambda$ x. f(x,x) 



# Part 3: Word embeddings

In [31]:
glove = open("/Users/tommccoy/srproj/data/glove.6B.50d.txt")

glove_dict = {}
for line in glove:
    parts = line.split()
    glove_dict[parts[0]] = np.array(list(map(float, parts[1:])))

NameError: name 'np' is not defined

In [32]:
import numpy as np

def cos_sim(vec_a, vec_b):
    return np.dot(vec_a, vec_b)/(np.linalg.norm(vec_a) * np.linalg.norm(vec_b))



In [33]:
cos_sim(np.array(list(glove_dict["the"])), np.array(list(glove_dict["you"])))

KeyError: 'the'

In [34]:
def find_closest(word_a):
    max_sim = 0.0
    closest = ""
    
    for word_b in glove_dict:
        dist = cos_sim(glove_dict[word_a], glove_dict[word_b])
        if dist > max_sim and word_b != word_a:
            max_sim = dist
            closest = word_b
            
    return closest
        

In [35]:
def find_closest_vec(vec, word_a, word_b, word_c):
    max_sim = 0.0
    closest = ""
    
    for word_d in glove_dict:
        dist = cos_sim(vec, glove_dict[word_d])
        if dist > max_sim and word_d != word_a and word_d != word_b and word_d != word_c:
            max_sim = dist
            closest = word_d
            
    return closest

In [36]:
find_closest("king")

''

In [37]:
cos_sim(glove_dict["cat"], glove_dict["dog"])

KeyError: 'cat'

In [38]:
def analogy(word_a, word_b, word_c):
    new_vec = glove_dict[word_a] - glove_dict[word_b] + glove_dict[word_c]
    return find_closest_vec(new_vec, word_a, word_b, word_c)

In [39]:
analogy("kitten", "cat", "dog")

KeyError: 'kitten'

In [40]:
analogy("king", "man", "woman")

KeyError: 'king'

In [41]:
analogy("man", "men", "women")

KeyError: 'man'

In [42]:
analogy("ate", "eat", "run")

KeyError: 'ate'

# Part 4: Tree RNNs

We can now combine these two ideas to create sentence representations that a computer can actually use! (have tree, initialize leaves with word embeddings, train LSTM to compose together into single vector)