# Semantics and Pragmatics, KIK-LG103

## Lab session 3, Part 3: Compositional distributional lexical semantics

---

<font color="red">**This page contains interactive graphics. It only works properly if you change to the "classic notebook" user interface. Start by selecting *Launch Classic Notebook* from the *Help* menu.**</font>

---

The final topic of today's lab is compositionality of meaning elements within words.

First, remember to import the necessary modules.

In [None]:
%matplotlib notebook
import sys
sys.path.append("../../../sem-prag-2025/src")
import plot_utils

embeddings, mapping = plot_utils.get_embeddings()

### Section 3.1: Vector arithmetic with Word2Vec

In the slides we saw some interesting properties of W2V- and GloVe-embeddings under the heading *Compositional meaning in Word2Vec and GloVe*. The examples show how the relations between the word embeddings in vector space match our intuitions; for example, subtracting the embedding for *man* from *king* and adding *woman* yields *queen* (in the optimal case).

    king - man + woman = queen

In this section we will try to visualize this process.

---

**Exercise 3.1.1** Run the code cell below. You will see a single vector for the word *king*. Now change the function argument `minus` to `"man"` instead of `None`. You should see a red vector starting from the end point of the `king` vector. The red vector corresponds to `man`, but it points in the opposite direction of what plain `man` would do. The red vector thus corresponds to a vector for `-man`. If you now follow the vectors for `king` and `-man`, you will end up at the point for `king - man`. A yellow vector going directly from the origin to the end point of `king - man` is shown as well.

---

**Exercise 3.1.2** Now change the argument `plus` to `"woman"`. You should see a new blue vector, which is the vector for `woman`. The blue vector starts at the end point of `king - man`. Again, there is a yellow vector, which points at the final result:

    king - man + woman

---

**Exercise 3.1.3** Change the argument `results` to `[ "queen", "princess", "prince", "maid" ]`. Now you will see some dots indicating where these four words are *actually* in this two-dimensional projection of the word2vec space. If the compositionality works as it should, the yellow vector should point towards the point labeled *queen*. Is that what happens?

---

In [None]:
plot_utils.plot_w2v_algebra(
    embeddings=embeddings,
    mapping=mapping,
    base="king",
    minus=None,
    plus=None,
    results=[]
)

---

**Exercise 3.1.4** In the code cell below, try out vector arithmetic on some words of your own choice. Try to figure out equations of the following form:

    king - man + woman = queen
    paris - france + germany = berlin
    bigger - big + small = smaller
    greece - warm + cold = ?

Do the result make any sense? Evaluating these things is not that simple when all you have is a (two-dimensional) figure. You can decide the number of words that you put in the `results` field.

Leave a nice result in the code cell and continue to the next exercise.

---

In [None]:
plot_utils.plot_w2v_algebra(
    embeddings=embeddings,
    mapping=mapping,
    base="???",      # change this
    minus=None,      # change this
    plus=None,       # change this
    results=[]       # change this
)

---

**Exercise 3.1.5** In the next code cell, find some good example of *additive* compositionality, that is you do not have any "minus" word at all.

    word1 + word2 = ?

---

In [None]:
plot_utils.plot_w2v_algebra(
    embeddings=embeddings,
    mapping=mapping,
    base="???",      # change this
    minus=None,      # don't change this line here
    plus=None,       # change this
    results=[]       # change this
)

---

**Exercise 3.1.6** In the next code cell, find some good example of *subtractive* compositionality, that is you do not have any "plus" word at all. 

    word1 - word2 = ?

(In this exercise, please do not just reverse some of the compositions you found above for additive compositionality.)

---

In [None]:
plot_utils.plot_w2v_algebra(
    embeddings=embeddings,
    mapping=mapping,
    base="???",      # change this
    minus=None,      # change this
    plus=None,       # don't change this line here
    results=[]       # change this
)

---

**Exercise 3.1.7** Try to find some examples of *prejudice* or *bias* in the word2vec embeddings. The bias can be related, for instance, to gender, race or ethnicity, such as:

    doctor - man + woman = nurse
    
Come up with your own example.

---

In [None]:
plot_utils.plot_w2v_algebra(
    embeddings=embeddings,
    mapping=mapping,
    base="doctor",      # change this
    minus="man",        # change this
    plus="woman",       # change this
    results=[ "nurse", "professor", "gynecologist",
              "midwife", "wife", "scientist", "nanny" ] # change this
)

After this you can continue with the home assignment.