<a href="https://colab.research.google.com/github/sarabartl/SemanticProjectionMetaphor/blob/main/UpIsDownProjections.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Semantic Projection for Conventional Metaphor

In this notebook, I adapt Grand et al.'s (2022) method of 'semantic projection' to metaphor. As a case-study, I use the previously attested mapping between vertical orientation in space and emotional valence. Like Grand et al. (2022), I use 42B GloVe embeddings.

# 1. Setting Up

In [None]:
!pip install gensim
!pip install numpy

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
import numpy
from numpy import linalg
import pandas as pd

In [None]:
!wget 'https://huggingface.co/stanfordnlp/glove/resolve/main/glove.42B.300d.zip'

--2023-05-03 08:15:50--  https://nlp.stanford.edu/data/glove.42B.300d.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://downloads.cs.stanford.edu/nlp/data/glove.42B.300d.zip [following]
--2023-05-03 08:15:50--  https://downloads.cs.stanford.edu/nlp/data/glove.42B.300d.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1877800501 (1.7G) [application/zip]
Saving to: ‘glove.42B.300d.zip’


2023-05-03 08:21:44 (5.08 MB/s) - ‘glove.42B.300d.zip’ saved [1877800501/1877800501]



In [None]:
!unzip glove.42B.300d.zip
glove_file = 'glove.42B.300d.txt'
from gensim.models import KeyedVectors
glove = KeyedVectors.load_word2vec_format(glove_file, binary=False, no_header=True)

# 2) Semantic Projections


Conceptual Metaphor Theory posits that we use concrete domains of knowledge to think and talk about more abstract domains. One example of a concrete domain is orientation in space (e.g. horizontal or vertical orientation). According to experimental and linguistic evidence for example, we understand emotional valence (good/bad) in terms of vertical orientation (up/down). This leads to expression such as "feeling up" meaning feeling good and experiments where.

I construct two semantic scales to represent the source domain of vertical orientation (up-down, high-low) and project four lexical items from the target domain of emotional valence (best, better, worse, worst). The higher the dotproduct of a word's projection, the closer to the 'up'/'high' side of the scale, the lower the dotproduct of a word, the closer to the 'down'/'low' side of the scale.

In [None]:
up_down = model['up'] - model['down'] # semantic scale 1
high_low = model['high'] - model['low'] # semantic scale 2

valence = ['worst', 'worse', 'better', 'best'] #target domain vocabulary to project

up_down_dots = [] #empty list for projections on up/down scale
high_low_dots = [] #empty list for projections on high/low scale

#### Up/Down Projections

In [None]:
for i in valence: #get the dotproduct for each word in valence list
  dotproduct = numpy.dot(model[i], up_down)
  up_down_dots.append(dotproduct) #write dotproduct to list
#create a df with the words and their dotproducts
up_down_projections = pd.DataFrame(list(zip(valence, up_down_dots)),
                          columns = ['valence', 'dotproduct'])
up_down_projections.sort_values('dotproduct') #order words by dotproduct value

Unnamed: 0,valence,dotproduct
1,worse,0.165529
0,worst,0.478536
2,better,2.944689
3,best,3.72085


#### High/Low Projections

In [None]:
#repeat the same for high/low scale
for i in valence:
  dotproduct = numpy.dot(model[i], high_low)
  high_low_dots.append(dotproduct)
high_low_projections = pd.DataFrame(list(zip(valence, high_low_dots)),
                                    columns = ['valence', 'dotproduct'])
high_low_projections.sort_values('dotproduct')

Unnamed: 0,valence,dotproduct
1,worse,-0.996801
0,worst,1.15819
2,better,2.046423
3,best,3.038534


# 3) Visualisation

Visualisation of the target domain tokens onto the semantic scale.
--> to come
