# Lab2: Assignment - the lexicon of emotionally loaded words

Emotions are difficult to grasp and our language to express emotions is subtle and nuanced. What are the words that we can use to express emotions?

In this assignment, you need to find emotion expressing words through Wordnet-NLTK and through Word embeddings.

We start from the six basic emotions that Ekman and his team claimed to hold universally across the facial expressions of people from many different cultures:

<img src="images/ekman-faces.png" width="500" height="600">

The emotions are: ```anger```, ```disgust```, ```fear```, ```joy```, ```sadness``` and ```surprise```. Many emotion detection modules in NLP use these six emotions.

Refeernce: Ekman, Paul, Wallace V. Friesen, Maureen O'sullivan, Anthony Chan, Irene Diacoyanni-Tarlatzis, Karl Heider, Rainer Krause et al. "Universals and cultural differences in the judgments of facial expressions of emotion." Journal of personality and social psychology 53, no. 4 (1987): 712


## 1. Finding emotional words in WordNet

When you ask ChatGPT how to extract emotion words from WordNet-NLTK, it will give you more or less the following answer and code:

### Text generated by ChatGPT
To get **all words related to basic emotion terms** (like *sadness, anger, disgust, surprise, fear, joy*) using **WordNet (NLTK)**, you can:

1. Loop over each basic emotion term.
2. Query **adjective, noun, and verb synsets** (since emotions can span multiple parts of speech).
3. Collect **related words**:

   * Synonyms (`lemmas`)
   * Antonyms (`lemma.antonyms()`)
   * Similar adjectives (`similar_tos()` for adjectives)
   * Derivationally related forms
   * Hypernyms, hyponyms, meronyms, etc. (for nouns and verbs)

---


In [1]:
### Code generated by ChatGPT
### âœ… General Code to Expand Emotion Terms

from nltk.corpus import wordnet as wn

def expand_emotion_word(word):
    related_words = set()
    
    # Check all parts of speech
    for pos in [wn.NOUN, wn.ADJ, wn.VERB]:
        synsets = wn.synsets(word, pos=pos)
        for syn in synsets:
            # Synonyms
            for lemma in syn.lemmas():
                related_words.add(lemma.name())

                # Antonyms
                for antonym in lemma.antonyms():
                    related_words.add(antonym.name())

                # Derivationally related forms
                for deriv in lemma.derivationally_related_forms():
                    related_words.add(deriv.name())

            # Adjective similarity
            if pos == wn.ADJ:
                for similar in syn.similar_tos():
                    for lemma in similar.lemmas():
                        related_words.add(lemma.name())

            # Hypernyms, hyponyms, meronyms, holonyms
            for related_syn in syn.hypernyms() + syn.hyponyms() + syn.part_meronyms() + syn.member_holonyms():
                for lemma in related_syn.lemmas():
                    related_words.add(lemma.name())

    return sorted(related_words)

In [2]:
### Code generated by ChatGPT

basic_emotions = ['sadness', 'anger', 'disgust', 'surprise', 'fear', 'joy']

wordnet_emotion_expansions = {}

for emotion in basic_emotions:
    words = expand_emotion_word(emotion)
    wordnet_emotion_expansions[emotion] = words

for emotion, words in wordnet_emotion_expansions.items():
    print(f"{emotion.upper()}:{len(words)}")
    print(words)
    print()

SADNESS:37
['bereavement', 'cheerlessness', 'dejectedness', 'depression', 'desolation', 'dispiritedness', 'dolefulness', 'downheartedness', 'feeling', 'forlornness', 'gloominess', 'gloomy', 'happiness', 'heaviness', 'loneliness', 'low-spiritedness', 'lowness', 'lugubrious', 'lugubriousness', 'melancholy', 'misery', 'mourning', 'poignance', 'poignancy', 'regret', 'rue', 'ruefulness', 'sad', 'sadness', 'sorrow', 'sorrowful', 'sorrowfulness', 'tearfulness', 'uncheerfulness', 'unhappiness', 'unhappy', 'weepiness']

ANGER:57
['aggravate', 'anger', 'angriness', 'angry', 'annoyance', 'arouse', 'bad_temper', 'bridle', 'chafe', 'choler', 'choleric', 'combust', 'dander', 'deadly_sin', 'elicit', 'emotion', 'emotional_arousal', 'enkindle', 'enrage', 'enragement', 'evoke', 'exacerbate', 'exasperate', 'experience', 'feel', 'fire', 'fury', 'gall', 'hackles', 'huffiness', 'ill_temper', 'incense', 'indignation', 'infuriate', 'infuriation', 'ira', 'ire', 'irk', 'kindle', 'madden', 'madness', 'miff', 'mo

## ASSIGNMENT:

The basic emotion terms are all nouns, except for ```surprise``` which can also be a verb. This may have an impact on the expansion because the ```nets``` for each Part-of-Speech are poorly connected.
Answer the following questions:

1. Create another list called ```adjectival_basic_emotions``` that contains six adjectives that correspond with the six nouns.
2. Extend the dictionary ```wordnet_emotion_expansions``` with six adjectival expansions. 
4. Which words in these pairs are inconsistent with the emotion or do not imply this emotion?
5. Which of the Wordnet relations causes these inconsistencies in your opinion? How could you make the function more consistent?
3. For each pair of noun-adjective, how much overlap is there and is the overlap more consistent?

TIPS:

- list the nouns and adjectives in the order of the bsic emotion and pair the list using the ```zip``` function, e.g. ```for noun, adjective in zip(basic_emotions, adjectives_basic_emotions):```
- to get the overlap of two lists `a` and `b`, you can use the following expression: ```overlap = list(set(a)&set(b))```, which first turns the list into a set and next applies the ```&``` operator.

## [YOUR CODE AND DISCUSSION GO HERE]

## 2. Get emotions words through word embeddings

## ASSIGNMENT:

Wordembeddings can also be used to expand words to related words. Create two more dictionaries for emotion words derived from ```Wiki2Vec``` and ```Leipzig2Vec```. 

1. Create a ```wiki2vec_emotion_dict``` dictionary by expanding the six nouns and six adjectives to the top 50 most similar using the Wiki2Vec embeddings. 
2. Create a ```leipzig2vec_emotion_dict``` dictionary by expanding the six nouns and six adjectives to the top 50 most similar using the Leipzig2Vec embeddings.
3. For each noun-adjective pair, print the 50 most similar words for each Word2Vec model.
4. How correct are these most similar words?
5. How different are the lists across the models and compared to the WordNet expansion?

### 2.1 Wiki2Vec expansion:

Download the Wiki2Vec embeddings for your target language and load the model using the Gensim package. If you cannot load the complete model, load part of it.

## TIP: 
You need to check if the emotion word actually is included in the vcabulary of the embedding model. If not, add an empty list of words to the dictionary.

## [HERE COMES YOUR CODE TO EXPAND THE WORDS USING WIKI2VEC]

### 2.2 Embeddings from the Leipzig corpus:

Download a text corpus from the Leipniz corpora collection. Build an embedding model from that corpus as explained in the notebook **Lab2.3.Creating_Wordembeddings** or load the model from disk if you already built and saved it. 

Get the words that are most similar to the six basic emotions in the same way as you did for the Wiki2vec embeddings. 

## TIP: 
You need to check if the emotion word actually is included in the vcabulary of the embedding model. If not, add an empty list of words to the dictionary.

## [HERE COMES YOUR CODE TO EXPAND THE WORDS USING LEIPZIG2VEC]

### 2.3. Comparison

## [HERE COMES YOUR CODE TO PRINT THE EXPANDED WORDS FOR EACH MODEL AND THE OVERLAP ACROSS THE TWO MODELS]

## [HERE COMES YOUR ANALYSIS AND DISCUSSION]

## 2.4 Lexical baseline

Create a dictionary with an entry for each basic emotion and add the list of the combined unique words obtained from the wordnet expansions and the word2vec expansions. Save this dictionary to a JSON file on disk because you need it later for your final assignment.

In [5]:
# [HERE IS SOME DUMMY CODE. USE IT TO SAVE YOUR DICTIONARY OF EMOTION WORDS DERIVED FROM WORDNERT AND THE WORD2VEC MODELS]
import json
### dummy dictionary
dictionary = {'sadness': ["sad"], 'anger': ["angry"], 'disgust': ["disgusting"], 'surprise': ["surprise"], 'fear': ['fear'], 'joy': ['joy']}

### saving the dictionary to a JSON file
filename = "my_emotion_dictionary.json"
with open(filename, "w") as f:
    json.dump(dictionary, f)

### loading the dictionary from the file
with open('my_emotion_dictionary.json', 'r') as file:
    loaded_dictionary = json.load(file)


Create a function that can score a text for the number of emotion words it contains of each basic emotion. This function is called a ```lexical baseline```  and you will use this in your final assignment.

In [6]:
#[HERE COMES YOU CODE TO DEFINE THE LEXICAL BASELINE AND APPLY IT TO THE TURNS
my_turns= ["I am an angry chatbot", "You scare me!", "You cannot make me happy", "That does not surprise me", "I have a lot of fun talking to you", "You are disgusting, human"]

def lexical_baseline(dictionary={}, text=""):
    print(text)
     #[HERE COMES THE FUNCTION THAT CAN SCORE A TEXT FOR THE BASIC EMOTIONS]

#[APPLY THIS FUNCTION TO my_turns]

for turn in my_turns:
    lexical_baseline(loaded_dictionary, turn)

I am an angry chatbot
You scare me!
You cannot make me happy
That does not surprise me
I have a lot of fun talking to you
You are disgusting, human


# End of notebook