# NRC Emotional Lexicon

This is the [NRC Emotional Lexicon](http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm): "The NRC Emotion Lexicon is a list of English words and their associations with eight basic emotions (anger, fear, anticipation, trust, surprise, sadness, joy, and disgust) and two sentiments (negative and positive). The annotations were manually done by crowdsourcing."

I don't trust it, but everyone uses it.

In [1]:
import pandas as pd

In [2]:
filepath = "./NRC-Sentiment-Emotion-Lexicons/NRC-Emotion-Lexicon-v0.92/NRC-Emotion-Lexicon-Wordlevel-v0.92.txt"
emolex_df = pd.read_csv(filepath,  names=["word", "emotion", "association"], skiprows=45, sep='\t')
emolex_df.head(12)

Unnamed: 0,word,emotion,association
0,abandonment,joy,0
1,abandonment,negative,1
2,abandonment,positive,0
3,abandonment,sadness,1
4,abandonment,surprise,1
5,abandonment,trust,0
6,abate,anger,0
7,abate,anticipation,0
8,abate,disgust,0
9,abate,fear,0


Seems kind of simple. A column for a word, a column for an emotion, and whether it't associated or not. You see "aback aback aback aback" because there's a row for every word-emotion pair.

## What emotions are covered?

Let's look at the 'emotion' column. What can we talk about?

In [3]:
emolex_df.emotion.unique()

array(['joy', 'negative', 'positive', 'sadness', 'surprise', 'trust',
       'anger', 'anticipation', 'disgust', 'fear'], dtype=object)

In [4]:
emolex_df.emotion.value_counts()

joy             14178
sadness         14178
positive        14178
trust           14178
negative        14178
surprise        14178
disgust         14177
anticipation    14177
fear            14177
anger           14177
Name: emotion, dtype: int64

## How many words does each emotion have?

Each emotion doesn't have 14182 words associated with it, unfortunately! `1` means "is associated" and `0` means "is not associated."

We're only going to care about "is associated."

In [5]:
emolex_df[emolex_df.association == 1].emotion.value_counts()

negative        3322
positive        2312
fear            1473
anger           1245
trust           1230
sadness         1189
disgust         1058
anticipation     839
joy              689
surprise         534
Name: emotion, dtype: int64

In theory things could be *kind of* angry or *kind of* joyous, but it doesn't work like that. If you want to spend a few hundred dollars on Mechnical Turk, though, *your own personal version can.*

## What if I just want the fear words?

In [6]:
emolex_df[(emolex_df.association == 1) & (emolex_df.emotion == 'fear')].word

89          abduction
129             abhor
139         abhorrent
249        abominable
259       abomination
289          abortion
389           absence
589             abuse
629             abyss
779          accident
789        accidental
1079         accursed
1109          accused
1119          accuser
1129         accusing
1389          acrobat
1609            adder
1859       adjudicate
2089       admonition
2209           adrift
2249          advance
2359          adverse
2369        adversity
2739          afflict
2759       affliction
2799          affront
2849           afraid
2879        aftermath
2919              aga
3039       aggression
             ...     
139159         weight
139189        weighty
139219         weirdo
139469        whimper
139529      whirlpool
139539      whirlwind
139719         wicked
139889     wilderness
139899       wildfire
139969           wimp
139979          wimpy
139989          wince
140249          witch
140259     witchcraft
140319    

## Reshaping

You can also reshape the data in order to look at it a slightly different way

In [7]:
emolex_words = emolex_df.pivot(index='word', columns='emotion', values='association').reset_index()
emolex_words.head()

emotion,word,anger,anticipation,disgust,fear,joy,negative,positive,sadness,surprise,trust
0,,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,abandonment,,,,,0.0,1.0,0.0,1.0,1.0,0.0
2,abate,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,abatement,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,abba,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0


You can now pull out individual words...

In [8]:
# If you didn't reset_index you could do this more easily
# by doing emolex_words.loc['charitable']
emolex_words[emolex_words.word == 'charitable']

emotion,word,anger,anticipation,disgust,fear,joy,negative,positive,sadness,surprise,trust
1998,charitable,0.0,1.0,0.0,0.0,1.0,0.0,1.0,0.0,0.0,1.0


...or individual emotions....

In [9]:
emolex_words[emolex_words.anger == 1].head()

emotion,word,anger,anticipation,disgust,fear,joy,negative,positive,sadness,surprise,trust
14,abhor,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0
15,abhorrent,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0
24,abolish,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0
27,abomination,1.0,0.0,1.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0
60,abuse,1.0,0.0,1.0,1.0,0.0,1.0,0.0,1.0,0.0,0.0


...or multiple emotions!

In [10]:
emolex_words[(emolex_words.joy == 1) & (emolex_words.negative == 1)].head()

emotion,word,anger,anticipation,disgust,fear,joy,negative,positive,sadness,surprise,trust
58,abundance,0.0,1.0,1.0,0.0,1.0,1.0,1.0,0.0,0.0,1.0
1015,balm,0.0,1.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0
1379,boisterous,1.0,1.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0
1913,celebrity,1.0,1.0,1.0,0.0,1.0,1.0,1.0,0.0,1.0,1.0
2001,charmed,0.0,0.0,0.0,0.0,1.0,1.0,1.0,0.0,0.0,0.0


The useful part is going to be just getting words for a **single emotion.**

In [12]:
# Angry words
emolex_words[emolex_words.fear == 1].word

10         abduction
14             abhor
15         abhorrent
26        abominable
27       abomination
30          abortion
40           absence
60             abuse
64             abyss
79          accident
80        accidental
109         accursed
112          accused
113          accuser
114         accusing
140          acrobat
162            adder
187       adjudicate
210       admonition
222           adrift
226          advance
237          adverse
238        adversity
275          afflict
277       affliction
281          affront
286           afraid
289        aftermath
293              aga
305       aggression
            ...     
13916         weight
13919        weighty
13922         weirdo
13947        whimper
13953      whirlpool
13954      whirlwind
13972         wicked
13989     wilderness
13990       wildfire
13997           wimp
13998          wimpy
13999          wince
14025          witch
14026     witchcraft
14032      withstand
14037            woe
14077        