# NLTK

Natural Language Toolkit is a library for NLP tasks using Python. It provides easy-to-use
interfaces, text corpora, and lexical resources. It can do classifications, tokenizations, stemmings, taggings, parsings, and sementic reasonings. In this starter file, we will be simply trying out the VADER Sentiment Analysis tool.

In [1]:
import nltk

### Using NLTK's Pre-Trained Sentiment Analyzer. 
VADER(Valence Aware Dictionary and Sentiment Reasoner) is a pretrained NLTK analyzing model that is best suited for languages used in social media, those with short sentences with slang and abbreviations. Although they are less accurate when examining the more structured, essay-like sentences with longer length, they are useful beginning tools as they are pretrained with less overhead for getting initial results

In [13]:
nltk.download("vader_lexicon")


[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /Users/weilezheng/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

 VADER is the sentiment analysis tool, the VADER Lexicon is the dictionary it uses, and the NLTK Sentiment Intensity Analyzer is the module in NLTK that uses VADER to perform sentiment analysis.

In [16]:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()
print(analyzer.polarity_scores("Love you!"))
print(analyzer.polarity_scores("I love data science!"))

{'neg': 0.0, 'neu': 0.182, 'pos': 0.818, 'compound': 0.6696}
{'neg': 0.0, 'neu': 0.308, 'pos': 0.692, 'compound': 0.6696}


Negative, neutral, and positive scores are normalized to add up to 1. Compound score are normalized to range from -1 to 1 with positive value being overall positive sentiment and vice versa.

Test it out on a few more lines. Tweak the variables. 

In [18]:
# Obvious Postive
postive = "I love Professor Juett's class"
print(f"Obvious Positive {analyzer.polarity_scores(postive)}")

# Obvious Negative
negative = "I hate EECS376"
print(f"Obvious Negative {analyzer.polarity_scores(negative)}")

# Sarcasm
sarcastic = "Oh yes, clearly adding more meetings to my schedule is exactly what I need to boost my productivity."
print(f"Sarcastic {analyzer.polarity_scores(sarcastic)}")

# Words with different meaning in different context
context1 = "He was killing it at the game last week"
context2 = "The math homework is killing me"
print(f"Context 1 {analyzer.polarity_scores(context1)}")
print(f"Context 2 {analyzer.polarity_scores(context2)}")


Obvious Positive {'neg': 0.0, 'neu': 0.417, 'pos': 0.583, 'compound': 0.6369}
Obvious Negative {'neg': 0.787, 'neu': 0.213, 'pos': 0.0, 'compound': -0.5719}
Sarcastic {'neg': 0.0, 'neu': 0.633, 'pos': 0.367, 'compound': 0.7964}
Context 1 {'neg': 0.355, 'neu': 0.645, 'pos': 0.0, 'compound': -0.6597}
Context 2 {'neg': 0.468, 'neu': 0.532, 'pos': 0.0, 'compound': -0.6597}
