# VADER Sentiment Analysis with Python and NLTK

In this lecture we're going to show you how to use VADER sentiment analysis with Python and NLTK.

OK let's begin by importing NLTK

In [1]:
import nltk

And then what you need to do is download the VADER lexicon and you only need to do this once.

In [2]:
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     /Users/marcosaguilerakeyser/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

So once you've downloaded the VADER lexicon you should be able to import it

In [24]:
from nltk.sentiment.vader import SentimentIntensityAnalyzer

and then create an instance of it

In [33]:
sid = SentimentIntensityAnalyzer()

And what VADER's `SentimentIntensityAnalyzer` does is it simply takes in the string and returns a dictionary of scores in four categories:

- Negative 
- Neutral 
- Positive 
- A compound score which is computed by normalizing the negative neutral and positive scores

So let's create a really simple string:

In [34]:
a = "This is a good movie"

In [35]:
a

'This is a good movie'

Now we are going to pass in the string:

In [36]:
sid.polarity_scores(a)

{'neg': 0.0, 'neu': 0.508, 'pos': 0.492, 'compound': 0.4404}

And you get back this dictionary which has some negative value, a neutral value, a positive value and then a compound value which essentially normalizing these three values here.

So, as we expect there is no negative value since this is a good movie. It has some neutral words or tones in it and then it has also some positive tones. And the max value for any of these four scorews is 1.0.

So now let's try a more complicated string. Notice we're going to capitalize "ever made" and have three exclamation points.

In [37]:
a = "This was the best, most awesome movie EVER MADE!!!"

As we previously mentioned, VADER is smart enough to understand things like repeated punctuation and capitalization.

In [38]:
sid.polarity_scores(a)

{'neg': 0.0, 'neu': 0.425, 'pos': 0.575, 'compound': 0.8877}

And here we can see it's again more positive than the previous one.

And we can see here that the compound score is much more positive because neutral also dropped.

Finally let's go ahead and have a very negative string.

In [39]:
a = "This was the WORST movie that has ever disgraced the screen"

So quite a negative review. Let's see if the VADER picks it up.

In [40]:
sid.polarity_scores(a)

{'neg': 0.465, 'neu': 0.535, 'pos': 0.0, 'compound': -0.8331}

And here we can see that now there is no positive, it's just neutral and negative and so happens is the compound score then becomes negative. 

- So we can see here a compound score of zero would be completely neutral
- A compound score above zero indicates some sort of positive score 
- A compound score below zero indicates some sort of negative score

## Analize Amazon reviews with VADER

So that we're going to do is show you how you can use VADER to analyze Amazon reviews.



















OK.

Coming up next we'll go ahead and run a sentiment analysis project.

We'll see you there.