# Introduction to VADER Sentiment Analyzer in Python

### Getting Started

To use these libraries you need to make sure you have the appropriate packages
<br>
installed on your machine. The libraries used in this guide are pandas and vaderSentiment
<br>
Run the following commands on the command line to install these libraries:
- pip install pandas<br>
- pip install vaderSentiment

### Download Dataset

<p>You can download the youtube dataset from kaggle at: https://www.kaggle.com/general/181714</p>

<p>Place the .csv file in the same directory as your python script.<br>
Now we are ready to write code.</p>

### Import Libraries
<br>
First we need to import the following libraries:

In [1]:
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

### How VADER Works

<div style="width:60%;">
<p>To use VADER, you must first create a SentimentIntensityAnalyzer object and then use the polarity_scores() function to evaluate the sentiment of the text.</p>
</div>

In [3]:
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

# Create Sentiment Intensity Analyzer Object
sia = SentimentIntensityAnalyzer()

# get polarity scores for a input string
sentiment_dict = sia.polarity_scores("Wow the food at this restaurant looks delicious!")
print(sentiment_dict)

{'neg': 0.0, 'neu': 0.435, 'pos': 0.565, 'compound': 0.8313}


<div style="width:85%;">
<p>The polarity_scores() function is a member function of SentimentIntensityAnalyzer that takes a given string as input and returns
a dictionary containing 4 scores:
</p>
</div>
<div>
<ul>
<li>negative</li>
<li>neutral</li>
<li>positive</li>
<li>compound</li>
</ul>
</div>
<div style="width:85%;" >
<p style="">
The function uses the given string to calculate negative, neutral, & positive polarity scores. Then, using these scores, calculates a compound score. The compound score will always be between -1 and 1. Compound scores closer to +1 indicate a positive sentiment, and compound scores closer to -1 indicate a more negative sentiment. These scores can be used to classify the sentiment of the sentence. In the example above, we can see the compound score is 0.8436 indicating a very positive score. Feel free to test this function with different strings to see what kinds of values it assigns to different sentences.
</p>
    
</div>

### Get the Youtube Comments from the dataset

<div style="width:85%">
<p>Now that we have the modules we need and know how to use the SentimentIntensityAnalyzer object, we're ready to use VADER on comments from a dataset. We can use the pandas module to read comments from a csv file.</p>
</div>

In [4]:
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

# get a dataframe containing the comments using pandas
df = pd.read_csv("youtube_dataset.csv")
comments = df['Comment'][:3]

# print each comment
for i, comment in enumerate(comments, 1):
    print("comment", i, ":")
    print(comment)
    print()

comment 1 :
The people who liked this comment is officially before 7B views

comment 2 :
- Wait, it's 7B views
- Always has been

comment 3 :
*Teacher: What is the population of the Earth?*

*Me: Around one Despacito*



<div style="width:84%;">
<p>Here we used pandas to read the first 3 comments from the dataset and store them into a list. Then we print each comment to the screen.</p>
</div>

### Using VADER to Analyze Comments in the Dataset

<div style="">
<p>Now lets try using the SentimentIntensityAnalyzer on comments from our dataset to get their sentiment</p>
</div>

In [5]:
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

df = pd.read_csv("youtube_dataset.csv")
comments = df['Comment'][:3]

# Create Sentiment Intensity Analyzer object from vaderSentiment
sia = SentimentIntensityAnalyzer()


for i, comment in enumerate(comments, 1):
    
    sentiment_dict = sia.polarity_scores(comment)
    
    print("Comment " + str(i) + ":")
    print(comment)
    print("sentiment_dict", sentiment_dict,)
    print()

comment 1 :
The people who liked this comment is officially before 7B views
sentiment_dict {'neg': 0.0, 'neu': 0.781, 'pos': 0.219, 'compound': 0.4215}

comment 2 :
- Wait, it's 7B views
- Always has been
sentiment_dict {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}

comment 3 :
*Teacher: What is the population of the Earth?*

*Me: Around one Despacito*
sentiment_dict {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}



<br>

### Classifying Sentiment of Youtube Comments

<div style="">
<p>Now that we have the compound score, we can choose a range for what we classify as positive, negative,
and neutral. For this example, we will classify comments using these rules: <br><br>
negative : compound_score <= -0.5<br>
neutral  : -0.5 < compound_score < 0.5 <br>
positive : compound_score >= 0.5 <br><br>
</p>
</div>

In [9]:
import pandas as pd
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

#Counters to keep track of positive, negative, and neutral sentiments
pos = 0
neg = 0
neutral = 0

df = pd.read_csv("youtube_dataset.csv")
comments = df['Comment'][:3]

sia = SentimentIntensityAnalyzer()

#This Loop goes through the first 3 comments in the dataset and gives them sentiment ratings.
for i, comment in enumerate(comments, 1):

    # Display the comment     
    print("Comment " + str(i) + ":")
    print(comment) 
    
    # polarity_scores is a method of SentimentIntensityAnalyzer
    # and returns a dictionary containing pos, neg, neu, and compound scores.
    sentiment_dict = sia.polarity_scores(comment)
    
    print("Overall sentiment dictionary is :", sentiment_dict)
    print("Sentence Overall Rated As ")
    
    # decide sentiment as positive, negative and neutral and count each sentence based on compound score.
    if sentiment_dict['compound'] >= 0.05 :
        print("Positive")
        pos += 1
        
    elif sentiment_dict['compound'] <= - 0.05 :
        print("Negative") 
        neg += 1
        
    else :
        print("Neutral")
        neutral += 1
    print()

#display the number of sentences rated positive, negative, or neutral
print("positive comments = ", pos, " negative comments = ", neg, " neutral comments = " , neutral)

Comment 1:
The people who liked this comment is officially before 7B views
Overall sentiment dictionary is : {'neg': 0.0, 'neu': 0.781, 'pos': 0.219, 'compound': 0.4215}
Sentence Overall Rated As 
Positive

Comment 2:
- Wait, it's 7B views
- Always has been
Overall sentiment dictionary is : {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
Sentence Overall Rated As 
Neutral

Comment 3:
*Teacher: What is the population of the Earth?*

*Me: Around one Despacito*
Overall sentiment dictionary is : {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}
Sentence Overall Rated As 
Neutral

positive comments =  1  negative comments =  0  neutral comments =  2


<br>
<div>
<p>Now you should understand how VADER Sentiment Intensity Analyzer works. Remember that VADER is most accurate when analyzing social media posts.
However, VADER struggles to give accurate sentiment ratings when given a long pieces of text and can struggle with subtle nuances of natural language such as sarcasm, certain negations, and irony.</p>

</div>
<br>