# Rule based Movie Sentiment Analysis

VADER (Valence Aware Dictionary and sEntiment Reasoner) is a lexicon and rule-based sentiment analysis tool. Vader has its lexicon where words are given some value, positive and negative. It also takes care of how positive or negative. for e.g. Best has higher positive value than better. VADER produces four sentiment metrics; positive, neutral, negative and compound. 

In this code, we intend to use our own positive and negative reviews to check the accuracy of the VADER sentiment analysis.

In [1]:
import nltk
from nltk.sentiment import vader

In [2]:
sia = vader.SentimentIntensityAnalyzer()

Using the polarity_scores method to get the polarity scores

In [3]:
sia.polarity_scores('Good')

{'compound': 0.4404, 'neg': 0.0, 'neu': 0.0, 'pos': 1.0}

In [4]:
sia.polarity_scores('Terrible')

{'compound': -0.4767, 'neg': 1.0, 'neu': 0.0, 'pos': 0.0}

In [5]:
pos_review= 'C:/Users/roush/Downloads/Python Sentiment Analysis/rt-polaritydata/rt-polaritydata/rt-polarity.pos'

In [6]:
with open(pos_review,'r') as f:
    positive_review = f.readlines()

In [7]:
positive_review[:5]

['the rock is destined to be the 21st century\'s new " conan " and that he\'s going to make a splash even greater than arnold schwarzenegger , jean-claud van damme or steven segal . \n',
 'the gorgeously elaborate continuation of " the lord of the rings " trilogy is so huge that a column of words cannot adequately describe co-writer/director peter jackson\'s expanded vision of j . r . r . tolkien\'s middle-earth . \n',
 'effective but too-tepid biopic\n',
 'if you sometimes like to go to the movies to have fun , wasabi is a good place to start . \n',
 "emerges as something rare , an issue movie that's so honest and keenly observed that it doesn't feel like one . \n"]

In [8]:
sia.polarity_scores(positive_review[0])

{'compound': 0.3612, 'neg': 0.0, 'neu': 0.918, 'pos': 0.082}

In [9]:
neg_review= 'C:/Users/roush/Downloads/Python Sentiment Analysis/rt-polaritydata/rt-polaritydata/rt-polarity.neg'

In [10]:
with open(neg_review,'r') as f:
    negative_review = f.readlines()

In [11]:
negative_review[:5]

['simplistic , silly and tedious . \n',
 "it's so laddish and juvenile , only teenage boys could possibly find it funny . \n",
 'exploitative and largely devoid of the depth or sophistication that would make watching such a graphic treatment of the crimes bearable . \n',
 '[garbus] discards the potential for pathological study , exhuming instead , the skewed melodrama of the circumstantial situation . \n',
 'a visually flashy but narratively opaque and emotionally vapid exercise in style and mystification . \n']

In [12]:
def vaderSentiment(review):
    return sia.polarity_scores(review)['compound']

In [13]:
def getSentiment(vaderSentiment):
    posReviewResult = [vaderSentiment(posReview) for posReview in positive_review]
    negReviewResult = [vaderSentiment(negReview) for negReview in negative_review]
    return{'resultsPositive': posReviewResult, 'resultsNegative': negReviewResult}

In [14]:
def runDiagnostics(reviewResult):
    positiveReviewResult = reviewResult['resultsPositive']
    negativeReviewResult = reviewResult['resultsNegative']
    pctTruePositive = float(sum(x>0 for x in positiveReviewResult))/len(positiveReviewResult)
    pctTrueNegative = float(sum(x>0 for x in negativeReviewResult))/len(negativeReviewResult)
    totalAccuracy = float(sum(x>0 for x in positiveReviewResult)) + float(sum(x>0 for x in negativeReviewResult))
    total = len(positiveReviewResult)+len(negativeReviewResult)
    
    print('Accuracy of Positive reviews =' + '%.2f' %(pctTruePositive* 100) + '%')
    print('Accuracy of Negative reviews =' + '%.2f' %(pctTrueNegative* 100) + '%')
    print('Overall Accuracy =' + '%.2f' %((totalAccuracy/total)* 100) + '%')

In [15]:
vaderResults = getSentiment(vaderSentiment)

In [16]:
runDiagnostics(vaderResults)

Accuracy of Positive reviews =69.44%
Accuracy of Negative reviews =42.24%
Overall Accuracy =55.84%
