# When Rotten Tomatoes Isn't Enough: Twitter Sentiment Analysis with DSE

### Things To Setup
* Create a Twitter Account and get API access: https://developer.twitter.com/en/docs/ads/general/guides/getting-started.html
* Install DSE https://docs.datastax.com/en/install/doc/install60/installTOC.html
* Start DSE Analytics Cluster: dse cassandra -k #Must use -k option for Analytics
* Set and Source Twitter enviroment variables in shell you will start Jupyter from
* CONSUMER_KEY 
* CONSUMER_SECRET 
* ACCESS_TOKEN 
* ACCESS_TOKEN_SECRET
* Install Anaconda and Jupyter #Anaconda is not required but will make installing jupyter easier 
* Start Jupyter with DSE to get all environemnt variables: dse exec jupyter notebook
* !pip install cassandra-driver
* !pip install tweepy 
* !pip install pattern 
* Counter-intuitive don't install pyspark!!

#### Add some environment variables to find dse verision of pyspark

In [1]:
# Needed to be able to find pyspark libaries
import sys
sys.path.append("/Users/amanda.moran/cassandra/dse-6.0.1/resources/spark/python/lib/pyspark.zip")
sys.path.append("/Users/amanda.moran/cassandra/dse-6.0.1/resources/spark/python/lib/py4j-0.10.4-src.zip")

#### Import python packages -- all are required

In [2]:
import pandas
import cassandra
import pyspark
import tweepy
import re
import os
from IPython.display import display, HTML
from pyspark.sql import SparkSession
from pyspark.ml.feature import Tokenizer, RegexTokenizer, StopWordsRemover
from pyspark.sql.functions import col, udf
from pyspark.sql.types import IntegerType
from pattern.en import sentiment, positive

#### Helper function to have nicer formatting of Spark DataFrames

In [3]:
#Helper for pretty formatting for Spark DataFrames
def showDF(df, limitRows =  5, truncate = True):
    if(truncate):
        pandas.set_option('display.max_colwidth', 50)
    else:
        pandas.set_option('display.max_colwidth', -1)
    pandas.set_option('display.max_rows', limitRows)
    display(df.limit(limitRows).toPandas())
    pandas.reset_option('display.max_rows')

### Creating Tables, Pulling Tweets, and Loading Tables

#### Connect to DSE Analytics Cluster

In [4]:
from cassandra.cluster import Cluster

cluster = Cluster(['127.0.0.1']) #If you have a locally installed DSE cluster
session = cluster.connect()

#### Create Demo Keyspace 

In [5]:
session.execute("""
    CREATE KEYSPACE IF NOT EXISTS dseanalyticsdemo 
    WITH REPLICATION = 
    { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }"""
)

<cassandra.cluster.ResultSet at 0x107369b10>

#### Set keyspace 

In [6]:
session.set_keyspace('dseanalyticsdemo')

#### Set Movie Title variable --Change this to search for different movies!

In [58]:
movieTitle = "mamamia2"

In [59]:
positiveNegative = ["pos", "sad"] 

#### Create two tables in Cassandra for the movie title. One of negative tweets and one for positive tweets. Twitter returns a lot of information with each call but for this demo we will just utilize the twitter id (as our Primary key as it is unique) and the actual tweet. 
#### Is using twitter id the right value to distriubte by? Consider your data model when choosing your primary key. 

In [60]:
for emotion in positiveNegative: 
    
    query = "CREATE TABLE IF NOT EXISTS movie_tweets_%s_%s (twitterid bigint, tweet text, PRIMARY KEY (twitterid))" % (movieTitle, emotion)
    print query
    session.execute(query)


CREATE TABLE IF NOT EXISTS movie_tweets_mamamia2_pos (twitterid bigint, tweet text, PRIMARY KEY (twitterid))
CREATE TABLE IF NOT EXISTS movie_tweets_mamamia2_sad (twitterid bigint, tweet text, PRIMARY KEY (twitterid))


#### Setting up Search Terms for gathering tweets from Twitters API. The happy :) and sad :( face are twitter operators to find positive and negative tweets

In [61]:
searchTermSad= movieTitle + " :("
searchTermPos= movieTitle + " :)"

searchTerms = [searchTermSad, searchTermPos]

#### Function to CleanUp Each Tweet before if is inserted into Cassandra.
#### Removing: 
* emojis 
* flags 
* special characters 
* URL's 
* RT (for Retweet)

In [62]:
#Code from: https://stackoverflow.com/questions/33404752/removing-emojis-from-a-string-in-

def cleanUpTweet(tweet):
    
    emoji_pattern = re.compile(
    u"(\ud83d[\ude00-\ude4f])|"
    u"(\ud83c[\udf00-\uffff])|"  
    u"(\ud83d[\u0000-\uddff])|" 
    u"(\ud83d[\ude80-\udeff])|"  
    u"(\ud83c[\udde0-\uddff])" 
    "+", flags=re.UNICODE)

    removeSpecial = re.compile ('[\n|#|@|!|.|?|,|\"]')
    removeHttp = re.compile("http\S+ | https\S+")
    removeRetweet = re.compile("RT")
    
    noemoji = emoji_pattern.sub(r'', tweet)
    nospecial = removeSpecial.sub(r'', noemoji)
    nohttp = removeHttp.sub(r'', nospecial)
    noretweet = removeRetweet.sub(r'', nohttp)
    
    cleanTweet=noretweet
    
    return cleanTweet

#### Required from Twitter: 
* consumer_key= ''
* consumer_secret= ''
* access_token=''
* access_token_secret=''

In [63]:

consumer_key = os.environ['CONSUMER_KEY']
consumer_secret = os.environ['CONSUMER_SECRET']

access_token = os.environ['ACCESS_TOKEN']
access_token_secret = os.environ['ACCESS_TOKEN_SECRET']

auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)

api = tweepy.API(auth)

#### This cell will pull tweets from Twitter. The max number of tweets returned for free at one time is 100. 
#### Run this code a couple of times to get more data! 
#### Once the tweets are collected, loop over the list, clean up each tweet, and then insert it into the table. A large for loops surrounds this to make one call for postive tweets and one call for negative tweets. Happy and sad face have been URL encoded. :) = "%20%3A%29" and :( = "%20%3A%28"

In [64]:
for emotion in positiveNegative:
    print emotion
    public_tweets = 0
    query = "INSERT INTO movie_tweets_%s_%s (twitterid, tweet)" % (movieTitle, emotion)
    query = query + " VALUES (%s, %s)"
    
    if emotion == "pos":
        searchTermPos= movieTitle + "%20%3A%29"
        public_tweets = api.search(q=searchTermPos, lang="en", count="100")
    if emotion == "sad":
        searchTermSad= movieTitle + "%20%3A%28"
        public_tweets = api.search(q=searchTermSad, lang="en", count="100")

    for tweet in public_tweets:
        cleanTweet = cleanUpTweet(tweet.text)
        session.execute(query, (tweet.id, cleanTweet))
        print(cleanTweet)

pos
A festering wheel of stinky stinky Stilton but the best movie I have watched in ages  Loved it MamaMia2
MamaMia2 was amazing 
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
Mama Mia 2 was a good laugh and a great singalong notashamed famnightout MamaMia2
It was everything Go see it and have the time of your life MamaMia2 mamamiachallenge
 BCDlane: Let the PAY🥳 commence Join us for BREAKFAST LUNCH or DINNER at BigCityDiner at WardVillage Pearlridge or WindwardMall…
Just saw mamamia2 and it was  good I will be blaring Abba for the next month and daydreaming about trips to Greece
 pearcegardner2: The New Testament MamaMia2
Mama Mia was AMAZING Seriously recommend everyone to go and see it MamaMia2
Someone get me a young Bill Anderson please  MamaMia2
Dangit now I wanna go blonde and move to Greece MamaMia2
DrDPrabhat I am considering MamaMia2 I watched the first one in the O2 with my mother I won

A festering wheel of stinky stinky Stilton but the best movie I have watched in ages  Loved it MamaMia2
MamaMia2 was amazing 
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
Mama Mia 2 was a good laugh and a great singalong notashamed famnightout MamaMia2
It was everything Go see it and have the time of your life MamaMia2 mamamiachallenge
 BCDlane: Let the PAY🥳 commence Join us for BREAKFAST LUNCH or DINNER at BigCityDiner at WardVillage Pearlridge or WindwardMall…
Just saw mamamia2 and it was  good I will be blaring Abba for the next month and daydreaming about trips to Greece
 pearcegardner2: The New Testament MamaMia2
Mama Mia was AMAZING Seriously recommend everyone to go and see it MamaMia2
Someone get me a young Bill Anderson please  MamaMia2
Dangit now I wanna go blonde and move to Greece MamaMia2
DrDPrabhat I am considering MamaMia2 I watched the first one in the O2 with my mother I won't s

#### Do a select * on each table and verify that the tweets have been inserted into each Cassandra table

In [65]:
for emotion in positiveNegative:
    print emotion
    query = 'SELECT * FROM movie_tweets_%s_%s limit 10' % (movieTitle, emotion)
    rows = session.execute(query)
    for user_row in rows:
        print (user_row.twitterid, user_row.tweet)

pos
(1021866373692248065, u'Went to see MamaMia2 &amp; it was so so good I laughed the entire way through Fab dancing and cast proper feel good\u2026')
(1021707917354000384, u' omid9: I LOVE BLOCK CAPITALSTHERE SHOULD BE AN ANNUAL WORLD SHOUTY DAYGO SEE MamaMia2 AND ME marlowetheatre THIS SATURDAY GOO\u2026')
(1022213545105281024, u'Mamma 3rd time in 4 days \xbf MamaMia2')
(1021874001377984512, u' _oli01_: if my future boyfriend doesn\u2019t sing waterloo to me in a restaurant full of people or sing to me on a boat i don\u2019t want him tha\u2026')
(1022189556869750784, u'I cannot stop jamming to ABBA Thank you mamamia2 for reminding me how great they are brb dancing in the mirror')
(1021782859315593221, u' is_quin: The last 5 minutes of MamaMia2 deserves an Oscar')
(1022148733981650945, u' _oli01_: if my future boyfriend doesn\u2019t sing waterloo to me in a restaurant full of people or sing to me on a boat i don\u2019t want him tha\u2026')
(1022066415522787328, u' hhgarcia41: Wow Tha

### Finally time for Apache Spark! 

#### Create a spark session that is connected to Cassandra. From there load each table into a Spark Dataframe and take a count of the number of rows in each.

In [66]:
countTokens = udf(lambda words: len(words), IntegerType())

spark = SparkSession.builder.appName('demo').master("local").getOrCreate()

tableNamePos = "movie_tweets_%s_pos" % (movieTitle.lower())
tableNameSad = "movie_tweets_%s_sad" % (movieTitle.lower())
tablepos = spark.read.format("org.apache.spark.sql.cassandra").options(table=tableNamePos, keyspace="dseanalyticsdemo").load()
tablesad = spark.read.format("org.apache.spark.sql.cassandra").options(table=tableNameSad, keyspace="dseanalyticsdemo").load()

print "Postive Table Count: "
print tablepos.count()
print "Negative Table Count: "
print tablesad.count()


Postive Table Count: 
376
Negative Table Count: 
370


#### Use Tokenizer to break up the sentences into indiviudals words

In [67]:
tokenizerPos = Tokenizer(inputCol="tweet", outputCol="tweetwords")
tokenizedPos = tokenizerPos.transform(tablepos)

dfPos = tokenizedPos.select("tweet", "tweetwords").withColumn("tokens", countTokens(col("tweetwords")))

showDF(dfPos)

tokenizerSad = Tokenizer(inputCol="tweet", outputCol="tweetwords")
tokenizedSad = tokenizerSad.transform(tablesad)

dfSad = tokenizedSad.select("tweet", "tweetwords").withColumn("tokens", countTokens(col("tweetwords")))

showDF(dfSad)

Unnamed: 0,tweet,tweetwords,tokens
0,Went to see MamaMia2 &amp; it was so so good I...,"[went, to, see, mamamia2, &amp;, it, was, so, ...",23
1,omid9: I LOVE BLOCK CAPITALSTHERE SHOULD BE A...,"[, omid9:, i, love, block, capitalsthere, shou...",21
2,Mamma 3rd time in 4 days ¿ MamaMia2,"[mamma, 3rd, time, in, 4, days, ¿, mamamia2]",8
3,_oli01_: if my future boyfriend doesn’t sing ...,"[, _oli01_:, if, my, future, boyfriend, doesn’...",29
4,I cannot stop jamming to ABBA Thank you mamami...,"[i, cannot, stop, jamming, to, abba, thank, yo...",21


Unnamed: 0,tweet,tweetwords,tokens
0,Went to see MamaMia2 &amp; it was so so good I...,"[went, to, see, mamamia2, &amp;, it, was, so, ...",23
1,omid9: I LOVE BLOCK CAPITALSTHERE SHOULD BE A...,"[, omid9:, i, love, block, capitalsthere, shou...",21
2,Mamma 3rd time in 4 days ¿ MamaMia2,"[mamma, 3rd, time, in, 4, days, ¿, mamamia2]",8
3,_oli01_: if my future boyfriend doesn’t sing ...,"[, _oli01_:, if, my, future, boyfriend, doesn’...",29
4,I cannot stop jamming to ABBA Thank you mamami...,"[i, cannot, stop, jamming, to, abba, thank, yo...",21


#### Using StopWordsRemover to remove all stop words. Interesting to see, people don't use many stop words with twitter!

In [68]:
removerPos = StopWordsRemover(inputCol="tweetwords", outputCol="tweetnostopwords")
removedPos = removerPos.transform(dfPos)

dfPosStop = removedPos.select("tweet", "tweetwords", "tweetnostopwords").withColumn("tokens", countTokens(col("tweetwords"))).withColumn("notokens", countTokens(col("tweetnostopwords")))

showDF(dfPosStop)

removerSad = StopWordsRemover(inputCol="tweetwords", outputCol="tweetnostopwords")
removedSad = removerSad.transform(dfSad)

dfSadStop = removedSad.select("tweet", "tweetwords", "tweetnostopwords").withColumn("tokens", countTokens(col("tweetwords"))).withColumn("notokens", countTokens(col("tweetnostopwords")))

showDF(dfSadStop)

Unnamed: 0,tweet,tweetwords,tweetnostopwords,tokens,notokens
0,So I saw MamaMia2 today and omfg I was not pre...,"[so, i, saw, mamamia2, today, and, omfg, i, wa...","[saw, mamamia2, today, omfg, prepared, feels, ...",27,14
1,DoctorWho_FR_: MattSmith MamaMia2 MamaMiaHere...,"[, doctorwho_fr_:, mattsmith, mamamia2, mamami...","[, doctorwho_fr_:, mattsmith, mamamia2, mamami...",18,14
2,hhgarcia41: Wow Thank you China &amp; thank y...,"[, hhgarcia41:, wow, thank, you, china, &amp;,...","[, hhgarcia41:, wow, thank, china, &amp;, than...",27,15
3,Mama Mia 2 was absolutely delightful Anyone wh...,"[mama, mia, 2, was, absolutely, delightful, an...","[mama, mia, 2, absolutely, delightful, anyone,...",15,11
4,hhgarcia41: Wow Thank you China &amp; thank y...,"[, hhgarcia41:, wow, thank, you, china, &amp;,...","[, hhgarcia41:, wow, thank, china, &amp;, than...",27,15


Unnamed: 0,tweet,tweetwords,tweetnostopwords,tokens,notokens
0,DoctorWho_FR_: MattSmith MamaMia2 MamaMiaHere...,"[, doctorwho_fr_:, mattsmith, mamamia2, mamami...","[, doctorwho_fr_:, mattsmith, mamamia2, mamami...",18,14
1,hhgarcia41: Wow Thank you China &amp; thank y...,"[, hhgarcia41:, wow, thank, you, china, &amp;,...","[, hhgarcia41:, wow, thank, china, &amp;, than...",27,15
2,Mama Mia 2 was absolutely delightful Anyone wh...,"[mama, mia, 2, was, absolutely, delightful, an...","[mama, mia, 2, absolutely, delightful, anyone,...",15,11
3,hhgarcia41: Wow Thank you China &amp; thank y...,"[, hhgarcia41:, wow, thank, you, china, &amp;,...","[, hhgarcia41:, wow, thank, china, &amp;, than...",27,15
4,Vis Island in Croatia is beautiful Who wants t...,"[vis, island, in, croatia, is, beautiful, who,...","[vis, island, croatia, beautiful, wants, visit...",17,10


### Sentiment Analysis using Python package Pattern

#### Convert each Spark Dataframe to a Pandas Dataframe. From there loop over each row and get the sentiment score (anything + is postive and anything - or 0 is negative). The "positive" function will return true if the tweet is postive. For more info on how the scores are calcuated: https://www.clips.uantwerpen.be/pages/pattern-en#sentiment

#### Negative Tweets

In [69]:
pandaSad = dfSadStop.toPandas()
movieScoreSad = 0
countSad = 0
numSadTweets = 0

for index, row in pandaSad.iterrows():
    print row['tweet']
    print sentiment(row["tweetnostopwords"])
    print positive(row["tweetnostopwords"], .1)
    if positive(row["tweetnostopwords"], .1):
        print "This is a negative tweet! Analysis is wrong :("
        countSad = countSad + 1
    scoreSad = sentiment(row['tweetnostopwords'])[0]
    if scoreSad <= 0:
        movieScoreSad = scoreSad + movieScoreSad

 DoctorWho_FR_: MattSmith MamaMia2 MamaMiaHereWeGoAgain LilyJamesSo excited about this Matt and Lily looks amazing as always  ht…
(0.48750000000000004, 0.825)
True
This is a negative tweet! Analysis is wrong :(
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
Mama Mia 2 was absolutely delightful Anyone who says otherwise is a joyless monster MamaMia2
(1.0, 1.0)
True
This is a negative tweet! Analysis is wrong :(
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
Vis Island in Croatia is beautiful Who wants to visit this beautiful island where MamaMia2 was filmed…
(0.6333333333333333, 0.7000000000000001)
True
This is a negative tweet! Analysis is wrong :(
 DanteHarker: Off to see MamaMia2 have you seen it What did you think
(0.0, 0.0)
False
I will also say that upon watching MamaM

This is a negative tweet! Analysis is wrong :(
MamaMia2 is amazing and lily James is amazing
(0.6000000000000001, 0.9)
True
This is a negative tweet! Analysis is wrong :(
It’s timeMamaMia2
(0.0, 0.0)
False
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
 DoctorWho_FR_: MattSmith MamaMia2 MamaMiaHereWeGoAgain LilyJamesSo excited about this Matt and Lily looks amazing as always  ht…
(0.48750000000000004, 0.825)
True
This is a negative tweet! Analysis is wrong :(
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
Just watched MamaMia so I could cope up with the Part 2 and damnson It was soooooo good I'm gonna have to watch…
(0.7, 0.6000000000000001)
True
This is a negative tweet! Analysis is wrong :(
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our sky

MattSmith MamaMia2 MamaMiaHereWeGoAgain LilyJamesSo excited about this Matt and Lily looks amazing as always…
(0.48750000000000004, 0.825)
True
This is a negative tweet! Analysis is wrong :(
 omid9: MamaMia2 TV ad now cut down to only this &amp; has crashed Coincidence HahahaNo Well played Universa…
(0.0, 0.0)
False
Only just realised that young Rosie was played by AlexaLaura in MamaMia2 I didn’t even clock it and I was OBSESS…
(0.1, 0.4)
True
This is a negative tweet! Analysis is wrong :(
MERYL STREEP DESERVED BETTER MamaMia2
(0.5, 0.5)
True
This is a negative tweet! Analysis is wrong :(
 DoctorWho_FR_: MattSmith MamaMia2 MamaMiaHereWeGoAgain LilyJamesSo excited about this Matt and Lily looks amazing as always  ht…
(0.48750000000000004, 0.825)
True
This is a negative tweet! Analysis is wrong :(
 ROUND 2MamaMia2
(-0.2, 0.4)
False
 pearcegardner2: The New Testament MamaMia2
(0.13636363636363635, 0.45454545454545453)
True
This is a negative tweet! Analysis is wrong :(
That CGI Cher in Ma

#### Positive Tweet
#### Also adding up all the sentiment scores of all the tweets

In [70]:
pandaPos = dfPosStop.toPandas()
movieScore = 0
countPos = 0
for index, row in pandaPos.iterrows():
    print row['tweet']
    print sentiment(row["tweetnostopwords"])
    print positive(row["tweetnostopwords"])
    if not positive(row["tweetnostopwords"]) and sentiment(row["tweetnostopwords"])[0] != 0.0:
        print "This is a postive tweet! Analysis is wrong :("
        countPos = countPos + 1
    score = sentiment(row['tweetnostopwords'])[0]
    movieScore = score + movieScore



Buzzing to see mamamia2 for the second time may even watch it a third time who knows 🤷‍♀️
(0.0, 0.0)
False
*walks into meeting at 9AM &amp; clears throat* “my my at Waterloo Napoleon did surrender oh yeah” abba…
(0.0, 0.0)
False
Nothing better than falling up the steps at the cinema in the middle of MamaMia2 and your mum telling the taxi man on the way home
(0.25, 0.25)
True
 samanthabaines: I feel like my cher impression from MamaMia2 hasn't got enough praise 🤣 What do you think sticktothedayjob
(0.0, 0.5)
False
That feel when all you want to do is go see MamaMia2 but it's sold out so you have to go to the next one 
(0.0, 0.0)
False
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
This is a postive tweet! Analysis is wrong :(
When you are in a rush to see mammamiamovie MamaMia2
(0.0, 0.0)
False
🤗date night tonight with my bestest ever sonmamamia2  nd yes i did have to bribe him wi

(0.28409090909090906, 0.4375)
True
The New Testament MamaMia2
(0.13636363636363635, 0.45454545454545453)
True
 omid9: MamaMia2 TV ad now cut down to only this &amp; has crashed Coincidence HahahaNo Well played Universa…
(0.0, 0.0)
False
I feel like my cher impression from MamaMia2 hasn't got enough praise 🤣 What do you think sticktothedayjob
(0.0, 0.5)
False
Let the PAY🥳 commence Join us for BREAKFAST LUNCH or DINNER at BigCityDiner at WardVillage Pearlridge or…
(0.0, 0.0)
False
Seeing MamaMia2 - Good Having Waterloo and supertrooper in my head at 4:00 AM - Bad
(5.551115123125783e-17, 0.6333333333333333)
False
This is a postive tweet! Analysis is wrong :(
Going to see MamaMia2 with my mama mia
(0.0, 0.0)
False
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
This is a postive tweet! Analysis is wrong :(
MamaMia2 Great movie feel good factor It is the holidays
(0.75, 0.675)
True
Wel

(-0.75, 0.75)
False
This is a postive tweet! Analysis is wrong :(
 SuzanneCordeiro: Absolutely loved everything about MamaMia2 and would want to go see it again 
(0.7, 0.8)
True
 omid9: I LOVE BLOCK CAPITALSTHERE SHOULD BE AN ANNUAL WORLD SHOUTY DAYGO SEE MamaMia2 AND ME marlowetheatre THIS SATURDAY GOO…
(0.5, 0.6)
True
don’t think there’s a dry eye in the cinema mamamia2 was soooo good🤧
(-0.06666666666666665, 0.6)
False
This is a postive tweet! Analysis is wrong :(
I fell in love with Lily James MamaMia2
(0.5, 0.6)
True
 omid9: I LOVE BLOCK CAPITALSTHERE SHOULD BE AN ANNUAL WORLD SHOUTY DAYGO SEE MamaMia2 AND ME marlowetheatre THIS SATURDAY GOO…
(0.5, 0.6)
True
 hhgarcia41: Wow Thank you China &amp; thank you world In our 2nd week out our skyscrapermovie is not only the 1 movie on the planet be…
(0.05, 0.5)
False
This is a postive tweet! Analysis is wrong :(
cher You did an amazing job love the film MamaMia2 ❤️❤️
(0.55, 0.75)
True
DrDPrabhat I am considering MamaMia2 I watched the fir

### Alright! Should I see this movie???

In [71]:
posrating = movieScore/(dfPos.count() - countPos)

print "Postive Rating Average Score: " 
print posrating
print "Number of Tweets Classified Incorrectly:"
print countPos 
if dfSad.count() != 0:
    sadrating = movieScoreSad/(dfSad.count() - countSad)
else: 
    sadrating = 0
print "Negative Rating Average Score:"
print sadrating
print "Number of Tweets Classified Incorrectly:"
print countSad


if posrating > abs(sadrating):
    print "People like this movie!"
elif posrating == abs(sadrating):
    print "People are split on this movie! Take a risk!"
elif posrating < abs(sadrating):
    print "People do not like this movie!"


Postive Rating Average Score: 
0.267460661319
Number of Tweets Classified Incorrectly:
77
Negative Rating Average Score:
-0.0549773982923
Number of Tweets Classified Incorrectly:
189
People like this movie!
