# When Rotten Tomatoes Isn't Enough: Twitter Sentiment Analysis with DSE
------
<img src="images/allLogos.png" width="250" height="250">
#### A demo using DataStax Enterprise Analytics, Apache Cassandra, Apache Spark, Python, Jupyter Notebooks, Twitter tweets, pattern, and Sentiment Analysis

### Things To Setup
#### Please work through the ***Installation of DSE and Juypter Notebook*** for setup instructions


##### On your free time try to get the Twitter Dev API up and running. Utilize the other notebooks for this. This example will use CSV files.

#### Add some environment variables to find dse verision of pyspark. Edit these varibles with your path.

In [1]:
pysparkzip = "/opt/dse/resources/spark/python/lib/pyspark.zip"
py4jzip = "/opt/dse/resources/spark/python/lib/py4j-0.10.4-src.zip"

In [2]:
# Needed to be able to find pyspark libaries
import sys
sys.path.append(pysparkzip)
sys.path.append(py4jzip)

#### Import python packages -- all are required
##### Ignore any errors shown

In [26]:
import pandas
import cassandra
import pyspark
import re
import os
from IPython.display import display, Markdown
from pyspark.sql import SparkSession
from pyspark.ml.feature import Tokenizer, RegexTokenizer, StopWordsRemover
from pyspark.sql.functions import col, udf
from pyspark.sql.types import IntegerType
from pattern.en import sentiment, positive

#### Helper function to have nicer formatting of Spark DataFrames

In [4]:
#Helper for pretty formatting for Spark DataFrames
def showDF(df, limitRows =  5, truncate = True):
    if(truncate):
        pandas.set_option('display.max_colwidth', 50)
    else:
        pandas.set_option('display.max_colwidth', -1)
    pandas.set_option('display.max_rows', limitRows)
    display(df.limit(limitRows).toPandas())
    pandas.reset_option('display.max_rows')

# DataStax Enterprise Analytics
<img src="images/datastaxlogo.png" width="200" height="200">

### Creating Tables and Loading Tweets

#### Connect to DSE Analytics Cluster

In [6]:
from cassandra.cluster import Cluster

cluster = Cluster(['dse'])
session = cluster.connect()

#### Create Demo Keyspace 

In [7]:
session.execute("""
    CREATE KEYSPACE IF NOT EXISTS demo1 
    WITH REPLICATION = 
    { 'class' : 'SimpleStrategy', 'replication_factor' : 1 }"""
)

<cassandra.cluster.ResultSet at 0x7feb7440e8d0>

#### Set keyspace 

In [8]:
session.set_keyspace('demo1')

#### Set Movie Title variable --Change this to search for different movies!
##### Choices are: MamaMia2, FirstMan, AStarIsBorn, and MissionImpossible

In [10]:
movieTitle = "AStarIsBorn"

In [11]:
positiveNegative = ["pos", "sad"] 

#### Create two tables in Cassandra for the movie title. One of negative tweets and one for positive tweets. Twitter returns a lot of information with each call but for this demo we will just utilize the twitter id (as our Primary key as it is unique) and the actual tweet. 
#### Is using twitter id the right value to distriubte by? Consider your data model when choosing your primary key. 

In [12]:
for emotion in positiveNegative: 
    
    query = "CREATE TABLE IF NOT EXISTS movie_tweets_%s_%s (twitterid bigint, tweet text, PRIMARY KEY (twitterid))" % (movieTitle, emotion)
    print query
    session.execute(query)


CREATE TABLE IF NOT EXISTS movie_tweets_AStarIsBorn_pos (twitterid bigint, tweet text, PRIMARY KEY (twitterid))
CREATE TABLE IF NOT EXISTS movie_tweets_AStarIsBorn_sad (twitterid bigint, tweet text, PRIMARY KEY (twitterid))


### Load Twitter Tweets
#### Pulled from twitter and stored in CSV file
<img src="images/twitterlogo.png" width="100" height="100">

#### Load Negative Tweets from CSV file

In [13]:
fileName = 'data/' + movieTitle + '_sad.csv'
input_file = open(fileName, 'r')
for line in input_file:
    tweets = line.split(',')
    query = "INSERT INTO movie_tweets_%s_sad (twitterid, tweet)" % (movieTitle)
    query = query + " VALUES (%s, %s)"
    session.execute(query, (int(tweets[0]), tweets[1]))

#### Load Postive Tweets from CSV File

In [14]:
fileName1 = 'data/' + movieTitle + '_pos.csv'
input_file1 = open(fileName1, 'r')
for line in input_file1:
    tweets = line.split(',')
    query = "INSERT INTO movie_tweets_%s_pos (twitterid, tweet)" % (movieTitle)
    query = query + " VALUES (%s, %s)"
    session.execute(query, (int(tweets[0]), tweets[1]))

#### Do a select * on each table and verify that the tweets have been inserted into each Cassandra table

In [15]:
for emotion in positiveNegative:
    print emotion
    query = 'SELECT * FROM movie_tweets_%s_%s limit 10' % (movieTitle, emotion)
    rows = session.execute(query)
    for user_row in rows:
        print (user_row.twitterid, user_row.tweet)

pos
(1052299039172059137, u' JamesArthur23: Watched AStarIsBorn last week and was quite moved by this song so thought I\u2019d have a crack at it Look out for the ful\u2026\r\n')
(1052297682151198720, u' GagaMediaDotNet: Lady Gaga promoting AStarIsBorn release in different countries is the cutest thing ever \u2764\ufe0f\r\n')
(1052298334055137281, u'Saw \u201cA Star is Born\u201d for the second time Why was I more emotional this time around than I was the first AStarIsBorn\r\n')
(1052314159805816832, u'That look  Haven\u2019t seen the movie but the music is good enough for me I can barely handle it \u2665\ufe0f AStarIsBorn\r\n')
(1052299060181266433, u"Okay here's my pitch for AStarIsBorn sequel First of all Ally actually DOES love again because the song she sang\u2026\r\n")
(1052298053493895169, u'A star is born was absolutely unbelievable Such a beautiful yet saddening storyline AStarIsBorn\r\n')
(1052312800637288448, u' erikarabara: You are so talented ladygaga  AStarIsBorn\r\n')
(1

## DSE Analytics with Apache Spark
<img src="images/sparklogo.png" width="150" height="200">

### Finally time for Apache Spark! 

#### Create a spark session that is connected to Cassandra. From there load each table into a Spark Dataframe and take a count of the number of rows in each.

In [16]:
countTokens = udf(lambda words: len(words), IntegerType())

spark = SparkSession.builder.appName('demo').master("dse://dse:9042").getOrCreate()

tableNamePos = "movie_tweets_%s_pos" % (movieTitle.lower())
tableNameSad = "movie_tweets_%s_sad" % (movieTitle.lower())
tablepos = spark.read.format("org.apache.spark.sql.cassandra").options(table=tableNamePos, keyspace="demo1").load()
tablesad = spark.read.format("org.apache.spark.sql.cassandra").options(table=tableNameSad, keyspace="demo1").load()

print "Postive Table Count: "
print tablepos.count()
print "Negative Table Count: "
print tablesad.count()


Postive Table Count: 
29
Negative Table Count: 
31


#### Use Tokenizer to break up the sentences into indiviudals words

In [17]:
tokenizerPos = Tokenizer(inputCol="tweet", outputCol="tweetwords")
tokenizedPos = tokenizerPos.transform(tablepos)

dfPos = tokenizedPos.select("tweet", "tweetwords").withColumn("tokens", countTokens(col("tweetwords")))

showDF(dfPos)

tokenizerSad = Tokenizer(inputCol="tweet", outputCol="tweetwords")
tokenizedSad = tokenizerSad.transform(tablesad)

dfSad = tokenizedSad.select("tweet", "tweetwords").withColumn("tokens", countTokens(col("tweetwords")))

showDF(dfSad)

Unnamed: 0,tweet,tweetwords,tokens
0,Finally get to see AStarIsBorn tomorrow \r\n,"[finally, get, to, see, astarisborn, tomorrow]",6
1,reenielarsen: I saw AStarIsBorn on Saturday a...,"[, reenielarsen:, i, saw, astarisborn, on, sat...",24
2,Just got back from seeing A Star is Born and c...,"[just, got, back, from, seeing, a, star, is, b...",23
3,JamesArthur23 ladygaga Cried so much when she ...,"[jamesarthur23, ladygaga, cried, so, much, whe...",22
4,BreakfastNews: It's completely overwhelming a...,"[, breakfastnews:, it's, completely, overwhelm...",18


Unnamed: 0,tweet,tweetwords,tokens
0,Finally get to see AStarIsBorn tomorrow \r\n,"[finally, get, to, see, astarisborn, tomorrow]",6
1,A Star Is Born was great but had no idea the e...,"[a, star, is, born, was, great, but, had, no, ...",17
2,I'm probably the only one in the world who has...,"[i'm, probably, the, only, one, in, the, world...",21
3,I'll never love again :( AStarIsBorn\r\n,"[i'll, never, love, again, :(, astarisborn]",6
4,i can't express this enought buT YOU ALL NEED ...,"[i, can't, express, this, enought, but, you, a...",22


#### Using StopWordsRemover to remove all stop words. Interesting to see, people don't use many stop words with twitter!

In [18]:
removerPos = StopWordsRemover(inputCol="tweetwords", outputCol="tweetnostopwords")
removedPos = removerPos.transform(dfPos)

dfPosStop = removedPos.select("tweet", "tweetwords", "tweetnostopwords").withColumn("tokens", countTokens(col("tweetwords"))).withColumn("notokens", countTokens(col("tweetnostopwords")))

showDF(dfPosStop)

removerSad = StopWordsRemover(inputCol="tweetwords", outputCol="tweetnostopwords")
removedSad = removerSad.transform(dfSad)

dfSadStop = removedSad.select("tweet", "tweetwords", "tweetnostopwords").withColumn("tokens", countTokens(col("tweetwords"))).withColumn("notokens", countTokens(col("tweetnostopwords")))

showDF(dfSadStop)

Unnamed: 0,tweet,tweetwords,tweetnostopwords,tokens,notokens
0,Finally get to see AStarIsBorn tomorrow \r\n,"[finally, get, to, see, astarisborn, tomorrow]","[finally, get, see, astarisborn, tomorrow]",6,5
1,reenielarsen: I saw AStarIsBorn on Saturday a...,"[, reenielarsen:, i, saw, astarisborn, on, sat...","[, reenielarsen:, saw, astarisborn, saturday, ...",24,13
2,Just got back from seeing A Star is Born and c...,"[just, got, back, from, seeing, a, star, is, b...","[got, back, seeing, star, born, challenge, rea...",23,12
3,JamesArthur23 ladygaga Cried so much when she ...,"[jamesarthur23, ladygaga, cried, so, much, whe...","[jamesarthur23, ladygaga, cried, much, sang, m...",22,11
4,BreakfastNews: It's completely overwhelming a...,"[, breakfastnews:, it's, completely, overwhelm...","[, breakfastnews:, completely, overwhelming, b...",18,11


Unnamed: 0,tweet,tweetwords,tweetnostopwords,tokens,notokens
0,Finally get to see AStarIsBorn tomorrow \r\n,"[finally, get, to, see, astarisborn, tomorrow]","[finally, get, see, astarisborn, tomorrow]",6,5
1,itsfuzzzy: Why do all Jacks die so sad :/ tit...,"[, itsfuzzzy:, why, do, all, jacks, die, so, s...","[, itsfuzzzy:, jacks, die, sad, :/, titanic, t...",20,15
2,_jamiemac: Wow what an emotional movie AStarI...,"[, _jamiemac:, wow, what, an, emotional, movie...","[, _jamiemac:, wow, emotional, movie, astarisb...",8,6
3,AStarIsBorn was brilliant and heartbreaking Is...,"[astarisborn, was, brilliant, and, heartbreaki...","[astarisborn, brilliant, heartbreaking, anythi...",12,7
4,Why do all Jacks die so sad :/ titanic thisisu...,"[why, do, all, jacks, die, so, sad, :/, titani...","[jacks, die, sad, :/, titanic, thisisus, astar...",18,13


### Sentiment Analysis using Python package Pattern

#### Convert each Spark Dataframe to a Pandas Dataframe. This works as-is because we are working with a small dataset. For larger datasets only convert to Pandas if data can fit in memory. From there loop over each row and get the sentiment score (anything + is postive and anything - or 0 is negative). The "positive" function will return true if the tweet is postive. The "assessment" function shows which words where used to judge and the score of each word. For more info on how the scores are calcuated: https://www.clips.uantwerpen.be/pages/pattern-en#sentiment

#### Negative Tweets

In [19]:
pandaSad = dfSadStop.toPandas()
movieScoreSad = 0
countSad = 0
numSadTweets = 0
sadList = list()

for index, row in pandaSad.iterrows():
    if positive(row["tweetnostopwords"], .1):
        countSad = countSad + 1
    scoreSad = sentiment(row['tweetnostopwords'])[0]
    if scoreSad <= 0:
        #print row['tweet']
        #print sentiment(row['tweetnostopwords'])[0]
        sadList.append((row['tweet'], sentiment(row["tweetnostopwords"]), positive(row["tweetnostopwords"]), \
                         sentiment(row['tweetnostopwords']).assessments))
        movieScoreSad = scoreSad + movieScoreSad
        
labels = ['Original Tweet', 'Sentiment Score', 'Postive', 'Assessments']
sadTweetScores = pandas.DataFrame.from_records(sadList, columns=labels)

sadTweetScores

Unnamed: 0,Original Tweet,Sentiment Score,Postive,Assessments
0,Finally get to see AStarIsBorn tomorrow \r\n,"(0.0, 1.0)",False,"[([finally], 0.0, 1.0, None)]"
1,itsfuzzzy: Why do all Jacks die so sad :/ tit...,"(-0.5, 1.0)",False,"[([sad], -0.5, 1.0, None), ([:/], -0.25, 1.0, ..."
2,Why do all Jacks die so sad :/ titanic thisisu...,"(-0.5, 1.0)",False,"[([sad], -0.5, 1.0, None), ([:/], -0.25, 1.0, ..."
3,High-key needs to watch AStarIsBorn :(((\r\n,"(0.0, 0.0)",False,[]
4,unpopular opinion: all the (what seemed like 5...,"(0.0, 0.0)",False,"[([overall], 0.0, 0.0, None)]"
5,Gotta wait til payday to go see AStarIsBorn ag...,"(-0.75, 1.0)",False,"[([:(], -0.75, 1.0, mood)]"
6,wow astarisborn was so sad but the part that k...,"(-0.3875, 0.9)",False,"[([wow], 0.1, 1.0, None), ([sad], -0.5, 1.0, N..."
7,ladygaga Spotify I Would like to watch AStarIs...,"(-0.75, 1.0)",False,"[([:(], -0.75, 1.0, mood)]"
8,I freaking refuse She belongs with BradleyCoop...,"(-0.75, 1.0)",False,"[([:(], -0.75, 1.0, mood)]"
9,Dreamt that I was watching AStarIsBorn but th...,"(-0.75, 1.0)",False,"[([:(], -0.75, 1.0, mood)]"


#### Positive Tweet
#### Also adding up all the sentiment scores of all the tweets

In [20]:
pandaPos = dfPosStop.toPandas()
movieScore = 0
countPos = 0
poslist = list()

for index, row in pandaPos.iterrows():
    if not positive(row["tweetnostopwords"]) and sentiment(row["tweetnostopwords"])[0] != 0.0:
        countPos = countPos + 1
    score = sentiment(row['tweetnostopwords'])[0]
    if score > 0:
        #print row['tweet']
        #print sentiment(row['tweetnostopwords'])[0]
        poslist.append((row['tweet'], sentiment(row["tweetnostopwords"]), positive(row["tweetnostopwords"]), \
                         sentiment(row['tweetnostopwords']).assessments))
        movieScore = score + movieScore
        
labels = ['Original Tweet', 'Sentiment Score', 'Postive', 'Assessments']
postiveTweetScores = pandas.DataFrame.from_records(poslist, columns=labels)

postiveTweetScores

Unnamed: 0,Original Tweet,Sentiment Score,Postive,Assessments
0,Just got back from seeing A Star is Born and c...,"(0.1, 0.15)",True,"[([back], 0.0, 0.0, None), ([real], 0.2, 0.3, ..."
1,JamesArthur23 ladygaga Cried so much when she ...,"(0.2, 0.2)",True,"[([much], 0.2, 0.2, None)]"
2,BreakfastNews: It's completely overwhelming a...,"(0.5, 1.0)",True,"[([completely, overwhelming], 0.5, 1.0, None)]"
3,I've just watched AStarIsBorn Now the question...,"(0.5, 0.6)",True,"[([much, love], 0.5, 0.6, None)]"
4,Saw “A Star is Born” for the second time Why w...,"(0.0833333333333, 0.327777777778)",False,"[([second], 0.0, 0.0, None), ([emotional], 0.0..."
5,That look Haven’t seen the movie but the musi...,"(0.25, 0.4)",True,"[([good], 0.7, 0.6, None), ([enough], 0.0, 0.5..."
6,Okay here's my pitch for AStarIsBorn sequel Fi...,"(0.416666666667, 0.477777777778)",True,"[([okay], 0.5, 0.5, None), ([first], 0.25, 0.3..."
7,A star is born was absolutely unbelievable Suc...,"(0.3, 1.0)",True,"[([absolutely, unbelievable], -0.25, 1.0, None..."
8,Well I just saw AStarIsBorn and I am not okay ...,"(0.5, 0.5)",True,"[([okay], 0.5, 0.5, None)]"
9,If someone could love me the way Jackson loves...,"(0.5, 0.6)",True,"[([love], 0.5, 0.6, None)]"


### Alright! Should I see this movie???

In [21]:
posrating = movieScore/(dfPos.count() - countPos)

display(Markdown('**{}**  \n{}'.format("Positive Rating Average Score", posrating)))

if dfSad.count() != 0:
    sadrating = movieScoreSad/(dfSad.count() - countSad)
else: 
    sadrating = 0

display(Markdown('**{}**  \n{}'.format("Negative Rating Average Score", sadrating)))

if posrating > abs(sadrating):
    display(Markdown('**{}**  \n'.format("People Like This Movie!")))
elif posrating == abs(sadrating):
    display(Markdown('**{}**  \n'.format("People are split! Take a chance!")))
elif posrating < abs(sadrating):
    display(Markdown('***{}***  \n'.format("People Do Not Like This Movie!")))
    

**Positive Rating Average Score**  
0.298913043478

**Negative Rating Average Score**  
-0.387698412698

***People Do Not Like This Movie!***  


### Is this answer what you were expecting? Either way, go back and take a look at 

#### What if we try this again, but removing some extra StopWords. Let's remove: 
* Movie Title
* :)
* :(
* mission, impossible, star, first

#### Using StopWordsRemover to remove list of stop words (but note will not remove other stop words!)

In [22]:
stopwordList = [movieTitle,":(",":)", "mission", "impossible", "Star", "first"]

removerPos = StopWordsRemover(inputCol="tweetwords", outputCol="tweetnostopwords", stopWords=stopwordList)
removedPos = removerPos.transform(dfPos)

dfPosStop = removedPos.select("tweet", "tweetwords", "tweetnostopwords").withColumn("tokens", countTokens(col("tweetwords"))).withColumn("notokens", countTokens(col("tweetnostopwords")))

showDF(dfPosStop)

removerSad = StopWordsRemover(inputCol="tweetwords", outputCol="tweetnostopwords", stopWords=stopwordList)
removedSad = removerSad.transform(dfSad)

dfSadStop = removedSad.select("tweet", "tweetwords", "tweetnostopwords").withColumn("tokens", countTokens(col("tweetwords"))).withColumn("notokens", countTokens(col("tweetnostopwords")))

showDF(dfSadStop)

Unnamed: 0,tweet,tweetwords,tweetnostopwords,tokens,notokens
0,Finally get to see AStarIsBorn tomorrow \r\n,"[finally, get, to, see, astarisborn, tomorrow]","[finally, get, to, see, tomorrow]",6,5
1,reenielarsen: I saw AStarIsBorn on Saturday a...,"[, reenielarsen:, i, saw, astarisborn, on, sat...","[, reenielarsen:, i, saw, on, saturday, and, i...",24,23
2,Just got back from seeing A Star is Born and c...,"[just, got, back, from, seeing, a, star, is, b...","[just, got, back, from, seeing, a, is, born, a...",23,22
3,JamesArthur23 ladygaga Cried so much when she ...,"[jamesarthur23, ladygaga, cried, so, much, whe...","[jamesarthur23, ladygaga, cried, so, much, whe...",22,22
4,BreakfastNews: It's completely overwhelming a...,"[, breakfastnews:, it's, completely, overwhelm...","[, breakfastnews:, it's, completely, overwhelm...",18,18


Unnamed: 0,tweet,tweetwords,tweetnostopwords,tokens,notokens
0,Finally get to see AStarIsBorn tomorrow \r\n,"[finally, get, to, see, astarisborn, tomorrow]","[finally, get, to, see, tomorrow]",6,5
1,crinda54: :( Well at least Shallow can qualif...,"[, crinda54:, :(, well, at, least, shallow, ca...","[, crinda54:, well, at, least, shallow, can, q...",16,13
2,:( Well at least Shallow can qualify if not AS...,"[:(, well, at, least, shallow, can, qualify, i...","[well, at, least, shallow, can, qualify, if, n...",14,11
3,itsfuzzzy: Why do all Jacks die so sad :/ tit...,"[, itsfuzzzy:, why, do, all, jacks, die, so, s...","[, itsfuzzzy:, why, do, all, jacks, die, so, s...",20,18
4,_jamiemac: Wow what an emotional movie AStarI...,"[, _jamiemac:, wow, what, an, emotional, movie...","[, _jamiemac:, wow, what, an, emotional, movie]",8,7


#### Sad Tweets: Convert to Pandas, use Pattern to get sentiment, and get the average

In [23]:
pandaSad = dfSadStop.toPandas()
movieScoreSad = 0
countSad = 0
numSadTweets = 0
sadList = list()

for index, row in pandaSad.iterrows():
    if positive(row["tweetnostopwords"], .1):
        countSad = countSad + 1
    scoreSad = sentiment(row['tweetnostopwords'])[0]
    if scoreSad <= 0:
        sadList.append((row['tweet'], sentiment(row["tweetnostopwords"]), positive(row["tweetnostopwords"]), \
                         sentiment(row['tweetnostopwords']).assessments))
        movieScoreSad = scoreSad + movieScoreSad
        
labels = ['Original Tweet', 'Sentiment Score', 'Postive', 'Assessments']
sadTweetScores = pandas.DataFrame.from_records(sadList, columns=labels)

sadTweetScores

Unnamed: 0,Original Tweet,Sentiment Score,Postive,Assessments
0,Finally get to see AStarIsBorn tomorrow \r\n,"(0.0, 1.0)",False,"[([finally], 0.0, 1.0, None)]"
1,I'm probably the only one in the world who has...,"(-0.2, 0.8)",False,"[([only], 0.0, 1.0, None), ([simply, poor], -0..."
2,I'll never love again :( AStarIsBorn\r\n,"(-0.25, 0.6)",False,"[([never, love], -0.25, 0.6, None)]"
3,crinda54: :( Well at least Shallow can qualif...,"(-0.316666666667, 0.45)",False,"[([least], -0.3, 0.4, None), ([shallow], -0.33..."
4,:( Well at least Shallow can qualify if not AS...,"(-0.316666666667, 0.45)",False,"[([least], -0.3, 0.4, None), ([shallow], -0.33..."
5,Gotta wait til payday to go see AStarIsBorn ag...,"(0.0, 0.0)",False,[]
6,wow astarisborn was so sad but the part that k...,"(-0.075, 0.775)",False,"[([wow], 0.1, 1.0, None), ([sad], -0.5, 1.0, N..."
7,ladygaga Spotify I Would like to watch AStarIs...,"(0.0, 0.0)",False,[]
8,I freaking refuse She belongs with BradleyCoop...,"(0.0, 0.0)",False,[]
9,Dreamt that I was watching AStarIsBorn but th...,"(0.0, 0.0)",False,[]


#### Positive Tweets: Convert to Pandas, use Pattern to get sentiment, and get the average

In [24]:
pandaPos = dfPosStop.toPandas()
movieScore = 0
countPos = 0
poslist = list()

for index, row in pandaPos.iterrows():
    if not positive(row["tweetnostopwords"]) and sentiment(row["tweetnostopwords"])[0] != 0.0:
        countPos = countPos + 1
    score = sentiment(row['tweetnostopwords'])[0]
    if score > 0:
        poslist.append((row['tweet'], sentiment(row["tweetnostopwords"]), positive(row["tweetnostopwords"]), \
                         sentiment(row['tweetnostopwords']).assessments))
        movieScore = score + movieScore
        
labels = ['Original Tweet', 'Sentiment Score', 'Postive', 'Assessments']
postiveTweetScores = pandas.DataFrame.from_records(poslist, columns=labels)

postiveTweetScores

Unnamed: 0,Original Tweet,Sentiment Score,Postive,Assessments
0,Just got back from seeing A Star is Born and c...,"(0.1, 0.15)",True,"[([back], 0.0, 0.0, None), ([real], 0.2, 0.3, ..."
1,erikarabara: You are so talented ladygaga AS...,"(0.7, 0.9)",True,"[([talented], 0.7, 0.9, None)]"
2,ladygaga and Bradley Cooper have sent me on an...,"(0.3, 0.775)",True,"[([emotional], 0.0, 0.65, None), ([amazing], 0..."
3,_jamiemac: Wow what an emotional movie AStarI...,"(0.05, 0.825)",False,"[([wow], 0.1, 1.0, None), ([emotional], 0.0, 0..."
4,AStarIsBorn was brilliant and heartbreaking Is...,"(0.95, 1.0)",True,"[([brilliant], 0.9, 1.0, None), ([perfectly], ..."
5,Wow first ladygaga mastered pop then she maste...,"(0.1, 1.0)",True,"[([wow], 0.1, 1.0, None)]"
6,JamesArthur23 ladygaga Cried so much when she ...,"(0.2, 0.2)",True,"[([much], 0.2, 0.2, None)]"
7,BreakfastNews: It's completely overwhelming a...,"(0.5, 1.0)",True,"[([completely, overwhelming], 0.5, 1.0, None)]"
8,I've just watched AStarIsBorn Now the question...,"(0.35, 0.4)",True,"[([much], 0.2, 0.2, None), ([love], 0.5, 0.6, ..."
9,A Star Is Born was made in 1937 1954 1976 and ...,"(0.25, 1.0)",True,"[([sadly, not], 0.25, 1.0, None)]"


#### Okay let's see if there was a difference! 

In [25]:
posrating1 = movieScore/(dfPos.count() - countPos)

display(Markdown('**{}**  \n{}'.format("Positive Rating Original Average Score", posrating)))
display(Markdown('**{}**  \n{}'.format("Positive Rating Average Score", posrating1)))

if dfSad.count() != 0:
    sadrating1 = movieScoreSad/(dfSad.count() - countSad)
else: 
    sadrating1 = 0

display(Markdown('**{}**  \n{}'.format("Negative Rating Original Average Score", sadrating)))
display(Markdown('**{}**  \n{}'.format("Negative Rating Average Score", sadrating1)))

if posrating1 > abs(sadrating1):
    display(Markdown('**{}**  \n'.format("People Like This Movie!")))
elif posrating1 == abs(sadrating1):
    display(Markdown('**{}**  \n'.format("People are split! Take a chance!")))
elif posrating1 < abs(sadrating1):
    display(Markdown('***{}***  \n'.format("People Do Not Like This Movie!")))

**Positive Rating Original Average Score**  
0.298913043478

**Positive Rating Average Score**  
0.254398148148

**Negative Rating Original Average Score**  
-0.387698412698

**Negative Rating Average Score**  
-0.133796296296

**People Like This Movie!**  
