# Sentiment Analysis
## Use *TextBlob*

Generate a sentiment analysis for newspaper articles covering events in Syria from the years 2010-2017. 

In [1]:
%matplotlib inline

In [13]:
import pandas as pd
import numpy as np
from numpy import nan
import seaborn as sns
import matplotlib
import matplotlib.pyplot as plt
import os

sns.set_context('notebook')
sns.set_style('whitegrid')

## Data Loading

In [3]:
df = pd.read_csv('CleanLexisNexis.csv', parse_dates=['date'])

In [34]:
df.dtypes

publication                object
date               datetime64[ns]
title                      object
length                      int64
publicationtype            object
text                       object
year                        int64
month                       int64
day                         int64
dtype: object

In [5]:
df.head(4)

Unnamed: 0,publication,date,title,length,publicationtype,text,year,month,day
0,The Atlanta Journal-Constitution,2010-01-03,Five pressing questions to answer in 2010,747,Newspapers,Will President Barack Obama regain his momentu...,2010,1,3
1,BBC,2010-01-04,"Saudi foreign minister says Israel ""spoiled ch...",2196,Transcript,Text of report by Saudi-owned leading pan-Arab...,2010,1,4
2,BBC,2010-01-08,Highlights of Iran parliamentary session.,1123,Transcript,Excerpt from report on parliamentary proceedin...,2010,1,8
3,Right Vision News,2010-01-09,Jordan:Way out for Obama,852,Newspaper,"Pakistan, Jan. 09 -- These are the worst of ti...",2010,1,9


## 1. Sentiment Analysis

Will use the default settings on TextBlob. These are positive and negative ratings trained on movie ratings. 

In [16]:
from textblob import TextBlob
from textblob.classifiers import NaiveBayesClassifier

In [41]:
df['polarity'] = df.apply(lambda x: TextBlob(x['text']).sentiment.polarity, axis=1)
df['subjectivity'] = df.apply(lambda x: TextBlob(x['text']).sentiment.subjectivity, axis=1)


In [42]:
df.ix[2000:2010]

Unnamed: 0,publication,date,title,length,publicationtype,text,year,month,day,polarity,subjectivity
2000,The Christian Science Monitor,2012-07-23,Obama vs. Romney: VFW hosting campaign side tr...,779,Newspaper,President Obama told the Veterans of Foreign W...,2012,7,23,0.04717,0.302023
2001,The Times (London),2012-07-24,Romney will be courted by Labour as he forges ...,613,Newspaper,Ed Miliband will try to find common cause with...,2012,7,24,0.098021,0.384447
2002,The Times (London),2012-07-24,Romney will be courted by Labour as he forges ...,615,Newspaper,Ed Miliband will try to find common cause with...,2012,7,24,0.098021,0.384447
2003,"The Age (Melbourne, Australia)",2012-07-24,America's foreign fantasy,970,Newspaper,Washington's global ambitions exceed its power...,2012,7,24,0.060325,0.386745
2004,BBC,2012-07-24,Iran MP says foreign interference in Syria to ...,470,Transcript,Text of report on interview with Javad Jahangi...,2012,7,24,0.060412,0.327679
2005,BBC,2012-07-24,Programme summary of Afghan Tolo TV news in Da...,621,Transcript,A. News Headlines B. Home News 1. 0030 A repor...,2012,7,24,-0.03142,0.26537
2006,Brattleboro Reformer (Vermont),2012-07-25,World in Brief,1347,Newspaper,"Wednesday July 25, 2012 Nonpartisan budget off...",2012,7,25,0.097203,0.410523
2007,Scotsman,2012-07-25,Tavish Scott: This diplomatic stand-off result...,464,Newspaper,The greatest show on Earth is about to begin. ...,2012,7,25,0.147222,0.331574
2008,The Christian Science Monitor,2012-07-25,What would 'President Romney' do about Syria?;...,951,Newspaper,"Judging from headlines, one might think Mitt R...",2012,7,25,0.074726,0.363221
2009,Charleston Gazette (West Virginia),2012-07-25,NATIONAL Briefs,932,Newspaper,"Romney says Obama threat to security RENO, Nev...",2012,7,25,0.089305,0.389729


In [50]:
testimonial = TextBlob("Textblob is kind of terrible. What great fun! Of course it could be that they don't know how to use it. Look at the cats...they're so fun.")

l = []
for s in testimonial.sentences:
    l.append(s.sentiment.polarity)
    
print(l)
print("\n", "Average of sentences:", sum(l)/len(l))
print("\n", "Total score:", testimonial.sentiment.polarity)

# There are different scores if you average the sentence scores vs the entire article

[-0.2, 0.5875, 0.0, 0.3]

 Average of sentences: 0.171875

 Total score: 0.215


In [51]:
testimonial = TextBlob("Textblob is kind of terrible. What great fun! Of course it could be that they don't know how to use it. Look at the cats...they're so fun.", classifier=cl)
testimonial.classify()
l = []
for s in testimonial.sentences:
    l.append(s.sentiment.polarity)
    
print(l)
print("\n", "Average of sentences:", sum(l)/len(l))
print("\n", "Total score:", testimonial.sentiment.polarity)

# There are different scores if you average the sentence scores vs the entire article

[-0.2, 0.5875, 0.0, 0.3]

 Average of sentences: 0.171875

 Total score: 0.215


## 2. Sentiment Analysis

Use the positive/negative list provided by Andy Kim, author of *Can Big Data Forcast North Korean Military Aggression?*

#### Append positive and negative list together

In [29]:
os.chdir('/Users/laurieottehenning/Documents/Georgetown Data Science /Capstone/Harvard Pos:Neg')

pos = pd.read_csv('Harvard_Positive.csv', header=None)
neg = pd.read_csv('Harvard_Negative.csv', header=None)

wordlist = pd.concat([pos, neg])
wordlist.head(2)

wordlist.to_csv("wordlist.csv", header=None, index=False)

#### Create Classifier to be trained on Newspaper data

In [30]:
with open('wordlist.csv') as fp:
    cl = NaiveBayesClassifier(fp, format="csv")

In [39]:
# blob = df['text'].apply(lambda tweet: TextBlob(tweet, classifier=cl)

df['polarity'] = df.apply(lambda x: TextBlob(x['text'], classifier=cl).sentiment.polarity, axis=1)
df['subjectivity'] = df.apply(lambda x: TextBlob(x['text']).sentiment.subjectivity, axis=1)

In [40]:
df.ix[2000:2010]

Unnamed: 0,publication,date,title,length,publicationtype,text,year,month,day,polarity,subjectivity
2000,The Christian Science Monitor,2012-07-23,Obama vs. Romney: VFW hosting campaign side tr...,779,Newspaper,President Obama told the Veterans of Foreign W...,2012,7,23,0.04717,0.302023
2001,The Times (London),2012-07-24,Romney will be courted by Labour as he forges ...,613,Newspaper,Ed Miliband will try to find common cause with...,2012,7,24,0.098021,0.384447
2002,The Times (London),2012-07-24,Romney will be courted by Labour as he forges ...,615,Newspaper,Ed Miliband will try to find common cause with...,2012,7,24,0.098021,0.384447
2003,"The Age (Melbourne, Australia)",2012-07-24,America's foreign fantasy,970,Newspaper,Washington's global ambitions exceed its power...,2012,7,24,0.060325,0.386745
2004,BBC,2012-07-24,Iran MP says foreign interference in Syria to ...,470,Transcript,Text of report on interview with Javad Jahangi...,2012,7,24,0.060412,0.327679
2005,BBC,2012-07-24,Programme summary of Afghan Tolo TV news in Da...,621,Transcript,A. News Headlines B. Home News 1. 0030 A repor...,2012,7,24,-0.03142,0.26537
2006,Brattleboro Reformer (Vermont),2012-07-25,World in Brief,1347,Newspaper,"Wednesday July 25, 2012 Nonpartisan budget off...",2012,7,25,0.097203,0.410523
2007,Scotsman,2012-07-25,Tavish Scott: This diplomatic stand-off result...,464,Newspaper,The greatest show on Earth is about to begin. ...,2012,7,25,0.147222,0.331574
2008,The Christian Science Monitor,2012-07-25,What would 'President Romney' do about Syria?;...,951,Newspaper,"Judging from headlines, one might think Mitt R...",2012,7,25,0.074726,0.363221
2009,Charleston Gazette (West Virginia),2012-07-25,NATIONAL Briefs,932,Newspaper,"Romney says Obama threat to security RENO, Nev...",2012,7,25,0.089305,0.389729
