# Sentiment Analysis Using TextBlob

Let's take a look at polarity ('postive' or 'negative') and the subjectivity of a piece of text. 

![](img/Covfefe-Donald-Trump-811379.jpg)


In [1]:
import pandas as pd
#import numpy as np
import random
from textblob import TextBlob, Word
import requests
from bs4 import BeautifulSoup
import time
import urllib
import sys
import random

from selenium import webdriver
from selenium.webdriver.common.keys import Keys

import os
chromedriver = "/Applications/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver


In [2]:
# Use Beautiful Soup to scrape tweets from Donald Trump's twitter page

donald_url = 'https://twitter.com/realDonaldTrump?ref_src=twsrc%5Egoogle%7Ctwcamp%5Eserp%7Ctwgr%5Eauthor'
donald_response = requests.get(donald_url)

In [3]:
donald_page = donald_response.text

In [4]:
soup = BeautifulSoup(donald_page,"lxml")

In [5]:
dt_tweets = []

tweets = soup.find_all(class_="TweetTextSize TweetTextSize--normal js-tweet-text tweet-text")


In [6]:
# This tells me the number of tweets scraped

for t in tweets:
    dt_tweets.append(t.text)

len(dt_tweets)


20

In [7]:
# Print a random tweet from the set of tweets that were scraped

x = random.randint(1, len(dt_tweets))
print(x)

dt_tweets[x]
#dt_tweets

3


'As I head out to a very important NATO meeting, I see that FBI Lover/Agent Lisa Page is dodging a Subpoena & is refusing to show up and testify. What can she possibly say about her statements and lies. So much corruption on the other side. Where is the Attorney General? @FoxNews'

In [8]:
# TextBlob implementation

blob = []

for d in dt_tweets:
    blob.append(TextBlob(d))

In [9]:
# Sanity check: len(blob) == len(tweets)?
len(blob)

20

In [10]:
blob[x].sentences

[Sentence("As I head out to a very important NATO meeting, I see that FBI Lover/Agent Lisa Page is dodging a Subpoena & is refusing to show up and testify."),
 Sentence("What can she possibly say about her statements and lies."),
 Sentence("So much corruption on the other side."),
 Sentence("Where is the Attorney General?"),
 Sentence("@FoxNews")]

In [11]:
# POS Tagging
# for tag info: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

for words, tag in blob[x].tags:
    print(words, tag)

As IN
I PRP
head VBP
out RP
to TO
a DT
very RB
important JJ
NATO NNP
meeting NN
I PRP
see VBP
that IN
FBI NNP
Lover/Agent NNP
Lisa NNP
Page NNP
is VBZ
dodging VBG
a DT
Subpoena NNP
& CC
is VBZ
refusing VBG
to TO
show VB
up RP
and CC
testify VB
What WP
can MD
she PRP
possibly RB
say VBP
about IN
her PRP$
statements NNS
and CC
lies NNS
So RB
much JJ
corruption NN
on IN
the DT
other JJ
side NN
Where WRB
is VBZ
the DT
Attorney NNP
General NNP
@ JJ
FoxNews NNS


In [12]:
for np in blob[x].noun_phrases:
     print(np)

important nato meeting
lover/agent lisa page
subpoena
foxnews


In [23]:
#n-gram analysis

for ngram in blob[x].ngrams(2):
    print(ngram)

['I', 'am']
['am', 'in']
['in', 'Brussels']
['Brussels', 'but']
['but', 'always']
['always', 'thinking']
['thinking', 'about']
['about', 'our']
['our', 'farmers']
['farmers', 'Soy']
['Soy', 'beans']
['beans', 'fell']
['fell', '50']
['50', 'from']
['from', '2012']
['2012', 'to']
['to', 'my']
['my', 'election']
['election', 'Farmers']
['Farmers', 'have']
['have', 'done']
['done', 'poorly']
['poorly', 'for']
['for', '15']
['15', 'years']
['years', 'Other']
['Other', 'countries']
['countries', '’']
['’', 'trade']
['trade', 'barriers']
['barriers', 'and']
['and', 'tariffs']
['tariffs', 'have']
['have', 'been']
['been', 'destroying']
['destroying', 'their']
['their', 'businesses']
['businesses', 'I']
['I', 'will']
['will', 'open']


In [13]:
# Sentiment Analysis

dt_sentiments = []
for b in blob:
    print(b, b.sentiment)
    print("\n")
    dt_sentiments.append(b.sentiment)

#NATOSummit2018 Press Conference in Brussels, Belgium:https://www.pscp.tv/w/bhcrPDFvTlFsTFJub1dwUXd8MXluSk9ZWEFuTUVLUvfjYbF9Qv07R34Ix3JItlgPIZlPHPx1wWtMCbOQ-7M0?t=4s … Sentiment(polarity=0.0, subjectivity=0.0)


....On top of it all, Germany just started paying Russia, the country they want protection from, Billions of Dollars for their Energy needs coming out of a new pipeline from Russia. Not acceptable! All NATO Nations must meet their 2% commitment, and that must ultimately go to 4%! Sentiment(polarity=0.22348484848484848, subjectivity=0.6515151515151515)


Presidents have been trying unsuccessfully for years to get Germany and other rich NATO Nations to pay more toward their protection from Russia. They pay only a fraction of their cost. The U.S. pays tens of Billions of Dollars too much to subsidize Europe, and loses Big on Trade! Sentiment(polarity=0.09285714285714285, subjectivity=0.4321428571428572)


As I head out to a very important NATO meeting, I see that FBI Lover/Agent L

In [14]:
dt_sentiments[x]

Sentiment(polarity=0.129, subjectivity=0.615)

In [15]:
dt_polarity = pd.Series(x[0] for x in dt_sentiments)
dt_subjectivity = pd.Series(x[1] for x in dt_sentiments)

In [16]:
dt_polarity.describe()

count    20.000000
mean      0.062774
std       0.307636
min      -0.500000
25%      -0.086250
50%       0.037500
75%       0.155871
max       0.611111
dtype: float64

In [17]:
dt_subjectivity.describe()

count    20.000000
mean      0.474019
std       0.198011
min       0.000000
25%       0.368750
50%       0.483333
75%       0.593750
max       0.850000
dtype: float64

In [29]:
oprah_url = 'https://twitter.com/Oprah?lang=en'
oprah_response = requests.get(oprah_url)

In [30]:
oprah_page = oprah_response.text

In [31]:
o_soup = BeautifulSoup(oprah_page,"lxml")

In [32]:
op_tweets = []

o_tweets = o_soup.find_all(class_="TweetTextSize TweetTextSize--normal js-tweet-text tweet-text")

In [33]:
for t in o_tweets:
    op_tweets.append(t.text)

len(op_tweets)

20

In [34]:
y = random.randint(1, len(op_tweets))
print(y)

op_tweets[y]

15


'Thanks for being a part of the TREND y’all! #LOVEis'

In [35]:
o_blob = []

for o in op_tweets:
    o_blob.append(TextBlob(o))

In [36]:
len(o_blob)

20

In [37]:
o_blob[y].sentences

[Sentence("Thanks for being a part of the TREND y’all!"), Sentence("#LOVEis")]

In [38]:
for words, tag in o_blob[y].tags:
    print(words, tag)

Thanks NNS
for IN
being VBG
a DT
part NN
of IN
the DT
TREND NNP
y NN
’ NNP
all DT
LOVEis NNP


In [39]:
for np in o_blob[y].noun_phrases:
     print(np)

thanks
trend
y ’
loveis


In [41]:
for ngram in o_blob[y].ngrams(3):
    print(ngram)

['Thanks', 'for', 'being']
['for', 'being', 'a']
['being', 'a', 'part']
['a', 'part', 'of']
['part', 'of', 'the']
['of', 'the', 'TREND']
['the', 'TREND', 'y']
['TREND', 'y', '’']
['y', '’', 'all']
['’', 'all', 'LOVEis']


In [42]:
ow_sentiments = []
for b in o_blob:
    print(b, b.sentiment)
    print("\n")
    ow_sentiments.append(b.sentiment)

LoVE IS __happening on @OWNTV! #LOVEis Sentiment(polarity=0.625, subjectivity=0.6)


JusT saw #Whitney. Great job @LisaErspamer and Kevin MacDonald and team. Footage we’ve NEVER seen. Fearless doc.pic.twitter.com/CLmlsl4Ehf Sentiment(polarity=0.8, subjectivity=0.75)


Thanks  @BritishVogue @edwardenninful @mertalas @macpiggott. for the regal experience,felt like an “Empress 4 a day!pic.twitter.com/fjd1pyjNyA Sentiment(polarity=0.2, subjectivity=0.2)


We’re Live on Facebook! Join us now. Sentiment(polarity=0.17045454545454544, subjectivity=0.5)


See ya from my porch in 15 minutes. LIVE FB #TheSunDoesShine Sentiment(polarity=0.13636363636363635, subjectivity=0.5)


Love this book and the author. Join me FBLive from my with your insights and questions 6pm eastern.pic.twitter.com/qBqqW66qiV Sentiment(polarity=0.5, subjectivity=0.6)


I do so LOVE me some  #QueenSugar  Thank u forever @ava Sentiment(polarity=0.5, subjectivity=0.6)


After 30 years of being on death row for a crime he did 

In [43]:
ow_sentiments[y]

Sentiment(polarity=0.25, subjectivity=0.2)

In [44]:
ow_polarity = pd.Series(x[0] for x in ow_sentiments)
ow_subjectivity = pd.Series(x[1] for x in ow_sentiments)

In [45]:
ow_polarity.describe()

count    20.000000
mean      0.357881
std       0.276691
min       0.000000
25%       0.192167
50%       0.272727
75%       0.500000
max       1.000000
dtype: float64

In [46]:
ow_subjectivity.describe()

count    20.000000
mean      0.482258
std       0.280180
min       0.000000
25%       0.363636
50%       0.508929
75%       0.600000
max       1.000000
dtype: float64

# Text Summary



In [18]:
driver = webdriver.Chrome("/usr/bin/chromedriver")
driver.get("http://www.bbc.com")


In [19]:
first_article = driver.find_element_by_class_name('media__link').click();

In [20]:
txt_url = driver.current_url

In [21]:
driver.close()

In [22]:
bbc_response = requests.get(txt_url)

In [23]:
bbc_page = bbc_response.text

In [24]:
soup = BeautifulSoup(bbc_page,"lxml")

In [25]:
soup.find(class_='story-body__inner').text

'\n\n\n\nMedia playback is unsupported on your device\n\n\n\n\n\n Media captionUS President Donald Trump arrived at Stansted Airport for a two-day working visit\nUS President Donald Trump has arrived in the UK, having said he is "fine" about any protests during his visit.Mr Trump and his wife Melania landed at Stansted on Air Force One at 13:50 BST before a helicopter took them to the US ambassador\'s residence in London.He is due to meet Theresa May, who is seeking a post-Brexit trade deal - days after he said the UK was in "turmoil".Extra security is in place to police a number of protests but Mr Trump said he thought Britons "like me a lot".Speaking at the Nato summit in Brussels before he arrived, Mr Trump said the UK was a "pretty hot spot right now".\n\n\n            /**/\n            (function() {\n                if (window.bbcdotcom && bbcdotcom.adverts && bbcdotcom.adverts.slotAsync) {\n                    bbcdotcom.adverts.slotAsync(\'mpu\', [1,2,3]);\n                }\n   

In [26]:
news_blob = TextBlob(soup.find(class_='story-body__inner').text)

In [27]:
nouns = list()
for word, tag in news_blob.tags:
    if tag == 'NN':
        nouns.append(word.lemmatize())

print("This text is about...")
for item in random.sample(nouns, 5):
    word = Word(item)
    print(word.pluralize())


This text is about...
visits
visits
visits
securities
Images


# Language Translation


In [61]:
en_blob = TextBlob(u'')

In [62]:
langlist = ['ar', 'ga', 'ko', 'ja', 'eu', 'zh-TW', 'ka', 'el', 'tr', 'yi', 'gl', 'fr', 'es' ]
for l in langlist:
    print(en_blob.translate(to=l))

الشتاء قادم
Tá an gheimhridh ag teacht
겨울이오고있다.
冬が来る
Negua badator
冬天來了
ზამთარი მოდის
Ερχεται ο χειμώνας
Kış geliyor
ווינטער קומט
O inverno está chegando
L'hiver arrive
Viene el invierno


In [63]:
ko_blob = TextBlob('겨울오고있다')

In [64]:
ko_blob.translate(to='en')

TextBlob("Winter is coming")