__Module 7/8 & Project__

1. [Import](#Import)
1. [Module 7 walkthrough](#Module-7-walkthrough)
1. [Module 8 walkthrough](#Module-8-walkthrough)
1. [Project](#Project)
    1. [Get tweets and clean tweets](#Get-tweets-and-clean-tweets)
        1. [Acquire tweets](#Acquire-tweets)
        1. [Load tweets](#Load-tweets)
        1. [HTML Parser](#HTML-Parser)
        1. [Remove username, URL](#Remove-username-URL)
        1. [Remove extraneous characters](#Remove-extraneous-characters)
        1. [Remove apostrophes](#Remove-apostrophes)
        1. [Remove punctuation](#Remove-punctuation)
        1. [Word pattern formatting](#Word-pattern-formatting)
        1. [Remove hashtags](#Remove-hashtags)
        1. [Export to pickle file](#Export-to-pickle-file)
        1. [Correct incorrectly spelled words](#Correct-incorrectly-spelled-words)
    1. [Polarity analysis](#Polarity-analysis)
        1. [Load tweets from pickle file](#Load-tweets-from-pickle-file)
        1. [Calculate polarity](#Calculate-polarity)
        1. [Evaluate results](#Evaluate-results)
        1. [Conclusion](#Conclusion)

# Import

<a id = 'Import'></a>

In [1]:
import os
import sys
import jsonpickle
import json
import tweepy
import html.parser as HTMLParser
import re

import nltk
nltk.download('sentiwordnet')
from nltk.corpus import sentiwordnet as swn
from nltk.corpus import stopwords

from textblob import TextBlob, Word

modulePath = os.path.abspath(os.path.join('../../..'))
if modulePath not in sys.path:
    sys.path.append(modulePath)
import config # stores the API and access keys for twitter. only on my machine so this will throw an error


[nltk_data] Downloading package sentiwordnet to
[nltk_data]     /Users/petersontylerd/nltk_data...
[nltk_data]   Package sentiwordnet is already up-to-date!


{'/search/tweets': {'limit': 450, 'remaining': 450, 'reset': 1543543514}}

In [None]:
# standard tweepy API setup
auth = tweepy.OAuthHandler(config.apiKey, config.apiSec)
auth.set_access_token(config.accessToken, config.accessSec)

api = tweepy.API(auth)

# Application authentication tweepy setup
# Use application-only authentication for higher Twitter API rate limit
# Twitter API returns a max of 100 tweets per query
# Allows for 450 queries every 15 minutes
# So we can gather 45,000 tweets every 15 minutes

# switching to application authentication
auth = tweepy.AppAuthHandler(config.apiKey, config.apiSec)

# setting up new api wrapper, using authentication only
api = tweepy.API(auth, wait_on_rate_limit = True
                 ,wait_on_rate_limit_notify = True)
 
# view rate limit status
api.rate_limit_status()['resources']['search']


# Module 7 walkthrough

<a id = 'Module-7-walkthrough'></a>

In [2]:
# parse sample tweet
htmlParser = HTMLParser.HTMLParser()

tweet = "@user_@34 Life is great & I like it sooooooooo much. It's whatis life. #life #great#like http://lifeisgreat.com ."
parsedTweet = htmlParser.unescape(tweet)
print(parsedTweet)


@user_@34 Life is great & I like it sooooooooo much. It's whatis life. #life #great#like http://lifeisgreat.com .


  


In [3]:
# remove URL
urlPattern = re.compile('http\S+')
tweet_v1 = re.sub(urlPattern, '', parsedTweet)
print(tweet_v1)


@user_@34 Life is great & I like it sooooooooo much. It's whatis life. #life #great#like  .


In [4]:
# remoe username
usernamePattern = re.compile('@\S+')
tweet_v2 = re.sub(usernamePattern, '', tweet_v1)
print(tweet_v2)


 Life is great & I like it sooooooooo much. It's whatis life. #life #great#like  .


In [5]:
# remove words repetive "o's"
wordPattern = re.compile('s[o]+')
tweet_v3 = re.sub(wordPattern, 'so', tweet_v2)
print(tweet_v3)


 Life is great & I like it so much. It's whatis life. #life #great#like  .


# Module 8 walkthrough

<a id = 'Module-8-walkthrough'></a>

In [6]:
# load NLTK wordnet and sample sentiment score
nltk.download('wordnet')

print('positive score for the word "happy": {0}'.format(list(swn.senti_synsets('happy','a'))[0].pos_score()))
print('negative score for the word "happy": {0}'.format(list(swn.senti_synsets('happy','a'))[0].neg_score()))
print('neutral score for the word "happy": {0}'.format(list(swn.senti_synsets('happy','a'))[0].obj_score()))


[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/petersontylerd/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
positive score for the word "happy": 0.875
negative score for the word "happy": 0.0
neutral score for the word "happy": 0.125


In [7]:
# toeknize sample sentence
nltk.download('punkt')

sentence = 'i am happy'
tokens = nltk.tokenize.word_tokenize(sentence)
print('Tokens: {0}'.format(tokens))
    

[nltk_data] Downloading package punkt to
[nltk_data]     /Users/petersontylerd/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
Tokens: ['i', 'am', 'happy']


In [8]:
# identify part of speech for each word
from nltk.tag import pos_tag
nltk.download('averaged_perceptron_tagger')

pos_tag(tokens)


[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/petersontylerd/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


[('i', 'NN'), ('am', 'VBP'), ('happy', 'JJ')]

NN - noun
VBP - verb
JJ - adjective

In [9]:
# remove stop words from sample sentence
stop = stopwords.words('english')
sentence = 'i am happy'
newSentence = []
for word in tokens:
    if word not in stop:
        newSentence.append(word)

print('The sentence has been reduced from \'{0}\' \n to \'{1}\''.format(sentence, newSentence))


The sentence has been reduced from 'i am happy' 
 to '['happy']'


# Project

* Try cleaning the tweets that you have extracted in the the previous chapter. Apply the above rules and in addition to that apply the below mentioned rules as well:
    * Remove Punctuations. Puntuations sometimes don't carry any weight. You can remove them. Try writing a regular expression to remove , from sentences. Dont remove question marks "?" or exclamatory marks as they have effect upon any sentence.
    * Remove apostrophes and expand the words. For example in the sentence "It's a great time to code!" the first word It's can be expanded to 'it is'. You can do this either with regular expressions.
    * Create a list of word patterns for word formatting. For example 'gud' should be substitued with 'good'

* Calculate the polarity of a sentence and write a progam to calculate the polarity of all the tweets that you have extracted and preprocessed in the previous questions. You progam should also include the below features:

    * Tweets have hashtags. Remove the hashtags and then find the polarity of each tweet.

    * There might be words that are not present in the sentiwordnet lexicon.
    * The program should handle these cases, by giving a zero score for such words.
    *Depending on the questions,file uploads or screenshots are necessary to show your work.

<a id = 'Project'></a>

## Get tweets and clean tweets

<a id = 'Get-tweets-and-clean-tweets'></a>

### Acquire tweets

<a id = 'Acquire-tweets'></a>

In [None]:
# find up to 500,000 tweets from the last week containing the word election.
# store in JSON file
maxTweets = 500000
tweetCount = 0
with open('trumpTweets.json','w') as f:
    for tweet in tweepy.Cursor(api.search, q = 'trump', tweet_mode = 'extended', lang = 'en').items(maxTweets):
        f.write(jsonpickle.encode(tweet._json, unpicklable = False) + '\n')
        tweetCount += 1
    print('Downloaded {0} tweets'.format(tweetCount))



### Load tweets

<a id = 'Load-tweets'></a>

In [10]:
# load election tweets into memory
data = []
with open('./trumpTweets.json', 'r') as jsonFile:
    for line in jsonFile:
        data.append(json.loads(line))
print('Total number of tweets loaded: {0}'.format(len(data)))


Total number of tweets loaded: 221072


In [35]:
# unpack all tweets in data
tweets = []
for item in data:
    if 'full_text' in item.keys():
        tweet = item['full_text']
        tweets.append(tweet)
print('Total number of tweets extracted from json: {0}'.format(len(tweets)))


Total number of tweets extracted from json: 221072


In [36]:
# print small sample of tweets
tweets[:5]


['RT @pettyasamug: Trump hates pics of his hair circulating on social media #Retweet ♻️\n\n#NotMyPresident https://t.co/4uvEKDaING',
 'RT @maggieNYT: Ted OLSON, who Trump praised in one of his Fla tweets and who Trump tried repeatedly to hire for his own personal legal team…',
 '@realDonaldTrump Stock goes up, credit to trump , goes down? Blame the Democrats. Got it 👌🏾',
 'This. The crisis is here; it can’t be avoided, only mitigated. When I look at Trump’s GOP &amp; their supporters, I see people knowingly and gleefully poisoning my grandchildren and my planet. This is why bipartisanship is BS. I won’t compromise with murderers. https://t.co/mEYs7IszjB',
 'RT @politico: Despite Democrats’ massive House gains — the party’s biggest since 1974, after Richard Nixon’s resignation — redistricting cl…']

In [37]:
# remove retweets
tweets = [x for x in tweets if not x.startswith('RT ')]


### HTML Parser

<a id = 'HTML-Parser'></a>

In [38]:
# remove escapes
import html

for ix, tweet in enumerate(tweets):
    parsedTweet = html.unescape(tweet)
    tweets[ix] = parsedTweet


### Remove username, URL

<a id = 'Remove-username-URL'></a>

In [39]:
# remove URLs and usernames
urlPattern = re.compile(r'(?:\@|https?\://)\S+')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(urlPattern, '', tweet)
    tweets[ix] = parsedTweet


### Remove extraneous characters

Remove extra white space, newlines, tabs, non-unicode characters
and unrendered unicode strings

<a id = 'Remove-extraneous-characters'></a>

In [40]:
# remove unnecessary white space, newlines and tabs
stripPattern = re.compile(r'\s+')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(stripPattern, ' ', tweet).strip()
    tweets[ix] = parsedTweet


In [41]:
# fix fancy single quote that's used as apostrophe in tweets with standard apostrophe
apostropheFix = re.compile(u'\u2019')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(apostropheFix, "'", tweet)
    tweets[ix] = parsedTweet


In [42]:
# remove unicdoe string and emojis
unicodeFix = re.compile('(?![ -~]).')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(unicodeFix, "", tweet)
    tweets[ix] = parsedTweet


### Remove apostrophes

- Remove apostrophes and expand words
    - "It's" becomes "It is", however "Trump's" stays "Trump's"

<a id = 'Remove-apostrophes'></a>

In [43]:
#  reformat contractsion
"""
I am modifying an approach I learned about through this stackoverflow post 
http://stackoverflow.com/questions/19790188/expanding-english-language-contractions-in-python
"""

cList = {
          "ain't": "am not",
          "aren't": "are not",
          "can't've": "cannot have",
          "can't": "cannot",
          "'cause": "because",
          "couldn't've": "could not have",
          "could've": "could have",
          "couldn't": "could not",
          "didn't": "did not",
          "doesn't": "does not",
          "don't": "do not",
          "hadn't've": "had not have",
          "hadn't": "had not",
          "hasn't": "has not",
          "haven't": "have not",
          "he'd've": "he would have",
          "he'd": "he would",
          "he'll've": "he will have",
          "he'll": "he will",
          "he's": "he is",
          "how'd'y": "how do you",
          "how'd": "how did",
          "how'll": "how will",
          "how's": "how is",
          "i'd've": "i would have",
          "i'd": "i would",
          "i'll've": "i will have",
          "i'll": "i will",
          "i'm": "i am",
          "i've": "i have",
          "isn't": "is not",
          "it'd've": "it would have",
          "it'd": "it had",
          "it'll've": "it will have",
          "it'll": "it will",
          "it's": "it is",
          "let's": "let us",
          "ma'am": "madam",
          "mayn't": "may not",
          "might've": "might have",
          "mightn't've": "might not have",
          "mightn't": "might not",
          "must've": "must have",
          "mustn't've": "must not have",
          "mustn't": "must not",
          "needn't've": "need not have",
          "needn't": "need not",
          "o'clock": "of the clock",
          "oughtn't've": "ought not have",
          "oughtn't": "ought not",
          "shan't": "shall not",
          "shan't've": "shall not have",
          "sha'n't": "shall not",
          "she'd've": "she would have",
          "she'd": "she would",
          "she'll've": "she will have",
          "she'll": "she will",
          "she's": "she is",
          "shouldn't've": "should not have",
          "should've": "should have",
          "shouldn't": "should not",
          "so've": "so have",
          "so's": "so is",
          "that'd've": "that would have",
          "that'd": "that would",
          "that's": "that is",
          "there'd've": "there would have",
          "there'd": "there had",
          "there's": "there is",
          "they'd've": "they would have",
          "they'd": "they would",
          "they'll've": "they will have",
          "they'll": "they will",
          "they're": "they are",
          "they've": "they have",
          "to've": "to have",
          "wasn't": "was not",
          "we'd've": "we would have",
          "we'd": "we had",
          "we'll've": "we will have",
          "we'll": "we will",
          "we're": "we are",
          "we've": "we have",
          "weren't": "were not",
          "what'll": "what will",
          "what'll've": "what will have",
          "what're": "what are",
          "what's": "what is",
          "what've": "what have",
          "when's": "when is",
          "when've": "when have",
          "where'd": "where did",
          "where's": "where is",
          "where've": "where have",
          "who'll": "who will",
          "who'll've": "who will have",
          "who's": "who is",
          "who've": "who have",
          "why's": "why is",
          "why've": "why have",
          "will've": "will have",
          "won't": "will not",
          "won't've": "will not have",
          "would've": "would have",
          "wouldn't": "would not",
          "wouldn't've": "would not have",
          "y'all'd've": "you all would have",
          "y'all'd": "you all would",
          "y'all're": "you all are",
          "y'all've": "you all have",
          "y'all": "you all",
          "y'alls": "you alls",
          "you'd've": "you would have",
          "you'd": "you had",
          "you'll've": "you you will have",
          "you'll": "you you will",
          "you're": "you are",
          "you've": "you have"
}

contractionPatterns = re.compile('(%s)' % '|'.join(cList.keys()))

def expandContractions(text, c_re = contractionPatterns):
    def replace(match):
        return cList[match.group(0)]
    return c_re.sub(replace, text.lower())


for ix, tweet in enumerate(tweets):
    parsedTweet = expandContractions(tweet)
    tweets[ix] = parsedTweet


### Remove punctuation
- Remove ellipses (...)
- Remove ','
- Keep '?','!'
- Keep '#' for now

<a id = 'Remove-punctuation'></a>

In [44]:
# replace ellipses with a single space
ellipsesPattern = re.compile(r'\.{3,}')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(ellipsesPattern, ' ', tweet)
    tweets[ix] = parsedTweet


In [45]:
# remove all punctuation except '?', '!', '#',and apostrophes
punctuationPattern = re.compile(r"[^\w\d\s?!#']+")
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(punctuationPattern, '', tweet)
    tweets[ix] = parsedTweet


### Word pattern formatting

- Condense extended strings of vowels and consonants down to form correctly spelled word
    - "Gooooooood" becomes "Good"
    - "Realllllly" becomes "Really"

<a id = 'Word-pattern-formatting'></a>

In [46]:
# remove all overly repetitive vowels and consonants
repetitionPattern = re.compile(r'(.)\1+')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(repetitionPattern, r'\1\1', tweet)
    tweets[ix] = parsedTweet


### Remove hashtags

<a id = 'Remove-hashtags'></a>

In [47]:
# remove all hashtags, including # and the associated word.
hashtagPattern = re.compile(r'([#?])(\w+)\b')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(hashtagPattern, '', tweet)
    tweets[ix] = parsedTweet


In [48]:
# remove unnecessary white space, newlines and tabs
stripPattern = re.compile(r'\s+')
for ix, tweet in enumerate(tweets):
    parsedTweet = re.sub(stripPattern, ' ', tweet).strip()
    tweets[ix] = parsedTweet


### Correct incorrectly spelled words

The assignment prompt suggests that we identify specific words that are misspelled and replace these words with the correctly spelled word that was intended. As an example, the prompt says that "gud" should be "good".

Due to the size of this dataset, which includes over 53,000 tweets (not to mention that tweets are not exactly known for accurate spelling), it would be quite difficult to find all incorrectly spelled and also make a definitive guess as to what the word should have been. It's not always clear what a misspelled word should have been.

In fact, the "gud" to "good" example is particulary informative. I used a library called textblob to see how it would correct the word "gud" specifically, and it returned a series of words along with the probability of each word being the correct word. Interestingly, "good" is not among them. The most likely guess is that "gud" should have been "god", and second "gun". "good" is not on the list, as shown below.

I am certainly intrigued by the potential utility in laboriously defining what incorrectly spelled words should have been depending on the context, but such an endeavor would not be practical due to the size of this text data set. A systematic approach is the only feasible solution, but a systematic approach needs to make assumptions that may not be correct, and may actually detract from the users initial message through incorrect "corrections" rather than truly correcting it.


<a id = 'Correct-incorrectly-spelled-words'></a>

In [54]:
# perform spell check on dummy word
w = Word('gud')
w.spellcheck()


[('god', 0.7453798767967146),
 ('gun', 0.1293634496919918),
 ('mud', 0.07392197125256673),
 ('gut', 0.028747433264887063),
 ('gum', 0.012320328542094456),
 ('bud', 0.006160164271047228),
 ('guy', 0.004106776180698152)]

In [25]:
# Review a sample of the cleaned up tweets
tweets[:20]


['stock goes up credit to trump goes down? blame the democrats got it',
 "this the crisis is here it cannot be avoided only mitigated when i look at trump's gop their supporters i see people knowingly and gleefully poisoning my grandchildren and my planet this is why bipartisanship is bs i will not compromise with murderers",
 'suburban white women who do not have a hard on for trump like',
 "i am so weary of how often you lie about this it is early but still not going to work trump's tax cut was supposed to change corporate behavior here's what happened",
 'umm no trump claims he tried to salvage trip to french cemetery for us troops politico',
 'we could not look more closely if we tried every single story msm gives us proves they are united with us against trump',
 "putin will send a putin bear to trump in his new prison surrounding's and will make sure ivan is his daddy i mean cellmate",
 'donald trump gets back to the united states and someone explains what they were saying in eur

### Export to pickle file

<a id = 'Export-to-pickle-file'></a>

In [27]:
import pickle
with open('cleanTrumpTweets.pkl', 'wb') as fp:
    pickle.dump(tweets, fp)

## Polarity analysis

You can simply start here and load the .pkl that was submitted along with the notebook.


<a id = 'Polarity-analysis'></a>

In [1]:
import os
import sys
import jsonpickle
import json
import tweepy
import html.parser as HTMLParser
import re
import pickle
import numpy as np

import nltk
nltk.download('sentiwordnet')
from nltk.corpus import sentiwordnet as swn
from nltk.corpus import stopwords
stop = stopwords.words('english')
from nltk.tag import pos_tag


[nltk_data] Downloading package sentiwordnet to
[nltk_data]     /Users/petersontylerd/nltk_data...
[nltk_data]   Package sentiwordnet is already up-to-date!


### Load tweets from pickle file

<a id = 'Load tweets from pickle file'></a>

In [2]:
# load tweets
with open ('cleanTrumpTweets.pkl', 'rb') as fp:
    cleanTweets = pickle.load(fp)

print('# of clean tweets: {}'.format(len(cleanTweets)))


# of clean tweets: 53284


### Calculate polarity

The following code block calculates a cumulative negative and positive sentiment for each tweet. The data is captured in a dictionary where the key is the tweet text and the value is a list containing two number - the first is the cumulative negative sentiment, and the second is the cumulative positive sentiment. Noun, verbs and adjectives are evaluated. If a word in not in the sentiwordnet lexicon, the word gets a positive/negative score of 0 by default.

<a id = 'Calculate polarity'></a>

In [3]:
# determine sentiment of tweets
sentimentDict = {}
for tweet in cleanTweets:
    sentimentDict[tweet] = [0., 0.]
    tokens = nltk.tokenize.word_tokenize(tweet)
    pos = pos_tag(tokens)    
    for word, p in zip(tokens, pos):
        if word not in stop:
            if p[1] == 'NN':
                wordPOS = 'n'
            elif p[1] == 'VBP':
                wordPOS = 'v'
            elif p[1] == 'JJ':
                wordPOS = 'a'
            try: # exception catch to avoid crashing when no sentiment values available
                sentimentDict[tweet][0] += list(swn.senti_synsets(word, wordPOS))[0].neg_score()
            except:
                pass
            try:
                sentimentDict[tweet][1] += list(swn.senti_synsets(word, wordPOS))[0].pos_score()
            except:
                pass
    

### Evaluate results

<a id = 'Evaluate-results'></a>

In [4]:
# Count total number of postive and negative tweets

import numpy as np
sentimentTally = [0, 0]
for val in sentimentDict.values():
    if np.argmax(val) == 0:
        sentimentTally[0] += 1
    else:
        sentimentTally[1] += 1
print('Negative tweet count: {0}'.format(sentimentTally[0]))
print('Positive tweet count: {0}'.format(sentimentTally[1]))

Negative tweet count: 27320
Positive tweet count: 19131


In [5]:
# identify the most negative tweets
sentimentDictNeg = {}
for k, v in sentimentDict.items():
    sentimentDictNeg[k] = v[0]
negSentSorted = sorted(sentimentDictNeg.items(), key = lambda kv: kv[1])
negSentSorted.reverse()
    
negSentSorted[:10]    


[('silly sad ed always the hater and silent racist! using both hate and racism as wheels to turn the country against trump! lmao but he cannot most people not as stupid as ed him and his brother will doom that poor unborn baby because those guys are just miserable lousy at life!',
  6.25),
 ("gui lt guilt guilt guilt guilt guilt guilt guilt guilt day later dollar short! a con's story! trump lives in a television world of unreality he needs to take up bowling",
  5.5),
 ('trump is repugnant on so many levels and this is among the worst as one of the millions of americans who did not vote for this monstrosity i offer my humble apologies for our nation on behalf of this pathetic disrespectful petty man who is currently occupying the white house',
  5.0),
 ('paul manafort guilty michael cohen guilty michael t flynn guilty rick gates guilty george papadopoulos guilty timothy nolan guilty richard pinedo guilty alex van der zwaan guilty who will be next roger stone ? donald trump jr ? donald 

In [6]:
# identify the most negative tweets
sentimentDictPos = {}
for k, v in sentimentDict.items():
    sentimentDictPos[k] = v[1]
posSentSorted = sorted(sentimentDictPos.items(), key = lambda kv: kv[1])
posSentSorted.reverse()
    
posSentSorted[:10]   


[("trump should control his temper in better way as president he should keep stable mood and be more gentle and decent than a boss whatever his ability and talent are good obama's different his excellence's only seducing people with beautiful words only for corrupt democracy",
  5.625),
 ('first mr trump how about if you make some effort to find your senses? or dignity? or decency? or civility? or honesty? or maturity? or morality? or loyalty? or humility? however let us all not hold our breath waiting for any of that to occur',
  5.125),
 ("trump's nationalism nationalism is the belief that your own country is better than all others it is important not to confuse nationalism with patriotism patriotism is a healthy pride in your country that brings about feelings of loyalty and a desire to help other citizens",
  5.0),
 ('no they are the opposite macron is correct but trump puts himself first ahead of the us patriotism is a healthy pride in your country that brings about feelings of lo

In [7]:
# show first ten negative tweets
counter = 0
for k, v in sentimentDict.items():
    if counter == 10:
        break
    if np.argmax(v) == 0 and v[0] != v[1]: # neg score is highest and no ties
        print('Negative tweet: {} - Sentiment: {}'.format(k, v[0]))
        counter += 1
    

Negative tweet: stock goes up credit to trump goes down? blame the democrats got it - Sentiment: 0.75
Negative tweet: this the crisis is here it cannot be avoided only mitigated when i look at trump's gop their supporters i see people knowingly and gleefully poisoning my grandchildren and my planet this is why bipartisanship is bs i will not compromise with murderers - Sentiment: 0.875
Negative tweet: suburban white women who do not have a hard on for trump like - Sentiment: 1.0
Negative tweet: i am so weary of how often you lie about this it is early but still not going to work trump's tax cut was supposed to change corporate behavior here's what happened - Sentiment: 1.25
Negative tweet: wrong person to use in gif sublimally a hateful person next time use the trump bear - Sentiment: 1.417
Negative tweet: this is not going to make anyone drink trump wine i guess macron is not your buddy anymore did someone get his feefees hurt? - Sentiment: 0.875
Negative tweet: did not trump fire som

In [8]:
# show first ten positive tweets
counter = 0
for k, v in sentimentDict.items():
    if counter == 10:
        break
    if np.argmax(v) == 1 and v[0] != v[1]: # pos score is highest and no ties
        print('Positive tweet: {} - Sentiment: {}'.format(k, v[1]))
        counter += 1
    

Positive tweet: umm no trump claims he tried to salvage trip to french cemetery for us troops politico - Sentiment: 0.125
Positive tweet: putin will send a putin bear to trump in his new prison surrounding's and will make sure ivan is his daddy i mean cellmate - Sentiment: 1.125
Positive tweet: they would not have done this when obama was in office or bush come to that but with trump in they feel he will back them - Sentiment: 0.5
Positive tweet: inside the body of king henry vii full tudor documentary via ah nothing like finding about the health of some famous person of the past imagine if hillary had been elected? do you think henry could keep up wtrump? - Sentiment: 1.375
Positive tweet: president trump took credit for retiring jeff flake he did not account for democrat kyrsten sinema who is about to make history on multiple fronts - Sentiment: 0.875
Positive tweet: mueller seeking more details on nigel farage key russia inquiry target says - Sentiment: 0.25
Positive tweet: oh dear 

<a id = 'Conclusion'></a>

### Conclusion

Not surprisingly, the majority of the tweets were negative. 

I evaluated the positive and negative sentiment determinations both generally and specifically. First I took the specific approach by reviewing the most positive and the most negative tweets. This was accomplished by sorting the sentiment dictionary by the positive/negative value in descending order. The most negative tweets are clearly very negative, whereas the most positive tweets seems to be a mixture of somewhat positive or sarcastic tweets. The latter get mistaken for positive.

In a general sense, I looked at the first handful of positive and negative tweets. The negative tweets were again pretty clearly negative, where the positive sentiment was less clear in the "positive" tweets. 