Create a Python script to perform a sentiment analysis of the Twitter activity of  __BBC, CBS, CNN, Fox, and New York times__.

The first plot will be and/or feature the following:

* Be a scatter plot of sentiments of the last __100__ tweets sent out by each news organization, ranging from -1.0 to 1.0, where a score of 0 expresses a neutral sentiment, -1 the most negative sentiment possible, and +1 the most positive sentiment possible.
* Each plot point will reflect the _compound_ sentiment of a tweet.
* Sort each plot point by its relative timestamp.

The second plot will be a bar plot visualizing the _overall_ sentiments of the last 100 tweets from each organization. For this plot, you will again aggregate the compound sentiments analyzed by VADER.

The tools of the trade you will need for your task as a data analyst include the following: tweepy, pandas, matplotlib, seaborn, textblob, and VADER.

Your final Jupyter notebook must:

* Pull last 100 tweets from each outlet.
* Perform a sentiment analysis with the compound, positive, neutral, and negative scoring for each tweet.
* Pull into a DataFrame the tweet's source acount, its text, its date, and its compound, positive, neutral, and negative sentiment scores.
* Export the data in the DataFrame into a CSV file.
* Save PNG images for each plot.

In [1]:
# Dependencies
import tweepy
import json
import numpy as np
import pandas as pd
from datetime import datetime
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
from config import consumer_key, consumer_secret, access_token, access_token_secret

In [2]:
# Import and Initialize Sentiment Analyzer
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()

In [3]:
# Twitter API Keys
consumer_key = consumer_key
consumer_secret = consumer_secret
access_token = access_token
access_token_secret = access_token_secret

In [4]:
# Setup Tweepy API Authentication
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth, parser=tweepy.parsers.JSONParser())

In [5]:
# Target Search Terms
target_users = ("@CNN", "@BBC", "@CBS", "@FOX", "@NYTIMES")

In [7]:
# Variables for holding sentiments
sentiments = []

# Counter
counter = 1

    # Loop through all target users
for user in target_users:

    # Variable for max_id
    oldest_tweet = None
               
    # Run search around each tweet
    public_tweets = api.search(user, count=100, result_type="recent", max_id=oldest_tweet)

    # Loop through all tweets
    for tweet in public_tweets["statuses"]:
        
        new_time = datetime.strptime(tweet["created_at"],"%a %b %d %H:%M:%S %z %Y")
        # Wed Apr 25 19:06:31 +0000 2018   
        # Run Vader Analysis on each tweet
        results = analyzer.polarity_scores(tweet["text"])
        compound = results["compound"]
        pos = results["pos"]
        neu = results["neu"]
        neg = results["neg"]
        text = tweet['text']
        tweets_ago = counter

        # Get Tweet ID, subtract 1, and assign to oldest_tweet
        oldest_tweet = tweet['id'] - 1


        # Add sentiments for each tweet into a list
        sentiments.append({"Date": new_time, 
                       "User": user,
                       "Compound": compound,
                       "Positive": pos,
                       "Negative": neu,
                       "Neutral": neg,
                       "Tweets Ago": counter,
                       "Tweet Text": text})

        # Add to counter 
        counter += 1


# Print the Sentiments
print(sentiments)

    
   



In [13]:
 # Append results to 'results_list'
sentiments_pd = pd.DataFrame.from_dict(sentiments).set_index('User').round(3)
sentiments_pd

Unnamed: 0_level_0,Compound,Date,Negative,Neutral,Positive,Tweet Text,Tweets Ago
User,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
@CNN,-0.262,2018-04-29 02:11:43+00:00,0.915,0.085,0.000,@CNN Nothing to do but shake my head at all th...,1
@CNN,0.000,2018-04-29 02:11:43+00:00,1.000,0.000,0.000,@CNN Why can't these people from Central Ameri...,2
@CNN,0.294,2018-04-29 02:11:42+00:00,0.691,0.110,0.199,"@cnn Oh Thank God, C-span is airing the corres...",3
@CNN,-0.571,2018-04-29 02:11:41+00:00,0.674,0.232,0.094,"@CNN When journalists start reporting like ""re...",4
@CNN,-0.380,2018-04-29 02:11:41+00:00,0.317,0.424,0.260,@KenPettigrew @CNN @realDonaldTrump Awesome! Y...,5
@CNN,-0.511,2018-04-29 02:11:41+00:00,0.377,0.623,0.000,@CNN You’re a dick.,6
@CNN,-0.527,2018-04-29 02:11:40+00:00,0.677,0.226,0.098,"RT @kazweida: Dear @CNN ,\n\nStop sending me n...",7
@CNN,0.440,2018-04-29 02:11:40+00:00,0.627,0.110,0.263,@CNN @CNNOpinion You assert one clear truth in...,8
@CNN,-0.294,2018-04-29 02:11:39+00:00,0.886,0.114,0.000,RT @SparkleSoup45: 💥MEDIA SILENT on Lawsuit Al...,9
@CNN,-0.437,2018-04-29 02:11:38+00:00,0.677,0.213,0.110,@CNN Then stop acting fools and get your act t...,10


In [14]:
average_mood = sentiments_pd.groupby("User")["Compound"].mean()
print(average_mood)

User
@BBC        0.14744
@CBS        0.14549
@CNN       -0.20841
@FOX       -0.04962
@NYTIMES   -0.01960
Name: Compound, dtype: float64


PLOT 1:
Scatter plot of sentiments of the last 100 tweets sent out by each news organization, ranging from -1.0 to 1.0, where a score of 0 expresses a neutral sentiment, -1 the most negative sentiment possible, and +1 the most positive sentiment possible.
Each plot point will reflect the compound sentiment of a tweet.
Sort each plot point by its relative timestamp.

In [15]:
#Export dataframe to csv:
sentiments_pd.to_csv("Twitter_News_Mood.csv", index=False)

In [20]:
#checking np.arange array for use on next cell
x_axis = np.arange(0,len(target_users))
x_axis

array([0, 1, 2, 3, 4])

In [21]:
# set the colors
colors=['b', 'g', 'r', 'y', 'm']

plt.style.use('dark_background')
# Loopthrough target_users to build a scatterplot using 'np.arange' by 
# creating an array of values starting from the the first tweet up to the last one.

for x in np.arange(0,len(target_users)):
    news_mood = sentiments_pd.loc[sentiments_pd["User"]==target_users[x]]
    #Sorting based on the time the tweet was posted. Latest tweet first, 100th tweet last. 
    news_mood = news_mood.sort_values("Tweets Ago")
    
    # Now, create an array of compound scores and associate each with a color 
    #  and make the source (target_users[x]) the same color for the legend. 
    
    plt.scatter(np.arange(len(news_mood["Compound"])), 
            news_mood["Compound"], color = colors[x],
            marker="x",label= target_users[x])

plt.legend(bbox_to_anchor = (1,1), title = 'News Source')        

#Add title, x axis label, y axis label, grid
plt.title(f"Sentiment Analysis of Tweets {datetime.now().strftime('%Y-%m-%d %H:%M')}")
plt.xlabel("Tweets Ago")
plt.ylabel("Tweet Polarity")
plt.grid(True)

plt.savefig("Sentiment Analysis of Media Tweets")
plt.show()  

KeyError: 'User'

The second plot is a bar plot visualizing the overall sentiments of the last 100 tweets from each news source. For this plot, I'm showing the aggregated compound sentiments analyzed by VADER.

In [1]:
# set the colors
colors=['b', 'g', 'r', 'y', 'm']

plt.style.use('ggplot')
#Loop through target_users to build a scatterplot.
# To get different legend colors for each source you need to loop through using 'np.arange' by 
# create an array of values starting from the the first tweet and incrementally 
# going up in a step wise fashion to get the last tweet. 

for i in np.arange(0,len(target_users)):
    news_mood = sentiments_pd.loc[sentiments_pd["User"]==target_users[i]]
    #Sorting based on the time the tweet was posted. Latest tweet first, 100th tweet last. 
    news_mood = news_mood.sort_values("Tweets Ago")
    
    # Now, create an array of compound scores and associate each with a color 
    #  and make the source (target_users[i]) the same color for the legend. 
    # And, add some styling so the color stands out. 
    plt.scatter(np.arange(len(news_mood["Compound"])), 
            news_mood["Compound"], color = colors[i],
            edgecolor="black", linewidths=1, marker="o",
            alpha=0.8,label= target_users[i])
#     plt.scatter(news_mood["Tweets Ago"],news_mood["Compound"], label = user, color = colors)

plt.legend(bbox_to_anchor = (1,1), title = 'News Source')        

#Add title, x axis label, y axis label, grid
plt.title(f"Sentiment Analysis of Tweets {datetime.now().strftime('%Y-%m-%d %H:%M')}")
plt.xlabel("Tweets Ago")
plt.ylabel("Tweet Polarity")
plt.grid(True)

plt.savefig("Sentiment Analysis of Media Tweets")
plt.show()  

NameError: name 'plt' is not defined