## Instructions for this project

In this assignment, you'll create a Python script to perform a sentiment analysis of the Twitter activity of various news outlets, and to present your findings visually.

Your final output should provide a visualized summary of the sentiments expressed in Tweets sent out by the following news organizations:

## BBC, CBS, CNN, Fox, and New York times

The first plot will be and/or feature the following:

**1  Be a scatter plot of sentiments of the last 100 tweets sent out by each news organization, ranging from -1.0 to 1.0, where a score of 0 expresses a neutral sentiment, -1 the most negative sentiment possible, and +1 the most positive sentiment possible.

**2 Each plot point will reflect the compound sentiment of a tweet. Sort each plot point by its relative timestamp.

**3 The second plot will be a bar plot visualizing the overall sentiments of the last 100 tweets from each organization. For this plot, you will again aggregate the compound sentiments analyzed by VADER.

The tools of the trade you will need for your task as a data analyst include the following: 

## Tweepy, Pandas, Matplotlib, and VADER.

Your final Jupyter notebook must:

Pull last 100 tweets from each outlet.
**Perform a sentiment analysis with the compound, positive, neutral, and negative scoring for each tweet.
  Pull into a DataFrame the tweet's source account, its text, its date, and its compound, positive, neutral, and negative  sentiment scores.
  Export the data in the DataFrame into a CSV file.
  Save PNG images for each plot.
  As final considerations:

**
    You must complete your analysis using a Jupyter notebook.
    You must use the Matplotlib or Pandas plotting libraries.
    Include a written description of three observable trends based on the data.
    Include proper labeling of your plots, including plot titles (with date of analysis) and axes labels.

In [1]:
# Dependencies

import tweepy
import os
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import json
import time

# Import and Initialize Sentiment Analyzer
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
analyzer = SentimentIntensityAnalyzer()

In [2]:
# Get the current working directory
orig_working_directory = os.getcwd()
print(orig_working_directory)

# # In this case, we are changing it to six levels up
os.chdir(os.path.join('..','..','..'))

# Now, you can see the new working directory
curr_working_directory = os.getcwd()
os.getcwd()

C:\Users\jsmit\Documents\GW_Bootcamp_JAS\Bootcamp_HW\Assignments\JAZ-Pandas-API\Twitter-API


'C:\\Users\\jsmit\\Documents\\GW_Bootcamp_JAS\\Bootcamp_HW'

In [3]:
from config import (consumer_key, 
                    consumer_secret, 
                    access_token, 
                    access_token_secret)

# Setup Tweepy API Authentication
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
# api = tweepy.API(auth, parser=tweepy.parsers.JSONParser())
api = tweepy.API(auth)

In [4]:
target_terms = ("@BBC", "@CBS", "@FoxNews","@CNN", "@nytimes")

# List to hold results
tweets_list = []
search_term_list = []

In [5]:
# Loop through all target users
for target in target_terms:
    #Iterate through the first 100 tweets
      for tweet in tweepy.Cursor(api.user_timeline, target, tweet_mode='extended').items(100):
        tweets_list.append(tweet)    
        search_term_list.append(target)

#     for tweet in tweepy.Cursor(api.search, target, tweet_mode='extended').items(100):
#         tweets_list.append(tweet)    
#         search_term_list.append(target)


len(tweets_list)

500

In [32]:
tweets_list[0]._json['full_text']

'There are all sorts of hidden meanings tucked inside brand logos... 👀\nhttps://t.co/4ngKMKX1RG'

In [38]:
#Set up a dictionary for the data we're collecting (Borrowed from solution - Mine was not working - Needed a rope)
Twitter_Collect = {
    "user_list": [],
    "text_list":[],
    "date_list":[],
    "compound_list":[],
    "positive_list":[],
    "negative_list":[],
    "neutral_list":[]
}

# Loop through all tweets
for tweet in tweets_list:
        
    #Fill data from Tweets into the dictionary & do Vader Analysis (Borrowed from solution struggled with index reference- Needed a rope)
    Twitter_Collect["user_list"].append(tweet._json["user"]["name"])
    Twitter_Collect["text_list"].append(tweet._json["full_text"])                                    
    Twitter_Collect["date_list"].append(tweet._json["created_at"])                                    
                                            
#     #Vader Analysis                                    
                                            
    Twitter_Collect['compound_list'].append(analyzer.polarity_scores(tweet._json['full_text'])['compound'])                                    
    Twitter_Collect['positive_list'].append(analyzer.polarity_scores(tweet._json['full_text'])['pos'])                                    
    Twitter_Collect['negative_list'].append(analyzer.polarity_scores(tweet._json['full_text'])['neg'])                                    
    Twitter_Collect['neutral_list'].append(analyzer.polarity_scores(tweet._json['full_text'])['neu']) 

In [41]:
Twitter_Collect = pd.DataFrame(Twitter_Collect)
Twitter_Collect.tail(10)

Unnamed: 0,user_list,text_list,date_list,compound_list,positive_list,negative_list,neutral_list
490,The New York Times,Lufthansa will receive a bailout worth 9 billi...,Mon May 25 17:40:03 +0000 2020,0.4939,0.115,0.035,0.85
491,The New York Times,Conventional wisdom has been that an increase ...,Mon May 25 17:30:07 +0000 2020,0.7789,0.239,0.0,0.761
492,The New York Times,"“He kept saying: ‘I’m going to get Covid, and ...",Mon May 25 17:01:05 +0000 2020,-0.0644,0.14,0.163,0.697
493,The New York Times,"""The virus has disrupted so much, but it has a...",Mon May 25 16:44:02 +0000 2020,0.1154,0.086,0.075,0.839
494,The New York Times,"Joe Biden, who has been campaigning from his h...",Mon May 25 16:20:14 +0000 2020,-0.6249,0.0,0.102,0.898
495,The New York Times,“The government has failed our old people.”\n\...,Mon May 25 16:01:02 +0000 2020,-0.93,0.0,0.261,0.739
496,The New York Times,A short-lived mission uncovered by U.N. invest...,Mon May 25 15:46:04 +0000 2020,-0.6808,0.046,0.187,0.767
497,The New York Times,"In Opinion\n\n""The cultural narrative that bla...",Mon May 25 15:30:07 +0000 2020,-0.8625,0.0,0.22,0.78
498,The New York Times,President Trump threatened to pull the Republi...,Mon May 25 15:15:06 +0000 2020,-0.4588,0.0,0.077,0.923
499,The New York Times,People across the United States took a varied ...,Mon May 25 14:55:48 +0000 2020,0.802,0.233,0.0,0.767


In [49]:
Twitter_Collect.to_csv("Twitter_Collect.csv")

In [55]:
Twitter_Collect.dtypes

user_list         object
text_list         object
date_list         object
compound_list    float64
positive_list    float64
negative_list    float64
neutral_list     float64
dtype: object

In [59]:
#Convert time from object to a date for future calcs.
Twitter_Collect['date_list'] = pd.to_datetime(Twitter_Collect.date_list)
Twitter_Collect.head(10)
# Twitter_Collect.dtypes

Unnamed: 0,user_list,text_list,date_list,compound_list,positive_list,negative_list,neutral_list
0,BBC,There are all sorts of hidden meanings tucked ...,2020-05-26 16:01:00+00:00,0.0,0.0,0.0,1.0
1,BBC,This gin distillery is doing something pretty ...,2020-05-26 13:01:00+00:00,0.7906,0.412,0.0,0.588
2,BBC,RT @BBCTheOneShow: Have you got any space rela...,2020-05-26 12:40:41+00:00,0.0,0.0,0.0,1.0
3,BBC,From #AnimalCrossing to #Minecraft and #Fortni...,2020-05-26 12:01:00+00:00,0.7003,0.228,0.044,0.728
4,BBC,Frontman of one of the biggest bands in the wo...,2020-05-26 08:01:00+00:00,0.4329,0.097,0.042,0.862
5,BBC,"A long time ago (40 years) in a galaxy far, fa...",2020-05-25 18:01:00+00:00,0.2481,0.107,0.06,0.833
6,BBC,#Paddington2 = perfect #BankHoliday film! 🙌❤️...,2020-05-25 17:01:00+00:00,0.6114,0.174,0.0,0.826
7,BBC,This is like a computer game... mesmerising sk...,2020-05-25 16:01:00+00:00,0.4199,0.202,0.0,0.798
8,BBC,Countries around the world are considering whe...,2020-05-25 11:01:00+00:00,0.1901,0.068,0.0,0.932
9,BBC,When The Empire Strikes Back was released 40 y...,2020-05-25 11:01:00+00:00,-0.0516,0.062,0.068,0.87


In [61]:
tuesday = pd.Timestamp('2020-05-26 00:00:00+00:00')

Twitter_Collect['Age'] = Twitter_Collect['date_list'] - tuesday
Twitter_Collect.head(10)

Unnamed: 0,user_list,text_list,date_list,compound_list,positive_list,negative_list,neutral_list,Age
0,BBC,There are all sorts of hidden meanings tucked ...,2020-05-26 16:01:00+00:00,0.0,0.0,0.0,1.0,16:01:00
1,BBC,This gin distillery is doing something pretty ...,2020-05-26 13:01:00+00:00,0.7906,0.412,0.0,0.588,13:01:00
2,BBC,RT @BBCTheOneShow: Have you got any space rela...,2020-05-26 12:40:41+00:00,0.0,0.0,0.0,1.0,12:40:41
3,BBC,From #AnimalCrossing to #Minecraft and #Fortni...,2020-05-26 12:01:00+00:00,0.7003,0.228,0.044,0.728,12:01:00
4,BBC,Frontman of one of the biggest bands in the wo...,2020-05-26 08:01:00+00:00,0.4329,0.097,0.042,0.862,08:01:00
5,BBC,"A long time ago (40 years) in a galaxy far, fa...",2020-05-25 18:01:00+00:00,0.2481,0.107,0.06,0.833,-1 days +18:01:00
6,BBC,#Paddington2 = perfect #BankHoliday film! 🙌❤️...,2020-05-25 17:01:00+00:00,0.6114,0.174,0.0,0.826,-1 days +17:01:00
7,BBC,This is like a computer game... mesmerising sk...,2020-05-25 16:01:00+00:00,0.4199,0.202,0.0,0.798,-1 days +16:01:00
8,BBC,Countries around the world are considering whe...,2020-05-25 11:01:00+00:00,0.1901,0.068,0.0,0.932,-1 days +11:01:00
9,BBC,When The Empire Strikes Back was released 40 y...,2020-05-25 11:01:00+00:00,-0.0516,0.062,0.068,0.87,-1 days +11:01:00


In [63]:
Twitter_Collect.count()

user_list        500
text_list        500
date_list        500
compound_list    500
positive_list    500
negative_list    500
neutral_list     500
Age              500
dtype: int64

## Need help with the plots. Turning in what I have now. Will continue working and submit an update.
