# The Feelings of the Crisis

When you read a news article, normally the headline is the hook to continue reading. However, a negative title could lead you to skip reading an article if you don't want to be in a bad mood. But is this accurate?

On this activity you are tasked to corroborate if a news title with a negative sentiment leads or not to a negative content. You will use VADER sentiment to accomplish this work using the news articles that you previously download on _The Voice of the Crisis_ activity.

In [1]:
# Initial imports
import os
from pathlib import Path
import pandas as pd
from newsapi import NewsApiClient
import nltk
from nltk.sentiment.vader import SentimentIntensityAnalyzer

get_ipython().run_line_magic("matplotlib", "inline")



Bad key "text.kerning_factor" on line 4 in
C:\Users\ooshy\anaconda3\envs\pyvizenv\lib\site-packages\matplotlib\mpl-data\stylelib\_classic_test_patch.mplstyle.
You probably need to get an updated matplotlibrc file from
http://github.com/matplotlib/matplotlib/blob/master/matplotlibrc.template
or from the matplotlib source distribution


## Instructions

Just for convenience download the `vader_lexicon` in order to initialize the VADER sentiment analyzer

In [2]:
# Download/Update the VADER Lexicon
nltk.download("vader_lexicon")

# Initialize the VADER sentiment analyzer
analyzer = SentimentIntensityAnalyzer()


[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\ooshy\AppData\Roaming\nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


### Load the News Articles from the CSV File as a DataFrame

Pick the CSV file you created on _The Voice of the Crisis_ activity and load it as a DataFrame, remember to specify the `encoding='utf-8-sig'` parameter.

In [7]:
# Load news from CSV file

article = pd.read_csv(Path('crisis_news_en_fr.csv'), encoding='utf-8-sig')
article

Unnamed: 0,date,description,language,text,title
0,2020-08-31,What you can learn from a CEO who started a co...,en,"August\r\n31, 2020\r\n7 min read\r\nOpinions e...",12 Leadership Lessons with NerdWallet CEO Tim ...
1,2020-08-25,"Rachel Kyte, the dean of the Fletcher School a...",en,You cant isolate yourself from a pandemic and ...,The Pandemic’s Economic Crisis Calls for a Gre...
2,2020-08-28,It’s been a week bookended with bad financial ...,en,Its been a week bookended with bad financial n...,Maybe It's Time to Retire the Phrase 'Big Oil'
3,2020-08-12,<ul>\n<li>Billionaire investor Carl Icahn made...,en,Brendan McDermid/Reuters\r\nBillionaire invest...,Carl Icahn netted $1.3 billion from betting ag...
4,2020-08-15,As the pandemic wreaks havoc on public transit...,en,"New Orleans, like most American cities, has se...",Public Transit Cuts Felt Deepest in Low-Income...
5,2020-08-19,COVID-19 caught us all off guard. While many b...,en,COVID-19 caught us all off guard. While many b...,Center-out management: The new leadership styl...
6,2020-08-12,ONS confirms recession as GDP falls 20.4% as C...,en,Britains economy has been officially confirmed...,UK economy plunges into deepest slump since re...
7,2020-09-07,Ireland's gross domestic product fell by 6.1% ...,en,By Reuters Staff\r\nStatues depicting the Iris...,Ireland enters shallower recession than most t...
8,2020-08-11,Ms. Harris sometimes struggled to clearly defi...,en,Ms. Harris later put out her own health care p...,"Kamala Harris on the Issues: Race, Policing, H..."
9,2020-08-17,"Even before the pandemic, America’s colleges a...",en,"MIAMI (Reuters) - Even before the pandemic, Am...",Breakingviews - Guest view: U.S. universities ...


The VADER sentiment module is only trained to score sentiment on English language, so create a new DataFrame only with news in English. You will learn how to score sentiment in multiple languages later.

In [9]:
# Fetch only English news
df_en = article[article['language']=='en']

df_en.head()

Unnamed: 0,date,description,language,text,title
0,2020-08-31,What you can learn from a CEO who started a co...,en,"August\r\n31, 2020\r\n7 min read\r\nOpinions e...",12 Leadership Lessons with NerdWallet CEO Tim ...
1,2020-08-25,"Rachel Kyte, the dean of the Fletcher School a...",en,You cant isolate yourself from a pandemic and ...,The Pandemic’s Economic Crisis Calls for a Gre...
2,2020-08-28,It’s been a week bookended with bad financial ...,en,Its been a week bookended with bad financial n...,Maybe It's Time to Retire the Phrase 'Big Oil'
3,2020-08-12,<ul>\n<li>Billionaire investor Carl Icahn made...,en,Brendan McDermid/Reuters\r\nBillionaire invest...,Carl Icahn netted $1.3 billion from betting ag...
4,2020-08-15,As the pandemic wreaks havoc on public transit...,en,"New Orleans, like most American cities, has se...",Public Transit Cuts Felt Deepest in Low-Income...


### Calculating VADER Sentiment Score for News Titles and Text

As you know the `compound` score could be used to get a normalized score for a sentiment, in this section you have to create a function called `get_sentiment(score)` that will return a normalized value of sentiment for the `score` parameter based on the rules you learn. This function should return `1` for positive sentiment, `-1` for negative sentiment, and `0` for neutral sentiment.

In [10]:
# Sentiment calculation based on compound score
def get_sentiment(score):
    """
    Calculates the sentiment based on the compound score.
    """
    result = 0 # Neutral by default
    if score >= 0.05:  #Postive
        result = 1
    elif score <= -0.05: # Negative
        result = -1
    return result



Use the the VADER sentiment module from `NLTK` to score the sentiment of every news article title and text in english; you should append ten new columns to the English news DataFrame to store the results as follows.

* Title's compound score
* Title's positive score
* Title's neutral score
* Title's negative score
* Title's normalized score (using the `get_sentiment()` function)
* Text's compound score
* Text's positive score
* Text's neutral score
* Text's negative score
* Text's normalized score (using the `get_sentiment()` function)

In [11]:
# Sentiment Scores Dictionaries

title_sent = {
    'title_compound':[],
    'title_pos':[],
    'title_neu' : [],
    'title_neg':[], 
    'title_sent': [],
}

text_sent = {
    'text_compound':[],
    'text_pos':[],
    'text_neu' : [],
    'text_neg':[], 
    'text_sent': [],
}


In [None]:
''''''
To get sentiment for the text & title

the VADER sentiment score for each news article's title & text is calculated within a for-loop,
this loop iterates across the df_en DataFrame using the iterrows() method to create the final result's Dataframe structure.

''''''

In [18]:
for index, row in df_en.iterrows():
    try:
        # Sentiment Score
        title_sentiment = analyzer.polarity_scores(row['title'])
        title_sent['title_compound'].append(title_sentiment['compound'])
        title_sent['title_pos'].append(title_sentiment['pos'])
        title_sent['title_neu'].append(title_sentiment['neu'])
        title_sent['title_neg'].append(title_sentiment['neg'])
        title_sent['title_sent'].append(get_sentiment(title_sentiment['compound']))
        
        text_sentiment = analyzer.polarity_scores(row['text'])
        text_sent['text_compound'].append(title_sentiment['compound'])
        text_sent['text_pos'].append(text_sentiment['pos'])
        text_sent['text_neu'].append(text_sentiment['neu'])
        text_sent['text_neg'].append(text_sentiment['neg'])
        text_sent['text_sent'].append(get_sentiment(text_sentiment['compound']))
    except AttributeError:
        pass

# Attaching sentiment columns to the news Dataframe
title_sentiment_df = pd.DataFrame(title_sent)
text_sentiment_df = pd.DataFrame(text_sent)
news_en_df = df_en.join(title_sentiment_df).join(text_sentiment_df)

ValueError: arrays must all be same length

### Analyzing Sentiments Results

How the sentiment of the title and the text differs on news articles?

To answer this question, on this section you will create a bar chart contrasting the normalized sentiment for the title and the text of each news article. Use the build-in `plot()` method of the Pandas DataFrame to create a bar chart like the one bellow. Be aware that you chart might differ from this one due to is made from a different news DataFrame.

Finally get the descriptive statistics from the English news DataFrame and discuss the analysis results with your partners.