# Sentiment Analysis of Top Google News Articles for keyword bitcoin
---

This notebook outlines my process of analyzing sentiments of each article contained in the article list for a particular date starting from January 7, 2014 up to December 12, 2017 with respect to the keyword **bitcoin**.
> You can view how I scraped these articles in this [notebook](Google News Scraper.ipynb).

For the analysis and computation of sentiment scores I decided to use Google's [Cloud Natural Language](https://cloud.google.com/natural-language/?utm_source=google&utm_medium=cpc&utm_campaign=na-US-all-en-dr-bkws-all-all-trial-p-dr-1002250&utm_content=text-ad-none-any-DEV_c-CRE_185611873602-ADGP_SKWS+%7C+Multi+%7E+null_Sentiment+Analysis-KWID_43700019167264275-kwd-2176092866&utm_term=KW_sentiment%20analysis-ST_sentiment+analysis&gclid=Cj0KCQiAgs7RBRDoARIsANOo-HhpmqiO5CsfaH9PMwL2dDVs8rrNyeiBE7QSac4Gmzzrgt9YpJlnSAIaAstzEALw_wcB&dclid=CMeTv6zbjdgCFdcKNwodDcQCAA).

## Introduction

Sentiment analysis attempts to determine the overall attitude (positive or negative) expressed within the text. Sentiment is represented by numerical score and magnitude values.

### Sentiment Analysis Response Fields

A sample analyzeSentiment response to the [Gettysburg Address](https://en.wikipedia.org/wiki/Gettysburg_Address) is shown below:

![alt text](analyzeSentiment JSON response.png "analyzeSentiment JSON response")

These field values are described below:

* **documentSentiment** contains the overall sentiment of the document, which consists of the following fields:
    * **score** of the sentiment ranges between -1.0 (negative) and 1.0 (positive) and corresponds to the overall emotional leaning of the text.
    * **magnitude** indicates the overall strength of emotion (both positive and negative) within the given text, between **0.0** and **+inf**. Unlike **score**, **magnitude** is not normalized; each expression of emotion within the text (both positive and negative) contributes to the text's **magnitude** (so longer text blocks may have greater magnitudes).
* **language** contains the language of the document, either passed in the initial request, or automatically detected if absent.
* **sentences** contains a list of the sentences extracted from the original document, which contains:
    * **sentiment** contains the sentence level sentiment values attached to each sentence, which contain **score** and **magnitude** values as described above.
    
A response value to the Gettysburg Address of **0.2** score indicates a document which is slightly positive in emotion, while the value of **3.6** indicates a relatively emotional document, given its small size (of about a paragraph). Note that the first sentence of the Gettysburg address contains a very high positive **score** of **0.8**.

### Interpreting sentiment analysis values

The **score** of a document's sentiment indicates the overall emotion of a document. The **magnitude** of a document's sentiment indicates how much emotional content is present within the document, and this value is often proportional to the length of the document.

It is important to note that the Natural Language API indicates differences between positive and negative emotion in a document, but does not identify specific positive and negative emotions. For example, "angry" and "sad" are both considered negative emotions. However, when the Natural Language API analyzes text that is considered "angry", or text that is considered "sad", the response only indicates that the sentiment in the text is negative, not "sad" or "angry".

A document with a neutral score (around **0.0**) may indicate a low-emotion document, or may indicate mixed emotions, with both high positive and negative values which cancel each out. Generally, you can use **magnitude** values to disambiguate these cases, as truly neutral documents will have a low **magnitude** value, while mixed documents will have higher **magnitude** values.

When comparing documents to each other (especially documents of different length), make sure to use the **magnitude** values to calibrate your **scores**, as they can help you gauge the relevant amount of emotional content.

## Read the Data

In [None]:
import pandas as pd



In [None]:
# Imports the Google Cloud client library
#from google.cloud import language
#from google.cloud.language import enums
#from google.cloud.language import types

#import os
#os.environ["GOOGLE_APPLICATION_CREDENTIALS"]="My-First-Project-3377f41be4cf.json"

# Instantiates a client
#client = language.LanguageServiceClient()

# The text to analyze
#text = article.text
#document = language.types.Document(content=text,type='PLAIN_TEXT',)

# Detects the sentiment of the text
#sentiment = client.analyze_sentiment(document=document, encoding_type='UTF32',).document_sentiment
#print('Text: {}'.format(text))
#print('Sentiment: {}, {}'.format(sentiment.score, sentiment.magnitude))