## NLP Sentiment Analysis

### Introduction
The purpose of this notebook is to create a proof-of-concept for sentiment analysis using existing NLP analysis library. The target corpus come from news articles that are published on the web, New York Times. Through this POC, it's desired to provide an example that qualitatively assesses the overall sentiment of article and how well the assessment aligns with human interpretation. A stretch goal of this POC includes automatically generating a summary of the article, which also gets qualitatively evaluated by human reader.

### Method
For the NLP analytic library, the following library is used. () 
The target corpus comes from NYT website, and the total number of articles is 3. Each article has a different level sentiment.
The human reader or interpretator is the author of this POC.

### Code
The below block of code loads essential libraries for this project.

In [1]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

In [5]:
# Spins up sentiment analyzer
analyzer = SentimentIntensityAnalyzer()
article_1 = ''
with open('article_1.txt', 'r') as f:
    for line in f:
        article_1 += line
article_1_paragraphs = article_1.split('\n\n')
print(article_1_paragraphs)

['The U.S. labor market has been less resilient than was initially believed. On Wednesday, the Labor Department said that the economy had added 818,000 fewer jobs than it had previously reported for the 12 months that ended in March.', 'The number means employers had overstated job growth by about 28 percent per month, especially in industries like hospitality and professional services. The downward revision adds to growing evidence of a weakening job market: The unemployment rate, though still relatively low, ticked up to 4.3 percent last month.', 'This adjusted number is an initial estimate of an annual revision, in which monthly employment figures from the Labor Department are reconciled with more accurate state unemployment reports. This year’s revision was unusually large: Over the previous decade, annual updates added or subtracted around 173,000 jobs, on average.', '“We’ve known that things on net were probably moving gradually in the wrong direction,” said Guy Berger, director 

In [6]:
def analyzeArticle1():
    # - VADER works best when analysis is done at the sentence level
    # - (but it can work on single words or entire novels).
    # - One reason you might analyze single words is because they're annotation to an image or video.
    # - Here are some positive examples
    paragraphSentiments = 0.0
    for paragraph in article_1_paragraphs:
        vs = analyzer.polarity_scores(paragraph)
        print("{:-<15} {}".format(paragraph, str(vs['compound'])))
        paragraphSentiments += vs["compound"]
    print("AVERAGE SENTIMENT OF PARAGRAPHS: \t" + str(round(paragraphSentiments / len(article_1_paragraphs), 4)))
    print("\t")

In [7]:
analyzeArticle1()

The U.S. labor market has been less resilient than was initially believed. On Wednesday, the Labor Department said that the economy had added 818,000 fewer jobs than it had previously reported for the 12 months that ended in March. 0.0
The number means employers had overstated job growth by about 28 percent per month, especially in industries like hospitality and professional services. The downward revision adds to growing evidence of a weakening job market: The unemployment rate, though still relatively low, ticked up to 4.3 percent last month. -0.4091
This adjusted number is an initial estimate of an annual revision, in which monthly employment figures from the Labor Department are reconciled with more accurate state unemployment reports. This year’s revision was unusually large: Over the previous decade, annual updates added or subtracted around 173,000 jobs, on average. -0.4336
“We’ve known that things on net were probably moving gradually in the wrong direction,” said Guy Berger, 