## Sentiment Analysis
Marissa Salas

This text was from the History of Sir Richard Calmady, a novel that includes sexually explicit material and a "female dandy". Two analyses were performed one by the paragraph and the other by the sentence. The first analysis was the paragraph in total which was mostly neutral leaning towards postive. When analyzed by sentence the results show statements like this - 
God of Heaven! compound: 0.69, neg: 0.0, neu: 0.149, pos: 0.851, 
As mostly postive (85%). But these sentiment (neither the paragraph or by sentence) scores do not really reflect the true senitment of the paragraph on the whole nor this example sentence "God of Heaven!".



In [1]:
# This demo is adapted from https://programminghistorian.org/en/lessons/sentiment-analysis

import nltk
nltk.download('vader_lexicon')
nltk.download('punkt')

[nltk_data] Downloading package vader_lexicon to
[nltk_data]     C:\Users\Streamer\AppData\Roaming\nltk_data...
[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Streamer\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!


True

In [4]:
from nltk.sentiment.vader import SentimentIntensityAnalyzer
sid = SentimentIntensityAnalyzer()

# the variable 'message_text' now contains the text we will analyze.
message_text = '''The storm has ceased--all is still. The winds are hushed; the church
clock proclaims the hour of one: a hissing sound comes from the throat
of the hideous being, and he raises his long, gaunt arms--the lips move.
He advances. The girl places one small foot from the bed on to the
floor. She is unconsciously dragging the clothing with her. The door of
the room is in that direction--can she reach it? Has she power to
walk?--can she withdraw her eyes from the face of the intruder, and so
break the hideous charm? God of Heaven! is it real, or some dream so
like reality as to nearly overturn the judgment for ever?'''

In [5]:
print(message_text)

# Calling the polarity_scores method on sid and passing in the message_text outputs a dictionary with negative, neutral, positive, and compound scores for the input text
scores = sid.polarity_scores(message_text)
for key in sorted(scores):
        print('{0}: {1}, '.format(key, scores[key]), end='')

The storm has ceased--all is still. The winds are hushed; the church
clock proclaims the hour of one: a hissing sound comes from the throat
of the hideous being, and he raises his long, gaunt arms--the lips move.
He advances. The girl places one small foot from the bed on to the
floor. She is unconsciously dragging the clothing with her. The door of
the room is in that direction--can she reach it? Has she power to
walk?--can she withdraw her eyes from the face of the intruder, and so
break the hideous charm? God of Heaven! is it real, or some dream so
like reality as to nearly overturn the judgment for ever?
compound: 0.9286, neg: 0.0, neu: 0.872, pos: 0.128, 

Context from the article above:

VADER collects and scores negative, neutral, and positive words and features (and accounts for factors like negation along the way). The “neg”, “neu”, and “pos” values describe the fraction of weighted scores that fall into each category. VADER also sums all weighted scores to calculate a “compound” value normalized between -1 and 1; this value attempts to describe the overall affect of the entire text from strongly negative (-1) to strongly positive (1). In this case, the VADER analysis describes the passage as slightly-to-moderately negative (-0.3804). We can think of this value as estimating the overall impression of an average reader when considering the e-mail as a whole, despite some ambiguity and ambivalence along the way.

In [7]:
# below is the sentiment analysis code rewritten for sentence-level analysis
# note the new module -- word_tokenize!
import nltk.data
from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import sentiment
from nltk import word_tokenize

# Next, we initialize VADER so we can use it within our Python script
sid = SentimentIntensityAnalyzer()

# We will also initialize our 'english.pickle' function and give it a short name

tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')

message_text = '''The storm has ceased--all is still. The winds are hushed; the church
clock proclaims the hour of one: a hissing sound comes from the throat
of the hideous being, and he raises his long, gaunt arms--the lips move.
He advances. The girl places one small foot from the bed on to the
floor. She is unconsciously dragging the clothing with her. The door of
the room is in that direction--can she reach it? Has she power to
walk?--can she withdraw her eyes from the face of the intruder, and so
break the hideous charm? God of Heaven! is it real, or some dream so
like reality as to nearly overturn the judgment for ever?'''
# The tokenize method breaks up the paragraph into a list of strings. In this example, note that the tokenizer is confused by the absence of spaces after periods and actually fails to break up sentences in two instances. How might you fix that?

sentences = tokenizer.tokenize(message_text)

# We add the additional step of iterating through the list of sentences and calculating and printing polarity scores for each one.

for sentence in sentences:
        print(sentence)
        scores = sid.polarity_scores(sentence)
        for key in sorted(scores):
                print('{0}: {1}, '.format(key, scores[key]), end='')
        print()


The storm has ceased--all is still.
compound: 0.0, neg: 0.0, neu: 1.0, pos: 0.0, 
The winds are hushed; the church
clock proclaims the hour of one: a hissing sound comes from the throat
of the hideous being, and he raises his long, gaunt arms--the lips move.
compound: 0.0, neg: 0.0, neu: 1.0, pos: 0.0, 
He advances.
compound: 0.0, neg: 0.0, neu: 1.0, pos: 0.0, 
The girl places one small foot from the bed on to the
floor.
compound: 0.0, neg: 0.0, neu: 1.0, pos: 0.0, 
She is unconsciously dragging the clothing with her.
compound: 0.0, neg: 0.0, neu: 1.0, pos: 0.0, 
The door of
the room is in that direction--can she reach it?
compound: 0.0258, neg: 0.0, neu: 0.909, pos: 0.091, 
Has she power to
walk?--can she withdraw her eyes from the face of the intruder, and so
break the hideous charm?
compound: 0.4696, neg: 0.0, neu: 0.867, pos: 0.133, 
God of Heaven!
compound: 0.69, neg: 0.0, neu: 0.149, pos: 0.851, 
is it real, or some dream so
like reality as to nearly overturn the judgment for eve