# Sentiment Analysis Assessment

## Task #1: Perform vector arithmetic on your own words
Write code that evaluates vector arithmetic on your own set of related words. The goal is to come as close to an expected word as possible. Please feel free to share success stories in the Q&A Forum for this section!

In [1]:
!python -m spacy download en_core_web_lg

Collecting en_core_web_lg==2.2.5
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_lg-2.2.5/en_core_web_lg-2.2.5.tar.gz (827.9 MB)
[K     |████████████████████████████████| 827.9 MB 1.2 MB/s 
[38;5;2m✔ Download and installation successful[0m
You can now load the model via spacy.load('en_core_web_lg')


In [2]:
import spacy
nlp = spacy.load('en_core_web_lg')

In [3]:
word1 = nlp.vocab['tiger'].vector
word2 = nlp.vocab['leopard'].vector
word3 = nlp.vocab['cheetah'].vector

In [4]:
from scipy import spatial

cosine_similarity = lambda x, y: 1 - spatial.distance.cosine(x, y)

In [5]:
new_vector = word1 - word2 + word3

In [6]:
computed_similarities = []

for word in nlp.vocab:
    if word.has_vector:
        if word.is_lower:
            if word.is_alpha:
                similarity = cosine_similarity(new_vector, word.vector)
                computed_similarities.append((word, similarity))

computed_similarities = sorted(computed_similarities, key=lambda item: -item[1])

print([w[0].text for w in computed_similarities[:10]])

['tiger', 'cheetah', 'tigers', 'panther', 'cub', 'rhino', 'lion', 'elephant', 'cubs', 'cheetahs']


#### CHALLENGE: Write a function that takes in 3 strings, performs a-b+c arithmetic, and returns a top-ten result

In [7]:
def vector_math(a,b,c):
    new_vector = nlp.vocab[a].vector - nlp.vocab[b].vector + nlp.vocab[c].vector
    computed_similarities = []

    for word in nlp.vocab:
        if word.has_vector:
            if word.is_lower:
                if word.is_alpha:
                    similarity = cosine_similarity(new_vector, word.vector)
                    computed_similarities.append((word, similarity))

    computed_similarities = sorted(computed_similarities, key=lambda item: -item[1])

    return [w[0].text for w in computed_similarities[:10]]

In [8]:
vector_math('lion','man','woman')

['lion',
 'lioness',
 'leopard',
 'tiger',
 'lions',
 'elephant',
 'cheetah',
 'panther',
 'woman',
 'giraffe']

## Task #2: Perform VADER Sentiment Analysis on your own review
Write code that returns a set of SentimentIntensityAnalyzer polarity scores based on your own written review.

In [9]:
import nltk
nltk.download('vader_lexicon')

[nltk_data] Downloading package vader_lexicon to /root/nltk_data...
[nltk_data]   Package vader_lexicon is already up-to-date!


True

In [10]:
# Import SentimentIntensityAnalyzer and create an sid object
!pip install twython
from nltk.sentiment.vader import SentimentIntensityAnalyzer

sid = SentimentIntensityAnalyzer()



In [11]:
from prettytable import PrettyTable

def PolarityScores(doc):
    scores = sid.polarity_scores(review)
    t = PrettyTable(scores.keys())
    t.add_row(scores.values())
    print(t)

In [12]:
review = 'This movie portrayed real people, and was based on actual events.'

PolarityScores(review)

+-----+-----+-----+----------+
| neg | neu | pos | compound |
+-----+-----+-----+----------+
| 0.0 | 1.0 | 0.0 |   0.0    |
+-----+-----+-----+----------+


### CHALLENGE: Write a function that takes in a review and returns a score of "Positive", "Negative" or "Neutral"

In [13]:
def review_rating(string):
    scores = sid.polarity_scores(string)
    if scores['compound'] == 0:
        return 'Neutral'
    elif scores['compound'] > 0:
        return 'Positive'
    else:
        return 'Negative'

In [14]:
review_rating(review)

'Neutral'