# Sentiment Analysis Assessment - Solution

## Task #1: Perform vector arithmetic on your own words
Write code that evaluates vector arithmetic on your own set of related words. The goal is to come as close to an expected word as possible. Please feel free to share success stories in the Q&A Forum for this section!

In [1]:
# Import spaCy and load the language library. Remember to use a larger model!

import spacy
nlp = spacy.load('en_core_web_md')

In [2]:
# Choose the words you wish to compare, and obtain their vectors

word1 = nlp(u'car').vector
word2 = nlp(u'land').vector
word3 = nlp(u'air').vector

In [3]:
# Import spatial and define a cosine_similarity function

from scipy import spatial

cosine_similarity = lambda x, y: 1 - spatial.distance.cosine(x, y)


In [4]:
# Write an expression for vector arithmetic
# For example: new_vector = word1 - word2 + word3

new_vector = word1 - word2 + word3

In [5]:
# List the top ten closest vectors in the vocabulary to the result of the expression above
computed_similarities = []

for word in nlp.vocab:
    # Ignore words without vectors and mixed-case words:
    if word.has_vector:
        if word.is_lower:
            if word.is_alpha:
                similarity = cosine_similarity(new_vector, word.vector)
                computed_similarities.append((word, similarity))

computed_similarities = sorted(computed_similarities, key=lambda item: -item[1])

print([w[0].text for w in computed_similarities[:10]])

['air', 'car', 'when', 'you', 'space', 'got', 'it', 'i', 'h', 'somethin']


#### CHALLENGE: Write a function that takes in 3 strings, performs a-b+c arithmetic, and returns a top-ten result

In [6]:
def vector_math(a,b,c):
    vec1 = nlp(a).vector
    vec2 = nlp(b).vector
    vec3 = nlp(c).vector

    new_vector = vec1 - vec2 + vec3
    computed_similarities = []

    for word in nlp.vocab:
        if word.has_vector:
            if word.is_lower:
                if word.is_alpha:
                    similarity = cosine_similarity(new_vector, word.vector)
                    computed_similarities.append((word, similarity))

    computed_similarities = sorted(computed_similarities, key=lambda item: -item[1])
    print([w[0].text for w in computed_similarities[:10]])
    

In [7]:
# Test the function on known words:
vector_math('king','man','woman')

['king', 'and', 'that', 'land', 'havin', 'where', 'she', 'they', 'woman', 'somethin']


## Task #2: Perform VADER Sentiment Analysis on your own review
Write code that returns a set of SentimentIntensityAnalyzer polarity scores based on your own written review.

In [8]:
# Import SentimentIntensityAnalyzer and create an sid object

from nltk.sentiment.vader import SentimentIntensityAnalyzer

sid = SentimentIntensityAnalyzer()

In [9]:
# Write a review as one continuous string (multiple sentences are ok)
review = 'Okay, listen up fam, this movie is like the ultimate vibe check of the year! \
    The characters are totally relatable, serving up major goals and keeping it real with their \
    struggles and triumphs. From the lit soundtrack to the killer plot twists, this movie is \
    straight-up fire and definitely worth adding to your watchlist ASAP!'

# Testing different reviews showcasing neutrality

review2 = "It was okay."

review3 = "It was alright."

review4 = "It was mid."

In [10]:
# Obtain the sid scores for your review
sid.polarity_scores(review)

{'neg': 0.134, 'neu': 0.683, 'pos': 0.183, 'compound': 0.3365}

In [11]:
sid.polarity_scores(review2)

{'neg': 0.0, 'neu': 0.513, 'pos': 0.487, 'compound': 0.2263}

In [12]:
sid.polarity_scores(review3)

{'neg': 0.0, 'neu': 0.5, 'pos': 0.5, 'compound': 0.25}

In [13]:
sid.polarity_scores(review4)

{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}

### CHALLENGE: Write a function that takes in a review and returns a score of "Positive", "Negative" or "Neutral"

In [14]:
def review_rating(string):
    scores = sid.polarity_scores(string)
    sentiment = ""

    if scores['compound'] > 0:
        sentiment = "Positive"
    elif scores['compound'] < 0:
        sentiment = "Negative"
    else:
        sentiment = "Neutral"
    
    return sentiment
    

In [15]:
# Test the function on your review above:
review_rating(review)

'Positive'

In [16]:
review_rating(review2)

'Positive'

In [17]:
review_rating(review3)

'Positive'

In [18]:
review_rating(review4)

'Neutral'