___

<a href='https://www.instagram.com/lanlearning/'> <img src='../pimages/logosmall.png' width="100" height="100"/></a>
___
<center>Copyright LanLearning 2020</center>





# Welcome Back ~ Day 8! 

## Agenda: 
### 1. NLP: Word2Vec
### 2. Pandas5
### 3. Live Demo: Mariokart Dataset

# Icebreaker: What is your absolute dream job?

# NLP Day 3: Word2Vec

### Representing word relationships are vectors

Now we are able to give the computer meaningful understanding of words. 

### Take a look below. 
If we have meanings on words, such as **Royalty, Masculinity, Femininity, Poor, Rich, Noble, and Age** we can assign something called **weights** to each word in our vocabulary. 

<img src='../pimages/vecs.png' width="700"/>

### What is a weight? 
A weight is a number between **0 and 1** which shows magnitude or in our case **strength in connection**
- 1 is a strong weight
- 0 is a weak weight

For example, **King** is a word which has some meaningful connection to **royalty, masculinity, rich**. 
- A king is **royal**, so the weight for that is near 1.
- A king is always a **male**, so the masculinity weight is high. 
- A king may be **corrupt**, so the weight for noble is somewhat low.

## Plotting these words:

### Note how many dimensions a word can have: 

<img src='../pimages/n.png' width="700"/>

### How to reduce dimensions to be able to visualize:

<img src='../pimages/dim.png' width="700"/>


## Allows you to see relationships:
<img src='../pimages/wv.png' width="900"/>


### Start combining them using addition and subtraction:
<img src='../pimages/vetcs.png' width="500"/>
<img src='../pimages/add.png' width="500"/>

In [None]:
#coding word2vec
!pip install gensim
pip install --upgrade gensim

In [None]:
#import gensim

from gensim.test.utils import common_texts, get_tmpfile
from gensim.models import Word2Vec

In [None]:
import gensim

In [None]:
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True) 

In [None]:
model.most_similar(positive=['king'], negative=['man'])

In [None]:
model.most_similar(positive=['school'], negative=['teacher'])

In [None]:
model.most_similar(positive=['fish'], negative=['shark'])

In [None]:
model.most_similar('lunch', 'dinner')

In [None]:
model.doesnt_match("breakfast cereal dinner lunch".split())

In [None]:
model.doesnt_match("paris california oregon ohio".split())

# Day 2 of NLP Basics

<img src='../pimages/sentiment.jpeg' width="400"/>

## How does sentiment analysis work? 

### Sentiment Library:
Each word in the English language has some sentiment ranging from -1 to 1 associated with it.

The **VADER** sentiment analysis algorithm matches each word in your text to the sentiment library. 


### Challenges with Sentiment Analysis:

## "Gatorade has an amazing color, but an awful taste." 

So much variety of sentiment in the phrase above, we have the word amazing, but also have the word awful. With computers, they take the words for meanings, so it may be hard at tims to understand the true intent of a phrase.



### Sentiment Scores
**Positive:** associated with a positive opinion/sentiment

**Negative:** associated with a negative opinion/sentiment

**Neutral:** associated with a neutral opinion/sentiment

**Compound:** the average, or overall, sentiment from the statement



### How to use VADER?

We don't need to know how VADER was created. As long as we know what's going on, we're good to go

#### Power of models and libraries is that we can simply use them without knowing the code behind them. It's important to know how they work and what's going on, but no need to fully understand the code. 


# Let's use VADER: 

In [None]:
# run the cell below if need to install

#!pip install vaderSentiment

# you will NEED to restart jupyter once downloading

In [None]:
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer

analyser = SentimentIntensityAnalyzer()

# Documentation: 
https://www.nltk.org/api/nltk.sentiment.html

In [None]:
def sentiment_analyzer_scores(sentence):
    score = analyser.polarity_scores(sentence) #from documentation
    print('SENTENCE EXAMINED:')
    print('')
    print(sentence) 
    print('---------------------------------')
    print('SENTENCE SCORE:')
    print('')
    print(score) #simple print statement
    print('')

In [None]:
sentiment_analyzer_scores(snape_speech)

In [None]:
sentiment_analyzer_scores('This is so negative. Such a bad string, hopefully it gets a really horrible sentiment.')

In [None]:
example_string = 'This is string is not good. it is not well made.'
sentiment_analyzer_scores(example_string)

In [None]:
example_string2 = 'The good part about Coca cola is that the taste is good.'
sentiment_analyzer_scores(example_string2)

In [None]:
milan_linkedin = "I'm a Data Science student who loves to teach and make use of the tools and technologies I learn about to make meaningful data-enabled discoveries. I enjoy learning about Sports Analytics, Decision-Making, Predictive Modeling, Education. I'm currently looking for opportunities to grow in these areas. In my free time, I enjoy playing basketball, adding visuals to my sports blog, teaching probability and data science, hanging out and having a fun time with friends and family! I love meeting someone new, so feel free to contact me! "

In [None]:
milan_linkedin

In [None]:
sentiment_analyzer_scores(milan_linkedin)

In [None]:
# where can sentiment analysis be useful?

In [None]:
# what are some other challenges with sentiment analysis? 

### Vader Sentiment Analysis is a Model

Models generally will work similarly when coding. 

You will need to import the model and simple use if with your pandas DataFrame. No need to know the coding behind the model, but understanding how models work is important!

Link: https://medium.com/analytics-vidhya/simplifying-social-media-sentiment-analysis-using-vader-in-python-f9e6ec6fc52f


# NLP Pre-processing from Day1

### Example: Professor Snape's Speech from Harry Potter. 

<img src='../pimages/snape.jpg' width="400"/>

In [None]:
snape_speech = "There will be no foolish wand-waving or silly incantations in this class. As such, I don't expect many of you to appreciate the subtle science and exact art that is potion-making. However, for those select few who possess the predisposition, I can teach you how to bewitch the mind and ensnare the senses. I can tell you how to bottle fame, brew glory, and even put a stopper in death. Then again, maybe some of you have come to Hogwarts in possession of abilities so formidable that you feel confident enough to not pay attention!"
snape_speech

Code will recognize the speech as a series or **list** of words. Sometimes, not necessarily the text. 

**Let's split the speech into a list of words.**

In [None]:
# want to get the words, series of words, from the text
snape_speech

In [None]:
example = ['There', 'will', 'be', 'no', 'foolish']
example

In [None]:
import re
snape_speech = re.sub(r'[^\w\s]','',snape_speech)

In [None]:
snape_speech

In [None]:
# pre-processing the text aka turing the text into a list of words! 
list_of_words = snape_speech.split()
list_of_words
