# Sentiment Analysis

## Part 2: Word Embeddings and Neural Network

In this notebook you will learn a powerful method to represent word in the numerical way and apply it to a simply 2-layer network for classification.

**Outline**:

- Neural Network
- Word Embeddings


**Pipeline**

<img src="resources/pipeline.png" width="800px">

## Neural Network

For this section, we will introduce some basis about neural network and define a basic NN using Tensorflow.

### Logistic Regression

There are two ways to build a model in Tensorflow:

1. Define a new Model class from `nn.Module` base class. Override `__init__` and `forward`.
2. Define a `nn.Sequential` and add layers one by one

In [None]:
# import utils and set plt settings
import nlp_proj_utils2 as utils
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline
%config InlineBackend.figure_format='retina'

In [None]:
# Make and load Moon Dataset
train_x, test_x, train_y, test_y = utils.load_moon()

plt.scatter(
    train_x[:,0],     # first feature as x
    train_x[:,1],     # second feature as y
    c=train_y.T[0],   # label as color
    cmap=plt.cm.Spectral)

In [None]:
# These are 
print('type of train and test', type(train_x))
print('shape of X', train_x.shape)
print('shape of Y', train_y.shape)

Generally speaking, Neural Network is the more general form of LR, which can be considered as a **1-Layer NN** (input layer doesn't count). 

<img src="resources/1-layer-nn.png">

<br>
<center>A 1-layer neural network: Logistic Regression</center>

In [None]:
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.optimizers import Adam

In [None]:
lr_model = Sequential()
lr_model.add(Dense(1, input_dim=2, activation='sigmoid')) # output layer

lr_model.summary()

In [None]:
lr_model.compile(
    loss='binary_crossentropy',
    optimizer='sgd',
    metrics=['accuracy'], )

In [None]:
lr_history = lr_model.fit(
    train_x, 
    train_y, 
    epochs=500, # Intend to set a large number here for demonstration
    validation_data=(test_x, test_y), )

In [None]:
utils.plot_history(lr_history, ['loss', 'val_loss'])

In [None]:
lr_model.evaluate(test_x, test_y)

In [None]:
utils.plot_decision_boundary(lr_model, test_x, test_y)

### Neural Network

LR works not so well when features are not linearly separable. It depends heavily on features, so feature engineering is essential if you are using LR.

<img src="resources/2-layer-nn.png">

<br>
<center>A 2-layer neural network: 1 hidden layer + 1 output layer</center>

In [None]:
def build_nn_model(input_dim, layers, output_dim):
    # Input layer
    X = Input(shape=(input_dim,))
    
    # Hidden layer(s)
    H = X
    for layer in layers:
        H = Dense(layer, activation='relu')(H)
    
    # Output layer
    activation_func = 'softmax' if output_dim > 1 else 'sigmoid'
    
    Y = Dense(output_dim, activation=activation_func)(H)
    return Model(inputs=X, outputs=Y)

In [None]:
nn_model = build_nn_model(
    input_dim=2,
    layers=[8],
    output_dim=1
)
nn_model.summary()

In [None]:
nn_model.compile(
    loss='binary_crossentropy',
    optimizer=Adam(learning_rate=0.01),
    metrics=['accuracy'],
)

In [None]:
nn_history = nn_model.fit(
    train_x, 
    train_y, 
    epochs=500, 
    validation_data=(test_x, test_y), )

In [None]:
utils.plot_history(nn_history, ['loss', 'val_loss'])

In [None]:
nn_model.evaluate(test_x, test_y)

In [None]:
utils.plot_decision_boundary(nn_model, test_x, test_y)

## Word Embedding

<img src="resources/word-vector.png" width="800">

### Emoji Classifier

<img src="resources/emoji.png" width="800">

In [None]:
train_x, test_x, train_y, test_y = utils.load_emoji()

In [None]:
# Download and load word embeddings
# This util function returns two dict: word_to_index and word_to_vec
# At this moment, we only need the second part
_, word_to_vec_map = utils.load_glove_vecs()

In [None]:
# Print the first 5 samples
for i in range(5):
    print(train_x[i], utils.label_to_emoji(train_y[i]))

In [None]:
# Convert output to one hot vector
train_y_oh = utils.convert_to_one_hot(train_y, 5)
test_y_oh = utils.convert_to_one_hot(test_y, 5)

print(train_y[0], "is converted into one hot", train_y_oh[0])

In [None]:
avg = utils.sentence_to_avg("I like it", word_to_vec_map)
avg

In [None]:
train_x = np.array(
    [utils.sentence_to_avg(x, word_to_vec_map) for x in train_x])

test_x = np.array(
    [utils.sentence_to_avg(x, word_to_vec_map) for x in test_x])

In [None]:
emoji_model = build_nn_model(
    input_dim=50, 
    layers=[50], 
    output_dim=5)

emoji_model.compile(
    loss='categorical_crossentropy',
    optimizer='adam',
    metrics=['accuracy'],
)

In [None]:
emoji_history = emoji_model.fit(
    train_x, 
    train_y_oh, 
    epochs=500, 
    shuffle=True, 
    validation_data=(test_x, test_y_oh), )

In [None]:
utils.plot_history(emoji_history, ['loss', 'val_loss'])

In [None]:
def pred_emoji(text):
    embedding = np.array([utils.sentence_to_avg(text, word_to_vec_map)]) # get embedding
    pred = emoji_model.predict([embedding]) # predict, return the probability of each class
    label = np.argmax(pred) # choose the one with largest probability as label
    return utils.label_to_emoji(label)

In [None]:
tests = [
    "i love you", 
    "it's horrible", 
    "funny lol", 
    "lets play with a ball", 
    "food is ready", 
    "i don't like it"]

for test in tests:
    print(test,pred_emoji(test))