# Sentiment Analysis P3

In this notebook, we will build a deep LSTM network and insert a fixed pre-trained embedding layer in Keras

<img src="resources/pipeline.png" width="800px">

## Still Emoji

In [None]:
# import 
import numpy as np
import nlp_proj_utils3 as utils
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Input, Dropout, LSTM, Activation, Embedding
from tensorflow.keras.preprocessing import sequence

import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format='retina'


np.random.seed(1)

In [None]:
train_x, test_x, train_y, test_y = utils.load_emoji()

**Load pretrained word embeddings**

2 dictionaries are loaded:

- `word_to_index`: map a word to its index in the vocabulary
    - Example:  `'word' -> 1234`

- `word_to_vec_map`: map a word to its embedding
    - Example: `'word' -> [0.1, 0.2, ..., 0.45]`

When adding a custom embedding layer in Keras, we can only load the pretrained embedding as a big matrix instead of a dictionary. An index will help us locate the entry for a given word.

### Word Embeddings & One Hot

In [None]:
word_to_index, word_to_vec_map = utils.load_glove_vecs()

In [None]:
utils.sentences_to_indices?

In [None]:
# Convert word to the index in vocabulary
utils.sentences_to_indices(
    np.array(["i like it", "i hate it"]),  # array of test sentences
    word_to_index, 
    max_len = 5)

In [None]:
maxlen = max([len(x.split()) for x in train_x])
print('max number of words in a sentence:', maxlen)

In [None]:
# Convert training/testing features into index list
train_x = utils.sentences_to_indices(train_x, word_to_index, maxlen)
test_x = utils.sentences_to_indices(test_x, word_to_index, maxlen)

# Convert training/testing labels into one hot array
train_y = utils.convert_to_one_hot(train_y, C = 5)
test_y = utils.convert_to_one_hot(test_y, C = 5)

In [None]:
# Check to make sure the shape looks good
assert train_x.shape == (132, maxlen)
assert train_y.shape == (132, 5)

### Embedding Layer

We need to build a embedding matrix where each row represent a word vector.

In [None]:
def pretrained_embedding_layer(word_to_index, word_to_vec_map):
    """
    Build and return a Keras Embedding Layer given word_to_vec mapping and word_to_index mapping
    
    Args:
        word_to_index (dict[str->int]): map from a word to its index in vocabulary
        word_to_vec_map (dict[str->np.ndarray]): map from a word to a vector with shape (N,) where N is the length of a word vector (50 in our case)

    Return:
        Keras.layers.Embedding: Embedding layer
    """
    
    # Keras requires vocab length start from index 1
    vocab_len = len(word_to_index) + 1  
    emb_dim = list(word_to_vec_map.values())[0].shape[0]
    
    # Initialize the embedding matrix as a numpy array of zeros of shape (vocab_len, dimensions of word vectors = emb_dim)
    emb_matrix = np.zeros((vocab_len, emb_dim))
    
    # Set each row "index" of the embedding matrix to be the word vector representation of the "index"th word of the vocabulary
    for word, index in word_to_index.items():
        emb_matrix[index, :] = word_to_vec_map[word]

    # Define Keras embedding layer with the correct output/input sizes, make it trainable. Use Embedding(...). Make sure to set trainable=False. 
    return Embedding(
        input_dim=vocab_len, 
        output_dim=emb_dim, 
        trainable=False,  # Indicating this is a pre-trained embedding 
        weights=[emb_matrix]
    )

For more information on how to define a pre-trained embedding layer in Keras, please refer to [this post](https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html).

### Build the Model

<img src="resources/deep_lstm.png" style="width:700px;height:400px;"> <br>
<caption><center> A 2-layer LSTM sequence classifier. </center></caption>

In [None]:
def build_emoji_model(input_dim, word_to_index, word_to_vec_map):
    """
    Build and return the Keras model
    
    Args:
        input_dim: The dim of input layer
        word_to_vec_map (dict[str->np.ndarray]): map from a word to a vector with shape (N,) where N is the length of a word vector (50 in our case)
        word_to_index (dict[str->int]): map from a word to its index in vocabulary
    
    Returns:
        Keras.models.Model: 2-layer LSTM model
    """
    
    # Input layer
    sentence_indices = Input(shape=(input_dim,), dtype='int32')
    
    # Embedding layer
    embedding_layer = pretrained_embedding_layer(word_to_index, word_to_vec_map)
    embeddings = embedding_layer(sentence_indices)   
    
    # 2-layer LSTM
    X = LSTM(128, return_sequences=True, recurrent_dropout=0.5)(embeddings)  # N->N RNN
    X = Dropout(rate=0.8)(X)
    X = LSTM(128, recurrent_dropout=0.5)(X)  # N -> 1 RNN
    X = Dropout(rate=0.8)(X)
    X = Dense(5, activation='softmax')(X)
    
    # Create and return model
    model = Model(inputs=sentence_indices, outputs=X)
    
    return model

In [None]:
emoji_model = build_emoji_model(
    maxlen, 
    word_to_index, 
    word_to_vec_map)

emoji_model.summary()

In [None]:
emoji_model.compile(
    loss='categorical_crossentropy', 
    optimizer='adam', 
    metrics=['accuracy'])

In [None]:
history = emoji_model.fit(
    train_x, 
    train_y, 
    epochs = 100,  
    # has to be a tuple, due to a tf bug: https://github.com/tensorflow/tensorflow/issues/39370
    validation_data=(test_x, test_y)  
)

In [None]:
utils.plot_history(history, ['loss', 'val_loss'])

In [None]:
utils.plot_history(history, ['accuracy', 'val_accuracy'])

In [None]:
emoji_model.evaluate(train_x, train_y)
emoji_model.evaluate(test_x, test_y)

### Sava and Load Models

In [None]:
# import
import h5py

Two parts need to be saved inorder to use the model in prod:

1. Neural Network Structure
2. Trained Weights (Matrix)

We will save them separately. This makes it easy to manage multiple versions of weights and you can always choose which version to go for production.

In [None]:
import os

model_root = 'resources/emoji_model'
os.makedirs(model_root, exist_ok=True)

In [None]:
# Save model structure as json
with open(os.path.join(model_root, "network.json"), "w") as fp:
    fp.write(emoji_model.to_json())

# Save model weights
emoji_model.save_weights(os.path.join(model_root, "weights.h5"))

Download and load a pretrained model 

In [None]:
network_path, weights_path = utils.download_best_emoji_model()

In [None]:
from tensorflow.keras.models import model_from_json

# Load model structure
with open(network_path, "r") as fp:
    emoji_model_best = model_from_json(fp.read())

# Load model weights
emoji_model_best.load_weights(weights_path)

In [None]:
emoji_model_best.compile(
    loss='categorical_crossentropy', 
    optimizer='adam', 
    metrics=['accuracy'])

In [None]:
def predict(text):
    x = utils.sentences_to_indices(
        np.array([text]), 
        word_to_index, 
        maxlen)
    
    probs = emoji_model_best.predict(x)
    pred = np.argmax(probs)
    
    print(text, utils.label_to_emoji(pred))

In [None]:
predict('i am not feeling happy')