# <u>Deep Emoji 

The aim of this project is to use a deep Recurrent Neural Network to express an input text in terms of an emoji.
The model assigns a suitable emoji to the input sentence. 

In [17]:
# import the necessary things first
import numpy as np
from emoji_utils import *
import emoji
import os.path

from keras.models import Model
from keras.layers import Dense, Input, Dropout, LSTM, Activation
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.initializers import glorot_uniform

## Dataset
The dataset used for this task is pretty small and the number of emojis used is only 5. So all the sentences will be expressed in terms of those 5 emojis only.

Let us load the dataset for training and testing.

In [2]:
# load the dataset
train_path = 'data/train_emoji.csv'
test_path = 'data/test_emoji.csv'

X_train, Y_train = load_csv(train_path)
X_test, Y_test = load_csv(test_path)

# find the maximum length of input training example
max_len = -1
for example in X_train:
    if len(example.split()) > max_len:
        max_len = len(example.split())
        
print(max_len)

10


For the purpose of training we will have to convert the output $ Y $ labels from numbers to a One Hot encoded representation. So we will convert the $Y$ vector from $(m, 1)$ to $(m,5)$ .

In [3]:
Y_train_ohe = convert_to_OHE(Y_train, C = 5)
Y_test_ohe = convert_to_OHE(Y_test, C = 5)

We will be using $GloVe$ word embeddings for representing each of the input words and convert the one hot encoding to a **Featurized representation**.

- `word_to_index`: dict mapping from words to their numerical indices. The vocabulary size is of 400,001 words.
- `index_to_word`: dictionary mapping from numerical indices to their corresponding words in the vocabulary.
- `word_to_vec_map`: dictionary mapping words to their GloVe vector representation.

In [4]:
# load the GLoVe word embeddings 
word_to_index, index_to_word, word_to_vec = load_glove('data/glove.6B.50d.txt')
print(len(word_to_index))
print(len(index_to_word))

400000
400000


# <u> Model Architecture
We will be using a embedding layer for finding the embedding vectors then the embedding vectors will be fed to a 2 layers deep **LSTM** Network stacked one on top of another. Dropout layers will be used for **regularization** and finally the output of the top LSTM network is given to a **softmax** layer for finding the output prediction.

Since for making it possible to use mini batches we will be using a defined length for all inputs. So we will have to pad the sentences if their lengths are shorter. We will pad with $0$.

#### Embedding layer
For using a embedding matrix in Keras we will use a Embedding Layer and will give it the 50 dimensional GLoVe embedding matrix weights. So the embedding layer will not be trained again. The input to the Embedding layer will be a matrix of size $(batch size, max input length)$ where each input will be a vector of numerical indices corresponding to its word mapping. Output will be a embedding vector of dimension $(batch size, max input length, dimension of word vectors)$.

1. So the input words will be converted to index representation.
2. Zero padding will be added.
3. Create an Embedding Layer.
4. Load the embedding layer weight values from GLoVe.
5. Then it will be fed to the embedding layer to get the embedding vector.

The input to the model is an array of shape (`m`, `max_len`) and the output is a  probability vector of shape (`m`, `number of output classes`). 
The input will be a vector of indices which will be first given to the Embedding layer and the output will be a One Hot representation of label classes.

In [5]:
X_train_indices = sentence_to_indices(X_train, word_to_index, max_len)
X_test_indices = sentence_to_indices(X_test, word_to_index, max_len)

In [6]:
# Creates a Keras Embedding layer with weights loaded from pre-trained 50-dimensional GloVe matrix.
def create_embedding_layer(word_to_vec_map, word_to_index):
    # GLoVe embedding dimensions
    emb_dim = word_to_vec_map["apple"].shape[0]         
    # adding 1 to fit Keras embedding
    vocab_len = len(word_to_index) + 1              
    # make a matrix of zeros of required size and load it with weights from glove embeddings
    emb_matrix = np.zeros((vocab_len, emb_dim))
    
    # get the embedding weights for each word
    for word, index in word_to_index.items():
        emb_matrix[index, :] = word_to_vec[word]
    
    # make a Keras Embedding layer 
    embedding_layer = Embedding(input_dim = vocab_len, output_dim = emb_dim, trainable = False)
    # Build the embedding layer
    embedding_layer.build((None,))
    # use the pretrained weights for the layer
    embedding_layer.set_weights([emb_matrix])
    
    return embedding_layer

In [7]:
# returns the overall model instance
def create_model(input_shape, word_to_vec_map, word_to_index):
    # input to the graph
    sentence_indices = Input(shape = input_shape, dtype = 'int32')
    
    # create the embedding layer with GloVe weights
    embedding_layer = create_embedding_layer(word_to_vec_map, word_to_index)
    # propagate the input through the embedding layer
    embeddings = embedding_layer(sentence_indices)   
    
    # LSTM layer 1
    # the output of each timestep is fed to the next layer
    X = LSTM(units = 128, return_sequences = True)(embeddings)
    X = Dropout(rate = 0.5)(X)
    
    # LSTM layer 2
    # Here all the output timesteps have not been used only the output of the final timestep has been used
    X = LSTM(units = 128)(X)
    X = Dropout(rate = 0.5)(X)
    # pass the RNN output through a softmax of 5 units
    X = Dense(units = 5, activation = 'softmax')(X)
    # Add a softmax activation
    X = Activation('softmax')(X)
    
    # Keras model instance
    model = Model(inputs=sentence_indices, outputs=X)
    
    return model

In [19]:
model = create_model((max_len,), word_to_vec, word_to_index)
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_3 (InputLayer)         (None, 10)                0         
_________________________________________________________________
embedding_3 (Embedding)      (None, 10, 50)            20000050  
_________________________________________________________________
lstm_5 (LSTM)                (None, 10, 128)           91648     
_________________________________________________________________
dropout_5 (Dropout)          (None, 10, 128)           0         
_________________________________________________________________
lstm_6 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dropout_6 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_3 (Dense)              (None, 5)                 645       
__________

In [20]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [22]:
model_path = 'models/weights.h5'

if os.path.exists(model_path):
    model.load_weights(model_path)
    print('Model Weights found')

Model Weights found


In [14]:
# start model training
model.fit(X_train_indices, Y_train_oh, epochs = 100, batch_size = 32, shuffle=True)
# save the model weights
model.save_weights(model_path)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

Epoch 86/100
Epoch 87/100
Epoch 88/100
Epoch 89/100
Epoch 90/100
Epoch 91/100
Epoch 92/100
Epoch 93/100
Epoch 94/100
Epoch 95/100
Epoch 96/100
Epoch 97/100
Epoch 98/100
Epoch 99/100
Epoch 100/100


## Model Testing 
Let us evaluate the e model on test data.

In [23]:
# evaluate model
_, acc = model.evaluate(X_test_indices, Y_test_ohe)

print("\nTest Accuracy = " +str(acc*100) + ' %' )


Test Accuracy = 78.5714294229 %


Let us look at some of outputs from the test data.

In [24]:
# find predictions
pred = model.predict(X_test_indices)

# random indices
indices = np.arange(min(15,X_test_indices.shape[0]))

for i in indices:
    x = X_test_indices
    emoji_number = np.argmax(pred[i])
    print('Model Prediction: '+ X_test[i] + label_to_emoji(emoji_number).strip())
    print('Expected emoji:'+ label_to_emoji(Y_test[i]) )
    print()

Model Prediction: I want to eat	🍴
Expected emoji:🍴

Model Prediction: he did not answer	😞
Expected emoji:😞

Model Prediction: he got a very nice raise	❤️
Expected emoji:😄

Model Prediction: she got me a nice present	❤️
Expected emoji:😄

Model Prediction: ha ha ha it was so funny	😄
Expected emoji:😄

Model Prediction: he is a good friend	😄
Expected emoji:😄

Model Prediction: I am upset	😞
Expected emoji:😞

Model Prediction: We had such a lovely dinner tonight	😄
Expected emoji:😄

Model Prediction: where is the food	🍴
Expected emoji:🍴

Model Prediction: Stop making this joke ha ha ha	😄
Expected emoji:😄

Model Prediction: where is the ball	⚾
Expected emoji:⚾

Model Prediction: work is hard	😞
Expected emoji:😞

Model Prediction: This girl is messing with me	❤️
Expected emoji:😞

Model Prediction: are you serious😞
Expected emoji:😞

Model Prediction: Let us go play baseball	⚾
Expected emoji:⚾



### Try using the model
Give an input sentence to see how to works. 

In [25]:
# takes the user input and shows output
def text_to_emoji():
    text = input()
    x_input = np.array([text])
    # convert to indices
    x_idx = sentence_to_indices(x_input, word_to_index, max_len)
    print(x_input[0] +' '+  label_to_emoji(np.argmax(model.predict(x_idx))))

In [26]:
text_to_emoji()

he likes to play
he likes to play ⚾


### Credits
This project is based on the assignment from Sequence Models Specialization by Deeplearning.ai on Coursera. <br>https://www.coursera.org/learn/nlp-sequence-models/home/welcome