# Generating  Emojis for a Given Sentence Using Deep Learning:
This jupyter notebook contains code to generate appropriate emojis using Deep Learning techniques. The notebook has 2 parts:<br>
1) Generating Emojis using Simple one-layered Neural Network.<br>
2) Generating Emojis using Recurrent Neural Network with Long Short Term Memory (LSTM).<br>
Glove Vectors: This Project Uses Glove Vectors from https://nlp.stanford.edu/projects/glove/

In [73]:
# Importing important packages.
import numpy as np
import emoji
import pandas as pd

# Reading the Training And Test Data files:
We will read our training and test set from a csv file which has sentences in first column and corresponding labels in the second column.<br> Labels are integers from 0-4.  

In [74]:
# Reading the training set csv file using pandas.
Data = pd.read_csv("Training_set.csv")
# To drop, if any, the columns with Null values.
Data = Data.dropna(axis = 1)

In [75]:
# Converting Pandas Dataframe into numpy array. 
Data1 = Data.values

In [76]:
# The first column of the file has Sentences. Saving them as strings in a numpy array.
X_train = Data1[:,0]
# The second column has labels which are codes for the emoji to be used. Label 0 means heart, label 1 means ball etc.
Y_train = Data1[:,1]


In [77]:
# Reading the test set csv file using pandas.
Data2 = pd.read_csv("Test_set.csv")

In [78]:
# Converting Pandas Dataframe into numpy array.
Data3 = Data2.values

In [79]:
# The first column of the file has Sentences. Saving them as strings in a numpy array.
X_test = Data3[:,0]
# The second column has labels which are codes for the emoji to be used. Label 0 means heart, label 1 means ball etc.
Y_test = Data3[:,1]


In [80]:
# Finding the length of the Longest Sentence in the Training Set.
maxLen = len(max(X_train, key=len).split(' '))
print(maxLen)

10


# Converting Labels to Emoji:

In [81]:
# Creating a dictionary having label : emoji code as key:value pairs.
emoji_dictionary = {"0": "\u2764\uFE0F",    
                    "1": ":baseball:",
                    "2": ":smile:",
                    "3": ":disappointed:",
                    "4": ":fork_and_knife:"}
# Converting label to emoji using emoji package.
def label_to_emoji(label):
    """Function Paramters: label: A label between 0 and 4.
       Return: Function returns an emoji emoticon corresponding to a label."""
    return emoji.emojize(emoji_dictionary[str(label)], use_aliases=True)

# Converting Labels to One-Hot Vectors:


In [10]:
# Converting labels to one-hot vectors as the softmax cross-entropy loss function uses this format of labels.
def convert_to_one_hot(Y):
    """Function Paramters: Y : A numpy array/vector of labels.
       Return: Function Returns a numpy array of shape (m , maxlabel+1) where m is length of Y.
       maxlabel + 1 means if we have max label as 4, we need 4 zeros and 1 one to represent it as one-hot."""
    
    # Initializing one-hot numpy array
    a = np.zeros((Y.shape[0],np.max(Y)+1))
    # Storing 1 at index == label for an example.
    a[np.arange(Y.shape[0]), Y] = 1
    
    return a

In [82]:
# Converting the Y_train labels to one-hot vectors.
# Before using convert_to_one_hot(Y) function, we need to convert Y_train datatype to integers.
Y_train = Y_train.astype(int)
# calling the above function.
Y_one_hot_train = convert_to_one_hot(Y_train)

In [83]:
# Let us check.
print(str(Y_train[5]) + " is converted to one-hot vector " + str(Y_one_hot_train[5]))


0 is converted to one-hot vector [ 1.  0.  0.  0.  0.]


In [84]:
# Converting the Y_test labels to one-hot vectors.
# Before using convert_to_one_hot(Y) function, we need to convert Y_test datatype to integers.
Y_test = Y_test.astype(int)
# Calling the above function.
Y_one_hot_test = convert_to_one_hot(Y_test)


In [85]:
# Let us check.
print(str(Y_test[11]) + " is converted to one-hot vector " + str(Y_one_hot_test[11]))

3 is converted to one-hot vector [ 0.  0.  0.  1.  0.]


# Reading Word Embeddings from Glove Vectors:

In [15]:
# Function that reads glove vector file and returns words_to_index, index_to_words, word_to_vec_map.
def read_glove_vectors(glove_file):
    """Function Paramters: Path to Glove Vector File.
       Return: Function Returns:
       1) words_to_index: A dictionary containing mapping from words to index.
       2) index_to_words: A dictionary containing mapping from index to words.
       3) word_to_vec_map: A dictionary conataining mapping from words to vectors. """
    # Open the text file and create an object f of the file.
    with open(glove_file, 'r') as f:
        # Creating an empty set that will contain words.
        words = set()
        # Creating an empty dictionary that will contain word:vector pairs.
        word_to_vec_map = {}
        # Looping through the file line by line. Every line has a word followed by it's vector.
        for line in f:
            # Splitting the long string of word followed by vector values. around blank spaces i.e. taking out words from a line.
            line = line.strip().split()
            # Taking out the word i.e. first element of the line.
            curr_word = line[0]
            # Adding current word to the set.
            words.add(curr_word)
            # Adding word:vector pair to the dictionary.
            word_to_vec_map[curr_word] = np.array(line[1:], dtype=np.float64)
        
        # Creating index to word and word to index dictionaries.  
        # Starting indexes of words from 1.
        i = 1
        # Creating empty dictionaries.
        words_to_index = {}
        index_to_words = {}
        # Looping through all the words created in the set above.
        for w in sorted(words):
            # Saving word:index pair in the dictionary.
            words_to_index[w] = i
            # Saving index:word pair in the dictionary.
            index_to_words[i] = w
            # Increamenting index by 1 for the next word.
            i = i + 1
        
        return words_to_index, index_to_words, word_to_vec_map    

In [16]:
word_to_index, index_to_word, word_to_vec_map = read_glove_vectors('glove.6B.50d.txt')

In [86]:
word = "chilly"
index = 289846
print("the index of", word, "in the vocabulary is", word_to_index[word])
print("the", str(index) + "th word in the vocabulary is", index_to_word[index])

the index of chilly in the vocabulary is 99074
the 289846th word in the vocabulary is potatos


# Creating Simple Model to Predict Label From Sentence:
This Model uses all the words in the Sentence and takes average of their Word Vectors to predict label for the sentence using a 1 Layered Neural Network with Softmax Activation.

In [87]:
# This function takes a sentence as an input and returns the average of word vectors of the words in that sentence.
def average(sentence , word_to_vec_map):
    """Function Parameters: sentence: A sentence in the form of a string.
       word_to_vec_map: A dictionary containing words to vector mapping.
       Return: Function returns a vector with average values of the embedding vectors."""
    # Splitting Sentence into lowercase words.
    words = [i.lower() for i in sentence.split()]
    
    total = 0
    for i in words:
        total = total + word_to_vec_map[i]
    
    
    average = total / len(words)
    
    return average

In [88]:
# Let us check
average("She is beautiful", word_to_vec_map)

array([ 0.40837067,  0.74165   , -0.78191333, -0.15946333,  0.76338333,
        0.45639133, -0.4019587 , -0.08766333, -0.04882333,  0.20158667,
        0.00542   ,  0.14020673,  0.07584033, -0.11334333,  0.51032333,
        0.136538  ,  0.11707   ,  0.53019667, -0.071368  , -0.18856333,
       -0.25945333,  0.82907667,  0.11644533,  0.30177   ,  0.85965333,
       -1.70628   , -0.84639667,  0.70187   ,  0.25922133, -0.57696467,
        3.07303333, -0.30657667,  0.00867667, -0.33548533,  0.07780833,
        0.0226    ,  0.07458067,  0.54815   , -0.08488   , -0.67778333,
       -0.079656  ,  0.12986   ,  0.0259089 , -0.295027  , -0.09114533,
       -0.10266633, -0.05054967, -1.05132333,  0.198019  ,  0.29884033])

In [20]:
def softmax(z):
    """Function Paramter: z: a vector or a python variable
       Return: Function returns softmax of the input vector."""
    # We subtract so that value doesn't becomes infinity.
    e_z = np.exp(z - np.max(z))
    
    return e_z / e_z.sum()

In [21]:
# Training a Simple Neural Network Model. 
def model(X ,Y , word_to_vec_map, learning_rate = 0.01, num_iterations = 400):
    """Function Paramters: X: A numpy array of shape(m,1) having sentences
       Y: Numpy vector of labels 
       word_to_vec_map: A dictionary containing word to vector mapping
       learning_rate: for gradient descent
       num_iteration: Number of iterations for Gradient Descent.
       Return: Function returns the updated paramters W and b"""

    # Number of training examples.
    m = Y.shape[0]
    
    # Number of output nodes.
    n_y = 5
    
    # Number of input nodes i.e. length of Glove Vector.
    n_h = 50
    
    # Parameter Initialization using Xavier Technique:
    
    W = np.random.randn(n_y , n_h) / np.sqrt(n_h)
    b = np.zeros((n_y , 1))
    
    # Converting Labels to one-hot Vectors.
    Y = Y.astype(int)
    Y_one_hot = convert_to_one_hot(Y)
    
    # Stochastic Gradient Descent:
    for t in range(num_iterations):
        # Looping over the examples one-by-one.
        for i in range(m):
            avg = average(X[i] , word_to_vec_map)
            avg = avg.reshape(50,1)
            # Forward Propagation.
            Z = np.dot(W, avg) + b
            
            A = softmax(Z)
            
            # Cost For Softmax Function;
            
            cost = -(np.sum(np.multiply(Y_one_hot[i].reshape(5,1), np.log(A))))
            
            # Computing Gradients:
            
            # Derivative of cost w.r.t Z
            dz = A - Y_one_hot[i].reshape(5,1)
            # Derivative of cost w.r.t W . We do outer dot product between dz and avg.
            dw = np.dot(dz.reshape(n_y,1), avg.reshape(1, n_h))
            # Derivative of cost w.r.t b
            db = dz
            
            # Stochastic Gradient Descent Update:
            
            W = W - learning_rate * dw
            
            b = b - learning_rate * db
            
            
            """if t % 100 == 0:
                print("Epoch: " + str(t) + " --- cost = " + str(cost))
                pred = predict(X, Y, W, b, word_to_vec_map)"""
        if(t%100 == 0 ):   
            print("Epoch: " + str(t) + " --- cost = " + str(cost))
    
    return  W, b

In [22]:
# Calling the function to train on Training Set.
W, b = model(X_train, Y_train, word_to_vec_map)

Epoch: 0 --- cost = 1.95204988128
Epoch: 100 --- cost = 0.0797181872601
Epoch: 200 --- cost = 0.0445636924368
Epoch: 300 --- cost = 0.0343226737879


In [24]:
# This function predicts labels for Sentences using only Forward Propagation:
def predict(X , Y , W, b, word_to_vec_map):
    """Function Parameters: X: A numpy array of shape(m,1) having sentences
       Y: Numpy vector of labels 
       W,b: Trained Parameters
       word_to_vec_map: A dictionary containing word to vector mapping
       Return: Function returns Label predictions for all examples in X."""
    # Number of examples to be predicted
    m = X.shape[0]
    # Array of Zeros to store prediction labels
    pred = np.zeros((m,1))
    
    # Looping over the training examples
    for i in range(m):
        avg = average(X[i] , word_to_vec_map)
        # Forward Propagation:
        
        Z = np.dot(W, avg.reshape(50,1)) + b
        A = softmax(Z)
        
        # Saving the prediction label in pred vector.
        pred[i] = np.argmax(A)
    
    print("Accuracy = "  + str(np.mean((pred == Y.reshape(Y.shape[0],1)))))
        
    return pred    

In [25]:
# Calling the function to make prediction on training set.
pred = predict(X_train,Y_train, W , b , word_to_vec_map)

Accuracy = 0.977272727273


In [26]:
# Making prediction on unseen data.
X_my_sentences = np.array(["i adore you", "i love you", "funny lol", "lets play with a ball", "food is ready", "not feeling happy"])
Y_my_labels = np.array([[0], [0], [2], [1], [4],[3]])

pred = predict(X_my_sentences, Y_my_labels , W, b, word_to_vec_map)


Accuracy = 0.833333333333


# Generating Emojis using Recurrent Neural Network with Long Short Term Memory (LSTM).

In [27]:
# importing keras package and functionalities.
from keras.models import Model
from keras.layers import Dense, Input, Dropout, LSTM, Activation
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.initializers import glorot_uniform

Using TensorFlow backend.


# Converting Sentences to Indices of Words:
To Use Sequence Model like LSTM, we need to convert sentences into indices first and then eventually use word-embeddings for training.<br> 
<b> Note:</b> We will do padding with 0 vectors and make all sentences of same length. This is a requirement for training using a Sequence Model using a Batch Gradient Descent Approach. 

In [28]:
def sentences_to_indices(X , word_to_index, maxLen):
    """Function Parameters: X: Numpy Array having sentences.
       word_to_index: A dictionary mapping words to index.
       maxLen: Length of Longest Sentence
       Return: X_indices : A numpy array of shape (m,10) having indices for all 10 words in a sentence.
       Index 0 means padded word."""
    
    # Number of training examples.
    m = X.shape[0]
    
    # Initialize a X_indices matrix of shape (m , maxLen).We use maxLen for all sentences in training example to make 
    # them of equal length.
    X_indices = np.zeros((m , maxLen))
    
    # Looping over all the training examples.
    for i in range(m):
        # Splitting a sentence into words.
        # Create a list of words having all words in a sentence in lower case.
        words = [j.lower() for j in X[i].split()]
        
        # Initialize word counter to set index as 0: 
        k = 0 
        # Looping over words in a sentence: 
        for w in words:
            # Setting i,jth index in X_indexes:
            X_indices[i][k] = word_to_index[w]
            # To set index of next word, increase k
            k = k + 1 
            
     
    return X_indices            

# Creating Keras Embedding Layer:
Before building the LSTM Network, we need to create an Embedding layer in Keras which can convert word index into Word Vectors.<br>
It is just like creating mapping from indices to vectors. Embedding Layer saves us from using expensive operations for fetching word vectors from indices.

In [29]:
def pretrained_embedding_layer(word_to_vec_map , word_to_index):
    """Function Parameters: word_to_vec_map : A dictionary mapping word to vectors.
       word_to_index: A dictionary mapping word to index.
       Return: Function returns a Keras embedding layer."""
    
    # Length of vocabulary. + 1 is required by Keras. 
    # Because index for words start from 1 and not 0.
    vocab_len = len(word_to_index) + 1 
    # Dimension of embedding vector. We use 50 because our pretrained Glove vec has length 50
    emb_dim = 50
    
    # Initializing an emb_matrix with Zeros. Every row will correspond to vector for that word.
    emb_matrix = np.zeros((vocab_len , emb_dim))
    
    # Looping over each element of word_to_index and saving vectors row-wise in emb_matrix.
    
    for word,index in word_to_index.items():
        
        emb_matrix[index , :] = word_to_vec_map[word]
        
    
    # Defining Keras Embedding Layer.This should have parmaters as Non-Trainable because we don't want to alter the embedding we are using.
    
    embedding_layer = Embedding(vocab_len , emb_dim, trainable = False)
    
    # Before giving weights to embedding layer, we need to build the layer. 
    
    embedding_layer.build((None,))
    
    # Setting the weights for embedding layer.
    
    embedding_layer.set_weights([emb_matrix])
    
    return embedding_layer   
    

In [30]:
# Embedding_layer can be indexed using 3 index viz 0, index, vector_index.
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
print("weights[0][1][3] =", embedding_layer.get_weights()[0][2][:])

weights[0][1][3] = [-1.05879998  0.26952001  0.94632     0.056907    0.24439999  0.37810001
  1.32579994 -0.88515002 -0.31154999  0.57618999 -0.056118   -0.62589002
 -0.41668999 -0.58279002  0.66974998  0.11759     0.68662     0.62711
 -0.65701997 -0.078008   -0.52221     0.018973    0.97861999  0.78516001
  0.69097     0.47174999 -1.1171      0.25342     0.34635001 -1.18659997
  0.69871998  0.66864002 -1.27649999  0.92610002 -0.017565   -0.25185001
  1.44840002 -0.75392997 -0.07427    -0.18682     0.69292998 -0.56638002
 -0.39572001 -0.30950999 -0.94393998  0.27484     1.06850004  0.31138
  0.79843003  0.20392001]


# Building Emoji_Model:


In [31]:
def emoji_lstm_model(input_shape , word_to_vec_map , word_to_index):
    """Function Paramters : input_shape: Shape of the input layer for Keras Model.
       word_to_vec_map : Mapping from words to vectors.
       word_to_index : Mapping from words to index.
       Return : Function returns Keras model. """
    
    # We will input the indices rather than words. Creating the input layer for the Network.
    
    sentence_indices = Input(input_shape, dtype='int32')
    
    # Creating the pre-trained embedding layer by calling the above function.
    
    embedding_layer = pretrained_embedding_layer(word_to_vec_map , word_to_index)
    
    # Propagating Input through Embedding layer to get word embedding vectors.
    
    embeddings = embedding_layer(sentence_indices)
    
    # Creating LSTM layer with 128 dimensional hidden state:
    # Also, returned sequence should be batch of sequences.
    
    X = LSTM(128 , return_sequences = True)(embeddings)
    
    # Adding Dropout Layer With Probability 0.5:
    
    X = Dropout(0.5)(X)
    
    # Creating Second LSTM Layer with 128 Dimensional Hidden State. 
    # But now, return_sequences = False because we want output at only the last time step and not all time steps.
    
    X = LSTM(128 , return_sequences = False)(X)
    
    # Adding Dropout Layer With Probability 0.5:
    
    X = Dropout(0.5)(X)
    
    # Creating Dense Layer at last time step returning a vector of size 5 with softmax activation.
    
    X = Dense(5)(X)
    
    X = Activation("softmax")(X)
    
    # Creating Model Instance:
    
    model = Model(inputs = sentence_indices , outputs = X)
    
    return model

In [38]:
# Calling the function to create model.
model = emoji_lstm_model((10,) , word_to_vec_map , word_to_index)

In [39]:
# Printing summary of the model.
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_2 (InputLayer)         (None, 10)                0         
_________________________________________________________________
embedding_3 (Embedding)      (None, 10, 50)            20000050  
_________________________________________________________________
lstm_3 (LSTM)                (None, 10, 128)           91648     
_________________________________________________________________
dropout_3 (Dropout)          (None, 10, 128)           0         
_________________________________________________________________
lstm_4 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dropout_4 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 645       
__________

In [40]:
# Compiling the model:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [89]:
X_train_indices = sentences_to_indices(X_train, word_to_index, maxLen)
# We have already created one-hot from original labels of Y_train.
Y_one_hot_train


(132, 10)

# Note on Dimensions:
We pass X and Y of shape (m,10) and (m,5) respectively i.e. each row represents a training examples.

In [42]:
# Fitting the model on the training set.
model.fit(X_train_indices, Y_one_hot_train, epochs = 50, batch_size = 32, shuffle=True)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x115a48e48>

# Model Accuracy on the Test Set :

In [90]:
# Converting Sentences to Indices.
X_test_indices = sentences_to_indices(X_test, word_to_index, maxLen)
# Evaluating model on the Test set
loss, acc = model.evaluate(X_test_indices, Y_one_hot_test)

print("Test accuracy = ", acc)

Test accuracy =  0.857142857143


# Identifying Mis-labelled Examples in the Test Set.

In [92]:
# Finding predictions on the Test Set.
# Predictions are the Output of the Network i.e. Probability Vector.
pred = model.predict(X_test_indices)
for i in range(len(X_test)):
    # Finding the label from probability vector.
    num = np.argmax(pred[i])
    if(num != Y_test[i]):
        print('Expected emoji:'+ label_to_emoji(Y_test[i]) + ' prediction: '+ X_test[i] + label_to_emoji(num).strip())

Expected emoji:😄 prediction: he got a very nice raise❤️
Expected emoji:😄 prediction: she got me a nice present❤️
Expected emoji:😞 prediction: This girl is messing with me❤️
Expected emoji:😄 prediction: you brighten my day❤️
Expected emoji:😞 prediction: she is a bully❤️
Expected emoji:😞 prediction: My life is so boring❤️
Expected emoji:😄 prediction: will you be my valentine❤️
Expected emoji:😄 prediction: What you did was awesome😞


# Generating Emoji for a New Sentence:

In [93]:
text = input()

hi


In [94]:
# Converting user input sentence to numpy array. 
x_test = np.array([text])
# Converting words to indices. 
X_test_indices = sentences_to_indices(x_test, word_to_index, maxLen)
print(x_test[0] +' '+  label_to_emoji(np.argmax(model.predict(X_test_indices))))

hi 😄
