## 2.3 Building the Emojifier-V2

Lets now build the Emojifier-V2 model. You will do so using the embedding layer you have built, and feed its output to an LSTM network. 

<img src="images/emojifier-v2.png" style="width:700px;height:400px;"> <br>
<caption><center> **Figure 3**: Emojifier-v2. A 2-layer LSTM sequence classifier. </center></caption>


**Exercise:** Implement `Emojify_V2()`, which builds a Keras graph of the architecture shown in Figure 3. The model takes as input an array of sentences of shape (`m`, `max_len`, ) defined by `input_shape`. It should output a softmax probability vector of shape (`m`, `C = 5`). You may need `Input(shape = ..., dtype = '...')`, [LSTM()](https://keras.io/layers/recurrent/#lstm), [Dropout()](https://keras.io/layers/core/#dropout), [Dense()](https://keras.io/layers/core/#dense), and [Activation()](https://keras.io/activations/).

In [6]:
import numpy as np
from keras.models import Model
from keras.layers import Dense, Input, Dropout, LSTM, Activation
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.initializers import glorot_uniform
from emo_utils import *

In [7]:
# load word embedding
word_to_index, index_to_word, word_to_vec_map = read_glove_vecs('data/glove.6B.50d.txt')

In [13]:
print(word_to_index['bottle'])

81641


In [14]:
print(index_to_word[81641])

bottle


In [15]:
print(word_to_vec_map['bottle'])

[ 0.20635   0.2606   -0.094723 -0.73396   0.72598   0.5099   -0.39352
 -0.45703   0.49335   1.3791    0.10285   0.14997   0.41506  -0.19039
  1.0527    0.16514  -0.16717   0.8092   -0.97394  -1.753     0.34632
 -0.053064  0.33046  -0.021036 -0.78655  -1.0088   -0.30341   1.6766
  0.90808  -0.39309   1.2131    0.21588  -0.87778   1.3756    0.57432
  0.35111   0.39926   0.33184   1.2035   -0.21218   1.2316    0.58557
 -0.40531   0.37376   0.16584   0.56948  -0.13898  -0.29062   0.56082
 -0.94112 ]


In [16]:
# 将句子转化为其单词对应的索引
def sentences_to_indices(X, word_to_index, maxLen):
    """
    maxLen -- maxinum number of words in a sentence.
    """
    m = X.shape[0] # number of training examples
    X_indices = np.zeros((m, maxLen))
    for i in range(m):
        words = X[i].lower().split()
        for j in range(len(words)):
            w = words[j]
            X_indices[i, j] = word_to_index[w]
    return X_indices

In [20]:
X1 = np.array(["funny lol", "lets play football"])
print(X1.shape)
X1_indices = sentences_to_indices(X1, word_to_index, 10)
print(X1_indices)

(2,)
[[155345. 225122.      0.      0.      0.      0.      0.      0.      0.
       0.]
 [220930. 286375. 151266.      0.      0.      0.      0.      0.      0.
       0.]]


In [21]:
print(word_to_vec_map['a'].shape)

(50,)


In [24]:
# building embedding layer in Keras
def pretrained_embedding_layer(word_to_vec_map, word_to_index):
    """
    wants:
    embedding layer has the weights' shape of (vocab_size + 1, word2vec_dim)
    Returns:
    embedding_layer -- pretrained layer Keras instance
    """
    vocab_len = len(word_to_index) + 1
    emb_dim = word_to_vec_map['a'].shape[0]
    
    emb_matrix = np.zeros((vocab_len, emb_dim))
    
    # assign
    for w, i in word_to_index.items():
        emb_matrix[i, :] = word_to_vec_map[w]
        
    # Define Keras embedding layer with the correct output/input sizes, make it trainable. Use Embedding(...). Make sure to set trainable=False.
    embedding_layer = Embedding(vocab_len, emb_dim, trainable=False)
    
    # Build the embedding layer, it is required before setting the weights of the embedding layer.
    embedding_layer.build((None,))
    
    # set the weights of the embedding layer to the embedding matrix.
    embedding_layer.set_weights([emb_matrix])
    
    return embedding_layer

In [25]:
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)

In [28]:
print(embedding_layer.get_weights()[0][81641])

[ 0.20635   0.2606   -0.094723 -0.73396   0.72598   0.5099   -0.39352
 -0.45703   0.49335   1.3791    0.10285   0.14997   0.41506  -0.19039
  1.0527    0.16514  -0.16717   0.8092   -0.97394  -1.753     0.34632
 -0.053064  0.33046  -0.021036 -0.78655  -1.0088   -0.30341   1.6766
  0.90808  -0.39309   1.2131    0.21588  -0.87778   1.3756    0.57432
  0.35111   0.39926   0.33184   1.2035   -0.21218   1.2316    0.58557
 -0.40531   0.37376   0.16584   0.56948  -0.13898  -0.29062   0.56082
 -0.94112 ]


## Model
<img src="images/emojifier-v2.png" style="width:700px;height:400px;"> <br>
<caption><center> **Figure 3**: Emojifier-v2. A 2-layer LSTM sequence classifier. </center></caption>

In [47]:
def Emojify_V2(input_shape, word_to_vec_map, word_to_index):
    """
    Function creating the Emojify-v2 model's graph.
    
    Arguments:
    input_shape -- shape of the input, usually (max_len,)
    word_to_vec_map -- dictionary mapping every word in a vocabulary into its 50-dimensional vector representation
    word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)

    Returns:
    model -- a model instance in Keras
    """
    # input tensor
    sentence_indices = Input(shape=input_shape, dtype=np.int32)
    
    # Create the embedding layer
    embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
    
    # Propagate sentence_indices through the embedding_layer, you get back the embeddings
    embeddings = embedding_layer(sentence_indices)
    
    # Propagage the embeddings layer through an LSTM with 128-dimensional hidden state
    # Be careful, the returned output should be a batch of sequence
    X = LSTM(128, return_sequences=True)(embeddings)
    
    # Add dropout with a probability of 0.5
    X = Dropout(0.5)(X)
    
    # Propagate X through an LSTM layer with 128-dimensional hidden layer
    # Be careful, the returned output should be single hidden state, not a batch of sequences.
    X = LSTM(128, return_sequences=False)(X)
    
    # Add dropout with a probability of 0.5
    X = Dropout(0.5)(X)
    
    # Propagate X through a Dense layer whith softmax activation to get back a batch of 5-dimensional vectors.
    X = Dense(5, activation='softmax')(X)
    
    # Add a softmax activation
    # wonder if is necessary?
#     X = Activation('softmax')(X)
    
    # Create Model instance which converts sentence_indices into X.
    model = Model(sentence_indices, X)
    
    return model
    

In [50]:
# load train&test data
X_train, Y_train = read_csv('data/train_emoji.csv')
X_test, Y_test = read_csv('data/test.csv')
Y_oh_test = convert_to_one_hot(Y_test, C = 5)

In [56]:
maxLen = len(max(X_train, key=len).split())
X_train_indices = sentences_to_indices(X_train, word_to_index, maxLen)
Y_oh_train = convert_to_one_hot(Y_train, C = 5)

In [53]:
model = Emojify_V2((maxLen,), word_to_vec_map, word_to_index)

In [54]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_4 (InputLayer)         (None, 10)                0         
_________________________________________________________________
embedding_5 (Embedding)      (None, 10, 50)            20000050  
_________________________________________________________________
lstm_4 (LSTM)                (None, 10, 128)           91648     
_________________________________________________________________
dropout_3 (Dropout)          (None, 10, 128)           0         
_________________________________________________________________
lstm_5 (LSTM)                (None, 128)               131584    
_________________________________________________________________
dropout_4 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_2 (Dense)              (None, 5)                 645       
Total para

In [59]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

In [69]:
model.fit(X_train_indices, Y_oh_train, epochs=50, batch_size=32, shuffle=True)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


<keras.callbacks.History at 0x2f022198>

In [70]:
X_test_indices = sentences_to_indices(X_test, word_to_index, maxLen = maxLen)
Y_test_oh = convert_to_one_hot(Y_test, C = 5)
loss, acc = model.evaluate(X_test_indices, Y_test_oh)
print()
print("Test accuracy = ", acc)


Test accuracy =  0.9464285629136222


In [85]:
prediction = model.predict(sentences_to_indices(np.array(['i would like to have some noodle']), word_to_index, maxLen))
print(label_to_emoji(np.argmax(prediction)))

🍴


In [86]:
print(prediction.shape)

(1, 5)
