# LSTM Model to Predict Change in Sentiments

Here, I use an LSTM model to predict the change in the sentiment of an incoming utterance given the previous sentiment. This model takes as inputs word sequences and the sentiment of the previous utterance. This model is able to take word order into account. Similarly with the previous models, I used a pre-trained word embedding to represent words in 50-dimensional GloVe embeddings. For my implementation, I used Keras with Tensorflow as the backend. 

## Mini-batch Training
Messages have different lengths and all the input sequences must have the same length to train the model with mini-batches. According to DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset (https://arxiv.org/pdf/1710.03957.pdf), the average number of tokens per utterance is 14.9. Therefore, I set the maximum length of the incoming sequences to 15 tokens. Any messages shorter than 15 were padded with zeros and the ones longer than 15 were right truncated. The batch size is 32 and I trained the model for 10 epochs.

## Overview of the Model
Here is the sentiment change prediction model that I used
![alt text](lstm_model.jpg "Model")

In [2]:
from utility import *

from keras.models import Model
from keras.layers import Dense, Input, Dropout, LSTM, Activation, Concatenate
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
from keras.initializers import glorot_uniform

np.random.seed(1)

  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
  return f(*args, **kwds)


In [3]:
# 50-dimensional GloVe embeddings
word_to_index, index_to_word, word_to_vec_map = read_glove_vecs('glove.6B/glove.6B.50d.txt')

In [4]:
def sentences_to_indices(samples, word_to_index, max_len):
    '''
    this function converts an array of sentences (strings) into an array of indices corresponding 
    to words in the sentences.
    The output shape should be such that it can be given to Embedding(). 
    '''
    m = len(samples)                                   # number of training examples
    X_indices = np.zeros((m, max_len))
    
    for i, sentence in enumerate(samples):            # loop over training examples
        
        sentence = re.sub(r'[^\w\s]', ' ', sentence.strip())  
        words = [i.lower() for i in sentence.strip().split()]
        j = 0
        
        for w in words:
            if w in word_to_index:
                X_indices[i, j] = word_to_index[w]
            else:
                X_indices[i, j] = word_to_index['unk']
            j += 1
            if j == max_len: break
                
    return X_indices

In [5]:
def create_features(samples_list):
    '''
    this function creates the inputs and output of the model
    '''
    X, Y, aux_X = [], [], []
    for sample in samples_list:            
        X.append(sample['utterance'])
        Y.append(sample['current_emotion'] - sample['prev_emotion'])
        aux_X.append(sample['prev_emotion'])
    
    return X, aux_X, Y

In [6]:
def pretrained_embedding_layer(word_to_vec_map, word_to_index):
    '''
    this function creates an embedding layer using glove.6B
    '''
    vocab_len = len(word_to_index) + 1                  # adding 1 to fit Keras embedding (requirement)
    emb_dim = word_to_vec_map["cucumber"].shape[0]      # define dimensionality of your GloVe word vectors (= 50)
    emb_matrix = np.zeros((vocab_len, emb_dim))
    
    for word, index in word_to_index.items():
        emb_matrix[index, :] = word_to_vec_map[word]

    embedding_layer = Embedding(vocab_len, emb_dim, trainable=False)
    embedding_layer.build((None,))
    embedding_layer.set_weights([emb_matrix])
    
    return embedding_layer

In [16]:
from keras import backend as K

def coeff_determination(y_true, y_pred):
    SS_res =  K.sum(K.square( y_true-y_pred )) 
    SS_tot = K.sum(K.square( y_true - K.mean(y_true) ) ) 
    return ( 1 - SS_res/(SS_tot + K.epsilon()) )

In [7]:
def sentiment_model(input_shape, word_to_vec_map, word_to_index):
    '''
    this function creates the rnn model
    '''
    sentence_indices = Input(input_shape, dtype='int32')
    aux_input = Input((1, ), dtype='float32')
    
    embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index)
    embeddings = embedding_layer(sentence_indices)   
    
    X = LSTM(128, return_sequences=True)(embeddings)
    X = Dropout(0.5)(X)
    X = LSTM(128, return_sequences=False)(X)
    X = Dropout(0.5)(X)
    X = Dense(1)(X)
    
    merged = Concatenate()([aux_input, X])
    merged = Dense(1)(merged)
    output = Activation('tanh')(merged)
    model = Model(inputs=[aux_input, sentence_indices], outputs=output)
        
    return model

In [8]:
# load train, test and validation sets
train_conversations = load_conversations(category='train')
find_sentiments(train_conversations)

test_conversations = load_conversations(category='test')
find_sentiments(test_conversations)

validation_conversations = load_conversations(category='validation')
find_sentiments(validation_conversations)

In [9]:
# any messages longer than max length will be cut 
# and messages shorter than max length will be padded with 0
maxLen = 15

In [10]:
# create training samples
train_samples = create_samples(train_conversations)
x_train, aux_x_train, y_train = create_features(train_samples)
print("Number of training samples: ", len(x_train), len(aux_x_train), len(y_train))

Number of training samples:  76052 76052 76052


In [11]:
# change the shape of auxilary input: previous emotion
aux_x_train = np.array(aux_x_train) 
aux_x_train.reshape(-1, 1)
# print(aux_x_train.shape)

array([[-0.2   ],
       [-0.35  ],
       [-0.3125],
       ...,
       [ 0.    ],
       [ 0.    ],
       [ 0.    ]])

In [12]:
# create the sentiment model and show the model summary
model = sentiment_model((maxLen,), word_to_vec_map, word_to_index)
model.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            (None, 15)           0                                            
__________________________________________________________________________________________________
embedding_1 (Embedding)         (None, 15, 50)       20000050    input_1[0][0]                    
__________________________________________________________________________________________________
lstm_1 (LSTM)                   (None, 15, 128)      91648       embedding_1[0][0]                
__________________________________________________________________________________________________
dropout_1 (Dropout)             (None, 15, 128)      0           lstm_1[0][0]                     
__________________________________________________________________________________________________
lstm_2 (LS

In [17]:
# compile and fit the model
model.compile(loss='mse', optimizer='adam', metrics=['mse', coeff_determination])
x_train_indices = sentences_to_indices(x_train, word_to_index, maxLen)
model.fit([aux_x_train, x_train_indices], y_train, epochs=10, batch_size=32, shuffle=True)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x1a2f77f4a8>

In [20]:
# create the test samples
test_samples = create_samples(test_conversations)
x_test, aux_x_test, y_test = create_features(test_samples)

# reshape the auxilary test input
aux_x_test = np.array(aux_x_test) 
aux_x_test.reshape(-1, 1)

# evaluate the model with the test set
x_test_indices = sentences_to_indices(x_test, word_to_index, max_len=maxLen)
loss, mse, r2 = model.evaluate([aux_x_test, x_test_indices], y_test)

print("Test mean square error = ", mse)
print("Test r-squared = ", r2)

Test mean square error =  0.017323209428800498
Test mean square error =  0.8795359571538977


In [22]:
# create the test samples
validation_samples = create_samples(validation_conversations)
x_validation, aux_x_validation, y_validation = create_features(validation_samples)

# reshape the auxilary test input
aux_x_validation = np.array(aux_x_validation) 
aux_x_validation.reshape(-1, 1)

# evaluate the model with the test set
x_validation_indices = sentences_to_indices(x_validation, word_to_index, max_len=maxLen)
loss, ms, r2 = model.evaluate([aux_x_validation, x_validation_indices], y_validation)

print("Validation mean square error = ", mse)
print("Validation r-squared: ", r2)

Validation mean square error =  0.017323209428800498
Validation r-squared:  0.8747538389554539
