# Report on Movie Review Classification
Github Link: https://github.com/soobino/adv_machine_learning

*   QMSS 5074
*   Advanced Machine Learning 
*  Spring 2023

## Set Up

**Discuss the dataset in general terms and describe why building a predictive model using this data might be practically useful.  Who could benefit from a model like this? Explain.**

In [2]:
#install aimodelshare library
! pip install aimodelshare==0.0.189

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting aimodelshare==0.0.189
  Downloading aimodelshare-0.0.189-py3-none-any.whl (967 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m967.8/967.8 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting scikit-learn==1.2.1
  Downloading scikit_learn-1.2.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.6 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m9.6/9.6 MB[0m [31m59.0 MB/s[0m eta [36m0:00:00[0m
Collecting tensorflow==2.9.2
  Downloading tensorflow-2.9.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (511.8 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m511.8/511.8 MB[0m [31m2.9 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting Pympler==0.9
  Downloading Pympler-0.9.tar.gz (178 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m178.4/178.4 kB[0m [31m18.6 MB/s[0m eta [36m0

In [1]:
# Get competition data
from aimodelshare import download_data
download_data('public.ecr.aws/y2e2a1d6/sst2_competition_data-repository:latest') 


Data downloaded successfully.


In [2]:
# Set up X_train, X_test, and y_train_labels objects
import pandas as pd
import warnings
warnings.simplefilter(action='ignore', category=Warning)

X_train=pd.read_csv("sst2_competition_data/X_train.csv", squeeze=True)
X_test=pd.read_csv("sst2_competition_data/X_test.csv", squeeze=True)

y_train_labels=pd.read_csv("sst2_competition_data/y_train_labels.csv", squeeze=True)

y_train = pd.get_dummies(y_train_labels)

In [3]:
from tensorflow import keras
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.utils import pad_sequences
import numpy as np

max_words =20000
maxlen=100
embedding_dim = 100 

tokenizer = Tokenizer(num_words=maxlen)
tokenizer.fit_on_texts(X_train)

def preprocessor(data, maxlen=maxlen, max_words=max_words):

    sequences = tokenizer.texts_to_sequences(data)

    word_index = tokenizer.word_index
    X = pad_sequences(sequences, maxlen=maxlen)

    return X

In [4]:
import aimodelshare as ai
ai.export_preprocessor(preprocessor,"") 

Your preprocessor is now saved to 'preprocessor.zip'


In [6]:
#Set credentials using modelshare.org username/password

from aimodelshare.aws import set_credentials
    
apiurl="https://rlxjxnoql9.execute-api.us-east-1.amazonaws.com/prod/m" #This is the unique rest api that powers this specific Playground

set_credentials(apiurl=apiurl)

mycompetition= ai.Competition(apiurl)

AI Modelshare Username:··········
AI Modelshare Password:··········
AI Model Share login credentials set successfully.


## Prediction Models Part 1


**1. Use an Embedding layer and LSTM layers**

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Flatten

model_1 = Sequential()
model_1.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model_1.add(LSTM(64, return_sequences=True, dropout=0.2))
model_1.add(LSTM(32, dropout=0.2))
model_1.add(Flatten())
model_1.add(Dense(2, activation='softmax'))

model_1.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_8 (Embedding)     (None, 100, 100)          2000000   
                                                                 
 lstm_8 (LSTM)               (None, 100, 64)           42240     
                                                                 
 lstm_9 (LSTM)               (None, 32)                12416     
                                                                 
 flatten_5 (Flatten)         (None, 32)                0         
                                                                 
 dense_11 (Dense)            (None, 2)                 66        
                                                                 
Total params: 2,054,722
Trainable params: 2,054,722
Non-trainable params: 0
_________________________________________________________________


In [None]:
model_1.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_1.fit(preprocessor(X_train), y_train,
                    epochs=5,
                    batch_size=26,
                    validation_split=0.2)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [None]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_1, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model_1.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
prediction_column_index=model_1.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_1.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 176

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763



**2. Use an Embedding layer and Conv1d layers**

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.layers import SimpleRNN, LSTM,Embedding

model_2 = Sequential()
model_2.add(layers.Embedding(max_words, embedding_dim, input_length=maxlen))
model_2.add(layers.Conv1D(64, 7, activation='relu')) 
model_2.add(layers.Conv1D(32, 7, activation='relu')) 
model_2.add(layers.Conv1D(16, 7, activation='relu')) 
model_2.add(layers.MaxPooling1D(5)) #
model_2.add(layers.GlobalMaxPooling1D())
model_2.add(layers.Dense(2, activation='softmax'))

model_2.summary()

Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_9 (Embedding)     (None, 100, 100)          2000000   
                                                                 
 conv1d_9 (Conv1D)           (None, 94, 64)            44864     
                                                                 
 conv1d_10 (Conv1D)          (None, 88, 32)            14368     
                                                                 
 conv1d_11 (Conv1D)          (None, 82, 16)            3600      
                                                                 
 max_pooling1d_3 (MaxPooling  (None, 16, 16)           0         
 1D)                                                             
                                                                 
 global_max_pooling1d_3 (Glo  (None, 16)               0         
 balMaxPooling1D)                                     

In [None]:
model_2.compile(optimizer=RMSprop(lr=1e-4),
              loss='binary_crossentropy',
              metrics=['acc'])
history = model_2.fit(preprocessor(X_train), y_train,
                    epochs=5,
                    batch_size=26,
                    validation_split=0.2)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [None]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_2, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model_2.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
prediction_column_index=model_2.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_2.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 177

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763



**3. Use transfer learning with glove embeddings**

In [None]:
# Download Glove embedding matrix weights
! wget http://nlp.stanford.edu/data/wordvecs/glove.6B.zip

--2023-04-15 19:45:34--  http://nlp.stanford.edu/data/wordvecs/glove.6B.zip
Resolving nlp.stanford.edu (nlp.stanford.edu)... 171.64.67.140
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://nlp.stanford.edu/data/wordvecs/glove.6B.zip [following]
--2023-04-15 19:45:34--  https://nlp.stanford.edu/data/wordvecs/glove.6B.zip
Connecting to nlp.stanford.edu (nlp.stanford.edu)|171.64.67.140|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://downloads.cs.stanford.edu/nlp/data/wordvecs/glove.6B.zip [following]
--2023-04-15 19:45:34--  https://downloads.cs.stanford.edu/nlp/data/wordvecs/glove.6B.zip
Resolving downloads.cs.stanford.edu (downloads.cs.stanford.edu)... 171.64.64.22
Connecting to downloads.cs.stanford.edu (downloads.cs.stanford.edu)|171.64.64.22|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 862182753 (822M) [app

In [None]:
! unzip glove.6B.zip 

Archive:  glove.6B.zip
replace glove.6B.100d.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: N


In [None]:
import os

glove_dir = os.getcwd()

embeddings_index = {}
f = open(os.path.join(glove_dir, 'glove.6B.100d.txt'))
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()

print('Found %s word vectors.' % len(embeddings_index))

Found 400001 word vectors.


In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import RMSprop
from tensorflow.keras.layers import SimpleRNN, LSTM,Embedding

model_3 = tf.keras.Sequential()
model_3.add(layers.Embedding(maxlen, embedding_dim, input_length=maxlen))
model_3.add(layers.Flatten())
model_3.add(layers.Dense(64, activation='relu'))
model_3.add(layers.Dense(32, activation='relu'))
model_3.add(layers.Dense(16, activation='relu'))
model_3.add(layers.Dense(2, activation='softmax'))
model_3.summary()

Model: "sequential_12"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_12 (Embedding)    (None, 100, 100)          10000     
                                                                 
 flatten_8 (Flatten)         (None, 10000)             0         
                                                                 
 dense_21 (Dense)            (None, 64)                640064    
                                                                 
 dense_22 (Dense)            (None, 32)                2080      
                                                                 
 dense_23 (Dense)            (None, 16)                528       
                                                                 
 dense_24 (Dense)            (None, 2)                 34        
                                                                 
Total params: 652,706
Trainable params: 652,706
Non-t

In [None]:
import tensorflow as tf

word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))

embedding_matrix = np.zeros((maxlen, maxlen))
for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word)
    if i < maxlen:
        if embedding_vector is not None:
            embedding_matrix[i] = embedding_vector

Found 13835 unique tokens.


In [None]:
model_3.layers[0].set_weights([embedding_matrix])
model_3.layers[0].trainable = False

model_3.compile(optimizer=RMSprop(lr=1e-4),
              loss='binary_crossentropy',
              metrics=['acc'])

history = model_3.fit(preprocessor(X_train), y_train,
                    epochs=5,
                    batch_size=26,
                    validation_split=0.2)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [None]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_3, framework='keras',
                          transfer_learning=True,
                          deep_learning=True)

with open("model_3.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [None]:
prediction_column_index=model_3.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_3.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 178

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763


**Discuss which models performed better and point out relevant hyper-parameter values for successful models.
Submit your best three models to the leader board for the SST Model Share competition.**

I experimented 3 models.

All of them were fitted with: epochs=5, batch_size=26,validation_split=0.2
Last layer was always softmax to classify into two categories: negative and positive sentiment. 

1. Embedding layer and LSTM layers
* 2 LSTM layers
* Accuracy of 0.6808 at epoch 5/5

2. Embedding layer and Conv1d layers
* 3 Conv1d Layers
* Accuracy of 0.6149 at epoch 5/5

3. Transfer learning with glove embeddings
* Accuracy of 0.6637 at epoch 5/5

Accuracy was the highest with LSTM layers

## Prediction Models Part 2


**After you submit your first three models, describe your best model with your team via your team slack channel. Fit and submit up to three more models after learning from your team.** 

In [7]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Flatten

model_4 = Sequential()
model_4.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model_4.add(LSTM(128, return_sequences=True, dropout=0.2))
model_4.add(LSTM(128, dropout=0.2))
model_4.add(Flatten())
model_4.add(Dense(2, activation='softmax'))

model_4.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 100, 100)          2000000   
                                                                 
 lstm (LSTM)                 (None, 100, 128)          117248    
                                                                 
 lstm_1 (LSTM)               (None, 128)               131584    
                                                                 
 flatten (Flatten)           (None, 128)               0         
                                                                 
 dense (Dense)               (None, 2)                 258       
                                                                 
Total params: 2,249,090
Trainable params: 2,249,090
Non-trainable params: 0
_________________________________________________________________


In [8]:
model_4.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_4.fit(preprocessor(X_train), y_train,
                    epochs=10,
                    batch_size=10,
                    validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [9]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_4, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model_4.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [10]:
prediction_column_index=model_4.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_4.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 353

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763


In [11]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Flatten

model_5 = Sequential()
model_5.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model_5.add(LSTM(64, return_sequences=True, dropout=0.2))
model_5.add(LSTM(64, dropout=0.2))
model_5.add(Flatten())
model_5.add(Dense(2, activation='softmax'))

model_5.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, 100, 100)          2000000   
                                                                 
 lstm_2 (LSTM)               (None, 100, 64)           42240     
                                                                 
 lstm_3 (LSTM)               (None, 64)                33024     
                                                                 
 flatten_1 (Flatten)         (None, 64)                0         
                                                                 
 dense_1 (Dense)             (None, 2)                 130       
                                                                 
Total params: 2,075,394
Trainable params: 2,075,394
Non-trainable params: 0
_________________________________________________________________


In [12]:
model_5.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_5.fit(preprocessor(X_train), y_train,
                    epochs=10,
                    batch_size=10,
                    validation_split=0.2)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [13]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_5, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model_5.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [14]:
prediction_column_index=model_5.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_5.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 359

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763


In [15]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Flatten

model_6 = Sequential()
model_6.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model_6.add(LSTM(64, return_sequences=True, dropout=0.2))
model_6.add(LSTM(64, dropout=0.2))
model_6.add(Flatten())
model_6.add(Dense(2, activation='softmax'))

model_6.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     (None, 100, 100)          2000000   
                                                                 
 lstm_4 (LSTM)               (None, 100, 64)           42240     
                                                                 
 lstm_5 (LSTM)               (None, 64)                33024     
                                                                 
 flatten_2 (Flatten)         (None, 64)                0         
                                                                 
 dense_2 (Dense)             (None, 2)                 130       
                                                                 
Total params: 2,075,394
Trainable params: 2,075,394
Non-trainable params: 0
_________________________________________________________________


In [16]:
model_6.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_6.fit(preprocessor(X_train), y_train,
                    epochs=10,
                    batch_size=10,
                    validation_split=0.4)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [17]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_6, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model_6.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [18]:
prediction_column_index=model_6.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_6.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 363

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763


In [19]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Embedding, LSTM, Flatten

model_7 = Sequential()
model_7.add(Embedding(max_words, embedding_dim, input_length=maxlen))
model_7.add(LSTM(128, return_sequences=True, dropout=0.2))
model_7.add(LSTM(128, dropout=0.2))
model_7.add(Flatten())
model_7.add(Dense(2, activation='softmax'))

model_7.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_3 (Embedding)     (None, 100, 100)          2000000   
                                                                 
 lstm_6 (LSTM)               (None, 100, 128)          117248    
                                                                 
 lstm_7 (LSTM)               (None, 128)               131584    
                                                                 
 flatten_3 (Flatten)         (None, 128)               0         
                                                                 
 dense_3 (Dense)             (None, 2)                 258       
                                                                 
Total params: 2,249,090
Trainable params: 2,249,090
Non-trainable params: 0
_________________________________________________________________


In [20]:
model_7.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
history = model_7.fit(preprocessor(X_train), y_train,
                    epochs=10,
                    batch_size=10,
                    validation_split=0.4)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [21]:
from aimodelshare.aimsonnx import model_to_onnx

onnx_model = model_to_onnx(model_7, framework='keras',
                          transfer_learning=False,
                          deep_learning=True)

with open("model_7.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

In [22]:
prediction_column_index=model_7.predict(preprocessor(X_test)).argmax(axis=1)
prediction_labels = [y_train.columns[i] for i in prediction_column_index]
mycompetition.submit_model(model_filepath = "model_7.onnx",
                                 preprocessor_filepath="preprocessor.zip",
                                 prediction_submission=prediction_labels)

Insert search tags to help users find your model (optional): 
Provide any useful notes about your model (optional): 

Your model has been submitted as model version 375

To submit code used to create this model or to view current leaderboard navigate to Model Playground: 

 https://www.modelshare.org/detail/model:2763


**Discuss which models you tried and which models performed better and point out relevant hyper-parameter values for successful models.**

My teammates also saw higher model performance from models that use LSTM layers. They used a bigger number for epoch, so I tried using 10 as epoch to see the change in performance. Comparing Model 4 and 5, which have different size filters, they ended up having the same accuracy of 0.7103. This indicates that epoch might have more value in altering model performance than filter sizes. For Model 6, I tried using a higher number of 0.4 (vs. 0.2) for validation data split, and all else equal as Model 5. In fact, the change in validation set has given significantly higher accuracy with 0.8090 at epoch 10/10. Even at 1/10 epoch, this model had accuracy of 0.7926, which is higher than the previous models. Lastly, I tried Model 7 with 0.4 validation data split with larger filter size. The accuracy did not improve, which indicate that filter sizes do not impact the performance much in the LSTM models.