<a href="https://colab.research.google.com/github/tejas-shanthraj/srh-da3-deep-learning/blob/main/10_Sentiment_Analysis_RNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup the Environment

In [1]:
import tensorflow.keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import InputLayer, Dense, SimpleRNN, Activation, Dropout, Conv1D
from tensorflow.keras.layers import Embedding, Flatten, LSTM, GRU
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.callbacks import EarlyStopping

import pandas as pd
import numpy as np
import spacy
from sklearn.metrics import classification_report

In [14]:
# Fix Colab bug: https://github.com/googlecolab/colabtools/issues/3409
import locale
locale.getpreferredencoding = lambda do_setlocale: "UTF-8"

## Exploratory Data Analysis

In [2]:
data = pd.read_csv("https://storage.googleapis.com/adsa-data/sentiment-analysis/tweeter.csv", header=None, encoding='latin-1')
data.head()

Unnamed: 0,0,1,2,3,4,5
0,0,1467810369,Mon Apr 06 22:19:45 PDT 2009,NO_QUERY,_TheSpecialOne_,"@switchfoot http://twitpic.com/2y1zl - Awww, t..."
1,0,1467810672,Mon Apr 06 22:19:49 PDT 2009,NO_QUERY,scotthamilton,is upset that he can't update his Facebook by ...
2,0,1467810917,Mon Apr 06 22:19:53 PDT 2009,NO_QUERY,mattycus,@Kenichan I dived many times for the ball. Man...
3,0,1467811184,Mon Apr 06 22:19:57 PDT 2009,NO_QUERY,ElleCTF,my whole body feels itchy and like its on fire
4,0,1467811193,Mon Apr 06 22:19:57 PDT 2009,NO_QUERY,Karoli,"@nationwideclass no, it's not behaving at all...."


In [3]:
# Check for missing values
data.isnull().any()

Unnamed: 0,0
0,False
1,False
2,False
3,False
4,False
5,False


## Preparing Data

We only care about the tweet text and tweet sentiment information, which stored in the 5th column and 0th column in the dataset. In the sentiment column, 0 represents negative, and 1 represents positive.

We organize the data as data_X contains all the tweet text, data_y contains the labels.

The following code will convert the tweet text data_X to sequence format that will be feed into RNNs

In [4]:
data_X = data[5]
print(data_X)

0        @switchfoot http://twitpic.com/2y1zl - Awww, t...
1        is upset that he can't update his Facebook by ...
2        @Kenichan I dived many times for the ball. Man...
3          my whole body feels itchy and like its on fire 
4        @nationwideclass no, it's not behaving at all....
                               ...                        
19995    Just woke up. Having no school is the best fee...
19996    TheWDB.com - Very cool to hear old Walt interv...
19997    Are you ready for your MoJo Makeover? Ask me f...
19998    Happy 38th Birthday to my boo of alll time!!! ...
19999    happy #charitytuesday @theNSPCC @SparksCharity...
Name: 5, Length: 20000, dtype: object


#### Label:
*   0 -> NEGATIVE
*   2 -> NEUTRAL
*   4 -> POSITIVE

In [5]:
data_y = pd.get_dummies(data[0]).to_numpy()
print(data_y)

[[ True False]
 [ True False]
 [ True False]
 ...
 [False  True]
 [False  True]
 [False  True]]


Splitting Data for Training

In [6]:
# TODO: Split data into train and valid sets
from sklearn.model_selection import train_test_split
train_X, valid_X, train_y, valid_y = train_test_split(data_X, data_y, test_size=0.2, random_state=42)

## Tokenization

In [7]:
MAX_VOCAB = 18000
MAX_LEN = 150
EMBED_SIZE = 200

In [8]:
# TODO: Tokenize inputs
tokenizer = Tokenizer(num_words=MAX_VOCAB)
tokenizer.fit_on_texts(train_X)

train_X = tokenizer.texts_to_sequences(train_X)
valid_X = tokenizer.texts_to_sequences(valid_X)

word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))

Found 26000 unique tokens.


In [9]:
# TODO: Text padding
train_X = pad_sequences(train_X, maxlen = MAX_LEN, padding="post")
valid_X = pad_sequences(valid_X, maxlen = MAX_LEN, padding="post")

train_X.shape

(16000, 150)

In [10]:
train_X

array([[ 554,  238,   12, ...,    0,    0,    0],
       [  16,  313, 1589, ...,    0,    0,    0],
       [  87,  303,    5, ...,    0,    0,    0],
       ...,
       [   5, 1052,  239, ...,    0,    0,    0],
       [ 814,   31,   13, ...,    0,    0,    0],
       [   6,    1,  827, ...,    0,    0,    0]], dtype=int32)

## Preparing Word Embeddings using the GloVe Model

In [11]:
!pip install -U gensim



In [12]:
import gensim.downloader as api

# Load the twitter embeddings model. This model is trained on 2 billion tweets, which contains 27 billion tokens, 1.2 million vocabs.
# might take a while
glove_model = api.load("glove-twitter-200")



In [13]:
# calcultaete number of words
nb_words = len(word_index) + 1
print('All words: ', nb_words)

# obtain the word embedding matrix
embedding_matrix = np.zeros((nb_words, EMBED_SIZE))
for word, i in word_index.items():
    if word in glove_model:
        embedding_matrix[i] = glove_model[word]

print('Null word embeddings: %d' % np.sum(np.sum(embedding_matrix, axis=1) == 0))

All words:  26001
Null word embeddings: 10327


**Explanation of the steps performed till now**

Tweets: Is upset that he can't update his Facebook..

Expected Input to RNN model -
Is - Embeddings [200] (32)

upset - Embeddings [200] (450)

that - Embeddings [200] (43)

he - Embeddings [200] (56)

1. Vocabulary of all tweets: 30257 unique tokens
2. Unique token IDs: ID (1, 2, 3, 4... for all the 30257 tokens)
3. Tweets represented as the sequence of IDs [32 450 43 56 ...]

Padding:
"Commonly in RNN's, we take the final output or hidden state and use this to make a prediction (or do whatever task we are trying to do).
If we send a bunch of 0's to the RNN before taking the final output (i.e. 'post' padding as you describe), then the hidden state of the network at the final word in the sentence would likely get 'flushed out' to some extent by all the zero inputs that come after this word.
So intuitively, this might be why pre-padding is more popular/effective." - [link](https://stackoverflow.com/questions/46298793/how-does-choosing-between-pre-and-post-zero-padding-of-sequences-impact-results)

Padding for RNNs - [Link](https://datascience.stackexchange.com/questions/49168/padding-sequences-for-neural-sequence-models-rnns)

[Paper](https://arxiv.org/abs/1903.07288)





## Training and Evaluation


Train and evaluate the SimpleRNN, LSTM, and GRU networks on our prepared dataset.

We are using the pre-trained word embeddings from the glove.twitter.27B.200d.txt data. Using the pre-trained word embeddings as weights for the Embedding layer leads to better results and faster convergence.

We set each models to run 20 epochs, but we also set EarlyStopping rules to prevent overfitting. The results of the SimpleRNN, LSTM, GRU models can be seen below.

In [14]:
model_rnn = Sequential()
model_rnn.add(Embedding(nb_words, EMBED_SIZE, weights=[embedding_matrix], input_length=MAX_LEN, trainable = False))

# TODO: Add a SimpleRNN layer
model_rnn.add(SimpleRNN(128, activation="tanh", return_sequences=False))

model_rnn.add(Dense(2, activation='softmax'))
model_rnn.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model_rnn.fit(train_X, train_y, epochs=20, batch_size=120,
          validation_data=(valid_X, valid_y), callbacks=EarlyStopping(monitor='val_accuracy', mode='max',patience=3))

predictions_rnn = model_rnn.predict(valid_X)
predictions_rnn = predictions_rnn.argmax(axis=1)
print(classification_report(valid_y.argmax(axis=1), predictions_rnn))



Epoch 1/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 36ms/step - accuracy: 0.4967 - loss: 0.6982 - val_accuracy: 0.4875 - val_loss: 0.7107
Epoch 2/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 21ms/step - accuracy: 0.5002 - loss: 0.7043 - val_accuracy: 0.4915 - val_loss: 0.7037
Epoch 3/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 21ms/step - accuracy: 0.5019 - loss: 0.7028 - val_accuracy: 0.4902 - val_loss: 0.6950
Epoch 4/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 23ms/step - accuracy: 0.5129 - loss: 0.6940 - val_accuracy: 0.5142 - val_loss: 0.7048
Epoch 5/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 68ms/step - accuracy: 0.5148 - loss: 0.6961 - val_accuracy: 0.5160 - val_loss: 0.7046
Epoch 6/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 58ms/step - accuracy: 0.5189 - loss: 0.6952 - val_accuracy: 0.5105 - val_loss: 0.6948
Epoch 7/20
[1m134/13

## LSTM and GRUs

In [15]:
# TODO: Train a LSTM model by replacing the SimpleRNN layer with a LSTM layer
model_lstm = Sequential()
model_lstm.add(Embedding(nb_words, EMBED_SIZE, weights=[embedding_matrix], input_length=MAX_LEN, trainable = False))

# TODO: Add a LSTM layer
model_lstm.add(LSTM(128, activation="tanh", return_sequences=False))

model_lstm.add(Dense(2, activation='softmax'))
model_lstm.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model_lstm.fit(train_X, train_y, epochs=20, batch_size=120,
          validation_data=(valid_X, valid_y), callbacks=EarlyStopping(monitor='val_accuracy', mode='max',patience=3))

predictions_lstm = model_lstm.predict(valid_X)

# TODO: Print a classification report for the model
predictions_lstm = predictions_lstm.argmax(axis=1)
print(classification_report(valid_y.argmax(axis=1), predictions_lstm))

Epoch 1/20




[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 18ms/step - accuracy: 0.4965 - loss: 0.6933 - val_accuracy: 0.4955 - val_loss: 0.6932
Epoch 2/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step - accuracy: 0.4989 - loss: 0.6932 - val_accuracy: 0.4955 - val_loss: 0.6932
Epoch 3/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 17ms/step - accuracy: 0.5034 - loss: 0.6931 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 4/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 20ms/step - accuracy: 0.4954 - loss: 0.6932 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 5/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 22ms/step - accuracy: 0.4955 - loss: 0.6932 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 6/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 17ms/step - accuracy: 0.5010 - loss: 0.6932 - val_accuracy: 0.4955 - val_loss: 0.6932
[1m125/125[0m [32m━━━━━━━━━━━━

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [16]:
# TODO: Train a GRU model by replacing the SimpleRNN layer with a GRU layer
model_gru = Sequential()
model_gru.add(Embedding(nb_words, EMBED_SIZE, weights=[embedding_matrix], input_length=MAX_LEN, trainable = False))

# TODO: Add a GRU layer
model_gru.add(GRU(128, activation="tanh", return_sequences=False))

model_gru.add(Dense(2, activation='softmax'))
model_gru.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model_gru.fit(train_X, train_y, epochs=20, batch_size=120,
          validation_data=(valid_X, valid_y), callbacks=EarlyStopping(monitor='val_accuracy', mode='max',patience=3))

predictions_gru = model_gru.predict(valid_X)

# TODO: Print a classification report for the model
predictions_gru = predictions_gru.argmax(axis=1)
print(classification_report(valid_y.argmax(axis=1), predictions_gru))

Epoch 1/20




[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 19ms/step - accuracy: 0.5040 - loss: 0.6935 - val_accuracy: 0.4955 - val_loss: 0.6934
Epoch 2/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 13ms/step - accuracy: 0.5029 - loss: 0.6932 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 3/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 12ms/step - accuracy: 0.4991 - loss: 0.6933 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 4/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - accuracy: 0.4986 - loss: 0.6932 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 5/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - accuracy: 0.4993 - loss: 0.6933 - val_accuracy: 0.4955 - val_loss: 0.6933
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 3ms/step
              precision    recall  f1-score   support

           0       0.00      0.00      0.00      2018
           1

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## Evaluation

In [17]:
import time

def predict(model, text):
    start_at = time.time()
    # Tokenize text
    x_test = pad_sequences(tokenizer.texts_to_sequences([text]), maxlen=MAX_LEN)
    # Predict
    score = model.predict([x_test])[0]

    return {"NEGATIVE": score[0], "POSITIVE": score[1],
       "elapsed_time": time.time()-start_at}

In [18]:
# TODO: Try few sentences to check the models
predict(model_lstm, "I feel not so good today")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 121ms/step


{'NEGATIVE': 0.46710822,
 'POSITIVE': 0.53289175,
 'elapsed_time': 0.19802594184875488}

In [19]:
# TODO: Try few sentences to check the models
predict(model_rnn, "I feel not so good today")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 537ms/step


{'NEGATIVE': 0.53772575,
 'POSITIVE': 0.46227422,
 'elapsed_time': 0.5841615200042725}

In [20]:
# TODO: Try few sentences to check the models
predict(model_gru, "I feel not so good today")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 158ms/step


{'NEGATIVE': 0.41808435,
 'POSITIVE': 0.5819157,
 'elapsed_time': 0.2411189079284668}

## Pre-trained Word Embeddings

Try training the RNNs with word embeddings but without the pre-trained weight and compare the results with the pre-trained model.


In [21]:
# TODO: Remove embedding_matrix and set trainable=TRUE
model_rnn1 = Sequential()
model_rnn1.add(Embedding(nb_words, EMBED_SIZE, input_length=MAX_LEN, trainable = True))

# TODO: Add a SimpleRNN layer
model_rnn1.add(SimpleRNN(128, activation="tanh", return_sequences=False))

model_rnn1.add(Dense(2, activation='softmax'))
model_rnn1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model_rnn1.fit(train_X, train_y, epochs=20, batch_size=120,
          validation_data=(valid_X, valid_y), callbacks=EarlyStopping(monitor='val_accuracy', mode='max',patience=3))

predictions_rnn1 = model_rnn1.predict(valid_X)
predictions_rnn1 = predictions_rnn1.argmax(axis=1)
print(classification_report(valid_y.argmax(axis=1), predictions_rnn1))

Epoch 1/20




[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 53ms/step - accuracy: 0.5109 - loss: 0.6969 - val_accuracy: 0.4955 - val_loss: 0.7026
Epoch 2/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 28ms/step - accuracy: 0.5040 - loss: 0.6965 - val_accuracy: 0.4955 - val_loss: 0.6939
Epoch 3/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 32ms/step - accuracy: 0.5024 - loss: 0.7019 - val_accuracy: 0.4922 - val_loss: 0.6935
Epoch 4/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 37ms/step - accuracy: 0.4994 - loss: 0.7009 - val_accuracy: 0.5080 - val_loss: 0.6963
Epoch 5/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 29ms/step - accuracy: 0.5008 - loss: 0.6975 - val_accuracy: 0.5080 - val_loss: 0.7085
Epoch 6/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 30ms/step - accuracy: 0.5024 - loss: 0.6984 - val_accuracy: 0.5077 - val_loss: 0.6951
Epoch 7/20
[1m134/134[0m [32m



[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step
              precision    recall  f1-score   support

           0       0.49      0.11      0.17      2018
           1       0.49      0.89      0.63      1982

    accuracy                           0.49      4000
   macro avg       0.49      0.50      0.40      4000
weighted avg       0.49      0.49      0.40      4000



In [22]:
# TODO: Train a LSTM model by replacing the SimpleRNN layer with a LSTM layer
model_lstm1 = Sequential()
model_lstm1.add(Embedding(nb_words, EMBED_SIZE, input_length=MAX_LEN, trainable = True))

# TODO: Add a LSTM layer
model_lstm1.add(LSTM(128, activation="tanh", return_sequences=False))

model_lstm1.add(Dense(2, activation='softmax'))
model_lstm1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model_lstm1.fit(train_X, train_y, epochs=20, batch_size=120,
          validation_data=(valid_X, valid_y), callbacks=EarlyStopping(monitor='val_accuracy', mode='max',patience=3))

predictions_lstm1 = model_lstm1.predict(valid_X)

# TODO: Print a classification report for the model
predictions_lstm1 = predictions_lstm1.argmax(axis=1)
print(classification_report(valid_y.argmax(axis=1), predictions_lstm1))

Epoch 1/20




[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 20ms/step - accuracy: 0.5002 - loss: 0.6941 - val_accuracy: 0.4955 - val_loss: 0.6932
Epoch 2/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 19ms/step - accuracy: 0.4916 - loss: 0.6932 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 3/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 20ms/step - accuracy: 0.4895 - loss: 0.6933 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 4/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - accuracy: 0.4990 - loss: 0.6932 - val_accuracy: 0.5045 - val_loss: 0.6931
Epoch 5/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 18ms/step - accuracy: 0.5025 - loss: 0.6931 - val_accuracy: 0.4955 - val_loss: 0.6933
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step
              precision    recall  f1-score   support

           0       0.00      0.00      0.00      2018
           1

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [23]:
# TODO: Train a GRU model by replacing the SimpleRNN layer with a GRU layer
model_gru1 = Sequential()
model_gru1.add(Embedding(nb_words, EMBED_SIZE, input_length=MAX_LEN, trainable = True))

# TODO: Add a GRU layer
model_gru1.add(GRU(128, activation="tanh", return_sequences=False))

model_gru1.add(Dense(2, activation='softmax'))
model_gru1.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model_gru1.fit(train_X, train_y, epochs=20, batch_size=120,
          validation_data=(valid_X, valid_y), callbacks=EarlyStopping(monitor='val_accuracy', mode='max',patience=3))

predictions_gru1 = model_gru1.predict(valid_X)

# TODO: Print a classification report for the model
predictions_gru1 = predictions_gru1.argmax(axis=1)
print(classification_report(valid_y.argmax(axis=1), predictions_gru1))

Epoch 1/20




[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 19ms/step - accuracy: 0.4969 - loss: 0.6944 - val_accuracy: 0.5045 - val_loss: 0.6932
Epoch 2/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 17ms/step - accuracy: 0.4931 - loss: 0.6936 - val_accuracy: 0.5045 - val_loss: 0.6942
Epoch 3/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step - accuracy: 0.5021 - loss: 0.6934 - val_accuracy: 0.4955 - val_loss: 0.6934
Epoch 4/20
[1m134/134[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 16ms/step - accuracy: 0.4985 - loss: 0.6934 - val_accuracy: 0.4955 - val_loss: 0.6939
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step
              precision    recall  f1-score   support

           0       0.00      0.00      0.00      2018
           1       0.50      1.00      0.66      1982

    accuracy                           0.50      4000
   macro avg       0.25      0.50      0.33      4000
weighted avg

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


In [24]:
# TODO: Try few sentences to check the models
predict(model_lstm1, "I feel not so good today")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 99ms/step


{'NEGATIVE': 0.4946124,
 'POSITIVE': 0.50538766,
 'elapsed_time': 0.13913488388061523}

In [25]:
# TODO: Try few sentences to check the models
predict(model_rnn1, "I feel not so good today")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 425ms/step


{'NEGATIVE': 0.510273,
 'POSITIVE': 0.48972705,
 'elapsed_time': 0.4824330806732178}

In [26]:
# TODO: Try few sentences to check the models
predict(model_gru1, "I feel not so good today")

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 129ms/step


{'NEGATIVE': 0.48564678,
 'POSITIVE': 0.5143532,
 'elapsed_time': 0.1878962516784668}