In [6]:
from tensorflow import keras
from keras.utils import pad_sequences
from keras.models import Sequential
from keras.layers import SimpleRNN
from keras.datasets import imdb
from keras import initializers
from keras.layers import Embedding, LSTM, Dense

#### The main objective of this project is to implement a LSTM deep learning model that enables a text sentiment analysis. This model should be able to detect whether a text has postive or negative connotation. 

In [15]:
rnn_hidden_dim = 5
word_embedding_dim = 50
max_features = 10000  # Assuming you have 10,000 unique words
input_length = 30 
batch_size = 32  # Define batch size

In [16]:
## Load in the data.  The function automatically tokenizes the text into distinct integers
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
print(len(x_train), 'train sequences')
print(len(x_test), 'test sequences') #training emotion detection

25000 train sequences
25000 test sequences


#### The dataset is the same as the one we did in our RNN activity. From the keras.datasets packages, it is the IMDB movie review sentiment classification dataset. It has over 25,000 reviewis of movies labeled by positive or negative sentiment

In [17]:
x_train = pad_sequences(x_train, maxlen=maxlen)
x_test = pad_sequences(x_test, maxlen=maxlen)
print('x_train shape:', x_train.shape)
print('x_test shape:', x_test.shape)

x_train shape: (25000, 30)
x_test shape: (25000, 30)


In [19]:
model_rnn = Sequential()

# Embedding layer: Maps each integer (word) in the sequence to a 50-dimensional vector
model_rnn.add(Embedding(max_features, word_embedding_dim, input_length=input_length))

# LSTM layer: Recurrent layer with hidden size of 5
model_rnn.add(LSTM(rnn_hidden_dim,
                   kernel_initializer=initializers.RandomNormal(stddev=0.001),
                   recurrent_initializer=initializers.Identity(gain=1.0),
                   activation='relu'))

# Fully connected output layer: Single neuron with sigmoid activation (for binary classification)
model_rnn.add(Dense(1, activation='sigmoid'))

# Model summary
model_rnn.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     (None, 30, 50)            500000    
                                                                 
 lstm_2 (LSTM)               (None, 5)                 1120      
                                                                 
 dense_2 (Dense)             (None, 1)                 6         
                                                                 
Total params: 501,126
Trainable params: 501,126
Non-trainable params: 0
_________________________________________________________________


#### Through this course, we already learned how to implement this using a simple RNN through Vanilla. Using a LSTM is way more beneficial as it ensures more accuracy because more memory can be obtained over larger units over text. Plus Vanilla RNNs have a vanishing gradient problem over time. 

In [20]:
from keras.optimizers import RMSprop
# Use RMSprop optimizer with a small learning rate
rmsprop = RMSprop(learning_rate=0.0001)

# Compile the model
model_rnn.compile(loss='binary_crossentropy',
                  optimizer=rmsprop,
                  metrics=['accuracy'])

# Train the model
model_rnn.fit(x_train, y_train,
              batch_size=batch_size,
              epochs=10,
              validation_data=(x_test, y_test))

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x7f3d5b5c3a60>

In [21]:
score, acc = model_rnn.evaluate(x_test, y_test,
                            batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

Test score: 0.5452677011489868
Test accuracy: 0.7766000032424927


#### One downside towards implementing the LSTM is that it took a long time to train the data. It took me at least 15 minutes to run the cell where I had to train the data, since the dataset was also pretty large. Another downside is that although LSTMs have larger capacity to store and retain data, when the texts get really large, it because inefficient and extremely slow. CNNs in additional to image recongition, can also be used for text recognition. 

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=729bdf52-3032-4b10-90bc-f4361cc9b198' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>