## Exercise: Using LSTMs to Classify the 20 Newsgroups Data Set
The 20 Newsgroups data set is a well known classification problem. The goal is to classify which newsgroup a particular post came from.  The 20 possible groups are:

`comp.graphics
comp.os.ms-windows.misc
comp.sys.ibm.pc.hardware
comp.sys.mac.hardware
comp.windows.x	rec.autos
rec.motorcycles
rec.sport.baseball
rec.sport.hockey	
sci.crypt
sci.electronics
sci.med
sci.space
misc.forsale	
talk.politics.misc
talk.politics.guns
talk.politics.mideast	
talk.religion.misc
alt.atheism
soc.religion.christian`

As you can see, some pairs of groups may be quite similar while others are very different.

The data is given as a designated training set of size 11314 and test set of size 7532.  The 20 categories are represented in roughly equal proportions, so the baseline accuracy is around 5%.


To begin, review the code below.  This will walk you through the basics of loading in the 20 newsgroups data, loading in the GloVe data, building the word embedding matrix, and building the LSTM model.

After we build the first LSTM model, it will be your turn to build one and play with the parameters.

In [2]:
import numpy as np

from keras.preprocessing import sequence
from keras.models import Sequential
from keras.layers import Dense, Embedding
from keras.layers import LSTM

import keras
import tensorflow as tf
from sklearn.datasets import fetch_20newsgroups

from tensorflow.keras.preprocessing.text import Tokenizer
from keras.preprocessing.sequence import pad_sequences

In [3]:
max_features = 20000
seq_length = 30  # How long to make our word sequences
batch_size = 32



In [4]:
# Download the 20 newsgroups data - there is already a designated "train" and "test" set

newsgroups_train = fetch_20newsgroups(subset='train')
newsgroups_test = fetch_20newsgroups(subset='test')



In [5]:
len(newsgroups_train.data), len(newsgroups_test.data)

(11314, 7532)

In [6]:
tokenizer = Tokenizer(num_words=max_features)
tokenizer.fit_on_texts(newsgroups_train.data)

In [7]:
sequences_train = tokenizer.texts_to_sequences(newsgroups_train.data)
sequences_test = tokenizer.texts_to_sequences(newsgroups_test.data)

In [8]:
word_index = tokenizer.word_index
print('Found %s unique tokens.' % len(word_index))


Found 134142 unique tokens.


In [9]:
x_train = pad_sequences(sequences_train, maxlen=seq_length)
x_test = pad_sequences(sequences_test, maxlen=seq_length)



In [10]:
x_train

array([[ 2908,   198,     3, ...,    35,    58,  7860],
       [  351,   138,   533, ...,   118,   441,    15],
       [    9,    33,     4, ...,   187,    84, 17015],
       ...,
       [   10,     1,  1787, ...,   349,   383,    31],
       [  115,   362,    67, ...,  7772,   486,   492],
       [ 4485, 13919,  1031, ...,   200,    38,  3826]], dtype=int32)

In [11]:
y_train = keras.utils.to_categorical(np.asarray(newsgroups_train.target))
y_test = keras.utils.to_categorical(np.asarray(newsgroups_test.target))

In [61]:
y_train

array([[0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.],
       ...,
       [0., 0., 0., ..., 0., 0., 0.],
       [0., 1., 0., ..., 0., 0., 0.],
       [0., 0., 0., ..., 0., 0., 0.]])

We will be using the Glove pre-trained word vectors.  If you haven't already, please download them using this link:
(NOTE: this will start downloading an 822MB file)

http://nlp.stanford.edu/data/glove.6B.zip

Then unzip the file and fill your local path to the file in the code cell below.

We will use the file `glove.6B.100d.txt`

In [13]:
embeddings_index = {}
f = open('/Users/valendunn/Downloads/glove.6B/glove.6B.100d.txt')
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()

print('Found %s word vectors.' % len(embeddings_index))

Found 400000 word vectors.


Let's just look at a word embedding

In [14]:
dog_vec = embeddings_index['dog']
dog_vec

array([ 0.30817  ,  0.30938  ,  0.52803  , -0.92543  , -0.73671  ,
        0.63475  ,  0.44197  ,  0.10262  , -0.09142  , -0.56607  ,
       -0.5327   ,  0.2013   ,  0.7704   , -0.13983  ,  0.13727  ,
        1.1128   ,  0.89301  , -0.17869  , -0.0019722,  0.57289  ,
        0.59479  ,  0.50428  , -0.28991  , -1.3491   ,  0.42756  ,
        1.2748   , -1.1613   , -0.41084  ,  0.042804 ,  0.54866  ,
        0.18897  ,  0.3759   ,  0.58035  ,  0.66975  ,  0.81156  ,
        0.93864  , -0.51005  , -0.070079 ,  0.82819  , -0.35346  ,
        0.21086  , -0.24412  , -0.16554  , -0.78358  , -0.48482  ,
        0.38968  , -0.86356  , -0.016391 ,  0.31984  , -0.49246  ,
       -0.069363 ,  0.018869 , -0.098286 ,  1.3126   , -0.12116  ,
       -1.2399   , -0.091429 ,  0.35294  ,  0.64645  ,  0.089642 ,
        0.70294  ,  1.1244   ,  0.38639  ,  0.52084  ,  0.98787  ,
        0.79952  , -0.34625  ,  0.14095  ,  0.80167  ,  0.20987  ,
       -0.86007  , -0.15308  ,  0.074523 ,  0.40816  ,  0.0192

In [15]:
## This creates a matrix where the $i$th row gives the word embedding for the word represented by integer $i$.
## Essentially, these will be the "weights" for the Embedding Layer
## Rather than learning the weights, we will use these ones and "freeze" the layer

embedding_matrix = np.zeros((len(word_index) + 1, 100))
for word, i in word_index.items():
    embedding_vector = embeddings_index.get(word)
    if embedding_vector is not None:
        # words not found in embedding index will be all-zeros.
        embedding_matrix[i] = embedding_vector

In [16]:
embedding_matrix.shape

(134143, 100)

## LSTM Layer
`keras.layers.recurrent.LSTM(units, activation='tanh', recurrent_activation='hard_sigmoid', use_bias=True, kernel_initializer='glorot_uniform', recurrent_initializer='orthogonal', bias_initializer='zeros', unit_forget_bias=True, kernel_regularizer=None, recurrent_regularizer=None, bias_regularizer=None, activity_regularizer=None, kernel_constraint=None, recurrent_constraint=None, bias_constraint=None, dropout=0.0, recurrent_dropout=0.0)`

- Similar in structure to the `SimpleRNN` layer
- `units` defines the dimension of the recurrent state
- `recurrent_...` refers the recurrent state aspects of the LSTM
- `kernel_...` refers to the transformations done on the input



In [45]:
word_dimension = 100  # This is the dimension of the words we are using from GloVe
model = Sequential()
model.add(Embedding(len(word_index) + 1,
                            word_dimension,  
                            weights=[embedding_matrix],  # We set the weights to be the word vectors from GloVe
                            input_length=seq_length,
                            trainable=False))  # By setting trainable to False, we "freeze" the word embeddings.
model.add(LSTM(30, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(20, activation='softmax'))

model.summary()



In [46]:
rmsprop = keras.optimizers.RMSprop(learning_rate = .002)

model.compile(loss='categorical_crossentropy',
              optimizer=rmsprop,
              metrics=['accuracy'])

In [47]:

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 8ms/step - accuracy: 0.1018 - loss: 2.9000 - val_accuracy: 0.1977 - val_loss: 2.5365
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.2315 - loss: 2.4611 - val_accuracy: 0.3032 - val_loss: 2.2277
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.3252 - loss: 2.1950 - val_accuracy: 0.3633 - val_loss: 2.0526
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.3720 - loss: 2.0394 - val_accuracy: 0.3955 - val_loss: 1.9606
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.4080 - loss: 1.9249 - val_accuracy: 0.4173 - val_loss: 1.8989
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.4335 - loss: 1.8511 - val_accuracy: 0.4279 - val_loss: 1.8671
Epoch 7/20
[1m354/354[0m 

<keras.src.callbacks.history.History at 0x37a02f3b0>

In [48]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.5565 - loss: 1.4510 - val_accuracy: 0.4746 - val_loss: 1.7476
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.5625 - loss: 1.4315 - val_accuracy: 0.4770 - val_loss: 1.7477
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.5607 - loss: 1.4326 - val_accuracy: 0.4842 - val_loss: 1.7399
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.5611 - loss: 1.4122 - val_accuracy: 0.4858 - val_loss: 1.7379
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.5763 - loss: 1.3852 - val_accuracy: 0.4866 - val_loss: 1.7393
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.5635 - loss: 1.4093 - val_accuracy: 0.4823 - val_loss: 1.7450
Epoch 7/20
[1m354/354[0m 

<keras.src.callbacks.history.History at 0x35f210e30>

In [25]:
score, acc = model.evaluate(x_test, y_test,
                            batch_size=batch_size)
print('Test score:', score)
print('Test accuracy:', acc)

[1m236/236[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.4866 - loss: 1.7903
Test score: 1.7922390699386597
Test accuracy: 0.48805099725723267


## Exercise
### Your Turn
- Build a neural network with a SimpleRNN instead of an LSTM (with other dimensions and parameters the same). How does the performance compare?
- Use the LSTM above without the pretrained word vectors (randomly initialize the weights and have them be learned during the training process).  How does the performance compare?
- Try different sequence lengths, and dimensions for the hidden state of the LSTM.  Can you improve the model?


1. The RNN is much much worse and levels out in the high 20% accurracy
2. With the same amount of parameters the LSTM fails to get higher than the low 30s
3. With the increase of dimensions of the hidden state to 60 I was able to get the test accuracy to 82% with a test of 53% as compared to the 58% with a test of 49% of standard with twice the time to train. By increasing the sequence length by a factor of 2, with a training time of double the time, I was able to get a training accuracy of 87% with a test of 63%

In [26]:
from keras.layers import SimpleRNN

In [29]:
# Please provide your code here
model_2 = Sequential([
Embedding(len(word_index) + 1, word_dimension, weights=[embedding_matrix], input_length=seq_length, trainable=False),
SimpleRNN(30, dropout=0.2, recurrent_dropout=0.2),
Dense(20, activation='softmax')
])
model_2.summary()



In [31]:
rmsprop = keras.optimizers.RMSprop(learning_rate = .002)

model_2.compile(loss='categorical_crossentropy',
              optimizer=rmsprop,
              metrics=['accuracy'])

In [32]:

model_2.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.0792 - loss: 3.0513 - val_accuracy: 0.1340 - val_loss: 2.8584
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.1595 - loss: 2.8111 - val_accuracy: 0.1683 - val_loss: 2.7620
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.1854 - loss: 2.7147 - val_accuracy: 0.1868 - val_loss: 2.6957
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.1944 - loss: 2.6782 - val_accuracy: 0.2041 - val_loss: 2.6558
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.2173 - loss: 2.6260 - val_accuracy: 0.2063 - val_loss: 2.6438
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.2151 - loss: 2.6039 - val_accuracy: 0.2076 - val_loss: 2.6351
Epoch 7/20
[1m354/354[0m 

<keras.src.callbacks.history.History at 0x3618e8ce0>

In [34]:
model_2.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.2546 - loss: 2.5015 - val_accuracy: 0.2392 - val_loss: 2.5733
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.2628 - loss: 2.4877 - val_accuracy: 0.2436 - val_loss: 2.5389
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.2632 - loss: 2.4702 - val_accuracy: 0.2432 - val_loss: 2.5324
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.2671 - loss: 2.4818 - val_accuracy: 0.2468 - val_loss: 2.5367
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.2702 - loss: 2.4621 - val_accuracy: 0.2504 - val_loss: 2.5428
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.2627 - loss: 2.4682 - val_accuracy: 0.2580 - val_loss: 2.5511
Epoch 7/20
[1m354/354[0m 

<keras.src.callbacks.history.History at 0x35d656b40>

In [35]:
model.summary()

In [36]:
model_2.summary()

In [37]:
model_3 = Sequential([
Embedding(len(word_index) + 1, word_dimension, input_length=seq_length, trainable=False),
LSTM(30, dropout=0.2, recurrent_dropout=0.2),
Dense(20, activation='softmax')
])
model_3.summary()



In [38]:
rmsprop = keras.optimizers.RMSprop(learning_rate = .002)

model_3.compile(loss='categorical_crossentropy',
              optimizer=rmsprop,
              metrics=['accuracy'])

In [39]:
model_3.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 7ms/step - accuracy: 0.0635 - loss: 2.9911 - val_accuracy: 0.0785 - val_loss: 2.9777
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.0996 - loss: 2.9565 - val_accuracy: 0.1078 - val_loss: 2.9270
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 7ms/step - accuracy: 0.1110 - loss: 2.9131 - val_accuracy: 0.1147 - val_loss: 2.8988
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.1210 - loss: 2.8727 - val_accuracy: 0.1257 - val_loss: 2.8687
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.1395 - loss: 2.8419 - val_accuracy: 0.1423 - val_loss: 2.8503
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 8ms/step - accuracy: 0.1459 - loss: 2.8197 - val_accuracy: 0.1435 - val_loss: 2.8371
Epoch 7/20
[1m354/354[0m 

<keras.src.callbacks.history.History at 0x35d6573b0>

In [40]:
model_3.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.2505 - loss: 2.5281 - val_accuracy: 0.2087 - val_loss: 2.6777
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.2537 - loss: 2.5128 - val_accuracy: 0.2175 - val_loss: 2.6622
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.2585 - loss: 2.5065 - val_accuracy: 0.2233 - val_loss: 2.6492
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.2602 - loss: 2.4968 - val_accuracy: 0.2152 - val_loss: 2.6538
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.2666 - loss: 2.4752 - val_accuracy: 0.2191 - val_loss: 2.6406
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.2814 - loss: 2.4432 - val_accuracy: 0.2252 - val_loss: 2.6402
Epoch 7/20
[1m354/354[0m 

<keras.src.callbacks.history.History at 0x35f258ef0>

In [41]:
model_3.summary()

In [57]:
model_4 = Sequential([
Embedding(len(word_index) + 1, word_dimension, weights=[embedding_matrix], input_length=seq_length, trainable=False),
LSTM(100, dropout=0.2, recurrent_dropout=0.2),
Dense(20, activation='softmax')
])
model_4.summary()

In [58]:
rmsprop = keras.optimizers.RMSprop(learning_rate = .002)

model_4.compile(loss='categorical_crossentropy',
              optimizer=rmsprop,
              metrics=['accuracy'])

In [59]:
model_4.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 17ms/step - accuracy: 0.1413 - loss: 2.7675 - val_accuracy: 0.2925 - val_loss: 2.2595
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.3127 - loss: 2.2169 - val_accuracy: 0.3704 - val_loss: 2.0095
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.4024 - loss: 1.9344 - val_accuracy: 0.4214 - val_loss: 1.8677
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.4605 - loss: 1.7441 - val_accuracy: 0.4454 - val_loss: 1.7793
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.5106 - loss: 1.5934 - val_accuracy: 0.4643 - val_loss: 1.7379
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.5426 - loss: 1.4878 - val_accuracy: 0.4721 - val_loss: 1.7113
Epoch 7/20
[1m354/354

<keras.src.callbacks.history.History at 0x3874016d0>

In [60]:
model_4.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.7565 - loss: 0.7790 - val_accuracy: 0.5347 - val_loss: 1.7747
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.7578 - loss: 0.7686 - val_accuracy: 0.5320 - val_loss: 1.7908
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 16ms/step - accuracy: 0.7701 - loss: 0.7411 - val_accuracy: 0.5360 - val_loss: 1.7990
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.7627 - loss: 0.7341 - val_accuracy: 0.5344 - val_loss: 1.7995
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.7685 - loss: 0.7529 - val_accuracy: 0.5297 - val_loss: 1.8396
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.7741 - loss: 0.7188 - val_accuracy: 0.5336 - val_loss: 1.8301
Epoch 7/20
[1m354/354

<keras.src.callbacks.history.History at 0x389abbd10>

In [73]:
x_train_1 = pad_sequences(sequences_train, maxlen=60)
x_test_1 = pad_sequences(sequences_test, maxlen=60)

In [74]:
model_5 = Sequential([
Embedding(len(word_index) + 1, word_dimension, weights=[embedding_matrix], input_length=seq_length, trainable=False),
LSTM(100, dropout=0.2, recurrent_dropout=0.2),
Dense(20, activation='softmax')
])
model_5.summary()



In [75]:
rmsprop = keras.optimizers.RMSprop(learning_rate = .002)

model_5.compile(loss='categorical_crossentropy',
              optimizer=rmsprop,
              metrics=['accuracy'])

In [76]:
model_5.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 16ms/step - accuracy: 0.1435 - loss: 2.7926 - val_accuracy: 0.2881 - val_loss: 2.2876
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.3192 - loss: 2.2089 - val_accuracy: 0.3802 - val_loss: 2.0224
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.4156 - loss: 1.9128 - val_accuracy: 0.4229 - val_loss: 1.8540
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.4680 - loss: 1.7307 - val_accuracy: 0.4387 - val_loss: 1.8144
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.5118 - loss: 1.5875 - val_accuracy: 0.4652 - val_loss: 1.7372
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.5444 - loss: 1.4827 - val_accuracy: 0.4823 - val_loss: 1.6938
Epoch 7/20
[1m354/354

<keras.src.callbacks.history.History at 0x3919ff3b0>

In [77]:
model_5.fit(x_train_1, y_train,
          batch_size=batch_size,
          epochs=20,
          validation_data=(x_test_1, y_test))

Epoch 1/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 27ms/step - accuracy: 0.7606 - loss: 0.7444 - val_accuracy: 0.6191 - val_loss: 1.3532
Epoch 2/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 29ms/step - accuracy: 0.7853 - loss: 0.6683 - val_accuracy: 0.6195 - val_loss: 1.3802
Epoch 3/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 28ms/step - accuracy: 0.7930 - loss: 0.6451 - val_accuracy: 0.6313 - val_loss: 1.3573
Epoch 4/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 28ms/step - accuracy: 0.8085 - loss: 0.6087 - val_accuracy: 0.6276 - val_loss: 1.3637
Epoch 5/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 27ms/step - accuracy: 0.8186 - loss: 0.5661 - val_accuracy: 0.6305 - val_loss: 1.3788
Epoch 6/20
[1m354/354[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 28ms/step - accuracy: 0.8215 - loss: 0.5592 - val_accuracy: 0.6310 - val_loss: 1.3692
Epoch 7/20
[1m3

<keras.src.callbacks.history.History at 0x3997073b0>

In [78]:
model_4.summary()

In [79]:
model_5.summary()