# Multiclass Text prediction using LSTM Neural Networks 

Neural networks can be used for text prediction and natural language processing. Keras includes a dataset of 11,228 newswires from Reuters, with labeled over 46 topics. Predicting the topic depends not just on the the previous words, but the sequence in which they are presented. Therefore they are an ideal test case for LSTM neural networks. 

I built three LSTM models using hte reuters dataset. The first model performed quite well with 97.83% accuracy and 100 LSTM neurons. For the second model, I increased hte numbre of neurons in the LSTM to 200. This increased the training time of the network, but the network still converged to the same accuracy level. Due to this, for the third iteration I went back to 100 neurons and then reduced hte batch size.

In [1]:
from tensorflow import keras
from tensorflow.keras.datasets import reuters
from tensorflow.keras.preprocessing.text import Tokenizer

In [2]:
num_of_words=10000

(x_train, y_train), (x_test, y_test) = reuters.load_data(num_words=num_of_words, test_split=0.2)
word_index = reuters.get_word_index(path="reuters_word_index.json")

print('# of Training Samples: {}'.format(len(x_train)))
print('# of Test Samples: {}'.format(len(x_test)))


num_classes = max(y_train) + 1
print('# of Classes: {}'.format(num_classes))

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters.npz
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/reuters_word_index.json
# of Training Samples: 8982
# of Test Samples: 2246
# of Classes: 46


In [3]:
#convert words to integer token using Kera's built in tokenizer
tokenizer = Tokenizer(num_words=num_of_words)
x_train = tokenizer.sequences_to_matrix(x_train, mode='binary')
x_test = tokenizer.sequences_to_matrix(x_test, mode='binary')

y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

print(x_train[0])
print(len(x_train[0]))

print(y_train[0])
print(len(y_train[0]))

[0. 1. 1. ... 0. 0. 0.]
10000
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
46


In [4]:
#Only consider the first 200 words within the review
max_review_length = 400
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_review_length)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_review_length)

In [5]:
# Construct our model
embedding_vecor_length = 46
model = keras.models.Sequential()
model.add(keras.layers.Embedding(num_of_words, embedding_vecor_length, input_length=max_review_length))
model.add(keras.layers.LSTM(100))
model.add(keras.layers.Dense(46, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=3, batch_size=64)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding (Embedding)        (None, 400, 46)           460000    
_________________________________________________________________
lstm (LSTM)                  (None, 100)               58800     
_________________________________________________________________
dense (Dense)                (None, 46)                4646      
Total params: 523,446
Trainable params: 523,446
Non-trainable params: 0
_________________________________________________________________
None


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Train on 8982 samples, validate on 2246 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x1c347d03358>

In [6]:
# Evaluate model
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

print('Test loss:', scores[0])


Accuracy: 97.83%
Test loss: 0.07202055565316233


# LSTM 2/3, tweaking parameters 

For the first tweak of the LSTM, I increased the number of neurons in the LSTM layer

In [7]:
# Construct our model
embedding_vecor_length = 46
model = keras.models.Sequential()
model.add(keras.layers.Embedding(num_of_words, embedding_vecor_length, input_length=max_review_length))
model.add(keras.layers.LSTM(200))
model.add(keras.layers.Dense(46, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=3, batch_size=64)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, 400, 46)           460000    
_________________________________________________________________
lstm_1 (LSTM)                (None, 200)               197600    
_________________________________________________________________
dense_1 (Dense)              (None, 46)                9246      
Total params: 666,846
Trainable params: 666,846
Non-trainable params: 0
_________________________________________________________________
None


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Train on 8982 samples, validate on 2246 samples
Epoch 1/3
Epoch 2/3
Epoch 3/3


<tensorflow.python.keras.callbacks.History at 0x1c3eeba2ef0>

In [8]:
# Evaluate model
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

print('Test loss:', scores[0])

Accuracy: 97.83%
Test loss: 0.07244043153704859


# Tweaking batch size 

In [9]:
# Construct our model
embedding_vecor_length = 46
model = keras.models.Sequential()
model.add(keras.layers.Embedding(num_of_words, embedding_vecor_length, input_length=max_review_length))
model.add(keras.layers.LSTM(100))
model.add(keras.layers.Dense(46, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(x_train, y_train, validation_data=(x_test, y_test), epochs=2, batch_size=12)

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_2 (Embedding)      (None, 400, 46)           460000    
_________________________________________________________________
lstm_2 (LSTM)                (None, 100)               58800     
_________________________________________________________________
dense_2 (Dense)              (None, 46)                4646      
Total params: 523,446
Trainable params: 523,446
Non-trainable params: 0
_________________________________________________________________
None


  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "


Train on 8982 samples, validate on 2246 samples
Epoch 1/2
Epoch 2/2


<tensorflow.python.keras.callbacks.History at 0x1c3f78e1f98>

In [10]:
# Evaluate model
scores = model.evaluate(x_test, y_test, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))

print('Test loss:', scores[0])

Accuracy: 97.83%
Test loss: 0.07255657746366168
