# The task of finding the sentiment from the text

## Dataset IMDB

This is a dataset of 25,000 movie reviews from IMDB, tagged by sentiment (positive/negative). 

The reviews have been pre-processed and each review is coded as a list of word indexes (integers). 

For convenience, the words are indexed by their overall frequency in the dataset, so for example, the integer "3" encodes the third most frequent word in the data. 

This allows quick filtering operations, such as "consider only the 10,000 most frequent words, but discard the 20 most frequent words".

By convention, "0" does not indicate a specific word, but is used to encode the item token.

In [None]:
from keras.datasets import imdb
import matplotlib.pyplot as plt
import numpy as np

## Data loading
To load the data, we will use the predefined function again.

In [None]:
vocabulary_size = 5000
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocabulary_size)

The first training record is displayed.

The input data is encoded as words according to the index.

In [None]:
print (x_train[0])

Let's see what a noodle looks like made up of words and not numbers.

First we need to download the dictionary.

In [None]:
word_idx = imdb.get_word_index()

Originally, the index number of the value is not a key.

Therefore, it is necessary to convert the index as a key and the words as values.

In [None]:
word_idx = {i: word for word, i in word_idx.items()}

Display text

In [None]:
print([word_idx[i] for i in x_train[0]])

The first review has 218 words

In [None]:
len(x_train[0])

Let's find out how long the reviews are.

In [None]:
print("Maximální délka recenze: ", len(max((x_train+x_test), key=len)))
print("Minimální délka recenze: ", len(min((x_train+x_test), key=len)))

Now that we know what the input data looks like, let's look at the output.

The resolution can be positive (0) or negative (1)

In [None]:
print(np.unique(y_train))

# Data preparation
The tensorflow library has functions for working with sequences

In [None]:
from tensorflow.keras.preprocessing import sequence

We'll take the first 400 words of each review. If the review is not long enough, we fill it with a blank word or the number 0.

In [None]:
max_words = 400
 
x_train = sequence.pad_sequences(x_train, maxlen=max_words)
x_test = sequence.pad_sequences(x_test, maxlen=max_words)
 
x_valid, y_valid = x_train[:64], y_train[:64]
x_train_, y_train_ = x_train[64:], y_train[64:]

Let's check the length of the first slice, which was originally 218 characters long.

In [None]:
print (len(x_train[0]))

Let's take a look at the first review.

In [None]:
x_train[0]

# Simple RNN model
For the neural network we again choose SimpleRNN

In [None]:
from keras.layers import SimpleRNN, Dense, Embedding
from keras.models import Sequential

We create a sequential model

In [None]:
RNN_model = Sequential(name="Simple_RNN")

The first layer is Embedding, it is used to map discrete values (e.g. numeric IDs of words) into dense vectors (embeddings).

It is typically used when working with text. You have a vocab_size dictionary, each word is represented by a number (an index in the dictionary).

The embedding converts this number into a fixed-length output_dim vector.

So instead of one-hot encoding, words are represented by a more compact, meaningful vector.

It is necessary to determine the size of the embedding. In our case, we set it to 32.

In [None]:
embd_len = 32
RNN_model.add(Embedding(vocabulary_size, embd_len))

Then follows the SimpleRNN network.

In [None]:
RNN_model.add(SimpleRNN(128,
                        activation='tanh',
                        return_sequences=False))

Last is the output Dense layer, which returns a number between 0 and 1.

In [None]:
RNN_model.add(Dense(1, activation='sigmoid'))

Representation of the neural network structure.

In [None]:
RNN_model.summary()

This is a two-class classification model, so we use the loss function binary_crossentropy.

In [None]:
RNN_model.compile(
    loss="binary_crossentropy",
    optimizer='adam',
    metrics=['accuracy']
)

We let go for learning.

In [None]:
rnn_history = RNN_model.fit(x_train_, y_train_,
                        batch_size=64,
                        epochs=5,
                        verbose=1,
                        validation_data=(x_valid, y_valid))

Save the trained model.

In [None]:
RNN_model.save('rnn_simple.keras')

Model validation
* the first number is the value of the cost/loss function
* the second number is the accuracy

In [None]:
RNN_model.evaluate(x_test, y_test)

History of learning

In [None]:
fig1 = plt.figure()
plt.plot(rnn_history.history['loss'], label='Train Loss')
plt.plot(rnn_history.history['accuracy'], label='Train Accuracy')
plt.legend(loc="right")
plt.title('Loss, accuracy')
plt.ylabel('Loss, accuracy')
plt.xlabel('Počet epoch')
plt.show()   

# GRU model
The model will be very similar, but we will replace the SimpleRNN part with GRU.

In [None]:
from keras.layers import GRU
gru_model = Sequential(name="GRU_Model")
gru_model.add(Embedding(vocabulary_size,
                        embd_len))
gru_model.add(GRU(128,
                  activation='tanh',
                  return_sequences=False))
gru_model.add(Dense(1, activation='sigmoid'))

Viewing the network structure

In [None]:
gru_model.summary()

Training a neural GRU network

In [None]:
gru_model.compile(
    loss="binary_crossentropy",
    optimizer='adam',
    metrics=['accuracy']
)

In [None]:
gru_history = gru_model.fit(x_train_, y_train_,
                         batch_size=64,
                         epochs=10,
                         verbose=1,
                         validation_data=(x_valid, y_valid))

Saving the trained model

In [None]:
gru_model.save('rnn_gru.keras')

Model validation

In [None]:
gru_model.evaluate(x_test, y_test)

View learning history

In [None]:
fig2 = plt.figure()                
plt.plot(gru_history.history['loss'], label='Train Loss')
plt.plot(gru_history.history['accuracy'], label='Train Accuracy')
plt.plot(gru_history.history['val_loss'], label='Validation Loss')
plt.plot(gru_history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc="right")
plt.title('Loss, accuracy')
plt.ylabel('Loss, accuracy')
plt.xlabel('Počet epoch')
plt.show()   

# LTSM model
Let's try the LTSM model. Again, it only replaces a given part of the network.

In [None]:
from keras.layers import LSTM

In [None]:
lstm_model = Sequential(name="LSTM_Model")
lstm_model.add(Embedding(vocabulary_size,
                         embd_len))
lstm_model.add(LSTM(128,
                    activation='relu',
                    return_sequences=False))
lstm_model.add(Dense(1, activation='sigmoid'))

Viewing the network structure

In [None]:
lstm_model.summary()

Neural network training

In [None]:
lstm_model.compile(
    loss="binary_crossentropy",
    optimizer='adam',
    metrics=['accuracy']
)

In [None]:
ltsm_history = lstm_model.fit(x_train_, y_train_,
                          batch_size=64,
                          epochs=5,
                          verbose=1,
                          validation_data=(x_valid, y_valid))

Storing the trained net

In [None]:
lstm_model.save('rnn_ltsm.keras')

Model validation

In [None]:
lstm_model.evaluate(x_test, y_test)

View learning history

In [None]:
fig3 = plt.figure()                
plt.plot(ltsm_history.history['loss'], label='Train Loss')
plt.plot(ltsm_history.history['accuracy'], label='Train Accuracy')
plt.plot(ltsm_history.history['val_loss'], label='Validation Loss')
plt.plot(ltsm_history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc="right")
plt.title('Loss, accuracy')
plt.ylabel('Loss, accuracy')
plt.xlabel('Počet epoch')
plt.show() 

# Bi-directional LSTM Model
For the last time we try the bi-directional LTSM model

In [None]:
from keras.layers import Bidirectional

In [None]:
bi_lstm_model = Sequential(name="Bidirectional_LSTM")
bi_lstm_model.add(Embedding(vocabulary_size,
                            embd_len))
bi_lstm_model.add(Bidirectional(LSTM(128,
                                     activation='tanh',
                                     return_sequences=False)))
bi_lstm_model.add(Dense(1, activation='sigmoid'))

Listing the network structure

In [None]:
bi_lstm_model.summary()

Network training

In [None]:
bi_lstm_model.compile(
  loss="binary_crossentropy",
  optimizer='adam',
  metrics=['accuracy']
)

In [None]:
bi_lstm_history = bi_lstm_model.fit(x_train_, y_train_,
                             batch_size=64,
                             epochs=5,
                             validation_data=(x_test, y_test))

Saving the trained model

In [None]:
bi_lstm_model.save('rnn_bi_ltsm.keras')

Model validation

In [None]:
bi_lstm_model.evaluate(x_test, y_test)

View learning history

In [None]:
fig4 = plt.figure()                
plt.plot(bi_lstm_history.history['loss'], label='Train Loss')
plt.plot(bi_lstm_history.history['accuracy'], label='Train Accuracy')
plt.plot(bi_lstm_history.history['val_loss'], label='Validation Loss')
plt.plot(bi_lstm_history.history['val_accuracy'], label='Validation Accuracy')
plt.legend(loc="right")
plt.title('Loss, accuracy')
plt.ylabel('Loss, accuracy')
plt.xlabel('Počet epoch')
plt.show() 