The IMDb Movie Reviews is a dataset containing about 50,000 reviews which are labeled as positive or negative (binary classification). The dataset is balanced (this means that the data contain even numbers of positive and negative reviews). We only consider polarized reviews (either very negative or very positive). 

- We want to make a model which is capbale of telling us if a review is positive or negative. The reviews are text messages, so that they need to become "vectors" as mentioned in deatils in lesson 3. 
- The length of each of these vectors should be the same (some review texts are shorter, some are longer. But, the input of the model should have the same size!) We can use Keras "pad_sequences" to ensure that all sequences in a list have the same length. Thus, we need to decide how many words from each movie review to pick (maxlen). 
- We are training a model in which the sequence of data matters. LSTM (long short term memory) algorithm is a neural network that takes care of the sequential data (text reviews). We are using Keras API to build our LSTM algorithm.

In [11]:
# Here we import all the libraries we need
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

max_features = 20000  # We only consider the top 20k words
maxlen = 200  # We only take the first 200 words of each movie review.

In [12]:
# Keras has its own built in datasets, one of them is the imdb data set which can be eaily loaded. 
# For convinece of the users, the imdb data is already being vecotrized and you can't see the review texts. 
(x_train, y_train), (x_val, y_val) = keras.datasets.imdb.load_data(
    num_words=max_features)
print(len(x_train), "Training sequences")
print(len(x_val), "Validation sequences")


25000 Training sequences
25000 Validation sequences


In [13]:
# Using keras.preprocessing.sequence.pad_sequences we make sure that all the vectors have the same size
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_val = keras.preprocessing.sequence.pad_sequences(x_val, maxlen=maxlen)

In [14]:
# Now we build our model (using keras). At first we define the input. 
inputs = keras.Input(shape=(None,), dtype="int32")
# Using word embeddings a dense representation of words and their relative meanings can be found. 
# Keras provides an embedding layer that works on integer inputs (our input is integer now).
# The Embedding layer is initialized with random weights.
# Embed each integer in a 128-dimensional vector.
# see this: https://keras.io/api/layers/core_layers/embedding/#embedding
x = layers.Embedding(max_features, 128)(inputs)
# Add LSTM layers, we use a type of LSTM which is called Bidirectional 
x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
x = layers.Bidirectional(layers.LSTM(64))(x)
# This is the output layer
outputs = layers.Dense(1, activation="sigmoid")(x)
# We build the model
model = keras.Model(inputs, outputs)
# summary of the model
model.summary()

Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, None)]            0         
                                                                 
 embedding_1 (Embedding)     (None, None, 128)         2560000   
                                                                 
 bidirectional_2 (Bidirectio  (None, None, 128)        98816     
 nal)                                                            
                                                                 
 bidirectional_3 (Bidirectio  (None, 128)              98816     
 nal)                                                            
                                                                 
 dense_1 (Dense)             (None, 1)                 129       
                                                                 
Total params: 2,757,761
Trainable params: 2,757,761
Non-tra

In [None]:
# compile the model using Adam optimizaer, and the loss fucnstion is chosen to be "binary_crossentropy"
model.compile("adam", "binary_crossentropy", metrics=["accuracy"])
# fir the model
model.fit(x_train, y_train, batch_size=32, epochs=2, validation_data=(x_val, y_val))

Epoch 1/2