# Bidirectional LSTM on IMDB

**Author:** [fchollet](https://twitter.com/fchollet)<br>
**Date created:** 2020/05/03<br>
**Last modified:** 2020/05/03<br>
**Description:** Train a 2-layer bidirectional LSTM on the IMDB movie review sentiment classification dataset.

## Setup

In [1]:
import numpy as np
import keras
from keras import layers

max_features = 20000  # Only consider the top 20k words
maxlen = 200  # Only consider the first 200 words of each movie review

## Build the model

In [2]:
# Input for variable-length sequences of integers
inputs = keras.Input(shape=(None,), dtype="int32")
# Embed each integer in a 128-dimensional vector
x = layers.Embedding(max_features, 128)(inputs)
# Add 2 bidirectional LSTMs
x = layers.Bidirectional(layers.LSTM(64, return_sequences=True))(x)
x = layers.Bidirectional(layers.LSTM(64))(x)
# Add a classifier
outputs = layers.Dense(1, activation="sigmoid")(x)
model = keras.Model(inputs, outputs)
model.summary()

In [19]:
# # Input for variable-length sequences of integers
# inputs = keras.Input(shape=(None,), dtype="int32")
# # Embed each integer in a 128-dimensional vector
# x = layers.Embedding(max_features, 128)(inputs)
# # GRU model
# x = layers.GRU(64, return_sequences=True)(x)
# x = layers.GRU(64)(x)
# # Add a classifier
# outputs = layers.Dense(1, activation="sigmoid")(x)
# model = keras.Model(inputs, outputs)
# model.summary()

## Load the IMDB movie review sentiment data

In [3]:
(x_train, y_train), (x_val, y_val) = keras.datasets.imdb.load_data(
    num_words=max_features
)
print(len(x_train), "Training sequences")
print(len(x_val), "Validation sequences")
# Use pad_sequence to standardize sequence length:
# this will truncate sequences longer than 200 words and zero-pad sequences shorter than 200 words.
x_train = keras.utils.pad_sequences(x_train, maxlen=maxlen)
x_val = keras.utils.pad_sequences(x_val, maxlen=maxlen)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
[1m17464789/17464789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step
25000 Training sequences
25000 Validation sequences


## Train and evaluate the model

You can use the trained model hosted on [Hugging Face Hub](https://huggingface.co/keras-io/bidirectional-lstm-imdb)
and try the demo on [Hugging Face Spaces](https://huggingface.co/spaces/keras-io/bidirectional_lstm_imdb).

In [4]:
model.compile(optimizer="adam", loss="binary_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, batch_size=32, epochs=2, validation_data=(x_val, y_val))

Epoch 1/2
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 43ms/step - accuracy: 0.7536 - loss: 0.4760 - val_accuracy: 0.8643 - val_loss: 0.3226
Epoch 2/2
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m32s[0m 41ms/step - accuracy: 0.9193 - loss: 0.2183 - val_accuracy: 0.8366 - val_loss: 0.4344


<keras.src.callbacks.history.History at 0x78bc3b70ef30>

In [7]:
import re
from tensorflow import keras

INDEX_FROM = 3  # 0:pad, 1:start, 2:OOV
word_index = keras.datasets.imdb.get_word_index()

def _clean(text):
    text = text.lower()
    text = re.sub(r"[^a-z0-9\s']", " ", text)
    text = re.sub(r"\s+", " ", text).strip()
    return text

def encode_text(text, max_features=20000):
    text = _clean(text)
    tokens = text.split()
    seq = [1]  # start token as in Keras IMDB
    for w in tokens:
        idx = word_index.get(w)
        if idx is not None and idx < max_features:
            seq.append(idx + INDEX_FROM)  # shift by 3 to match load_data() encoding
        else:
            seq.append(2)  # OOV
    return seq

def predict_review(review_text, model, maxlen=200):
    seq = encode_text(review_text)
    x = keras.utils.pad_sequences([seq], maxlen=maxlen)
    p = float(model.predict(x, verbose=0)[0][0])
    label = "Positive" if p >= 0.5 else "Negative"
    return p, label




In [8]:
# Examples
print(predict_review("I loved this film. Great acting and screenplay!", model))
print(predict_review("Terrible movie. I want my time back.", model))

(0.8249394297599792, 'Positive')
(0.0758194848895073, 'Negative')


In [10]:
model.save("imdb_bilstm.h5")




In [11]:
import os
print(os.path.exists("imdb_bilstm.h5"))


True


In [12]:
import gradio as gr

def predict_for_gradio(text):
    p, _ = predict_review(text, model, maxlen=200)
    return {"Positive": p, "Negative": 1.0 - p}

with gr.Blocks() as demo:
    gr.Markdown("### 🎬 IMDB Sentiment — Try it inline")
    inp = gr.Textbox(lines=4, label="Review")
    out = gr.Label(num_top_classes=2, label="Sentiment")
    btn = gr.Button("Predict")
    btn.click(predict_for_gradio, inp, out)
demo  # displays inline in many notebook setups; use demo.launch() for a local server


Gradio Blocks instance: 1 backend functions
-------------------------------------------
fn_index=0
 inputs:
 |-<gradio.components.textbox.Textbox object at 0x78bba5621f40>
 outputs:
 |-<gradio.components.label.Label object at 0x78bba56d9280>

In [14]:
prob, label = predict_review("I really hated this movie!", model)
print(f"Predicted sentiment: {label} (prob={prob:.3f})")


Predicted sentiment: Negative (prob=0.230)


In [15]:
demo.launch(inline=True)


It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://c358f727875eb5648c.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


