# Text classification with an RNN

This is a text classification that uses recurrent neural network on the IMDB large movie review dataset for sentiment analysis.

_source_: https://www.tensorflow.org/text/tutorials/text_classification_rnn

## Design of the Model
![image.png](https://www.tensorflow.org/text/tutorials/images/bidirectional.png)

In [1]:
import numpy as np

import matplotlib.pyplot as plt

import tensorflow as tf
import tensorflow_datasets as tfds

In [2]:
tfds.disable_progress_bar()

## Helpers functions

In [3]:
def plot_graphs(history, metric):
    plt.plot(history.history[metric])
    plt.plot(history.history['val_' + metric], '')
    plt.xlabel('Epochs')
    plt.ylable(metric)
    plt.legend([metric, 'val_' + metric])

## Setup input pipeline

In [4]:
dataset, info = tfds.load('imdb_reviews', with_info=True, as_supervised=True)
train_dataset, test_dataset = dataset['train'], dataset['test']

2022-03-08 22:10:14.412641: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-03-08 22:10:14.414219: I tensorflow/core/common_runtime/process_util.cc:146] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.


In [5]:
train_dataset.element_spec

(TensorSpec(shape=(), dtype=tf.string, name=None),
 TensorSpec(shape=(), dtype=tf.int64, name=None))

for example, label in train_dataset.take(1):
    print("text: ", example.numpy())
    print("label: ", label.numpy())

### Shuffle the data for train and create batches of these (text, label) pairs

In [6]:
BUFFER_SIZE = 10000
BATCH_SIZE = 64

In [7]:
train_dataset = train_dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
test_dataset = test_dataset.batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

In [8]:
for example, label in train_dataset.take(1):
    print("texts: ", example.numpy()[:3],)
    print("")
    print("labels: ", label.numpy()[:3])

2022-03-08 22:10:14.545559: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


texts:  [b'this movie is ok if you like mindless action ,corny acting, and a very small plot !The special effects are decent considering this movie is from the director of"Event Horizon". The costumes are like something from a mad max movie .None of the soldiers talk ,so others tell them what to do.It eventually end up as a big shoot\'em up movie with explosions all the place. I personally liked Russell better in "tango and cash""escape from la/new york" and "executive decision". It must see this movie leave your mind at the door for a no brainer action science fiction movie!!'
 b"Stefan is an x-con that five years ago got married to Marie. Their marriage has been stable until Stefan past catch up with them and he's offered to do a courier job. Stefan's job is a heroin delivery from Germany to Sweden which should go easily.<br /><br />In Germany Stefan meet Elli, a girl from Bosnia that has been sold to a stripclub owner. Stefan dislikes what he sees and decide to help Elli out of her 

## Create the text encoder

In [9]:
VOCAB_SIZE = 1000

encoder = tf.keras.layers.TextVectorization(max_tokens=VOCAB_SIZE)
encoder.adapt(train_dataset.map(lambda text, label: text))

In [10]:
vocab = np.array(encoder.get_vocabulary())
vocab[:20]

array(['', '[UNK]', 'the', 'and', 'a', 'of', 'to', 'is', 'in', 'it', 'i',
       'this', 'that', 'br', 'was', 'as', 'for', 'with', 'movie', 'but'],
      dtype='<U14')

In [11]:
encoded_example = encoder(example)[:3].numpy()
encoded_example

array([[ 11,  18,   7, ...,   0,   0,   0],
       [  1,   7,  34, ...,   0,   0,   0],
       [  2, 579,   5, ...,   0,   0,   0]])

In [12]:
for n in range(3):
    print("Original: ", example[n].numpy())
    print("--------------------------------")
    print("Round-trip: ", " ".join(vocab[encoded_example[n]]))
    print()

Original:  b'this movie is ok if you like mindless action ,corny acting, and a very small plot !The special effects are decent considering this movie is from the director of"Event Horizon". The costumes are like something from a mad max movie .None of the soldiers talk ,so others tell them what to do.It eventually end up as a big shoot\'em up movie with explosions all the place. I personally liked Russell better in "tango and cash""escape from la/new york" and "executive decision". It must see this movie leave your mind at the door for a no brainer action science fiction movie!!'
--------------------------------
Round-trip:  this movie is ok if you like [UNK] action [UNK] acting and a very small plot the special effects are decent [UNK] this movie is from the director [UNK] [UNK] the [UNK] are like something from a [UNK] [UNK] movie none of the [UNK] talk so others tell them what to [UNK] eventually end up as a big [UNK] up movie with [UNK] all the place i [UNK] liked [UNK] better in [

## Create the model