<a href="https://colab.research.google.com/github/rahmanidashti/TF2Practices/blob/main/04_TF2Practice_Recurrent_Neural_Network.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Recurrent Neural Network

In [8]:
import tensorflow as tf

## Data
*Large Movie Review Dataset*. A dataset for binary sentiment classification containing a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. The reviews are preprocessed and each one is encoded as a sequence of word indexes in the form of integers. The words within the reviews are indexed by their overall frequency within the dataset. For example, the integer “2” encodes the second most frequent word in the data. More information is available at https://ai.stanford.edu/~amaas/data/sentiment/

In [9]:
def load_data():

  ''' The 'num_words' argument is the number of distinct words which can be load in the training set 
  (This is the size of the vocabulary in the text data).
  '''

  (train_data, train_labels), (test_data, test_labels) = tf.keras.datasets.imdb.load_data(num_words=10000)

  '''
  pad_sequences method adds the value (default 0) to the sequences based on max length or the longest sequence in the list.
  The default padding is 'pre' which means it adds the values from the beginning of the sentences.
  '''

  train_data = tf.keras.preprocessing.sequence.pad_sequences(train_data, maxlen=100)
  test_data = tf.keras.preprocessing.sequence.pad_sequences(test_data, maxlen=100)

  return train_data, train_labels, test_data, test_labels

## Convert Int to Text

The code below retrieves the dictionary mapping word indices back into the original words so that we can read them. Credit: [How to build a neural network with Keras using the IMDB dataset](https://builtin.com/data-science/how-build-neural-network-keras?__cf_chl_captcha_tk__=c261bf91c002510c062f35f082f89294909a4d83-1616052406-0-AQEle5WRINPJo5B6eRDYKBxPUsGB-4HAr6CQzaC8IGqRhbuO3FcMHk8hUWVZBivvGBxVInEzD-B-QJt_dSB0vOekLt5pWOfnF3tQYCRZ7R1v0bgDzSlX7N7iLDCtdw1gglbiUc2lgu-2PnVXk_RjGE1_9gSldL2_OOF7JFR3zSmDeXYsC45se1-A65GrtIV9SuBS0KSI-GymxZ-scLye2ooHpKVycqVTkhzc7-mkk4SQRsEPw33WnV62m1uApEeuWuMUE4z8BtwpHVpA8WDkZW0gKIU0bjmYR7zk4RKHaRg_EMHrycHzq32BkM784v3VZldAtq9qJa3WMf6el3bBZ6oncZHBvG-sUNeWi60jNrvg3EaeCLJlfSPT2cDuIfxO2INFchU-nAFQdqELDNYFmZn7lkI4ExQyF5VlqRfg0nDHnrcsmUYOeg8x6vBxRiegmSlAxmF6zT4_GDNuw07sPBK1eaLPz6iLVaCXejmXx2C2c7rnmoX5RPFgoMMilQVN97VjXFJenui3UbNlnYO-g8R1s01RKXKbwJrWYuz9x_IAMOdHoii_ab0nuyoJzIcg4hlye1z3jYpD5BCgq_jOkVDCw9FfFpwbSmp2QCfRSZJOTDH5IN-MXEqgd-aZgdh_X12GUPsSQcRNDrfTDhY6jVE)

In [10]:
# See an actual review in words
# Reverse from integers to words using the DICTIONARY (given by keras...need to do nothing to create it)

word_index = tf.keras.datasets.imdb.get_word_index()
train_data, _, _, _ = load_data()

reverse_word_index = dict(
[(value, key) for (key, value) in word_index.items()])

decoded_review = ' '.join(
[reverse_word_index.get(i - 3, '?') for i in train_data[123]])

print(decoded_review)

  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? beautiful and touching movie rich colors great settings good acting and one of the most charming movies i have seen in a while i never saw such an interesting setting when i was in china my wife liked it so much she asked me to ? on and rate it so other would enjoy too


## Model
The input layer is an embedding layer that embeds each word (the words are represented by an integer) into a vector. Therefore, the `input_dim` is equal to `num_words` which is defined in the `load_data` function. The `num_words` indicates the number of words which can be loaded based on the reviews. It considers the top `num_words` most frequent words. On the tasks of the model is to learn these embedding to model better the representation of the words.

In [11]:
def define_model():
  model = tf.keras.models.Sequential()

  '''
  The first argument (10000) is the number of distinct words in the training set 
  and the second argument (128) indicates the size of the embedding vectors.
  '''

  model.add(tf.keras.layers.Embedding(input_dim=10000, output_dim=128))

  '''
  A Long Short-Term Memory network or LSTM is a type of recurrent neural network
  (RNN) that was developed to resolve the vanishing gradients problem. This 
  problem, which is caused by the chaining of gradients during error backpropagation,
  means that the most upstream layers in a neural network learn very slowly.
  '''

  model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(32, return_sequences=True)))
  model.add(tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)))

  model.add(tf.keras.layers.Dense(1, activation='sigmoid'))

  return model

## Train and Test

In [12]:
train_data, train_labels, test_data, test_labels = load_data()

model = define_model()

opt = tf.keras.optimizers.Adam(learning_rate=0.0001)

model.compile(optimizer=opt,
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(train_data, train_labels, batch_size=128, epochs=5)

test_loss, test_acc = model.evaluate(test_data, test_labels)

  x_train, y_train = np.array(xs[:idx]), np.array(labels[:idx])
  x_test, y_test = np.array(xs[idx:]), np.array(labels[idx:])


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [13]:
print("Test accuracy: %.2f" % (test_acc * 100))

Test accuracy: 83.63


In [14]:
model.summary()

Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
embedding_1 (Embedding)      (None, None, 128)         1280000   
_________________________________________________________________
bidirectional_2 (Bidirection (None, None, 64)          41216     
_________________________________________________________________
bidirectional_3 (Bidirection (None, 128)               66048     
_________________________________________________________________
dense_1 (Dense)              (None, 1)                 129       
Total params: 1,387,393
Trainable params: 1,387,393
Non-trainable params: 0
_________________________________________________________________


## More Study
[tf.keras.datasets.imdb.load_data](https://www.tensorflow.org/api_docs/python/tf/keras/datasets/imdb/load_data)

[How to Use Word Embedding Layers for Deep Learning with Keras](https://machinelearningmastery.com/use-word-embedding-layers-deep-learning-keras/)

[How does Keras 'Embedding' layer work?](https://stats.stackexchange.com/questions/270546/how-does-keras-embedding-layer-work)

[Bidirectional LSTMs with TensorFlow 2.0 and Keras](https://www.machinecurve.com/index.php/2021/01/11/bidirectional-lstms-with-tensorflow-and-keras/)