# **Hands-on RNN with TensorFlow**
This notebook will guide you through building and training a basic Recurrent Neural Network (RNN) model using TensorFlow. We'll explore the concepts learned in the lecture and put them into practice by performing sentiment analysis on a Natural Language Processing (NLP) dataset.

## **Prerequisites:**

* Basic understanding of Python programming
* Familiarity with Machine Learning concepts
* Introduction to Deep Learning (activation functions, optimizers)

## **Learning Objectives:**

* Implement a basic RNN model in TensorFlow
* Train the model on an NLP dataset
* Evaluate the model's performance
* Make predictions using the trained model
* Visualize the results

## Loading and Preprocessing the IMDB Review Dataset

In this step, we'll load and preprocess the IMDB movie review dataset for our RNN sentiment analysis task. Sentiment analysis aims to automatically determine the overall opinion or feeling expressed in a piece of text. Here, we're specifically trying to classify reviews as either positive (expressing a good opinion about the movie) or negative (expressing a bad opinion).

The `imdb.load_data` function provides a convenient way to access the IMDB dataset, which includes reviews labeled as positive or negative. We'll use this data to train our RNN model to identify sentiment patterns in text.

**Key Steps:**

1. **Data Loading:** We load a subset of the IMDB dataset using `imdb.load_data()`, specifying the maximum number of words to consider (10,000 in this case) to reduce vocabulary size and training complexity.

2. **Data Splitting:** The loaded data is split into training and testing sets. The training set will be used to train the model, and the testing set will be used to evaluate its performance on unseen reviews.

3. **Text Preprocessing:**
    - **Text to Sequences:** Reviews, which are initially strings of text, are converted into sequences of integer indices using `text_to_word_sequence`. This allows the model to process the text data numerically.
    - **Word Indexing:** Each word in the vocabulary is mapped to a unique integer index using a dictionary (`word_index`). This dictionary is retrieved from the tokenizer used during data loading by `imdb.get_word_index()`.
    - **Padding:** Sequences can have varying lengths. We use `pad_sequences` to ensure all sequences have a uniform length (100 in this case) by padding shorter sequences with zeros at the end or truncating longer ones. Padding is essential for feeding sequences into RNN models.

By following these steps, we prepare the IMDB review data for training our RNN model to perform sentiment analysis.



## 1. Import Libraries

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.preprocessing.text import text_to_word_sequence

# Import libraries for data visualization
import matplotlib.pyplot as plt

We'll import necessary libraries:

* `tensorflow`: The core library for building and training deep learning models.
* `keras`: A high-level API built on top of TensorFlow for easier model development.
* `imdb dataset loader`: We'll use a subset of the IMDB movie review dataset for this exercise.
* `pad_sequences`: Imports a function for padding text sequences to a uniform length (important for RNNs).
* `text_to_word_sequence`: Imports a function to split text into a list of words (useful for text pre-processing).
* `matplotlib.pyplot`: For data visualization after training.

## **2. Load and Preprocess Data**

In [2]:
(train_data, train_labels), (test_data, test_labels) = imdb.load_data(num_words=10000)

# Get word index dictionary from tokenizer used during data loading
word_index = imdb.get_word_index()

# Convert data sequences to tensors (readable by the model)
train_data = pad_sequences(train_data, maxlen=100)
test_data = pad_sequences(test_data, maxlen=100)

This code performs the following actions:

* Loads a subset of the IMDB movie review dataset with 10,000 most frequent words.
* Obtain a reverse lookup for indices of all 10,000 words.
* Splits the data into training and testing sets.
* Converts the sequence of word indexes into padded tensors. Padding ensures all sequences have the same length (100 words in this case) for the model to process them effectively.


### **Explanation:**

   - The `imdb.load_data()` function loads the IMDB movie review dataset with only the top 10,000 most frequently occurring words (`num_words=10000`).
   - The dataset is divided into training and testing sets, with data as sequences of word indices and labels as sentiment classifications (positive or negative).
   - `imdb.get_word_index()` retrieves the dictionary that maps words to their corresponding integer indices. This dictionary was created when the IMDB dataset was initially preprocessed.
   - `pad_sequences()` transforms the lists of word indices (which might be of  different lengths) into tensors of a fixed length (`maxlen=100`).
   - This ensures all input sequences have the same length,  which is necessary for most neural network models.  Sequences shorter than 100 are padded with zeros at the end, and longer ones are truncated.

## **3. Build the RNN Model**

In [3]:
# Define the RNN model architecture
model = keras.Sequential([
  keras.layers.Embedding(input_dim=10000, output_dim=128, input_length=100),
  keras.layers.SimpleRNN(units=64),
  keras.layers.Dense(1, activation='sigmoid')
])

# Compile the model with optimizer, loss function, and metrics
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Print a summary of the model architecture
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 100, 128)          1280000   
                                                                 
 simple_rnn (SimpleRNN)      (None, 64)                12352     
                                                                 
 dense (Dense)               (None, 1)                 65        
                                                                 
Total params: 1292417 (4.93 MB)
Trainable params: 1292417 (4.93 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


### **Explanation:**

Here, we build the RNN model step-by-step:

* **Embedding layer:** This layer transforms each integer representing a word in the sequence into a dense vector (128 dimensions in this case). This captures semantic relationships between words.
* **SimpleRNN layer:** This is the core recurrent layer with 64 hidden units. It processes the sequence of word vectors one by one, considering the information from previous steps.
* **Dense layer:** This final layer takes the output from the RNN and compresses it into a single unit with a sigmoid activation function. Since we're predicting positive or negative sentiment (1 or 0), sigmoid is a good choice.

We then compile the model by specifying:

* **Loss function:** 'binary_crossentropy' is suitable for binary classification tasks like sentiment analysis.
* **Optimizer:** 'adam' is a popular optimizer that efficiently adjusts model weights during training.
* **Metrics:** We track 'accuracy' to measure how well the model predicts sentiment correctly.

Finally, `model.summary()` provides a detailed overview of the model architecture, including the number of parameters and layers.

## **4. Train the Model**

In [4]:
# Train the model on the training data
history = model.fit(
    train_data, train_labels,
    epochs=5,
    validation_data=(test_data, test_labels),
    batch_size=32
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


### **Explanation:**

* `model.fit `takes the training data, labels, the number of training epochs (iterations), and validation data for monitoring performance during training.
* During training, the model learns to map sequences of words (reviews) to sentiment labels (positive or negative).

* We've introduced the `batch_size` parameter to the `model.fit` function. During training, a deep learning model uses batches of data—subgroups of the entire training set—for efficient learning. A batch size of 32 is a common choice. This means that in each iteration (within an epoch), 32 training samples will be used to update the model's weights.

### **How training works under the hood:**

1. The training data is divided into batches.
2. The model processes a single batch of data and predicts the labels for that batch.
3. The model's loss (how far off the predictions are) is calculated.
4. Gradients are computed to determine how to adjust the model's weights to reduce the loss.
5. The optimizer uses the gradients to update the model's weights.
6. Steps 2-5 are repeated for all batches in one epoch, and then again for multiple epochs.

### **Key points**

* **Validation data:** We use the validation data to see how well the model generalizes to unseen data. This helps avoid overfitting (where the model performs very well on the training data but poorly on held-out data).
* **Experiment with:** Feel free to experiment with different `batch_size` values and numbers of `epochs` to see how they affect model performance.

## **5. Evaluate the Model**

In [5]:
# Evaluate Performance on Test Data
test_loss, test_accuracy = model.evaluate(test_data, test_labels)

print('Test Loss:', test_loss)
print('Test Accuracy:', test_accuracy)

Test Loss: 0.8543428778648376
Test Accuracy: 0.7736799716949463


### **Explanation:**

`model.evaluate `calculates the loss and accuracy on the held-out testing set. This gives us a more unbiased assessment of how well the model has truly learned to generalize.

## **6. Make Predictions**

In [6]:
# Generate predictions on new data
new_review = "This movie was fantastic! Great acting and an engaging plot."  # Example
new_review = text_to_word_sequence(new_review)
encoded_review = [word_index[word] for word in new_review if word in word_index]
encoded_review = pad_sequences([encoded_review], maxlen=100)
prediction = model.predict(encoded_review)

# Interpret the prediction
if prediction[0][0] > 0.5:
    print("Positive Sentiment")
else:
    print("Negative Sentiment")

Positive Sentiment


**Predicting Sentiment on a New Review**

This code takes a new movie review as input and uses the trained RNN model to predict its sentiment (positive or negative). Here's a step-by-step breakdown:

1. **Preprocess the Review:**
   - **`new_review = text_to_word_sequence(new_review)`:** Splits the review into a list of individual words.
   - **`encoded_review = [word_index[word] for word in new_review if word in word_index]`:** Converts the words to their corresponding integer indices from the `word_index` dictionary (filtering out words not in the vocabulary).
   - **`encoded_review = pad_sequences([encoded_review], maxlen=100)`:** Pads or truncates the encoded review to a length of 100 to match the model's input shape.

2. **Generate Prediction:**
   - **`prediction = model.predict(encoded_review)`:** Feeds the preprocessed review to the trained RNN model, generating a probability between 0 and 1 (higher probability indicates positive sentiment).

3. **Interpret the Prediction:**
   - **`if prediction[0][0] > 0.5: ... else: ...`:** Checks the probability to determine the sentiment. If it's greater than 0.5, the review is considered positive; otherwise, it's considered negative.


## **Conclusion**

This code demonstrates how you can use a trained RNN model to make predictions on unseen textual data. Remember that the quality of your predictions will depend on the size and representativeness of your training dataset, as well as the model's architecture.

## **Extending Your Knowledge**

* Explore more complex RNN architectures like LSTM and GRU cells: https://keras.io/api/layers/recurrent_layers/
* Learn about pre-trained word embeddings to improve your model: https://blog.keras.io/using-pre-trained-word-embeddings-in-a-keras-model.html
* Experiment with different hyperparameters and regularization techniques: https://www.tensorflow.org/tutorials/keras/overfit_and_underfit