# Understanding RNNs and LSTMs for Sentiment Analysis - Template

## What is a Recurrent Neural Network (RNN)?
RNNs are neural networks designed to work with sequential data. Unlike traditional neural networks, they maintain an internal memory (hidden state) that allows them to remember information from previous inputs.

![RNN Basic Structure](https://www.mdpi.com/information/information-15-00517/article_deploy/html/images/information-15-00517-g001-550.jpg)

### How RNNs Process Text:
1. Words are converted to numbers (embeddings)
2. Each word is processed sequentially
3. Hidden state is updated at each step
4. Final output depends on all previous inputs

## Long Short-Term Memory (LSTM)
LSTMs are an advanced form of RNN that solve the "vanishing gradient" problem, allowing them to remember long-term dependencies.

![LSTM Cell Structure](https://miro.medium.com/v2/resize:fit:1156/1*laH0_xXEkFE0lKJu54gkFQ.png)

![LSTM](https://databasecamp.de/wp-content/uploads/lstm-architecture-1024x709.png)

### Key Components of LSTM:
1. **Forget Gate**: Decides what information to discard
2. **Input Gate**: Decides what new information to store
3. **Cell State**: Long-term memory component
4. **Output Gate**: Decides what parts of cell state to output

## Our Task: Sentiment Analysis
We'll use an LSTM network to analyze movie reviews and classify them as positive or negative.

![Sentiment Analysis Process](https://miro.medium.com/max/700/1*SICYykT7ybua1gVJDNlajw.png)

### Process Flow:
1. Text → Numbers (Embedding)
2. LSTM processes word sequence
3. Dense layers interpret LSTM output
4. Final prediction (0 = Negative, 1 = Positive)# Understanding RNNs and LSTMs for Sentiment Analysis

## What is a Recurrent Neural Network (RNN)?
RNNs are neural networks designed to work with sequential data. Unlike traditional neural networks, they maintain an internal memory (hidden state) that allows them to remember information from previous inputs.

![RNN Basic Structure](https://www.mdpi.com/information/information-15-00517/article_deploy/html/images/information-15-00517-g001-550.jpg)

### How RNNs Process Text:
1. Words are converted to numbers (embeddings)
2. Each word is processed sequentially
3. Hidden state is updated at each step
4. Final output depends on all previous inputs

## Long Short-Term Memory (LSTM)
LSTMs are an advanced form of RNN that solve the "vanishing gradient" problem, allowing them to remember long-term dependencies.

![LSTM Cell Structure](https://miro.medium.com/v2/resize:fit:1156/1*laH0_xXEkFE0lKJu54gkFQ.png)

![LSTM](https://databasecamp.de/wp-content/uploads/lstm-architecture-1024x709.png)

### Key Components of LSTM:
1. **Forget Gate**: Decides what information to discard
2. **Input Gate**: Decides what new information to store
3. **Cell State**: Long-term memory component
4. **Output Gate**: Decides what parts of cell state to output

## Our Task: Sentiment Analysis
We'll use an LSTM network to analyze movie reviews and classify them as positive or negative.

![Sentiment Analysis Process](https://miro.medium.com/max/700/1*SICYykT7ybua1gVJDNlajw.png)

### Process Flow:
1. Text → Numbers (Embedding)
2. LSTM processes word sequence
3. Dense layers interpret LSTM output
4. Final prediction (0 = Negative, 1 = Positive)

Follow the TODOs in each section to complete the implementation!

# Sentiment Analysis using RNN (LSTM) - Template

This notebook provides a template for building a Recurrent Neural Network (LSTM) for sentiment analysis using the IMDB movie reviews dataset.

### Learning Objectives:
1. Understand RNN/LSTM architecture
2. Learn text preprocessing for deep learning
3. Build and train an LSTM model
4. Evaluate model performance
5. Make predictions on new text

### Dataset:
We'll use the IMDB dataset which contains 50,000 movie reviews labeled as positive (1) or negative (0).

### Instructions:
Follow the TODOs in each section to complete the implementation.

## 1. Import Required Libraries and Setup

TODO: Import the necessary libraries:
- TensorFlow and Keras
- NumPy
- Matplotlib

### Text Preprocessing Steps

Before we can use text data in our LSTM, we need to:

1. **Convert Words to Numbers**
   ```python
   # Example:
   "Great movie!" → [143, 256, 1]  # Each number represents a word
   ```

2. **Make Sequences Same Length**
   ```python
   # Example:
   [143, 256, 1] → [143, 256, 1, 0, 0]  # Padding with zeros
   ```

3. **Create Word Embeddings**
   ```python
   # Example:
   143 → [0.2, -0.5, 0.1, ...]  # Convert to dense vector
   ```

![Text Processing](https://raw.githubusercontent.com/dair-ai/ml-visuals/master/images/text-preprocessing.png)

In [None]:
# TODO: Import required libraries
# Hint: You need tensorflow, numpy, and matplotlib

# Print TensorFlow version

## 2. Load and Preprocess the IMDB Dataset

TODO:
1. Set vocabulary size and maximum sequence length
2. Load the IMDB dataset
3. Pad sequences to ensure uniform length

In [None]:
# TODO: Set parameters
# vocab_size = ...  # Only keep top 10k words
# max_length = ...  # Maximum length of each review
# trunc_type = ...  # Where to truncate
# padding_type = ... # Where to add padding

# TODO: Load the IMDB dataset
# Hint: Use imdb.load_data()

# TODO: Pad the sequences
# Hint: Use pad_sequences()

# Print shapes of training and testing sets

## 3. Build the LSTM Model

TODO: Create a Sequential model with:
1. Embedding layer
2. One or more LSTM layers
3. Dense layers for classification

### LSTM Model Architecture

You'll build a model with these layers:

1. **Embedding Layer**
   ```python
   Embedding(vocab_size, embedding_dim, input_length=max_length)
   ```
   ![Word Embedding](https://arize.com/wp-content/uploads/2022/06/blog-king-queen-embeddings.jpg)

2. **LSTM Layers**
   ```python
   LSTM(units=64, return_sequences=True)  # First LSTM
   LSTM(units=32)                         # Second LSTM
   ```
  ![Stacked LSTM](https://www.researchgate.net/publication/328819708/figure/fig4/AS:845838377046027@1578674992640/Simple-LSTM-Vs-Stacked-LSTM.png)
3. **Dense Layers**
   ```python
   Dense(16, activation='relu')     # Hidden layer
   Dense(1, activation='sigmoid')   # Output layer
   ```
   ![Dense Layers](https://i.sstatic.net/dpp2W.png)

Complete the TODOs to implement this architecture!

In [None]:
# TODO: Set embedding dimension
# embedding_dim = ...

# TODO: Build the model
# model = Sequential([
#     # Add Embedding layer
#     # Add LSTM layer(s)
#     # Add Dense layer(s)
# ])

# TODO: Compile the model
# Hint: Use binary_crossentropy loss and adam optimizer

# Print model summary

## 4. Train the Model

TODO:
1. Set up early stopping callback
2. Train the model with validation split

In [None]:
# TODO: Define callbacks
# Hint: Use EarlyStopping

# TODO: Train the model
# Hint: Use model.fit with validation_split

## 5. Evaluate the Model

TODO:
1. Plot training history
2. Evaluate model on test set

In [None]:
# TODO: Plot training history
# Hint: Use plt.plot() for accuracy and loss

# TODO: Evaluate on test set
# Hint: Use model.evaluate()

## 6. Make Predictions

TODO:
1. Create a function to encode new text
2. Test the model with sample reviews

In [None]:
# TODO: Get the word index
# Hint: Use imdb.get_word_index()

# TODO: Create a function to encode text
# def encode_text(text):
#     # Your code here
#     pass

# TODO: Test with sample reviews
# sample_reviews = [
#     "This movie was fantastic! I really loved it.",
#     "I hated this movie. It was terrible."
# ]

# TODO: Make predictions
# Hint: Use model.predict()