# 09. Sequence Models: RNN, LSTM, GRU
Sequence models are a class of models in Natural Language Processing (NLP) that are designed to handle sequential data, such as text or time series. They are particularly useful for tasks where the order of the data points matters, such as language modeling, machine translation, and speech recognition. Common types of sequence models include Recurrent Neural Networks (RNNs), Long Short-Term Memory (LSTM) networks, and Gated Recurrent Units (GRUs).

### What You'll Learn:
- Why sequence models for NLP
- RNN limitations
- LSTM architecture
- GRU architecture
- Practical implementation

## Why Sequence Models?

**Problems with traditional ML**:
- No memory of previous inputs
- Can't handle variable length sequences
- Lost temporal relationships

**Solution: Recurrent Neural Networks (RNNs)**
- Have memory/state
- Process sequences
- Perfect for NLP

## RNN Problem: Vanishing Gradient

- In deep RNNs, gradients become very small
- Model can't learn long-term dependencies
- Solution: LSTM and GRU

## LSTM (Long Short-Term Memory)

**Key Components**:
- **Input Gate**: What new info to store
- **Forget Gate**: What old info to forget
- **Output Gate**: What to output
- **Cell State**: Long-term memory

Advantage: Can learn dependencies far apart in sequence

In [1]:
import tensorflow as tf
from tensorflow.keras.layers import LSTM, Dense, Embedding, Dropout
from tensorflow.keras.models import Sequential

print('='*60)
print('LSTM MODEL FOR SENTIMENT CLASSIFICATION')
print('='*60)

# Create LSTM model
model = Sequential([
    Embedding(1000, 64, input_length=100),  # Embedding layer
    LSTM(128, return_sequences=True),        # LSTM layer
    Dropout(0.2),                             # Dropout for regularization
    LSTM(64),                                 # Another LSTM layer
    Dense(32, activation='relu'),             # Dense layer
    Dense(1, activation='sigmoid')            # Output layer
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

print('\nModel Architecture:')
model.summary()

print('\n✓ LSTM model created')
print('\nModel Explanation:')
print('- Embedding: Converts word IDs to vectors')
print('- LSTM layer 1: Learns long-term dependencies')
print('- LSTM layer 2: Refines learned patterns')
print('- Dense layers: Classification')
print('- Output: Probability (0=negative, 1=positive)')

LSTM MODEL FOR SENTIMENT CLASSIFICATION





Model Architecture:



✓ LSTM model created

Model Explanation:
- Embedding: Converts word IDs to vectors
- LSTM layer 1: Learns long-term dependencies
- LSTM layer 2: Refines learned patterns
- Dense layers: Classification
- Output: Probability (0=negative, 1=positive)


## GRU (Gated Recurrent Unit)

**Similar to LSTM but**:
- Simpler architecture (fewer parameters)
- Faster training
- Often similar performance
- Good alternative to LSTM

In [2]:
from tensorflow.keras.layers import GRU

# Create GRU model
model_gru = Sequential([
    Embedding(1000, 64, input_length=100),
    GRU(128, return_sequences=True),
    Dropout(0.2),
    GRU(64),
    Dense(32, activation='relu'),
    Dense(1, activation='sigmoid')
])

model_gru.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

print('GRU Model Created')
print('\nComparison:')
print('LSTM: More parameters, slower, better for complex patterns')
print('GRU: Fewer parameters, faster, simpler alternative')

GRU Model Created

Comparison:
LSTM: More parameters, slower, better for complex patterns
GRU: Fewer parameters, faster, simpler alternative
