# Recurrent Neural Network (RNN) Implementation

This notebook demonstrates a complete implementation of a Recurrent Neural Network for sequence prediction.

## Dataset
We'll use IMDB movie reviews dataset for sentiment analysis.

## Topics Covered:
1. Sequential Data Processing
2. RNN Architecture
3. Text Tokenization and Padding
4. Sentiment Analysis
5. Model Evaluation

## 1. Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import SimpleRNN, Dense, Embedding, Dropout
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.callbacks import EarlyStopping

print(f"TensorFlow Version: {tf.__version__}")
print(f"Keras Version: {keras.__version__}")

## 2. Load and Explore Data

In [None]:
# Load IMDB dataset
vocab_size = 10000
max_length = 200

(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=vocab_size)

print(f"Training sequences: {len(X_train)}")
print(f"Test sequences: {len(X_test)}")
print(f"\nSample review (encoded): {X_train[0][:20]}...")
print(f"Label: {y_train[0]} (1=positive, 0=negative)")
print(f"\nReview lengths (first 10): {[len(x) for x in X_train[:10]]}")

## 3. Visualize Data Distribution

In [None]:
# Class distribution
plt.figure(figsize=(12, 5))

plt.subplot(1, 2, 1)
unique, counts = np.unique(y_train, return_counts=True)
plt.bar(['Negative', 'Positive'], counts)
plt.title('Sentiment Distribution')
plt.ylabel('Count')
plt.grid(axis='y', alpha=0.3)

# Sequence length distribution
plt.subplot(1, 2, 2)
seq_lengths = [len(x) for x in X_train]
plt.hist(seq_lengths, bins=50, edgecolor='black')
plt.xlabel('Sequence Length')
plt.ylabel('Frequency')
plt.title('Review Length Distribution')
plt.axvline(max_length, color='red', linestyle='--', label=f'Max Length: {max_length}')
plt.legend()
plt.grid(alpha=0.3)

plt.tight_layout()
plt.show()

## 4. Data Preprocessing - Padding Sequences

In [None]:
# Pad sequences to same length
X_train = pad_sequences(X_train, maxlen=max_length, padding='post', truncating='post')
X_test = pad_sequences(X_test, maxlen=max_length, padding='post', truncating='post')

print(f"Training data shape: {X_train.shape}")
print(f"Test data shape: {X_test.shape}")
print(f"\nSample padded sequence:\n{X_train[0]}")

## 5. Build RNN Architecture

In [None]:
# Initialize the RNN
embedding_dim = 128

model = Sequential([
    # Embedding layer
    Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_length),
    
    # RNN layers
    SimpleRNN(128, return_sequences=True),
    Dropout(0.3),
    
    SimpleRNN(64),
    Dropout(0.3),
    
    # Dense layers
    Dense(32, activation='relu'),
    Dropout(0.3),
    
    # Output layer
    Dense(1, activation='sigmoid')
])

# Display model architecture
model.summary()

## 6. Compile the Model

In [None]:
# Compile the model
model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

print("Model Compiled Successfully")

## 7. Train the Model

In [None]:
# Early stopping callback
early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Train the model
history = model.fit(X_train, y_train,
                    validation_split=0.2,
                    epochs=10,
                    batch_size=128,
                    callbacks=[early_stop],
                    verbose=1)

print("\nTraining Complete!")

## 8. Visualize Training History

In [None]:
# Plot training & validation metrics
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Training Accuracy', marker='o')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy', marker='s')
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True)

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Training Loss', marker='o')
plt.plot(history.history['val_loss'], label='Validation Loss', marker='s')
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend()
plt.grid(True)

plt.tight_layout()
plt.show()

## 9. Evaluate the Model

In [None]:
# Make predictions
y_pred_prob = model.predict(X_test)
y_pred = (y_pred_prob > 0.5).astype(int).flatten()

# Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)
print(f"Test Accuracy: {accuracy:.4f}")

# Confusion Matrix
cm = confusion_matrix(y_test, y_pred)
plt.figure(figsize=(8, 6))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=['Negative', 'Positive'],
            yticklabels=['Negative', 'Positive'])
plt.title('Confusion Matrix')
plt.ylabel('Actual')
plt.xlabel('Predicted')
plt.show()

# Classification Report
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=['Negative', 'Positive']))

## 10. Decode and Display Sample Predictions

In [None]:
# Get word index
word_index = imdb.get_word_index()
reverse_word_index = {value: key for key, value in word_index.items()}

def decode_review(encoded_review):
    return ' '.join([reverse_word_index.get(i - 3, '?') for i in encoded_review if i >= 3])

# Show some predictions
print("Sample Predictions:\n")
for i in range(5):
    review = decode_review(X_test[i])
    actual = 'Positive' if y_test[i] == 1 else 'Negative'
    predicted = 'Positive' if y_pred[i] == 1 else 'Negative'
    confidence = y_pred_prob[i][0]
    
    print(f"Review {i+1}:")
    print(f"Text: {review[:200]}...")
    print(f"Actual: {actual}, Predicted: {predicted}, Confidence: {confidence:.4f}")
    print("-" * 80)

## 11. Save the Model

In [None]:
# Save the model
model.save('rnn_sentiment_model.h5')
print("Model saved as 'rnn_sentiment_model.h5'")

## Summary

### Key Takeaways:
1. **RNN Architecture**: Processes sequences with temporal dependencies
2. **Embedding Layer**: Converts words to dense vectors
3. **Sequential Processing**: Maintains hidden state across time steps
4. **Sentiment Analysis**: Binary classification of text
5. **Padding**: Uniform sequence lengths for batch processing

### When to Use RNN:
- Sentiment analysis
- Text classification
- Time series prediction
- Speech recognition
- Machine translation (basic)

### Advantages:
- Handles variable-length sequences
- Maintains temporal information
- Shares parameters across time steps
- Good for short sequences

### Limitations:
- Vanishing gradient problem
- Difficulty learning long-term dependencies
- Slower training than feedforward networks
- LSTM/GRU often perform better

### RNN vs LSTM:
- **RNN**: Simpler, faster, but struggles with long sequences
- **LSTM**: More complex, better for long-term dependencies
- Use RNN for simple/short sequences
- Use LSTM for complex/long sequences