# Model 12: RNN (Recurrent Neural Network) â€“ Step-by-Step
Train a **vanilla RNN** on a real dataset: **IMDB movie review sentiment** (positive/negative).

You will learn:
1) What an RNN is (sequence + memory)
2) How to prepare text as padded sequences
3) How to build a `SimpleRNN` model
4) How to train and evaluate
5) Why vanilla RNNs struggle with long-range memory (motivation for LSTM/GRU)


In [None]:
# If TensorFlow is missing, uncomment:
# !pip -q install tensorflow

import numpy as np
import tensorflow as tf
from tensorflow.keras import layers, models

print('TensorFlow version:', tf.__version__)

## 1) Load the IMDB dataset

In [None]:
num_words = 10000
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.imdb.load_data(num_words=num_words)

print('Train samples:', len(x_train))
print('Test samples :', len(x_test))
print('Example review length (tokens):', len(x_train[0]))
print('Label (0=neg, 1=pos):', y_train[0])

## 2) Pad sequences

In [None]:
maxlen = 200
x_train = tf.keras.preprocessing.sequence.pad_sequences(x_train, maxlen=maxlen)
x_test  = tf.keras.preprocessing.sequence.pad_sequences(x_test, maxlen=maxlen)

print('Train shape:', x_train.shape)
print('Test shape :', x_test.shape)

## 3) Build a vanilla RNN model

In [None]:
model = models.Sequential([
    layers.Embedding(input_dim=num_words, output_dim=64, input_length=maxlen),
    layers.SimpleRNN(64),
    layers.Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.summary()

## 4) Train

In [None]:
history = model.fit(
    x_train, y_train,
    epochs=3,
    batch_size=128,
    validation_split=0.2,
    verbose=1
)

## 5) Evaluate

In [None]:
loss, acc = model.evaluate(x_test, y_test, verbose=0)
print('Test accuracy:', acc)
print('Test loss:', loss)

## 6) Plot training curves

In [None]:
import matplotlib.pyplot as plt

plt.figure()
plt.plot(history.history['accuracy'], label='train acc')
plt.plot(history.history['val_accuracy'], label='val acc')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.title('RNN Accuracy')
plt.legend()
plt.show()

plt.figure()
plt.plot(history.history['loss'], label='train loss')
plt.plot(history.history['val_loss'], label='val loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.title('RNN Loss')
plt.legend()
plt.show()

## 7) Try your own text (simple helper)

In [None]:
word_index = tf.keras.datasets.imdb.get_word_index()

def encode_review(text: str, maxlen=200):
    encoded = []
    for w in text.lower().split():
        idx = word_index.get(w)
        if idx is None:
            continue
        encoded.append(idx + 3)  # IMDB shifts indices by 3
    return tf.keras.preprocessing.sequence.pad_sequences([encoded], maxlen=maxlen)

def predict_sentiment(text: str):
    x = encode_review(text, maxlen=maxlen)
    prob = float(model.predict(x, verbose=0)[0][0])
    label = 'positive' if prob >= 0.5 else 'negative'
    return label, prob

examples = [
    'this movie was fantastic with great acting and a wonderful story',
    'boring film terrible plot and i fell asleep',
]

for t in examples:
    label, prob = predict_sentiment(t)
    print('Text:', t)
    print('Prediction:', label, '| probability:', prob)
    print('-'*80)

## 8) Why vanilla RNNs often struggle
- **Vanishing/exploding gradients**
- Weak long-term memory on long sequences
- LSTM/GRU fix this with gates (next model)
