## Recurrent Neural Networks (RNNs)

This notebook provides an introduction to Recurrent Neural Networks (RNNs), covering their basics, advanced variants like LSTM and GRU, and applications in time-series prediction.

---

### Table of Contents
1. **Introduction to RNNs**
2. **Basics of RNNs**
3. **Long Short-Term Memory (LSTM)**
4. **Gated Recurrent Units (GRU)**
5. **Applications in Time-Series Prediction**
6. **Example: Time-Series Prediction with LSTM**

---

## 1. Introduction to RNNs

Recurrent Neural Networks (RNNs) are a class of neural networks designed for sequential data. Unlike feedforward networks, RNNs have connections that form directed cycles, allowing them to maintain a 'memory' of previous inputs. This makes them suitable for tasks such as time-series prediction, natural language processing, and speech recognition.

![RNN](https://miro.medium.com/v2/resize:fit:1400/1*WMnFSJHzOloFlJHU6fVN-g.gif)
---

## 2. Basics of RNNs

### Mathematical Explanation
An RNN processes a sequence of inputs $ x_1, x_2, \dots, x_T $ and produces a sequence of hidden states $ h_1, h_2, \dots, h_T $. The hidden state at time step $ t $ is computed as:

$
h_t = \sigma(W_h h_{t-1} + W_x x_t + b_h)
$

Where:
- $ h_t $ is the hidden state at time $ t $.
- $ x_t $ is the input at time $ t $.
- $ W_h $ is the weight matrix for the hidden state.
- $ W_x $ is the weight matrix for the input.
- $ b_h $ is the bias term.
- $ \sigma $ is the activation function (e.g., tanh or ReLU).

![RNN Unrolled](https://blog.peddy.ai/assets/2019-05-26-Recurrent-Neural-Networks/rnn_rnn_unrolled.png)

### Example: Simple RNN
```python
import numpy as np

# Define parameters
input_size = 3
hidden_size = 2
sequence_length = 4

# Initialize weights and biases
W_h = np.random.randn(hidden_size, hidden_size)
W_x = np.random.randn(hidden_size, input_size)
b_h = np.random.randn(hidden_size, 1)

# Input sequence
X = [np.random.randn(input_size, 1) for _ in range(sequence_length)]

# Initialize hidden state
h_prev = np.zeros((hidden_size, 1))

# RNN forward pass
hidden_states = []
for x in X:
    h = np.tanh(np.dot(W_h, h_prev) + np.dot(W_x, x) + b_h)
    hidden_states.append(h)
    h_prev = h

print("Hidden States:\n", hidden_states)
```

---

## 3. Long Short-Term Memory (LSTM)

LSTMs are a special kind of RNN capable of learning long-term dependencies. They address the vanishing gradient problem by introducing gates that regulate the flow of information.

![LSTM](https://colah.github.io/posts/2015-08-Understanding-LSTMs/img/LSTM3-chain.png)

### Mathematical Explanation
An LSTM unit consists of:
- **Forget Gate ($ f_t $):** Decides what information to discard from the cell state.
- **Input Gate ($ i_t $):** Decides what new information to store in the cell state.
- **Output Gate ($ o_t $):** Decides what to output based on the cell state.

The equations for an LSTM unit are:

$
f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)
$

$
i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i)
$

$
\tilde{C}_t = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C)
$

$
C_t = f_t \cdot C_{t-1} + i_t \cdot \tilde{C}_t
$

$
o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o)
$

$
h_t = o_t \cdot \tanh(C_t)
$


Reference:https://colah.github.io/posts/2015-08-Understanding-LSTMs/
### Example: LSTM in TensorFlow/Keras
```python
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Define LSTM model
model = Sequential([
    LSTM(50, activation='relu', input_shape=(sequence_length, input_size)),
    Dense(1)
])

model.compile(optimizer='adam', loss='mse')
print(model.summary())
```

---

## 4. Gated Recurrent Units (GRU)

GRUs are a variation of LSTMs with a simplified architecture. They use update and reset gates to control the flow of information, making them computationally more efficient.

### Mathematical Explanation
A GRU unit consists of:
- **Update Gate ($ z_t $):** Decides how much of the past information to keep.
- **Reset Gate ($ r_t $):** Decides how much of the past information to forget.

The equations for a GRU unit are:

$
z_t = \sigma(W_z \cdot [h_{t-1}, x_t] + b_z)
$
$
r_t = \sigma(W_r \cdot [h_{t-1}, x_t] + b_r)
$
$
\tilde{h}_t = \tanh(W_h \cdot [r_t \cdot h_{t-1}, x_t] + b_h)
$
$
h_t = (1 - z_t) \cdot h_{t-1} + z_t \cdot \tilde{h}_t
$

### Example: GRU in TensorFlow/Keras
```python
from tensorflow.keras.layers import GRU

# Define GRU model
model = Sequential([
    GRU(50, activation='relu', input_shape=(sequence_length, input_size)),
    Dense(1)
])

model.compile(optimizer='adam', loss='mse')
print(model.summary())
```

---

## 5. Applications in Time-Series Prediction

RNNs, LSTMs, and GRUs are widely used in time-series prediction tasks such as:
- Stock price forecasting
- Weather prediction
- Energy consumption forecasting
- Anomaly detection

---

## 6. Example: Time-Series Prediction with LSTM

Let's build an LSTM model to predict the next value in a synthetic time-series dataset.

```python
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense

# Generate synthetic time-series data
def generate_time_series(n_steps):
    time = np.arange(0, n_steps)
    data = np.sin(0.1 * time) + np.random.normal(0, 0.1, n_steps)
    return data

n_steps = 1000
data = generate_time_series(n_steps)

# Prepare dataset
def prepare_dataset(data, sequence_length):
    X, y = [], []
    for i in range(len(data) - sequence_length):
        X.append(data[i:i+sequence_length])
        y.append(data[i+sequence_length])
    return np.array(X), np.array(y)

sequence_length = 20
X, y = prepare_dataset(data, sequence_length)

# Reshape input to be [samples, time steps, features]
X = X.reshape((X.shape[0], X.shape[1], 1))

# Split into training and testing sets
split = int(0.8 * len(X))
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

# Define LSTM model
model = Sequential([
    LSTM(50, activation='relu', input_shape=(sequence_length, 1)),
    Dense(1)
])

model.compile(optimizer='adam', loss='mse')

# Train the model
history = model.fit(X_train, y_train, epochs=20, validation_data=(X_test, y_test))

# Plot training and validation loss
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

# Make predictions
y_pred = model.predict(X_test)

# Plot predictions vs actual values
plt.plot(y_test, label='Actual')
plt.plot(y_pred, label='Predicted')
plt.xlabel('Time Steps')
plt.ylabel('Value')
plt.legend()
plt.show()
```

---

### Summary
This notebook introduced the basics of RNNs, LSTMs, and GRUs, along with their applications in time-series prediction. We also implemented an LSTM model for time-series forecasting using TensorFlow/Keras.

---

### Next Steps
- Experiment with different RNN architectures and hyperparameters.
- Explore advanced topics like attention mechanisms and transformer models.
- Apply RNNs to real-world time-series datasets.

---