<a href="https://colab.research.google.com/github/Jhansipothabattula/Machine_Learning/blob/main/Day64.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Gated Recurrent Units(GRUs)

# Introduction to Gated Recurrent Units (GRUs)

## What Are GRUs?

* Simplified variant of Long Short-Term Memory (LSTM) networks
* Designed to retain long-term dependencies while reducing computational complexity by having fewer parameters

## Key Features of GRUs

### Simpler Architecture

* GRUs have two gates (**update** and **reset**) compared to LSTMs' three gates (**input, forget, output**)

### Efficiency

* Fewer parameters make GRUs computationally faster and less prone to overfitting on smaller datasets

### Retains Performance

* Comparable to LSTMs in terms of capturing sequential dependencies.

## GRU Cell Structure: Update and Reset Gates

* **Update Gate**
    * Determines how much of the previous hidden state to retain and how much to update with new information
* **Reset Gate**
    * Controls how much of the past information to forget when combining with new input.
* **Hidden State Update**
    * Combines the new information and the past hidden state based on the reset and update gates


# When to Use GRUs vs. LSTMs

| Feature | LSTM | GRU |
| :--- | :--- | :--- |
| **Gates** | Input, Forget, Output | Update, Reset |
| **Parameters** | More | Fewer |
| **Performance** | Better for complex, longer sequences | Comparable for shorter sequences |
| **Training Speed** | Slower due to complexity | Faster due to simpler structure |
| **Use Cases** | NLP, speech recognition | Time-series data, small datasets |

**Objective**
- Build a GRU-based model for the IMDB Movie Reviews Dataset and Compare it's perfomance with the LSTM Model

In [3]:


import tensorflow as tf
from tensorflow.keras.datasets import imdb
from tensorflow.keras.preprocessing.sequence import pad_sequences
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, SimpleRNN, LSTM, GRU, Dense

vocab_size = 10000
max_len = 200

# Load the IMDB dataset, restricting the data to the top 'vocab_size' words.
(X_train, y_train), (X_test, y_test) = imdb.load_data(num_words=vocab_size)

X_train = pad_sequences(X_train, maxlen=max_len, padding='post')
X_test = pad_sequences(X_test, maxlen=max_len, padding='post')

print(f"Training Data Shape: {X_train.shape}")
print(f"Test Data Shape: {X_test.shape}")

rnn_model = Sequential([
    Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
    SimpleRNN(128, activation='tanh', return_sequences=False),
    Dense(1, activation='sigmoid')
])
rnn_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
rnn_model.summary()

rnn_history = rnn_model.fit(X_train, y_train, epochs=5, batch_size=64, validation_split=0.2)

rnn_loss, rnn_accuracy = rnn_model.evaluate(X_test, y_test)
print(f"RNN test Loss: {rnn_loss}, Accuracy: {rnn_accuracy}")

lstm_model = Sequential([
    Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
    LSTM(128, activation='tanh', return_sequences=False),
    Dense(1, activation='sigmoid')
])
lstm_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
lstm_model.summary()

lstm_history = lstm_model.fit(X_train, y_train, epochs=5, batch_size=64, validation_split=0.2)

lstm_loss, lstm_accuracy = lstm_model.evaluate(X_test, y_test)
print(f"LSTM test Loss: {lstm_loss}, Accuracy: {lstm_accuracy}")

gru_model = Sequential([
    Embedding(input_dim=vocab_size, output_dim=128, input_length=max_len),
    GRU(128, activation='tanh', return_sequences=False),
    Dense(1, activation='sigmoid')
])
gru_model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
gru_model.summary()

gru_history = gru_model.fit(X_train, y_train, epochs=5, batch_size=64, validation_split=0.2)

gru_loss, gru_accuracy =gru_model.evaluate(X_test, y_test)
print(f"GRU test Loss: {gru_loss}, Accuracy: {gru_accuracy}")

Training Data Shape: (25000, 200)
Test Data Shape: (25000, 200)




Epoch 1/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m56s[0m 172ms/step - accuracy: 0.5056 - loss: 0.6945 - val_accuracy: 0.5432 - val_loss: 0.6759
Epoch 2/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 171ms/step - accuracy: 0.5862 - loss: 0.6602 - val_accuracy: 0.5104 - val_loss: 0.6895
Epoch 3/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m83s[0m 173ms/step - accuracy: 0.5790 - loss: 0.6661 - val_accuracy: 0.5388 - val_loss: 0.6766
Epoch 4/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 169ms/step - accuracy: 0.6005 - loss: 0.6295 - val_accuracy: 0.5450 - val_loss: 0.6757
Epoch 5/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 170ms/step - accuracy: 0.6186 - loss: 0.6047 - val_accuracy: 0.5838 - val_loss: 0.6643
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m23s[0m 30ms/step - accuracy: 0.5757 - loss: 0.6694
RNN test Loss: 0.6707848310470581, Accuracy: 0.5705999732017517


Epoch 1/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m187s[0m 588ms/step - accuracy: 0.5181 - loss: 0.6876 - val_accuracy: 0.6142 - val_loss: 0.6136
Epoch 2/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m184s[0m 587ms/step - accuracy: 0.6537 - loss: 0.5660 - val_accuracy: 0.6172 - val_loss: 0.6235
Epoch 3/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m200s[0m 640ms/step - accuracy: 0.7274 - loss: 0.5363 - val_accuracy: 0.6756 - val_loss: 0.6116
Epoch 4/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m188s[0m 601ms/step - accuracy: 0.7753 - loss: 0.4895 - val_accuracy: 0.7584 - val_loss: 0.5448
Epoch 5/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m199s[0m 590ms/step - accuracy: 0.8376 - loss: 0.4002 - val_accuracy: 0.7800 - val_loss: 0.4630
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m93s[0m 119ms/step - accuracy: 0.7885 - loss: 0.4510
LSTM test Loss: 0.4574774205684662, Accuracy: 0.784359991550445

Epoch 1/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m183s[0m 573ms/step - accuracy: 0.5325 - loss: 0.6923 - val_accuracy: 0.5652 - val_loss: 0.6664
Epoch 2/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m176s[0m 563ms/step - accuracy: 0.7293 - loss: 0.5291 - val_accuracy: 0.8744 - val_loss: 0.3092
Epoch 3/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m177s[0m 564ms/step - accuracy: 0.9201 - loss: 0.2183 - val_accuracy: 0.8892 - val_loss: 0.2855
Epoch 4/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m177s[0m 565ms/step - accuracy: 0.9594 - loss: 0.1238 - val_accuracy: 0.8806 - val_loss: 0.3119
Epoch 5/5
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m177s[0m 566ms/step - accuracy: 0.9789 - loss: 0.0719 - val_accuracy: 0.8762 - val_loss: 0.4100
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m56s[0m 71ms/step - accuracy: 0.8664 - loss: 0.4479
GRU test Loss: 0.45048046112060547, Accuracy: 0.8660799860954285