<a href="https://colab.research.google.com/github/amara929/amara929/blob/main/Grated_RECCURRENT_Units_(GRU).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **IMPORTING THE NECESSARY LIBRARIES**

In [None]:
import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

# **LOAD THE DATASET**

In [None]:
# Load the IMDB dataset
max_features = 10000  # Number of words to consider as features
max_len = 200  # Trim reviews after this number of words

(x_train, y_train), (x_test, y_test) = keras.datasets.imdb.load_data(num_words=max_features)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
[1m17464789/17464789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


 loads the IMDB dataset using keras.datasets.imdb.load_data(num_words=max_features), but common issues include module import errors, shape mismatches, and memory constraints. The dataset consists of tokenized movie reviews, so sequences must be padded using pad_sequences(x_train, maxlen=max_len) to ensure uniform input size for training. If you face memory errors, reducing num_words (e.g., from 10000 to 5000) can help. The corrected approach ensures proper data preprocessing for feeding into an NLP model. 🚀

In [None]:
# Pad Sequence to a fixed length
x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_len)
x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_len)

 Pads IMDB dataset sequences to a fixed length before training, ensuring all reviews have uniform input size. However, if you get an AttributeError, import pad_sequences from tensorflow.keras.preprocessing.sequence. If you face a shape mismatch error, check that x_train and y_train have the same number of samples, and convert y_train to a NumPy array (np.array(y_train)) to avoid type issues. This preprocessing step ensures compatibility with deep learning models.

# **BUILD THE MODEL**

In [None]:
# Build the GRU model
model = keras.Sequential([
    layers.Embedding(max_features, 128),
    layers.GRU(128, dropout=0.2, recurrent_dropout=0.2),
    layers.Dense(1, activation='sigmoid')
])

Your GRU model is correctly structured for binary classification, but ensure you import layers (from tensorflow.keras import layers) to avoid a NameError. If you face shape mismatches, confirm that max_len in pad_sequences matches the model’s expected input. Also, recurrent_dropout may not work efficiently on GPUs, so consider removing it or using CuDNNGRU for better performance. Lastly, compile the model using binary_crossentropy loss and adam optimizer for optimal training.

In [None]:
# Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# TRAIN THE MODEL

In [None]:
# Train the model
batch_size = 32
epochs = 5

model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          validation_data=(x_test, y_test))

Epoch 1/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m479s[0m 607ms/step - accuracy: 0.6947 - loss: 0.5626 - val_accuracy: 0.8414 - val_loss: 0.3617
Epoch 2/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m465s[0m 561ms/step - accuracy: 0.8714 - loss: 0.3113 - val_accuracy: 0.8855 - val_loss: 0.2782
Epoch 3/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m476s[0m 605ms/step - accuracy: 0.9373 - loss: 0.1710 - val_accuracy: 0.8808 - val_loss: 0.2963
Epoch 4/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m503s[0m 606ms/step - accuracy: 0.9659 - loss: 0.1002 - val_accuracy: 0.8701 - val_loss: 0.3798
Epoch 5/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m473s[0m 604ms/step - accuracy: 0.9803 - loss: 0.0611 - val_accuracy: 0.8615 - val_loss: 0.4509


<keras.src.callbacks.history.History at 0x7813c41d5ad0>

Your training code is mostly correct, but ensure that the input data (x_train and x_test) is properly padded to the correct shape and that y_train and y_test are NumPy arrays. If you encounter a shape mismatch or data type error, check that x_train and x_test have the same number of samples and match the expected input dimensions of the model. Also, ensure that validation_data is correctly formatted as (x_test, y_test). Once the data is correctly preprocessed and the model is compiled, the training will proceed without issues.

In [None]:
# Evaluate the model
loss, accuracy = model.evaluate(x_test, y_test, batch_size=batch_size)
print(f'Test loss: {loss:.4f}')
print(f'Test accuracy: {accuracy:.4f}')

[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 60ms/step - accuracy: 0.8622 - loss: 0.4588
Test loss: 0.4430
Test accuracy: 0.8630
