<a href="https://colab.research.google.com/github/amara929/amara929/blob/main/ISTM.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 1.0 Import Libraries

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

The given code imports TensorFlow and Keras, which are widely used deep learning frameworks, along with NumPy, a library for numerical computing in Python. tensorflow as tf allows access to TensorFlow’s functionalities, while from tensorflow import keras enables direct usage of Keras modules, which simplify building and training neural networks. import numpy as np brings in NumPy, which is essential for handling arrays and numerical operations, often used in preprocessing data for machine learning models. This setup is commonly used when developing deep learning models for tasks such as image recognition, natural language processing, and time series forecasting.

# 2.0 Load the Buily

In [None]:
# Load the buily-in IMDB dataset
imdb=keras.datasets.imdb

The code imdb = keras.datasets.imdb imports the IMDB dataset module from Keras, which contains a preprocessed collection of movie reviews labeled as positive or negative for sentiment analysis. This dataset is often used for natural language processing (NLP) tasks, particularly binary classification problems. The dataset includes a predefined vocabulary where words are indexed based on their frequency in the dataset. By assigning it to the variable imdb, the code allows access to methods such as imdb.load_data(), which loads the dataset as tokenized sequences of integers, ready for use in training deep learning models like recurrent neural networks (RNNs) or long short-term memory (LSTM) networks.

**2.1 Set the vocabulary**

In [None]:
# Set the vocabulary size and maximum sequence length
vocab_size=10000
max_length=250

The code vocab_size = 10000 and max_length = 250 sets two key parameters for processing text data in a deep learning model.

vocab_size = 10000 limits the number of unique words in the dataset to the 10,000 most frequent words. This helps reduce computational complexity and memory usage by ignoring rare words that appear infrequently.

max_length = 250 sets the maximum sequence length for each movie review to 250 words. If a review is longer, it will be truncated; if shorter, it may be padded to maintain a consistent input size for neural networks.


These settings are essential for preparing text data before feeding it into models like LSTMs, GRUs, or Transformer-based architectures.

# 3.0 Load the Dataset

In [None]:
# Load the dataset
(x_train,y_train),(x_test,y_test)=imdb.load_data(num_words=vocab_size)

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz
[1m17464789/17464789[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


The code (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=vocab_size) loads the IMDB movie reviews dataset and splits it into training and testing sets.

x_train and x_test contain the tokenized movie reviews, where each review is represented as a sequence of integers corresponding to words in a predefined vocabulary.

y_train and y_test contain the sentiment labels for the reviews, where 0 represents a negative review and 1 represents a positive review.

The argument num_words=vocab_size ensures that only the 10,000 most common words are included in the dataset, while rarer words are ignored.


This step prepares the dataset for further preprocessing, such as padding or embedding, before training a deep learning model for sentiment analysis.

# **4.0 Keras preprocessing**

In [None]:
# Pad the sequences to have the same length
x_train=keras.preprocessing.sequence.pad_sequences(x_train,maxlen=max_length)
x_test=keras.preprocessing.sequence.pad_sequences(x_test,maxlen=max_length)

The code pads or truncates the IMDB movie reviews to ensure that all sequences have the same length, making them suitable for deep learning models that require fixed-size inputs.

keras.preprocessing.sequence.pad_sequences() is a Keras utility that adjusts the length of each sequence.

x_train = keras.preprocessing.sequence.pad_sequences(x_train, maxlen=max_length) processes the training data (x_train) by truncating longer reviews and padding shorter ones to a fixed length of max_length = 250.

Similarly, x_test = keras.preprocessing.sequence.pad_sequences(x_test, maxlen=max_length) applies the same transformation to the test data (x_test).


Padding ensures that all sequences have the same size, which is essential for batch processing in deep learning models, especially for recurrent neural networks (RNNs), LSTMs, and GRUs.

**4.1 Build the LSTM Model**

In [None]:
# Build the LSTM model
model=keras.Sequential([
    keras.layers.Embedding(vocab_size,32),
    keras.layers.LSTM(32),
    keras.layers.Dense(1,activation='sigmoid')
  ])

The given code defines a Long Short-Term Memory (LSTM) neural network using Keras' Sequential API for sentiment analysis on the IMDB dataset. Here's a breakdown of each layer:

1. keras.Sequential([...]): This creates a sequential model where layers are stacked one after another.


2. keras.layers.Embedding(vocab_size, 32):

The Embedding layer converts word indices (integers) into dense vectors of fixed size (32 in this case).

vocab_size = 10,000 defines the total number of words in the vocabulary.

The output is a word embedding matrix where each word is represented as a 32-dimensional vector.



3. keras.layers.LSTM(32):

This adds an LSTM (Long Short-Term Memory) layer with 32 units to capture sequential dependencies in text data.

LSTMs are well-suited for handling sequential data like text because they can retain important information over long sequences.



4. keras.layers.Dense(1, activation='sigmoid'):

A fully connected Dense layer with 1 neuron is added for binary classification (positive or negative sentiment).

The sigmoid activation function outputs a probability score between 0 (negative sentiment) and 1 (positive sentiment).




This model learns to analyze the sentiment of movie reviews using word embeddings and LSTM-based sequence learning.

**4.2 Compile the model**

In [None]:
# Compile the model
model.compile(optimizer='adam',
               loss='binary_crossentropy',
               metrics=['accuracy'])



This code compiles the LSTM model by specifying the optimizer, loss function, and evaluation metric before training.

1. optimizer='adam':

Uses the Adam (Adaptive Moment Estimation) optimizer, which is an efficient and widely used optimization algorithm for training deep learning models.

It combines the benefits of momentum and adaptive learning rates for faster convergence.



2. loss='binary_crossentropy':

Since this is a binary classification problem (positive vs. negative sentiment), binary cross-entropy is used as the loss function.

It calculates how well the model’s predicted probabilities match the actual labels (0 or 1).

The loss function helps the model adjust its weights during training to minimize classification errors.



3. metrics=['accuracy']:

The accuracy metric is used to track model performance during training and evaluation.

It measures the percentage of correctly predicted labels.




This compilation step prepares the model for training, ensuring it learns effectively from the IMDB dataset.

**4.3 Build the LSTM model**

In [None]:
# Build the LSTM model
model=keras.Sequential([
    keras.layers.Embedding(vocab_size,32),
    keras.layers.LSTM(32, input_shape=(x_train.shape[1], 32)), # Specify input_shape for LSTM
    keras.layers.Dense(1,activation='sigmoid')
  ])

  super().__init__(**kwargs)


The code defines an LSTM-based neural network using Keras' Sequential API for sentiment analysis on the IMDB dataset. The first layer, keras.layers.Embedding(vocab_size, 32), converts word indices into 32-dimensional dense vectors, enabling the model to learn word relationships. The keras.layers.LSTM(32, input_shape=(x_train.shape[1], 32)) layer processes the sequential text data, capturing long-term dependencies using 32 LSTM units; the input_shape=(x_train.shape[1], 32) explicitly defines the expected input dimensions, ensuring the model correctly interprets the padded sequences. Finally, keras.layers.Dense(1, activation='sigmoid') is a fully connected output layer with a single neuron, using the sigmoid activation function to produce a probability score between 0 (negative sentiment) and 1 (positive sentiment). This architecture enables the model to effectively classify movie reviews as either positive or negative based on learned patterns in the text data.

**4.4 Compile the model**

In [None]:
# Compile the model
model.compile(optimizer='adam',
               loss='binary_crossentropy', # Corrected loss function name
               metrics=['accuracy'])





1. optimizer='adam':

Uses the Adam optimizer, which is widely adopted in deep learning for its efficient handling of large datasets. Adam combines both momentum and adaptive learning rates, which helps in faster convergence and avoiding local minima.



2. loss='binary_crossentropy':

Since the model is performing binary classification (positive or negative sentiment), the loss function used is binary cross-entropy. This function measures the difference between the true labels (0 or 1) and the predicted probabilities, guiding the model to minimize classification errors during training.



3. metrics=['accuracy']:

The model tracks accuracy as its evaluation metric. Accuracy measures the proportion of correctly predicted labels (either positive or negative) over the total predictions, helping to monitor the model's performance during training and evaluation.




This compilation step ensures the model is set up with the necessary configurations to begin learning from the data.

**4.5 Train the model**

In [None]:

# Train the model
history = model.fit(x_train,y_train,
                  epochs=10,
                  batch_size=32,
                  validation_split=0.2)

Epoch 1/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 12ms/step - accuracy: 0.7034 - loss: 0.5376 - val_accuracy: 0.8324 - val_loss: 0.3732
Epoch 2/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 10ms/step - accuracy: 0.8993 - loss: 0.2641 - val_accuracy: 0.8780 - val_loss: 0.3201
Epoch 3/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 11ms/step - accuracy: 0.9287 - loss: 0.1928 - val_accuracy: 0.8610 - val_loss: 0.3622
Epoch 4/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 11ms/step - accuracy: 0.9496 - loss: 0.1445 - val_accuracy: 0.8664 - val_loss: 0.3390
Epoch 5/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 12ms/step - accuracy: 0.9416 - loss: 0.1525 - val_accuracy: 0.8706 - val_loss: 0.3925
Epoch 6/10
[1m625/625[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 12ms/step - accuracy: 0.9721 - loss: 0.0848 - val_accuracy: 0.8712 - val_loss: 0.4295
Epoch 7/10
[1m625



1. model.fit(): This method is used to train a machine learning model. The model learns from the data passed to it (x_train for input data and y_train for target labels).


2. x_train: These are the training input features (usually images, text, etc.).


3. y_train: These are the corresponding labels or target values for the training data.


4. epochs=10: This means the model will iterate over the entire training dataset 10 times. Each complete pass through the training dataset is called an epoch.


5. batch_size=32: This defines how many samples are processed before the model's internal parameters are updated. In this case, the model will process 32 samples at a time before making a change to its weights.


6. validation_split=0.2: This specifies that 20% of the data should be set aside for validation, which is used to evaluate the model’s performance during training. The remaining 80% will be used for training.


7. history: This variable stores the output of the training process, which includes training and validation metrics like loss and accuracy over each epoch. This can be used for plotting graphs and analyzing the model’s learning performance.



In short, the code is training a model using the provided training data, running for 10 epochs, with a batch size of 32, and reserving 20% of the data for validation during training.

**4.5 Evaluate the model**

In [None]:
# evaluate the model
test_loss,test_acc=model.evaluate(x_test,y_test)
print('Test accuracy:',test_acc)

[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 5ms/step - accuracy: 0.8447 - loss: 0.5740
Test accuracy: 0.8463199734687805




1. model.evaluate(x_test, y_test):
This function is used to evaluate the performance of the trained model on the test data (x_test, y_test).

x_test: The features of the test dataset (input data).

y_test: The true labels of the test dataset (expected output).


The function returns two values:

test_loss: The loss value (typically the error) between the predicted outputs and the true labels on the test data.

test_acc: The accuracy of the model, which is a measure of how many predictions are correct out of the total predictions.



2. print('Test accuracy:', test_acc):
This line prints the accuracy (test_acc) of the model on the test dataset, giving an indication of how well the model generalizes to unseen data.



In short, the code evaluates the model’s performance on the test set and prints the accuracy of the model.