<a href="https://colab.research.google.com/github/fleursomnium/Alzheimers-CNN/blob/main/CNN_MRI_Scans.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import os
import tensorflow as tf
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'  # to surpress the CUDA warnings

## Importing the data to be used in this CNN

In [None]:
# importing the data
import kagglehub

# download latest version
path = kagglehub.dataset_download("uraninjo/augmented-alzheimer-mri-dataset")

print("Path to dataset files:", path)

Path to dataset files: /home/x-jsanchez3/.cache/kagglehub/datasets/uraninjo/augmented-alzheimer-mri-dataset/versions/1


## Splitting Data to Training and Testing set

In [None]:
# set the base path to the downloaded dataset path
base_path = path  # this path comes from the API

train_dir = os.path.join(base_path, 'OriginalDataset')
test_dir = os.path.join(base_path, 'AugmentedAlzheimerDataset')

# printing the directories to verify
print("Train Directory:", train_dir)
print("Test Directory:", test_dir)

Train Directory: /home/x-jsanchez3/.cache/kagglehub/datasets/uraninjo/augmented-alzheimer-mri-dataset/versions/1/OriginalDataset
Test Directory: /home/x-jsanchez3/.cache/kagglehub/datasets/uraninjo/augmented-alzheimer-mri-dataset/versions/1/AugmentedAlzheimerDataset


In [None]:
# viewing the data provided
print("Train directory contents:", os.listdir(train_dir))
print("Test directory contents:", os.listdir(test_dir))

Train directory contents: ['NonDemented', 'MildDemented', 'VeryMildDemented', 'ModerateDemented']
Test directory contents: ['NonDemented', 'VeryMildDemented', 'ModerateDemented', 'MildDemented']


## Pre-Processing the data (distribution of the data)

In [None]:
# count the files inside each class directory
train_files = sum([len(files) for _, _, files in os.walk(train_dir)])
test_files = sum([len(files) for _, _, files in os.walk(test_dir)])

print(f"Train files count: {train_files}")
print(f"Test files count: {test_files}")

Train files count: 6400
Test files count: 33984


## Pipeline to parallilize the data across CPUs (potentially speed up the process)

In [None]:
# loading datasets using categorical labels
train_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    train_dir,
    image_size=(128, 128),          # resize all images to 128x128 pixels
    batch_size=32,                  # load images in batches of 32 for efficient processin
    label_mode='categorical',       # assign categorical labels to each image (one-hot encoded)
    shuffle=True,                   # shuffle the dataset to promote model generalization
)

test_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    test_dir,
    image_size=(128, 128),
    batch_size=32,
    label_mode='categorical',
    shuffle=False,                  # do not shuffle to maintain consistent evaluation order
)

# shard datasets for parallel processing across devices, reducing data loading overhead
num_shards = 4
shard_id = 0
train_dataset = train_dataset.shard(num_shards=num_shards, index=shard_id)
test_dataset = test_dataset.shard(num_shards=num_shards, index=shard_id)

# data Augmentation and Preprocessing for training data only
def augment_image(image, label):
    image = tf.image.random_flip_left_right(image)
    image = tf.image.random_flip_up_down(image)
    image = tf.image.random_brightness(image, max_delta=0.1)
    image = tf.image.random_contrast(image, lower=0.9, upper=1.1)
    return image, label

# normalization function for both training and testing data
def normalize_image(image, label):
    image = tf.cast(image, tf.float32) / 255.0
    return image, label

# applying augmentation + normalization for train dataset only
train_dataset = train_dataset.map(lambda image, label: (tf.image.resize(image, [128, 128]), label))
train_dataset = train_dataset.map(augment_image)
train_dataset = train_dataset.map(normalize_image)

# applying normalization only for test dataset
test_dataset = test_dataset.map(lambda image, label: (tf.image.resize(image, [128, 128]), label))
test_dataset = test_dataset.map(normalize_image)

# # cache, shuffle, batch, and prefetch for optimization
# train_dataset = train_dataset.cache().shuffle(1000).batch(32).prefetch(tf.data.experimental.AUTOTUNE)
# test_dataset = test_dataset.cache().batch(32).prefetch(tf.data.experimental.AUTOTUNE)

# remove .batch(32) in the preprocessing pipeline
train_dataset = train_dataset.cache().shuffle(1000).prefetch(tf.data.experimental.AUTOTUNE)
test_dataset = test_dataset.cache().prefetch(tf.data.experimental.AUTOTUNE)


# suppress tf messages
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2' # log level to 2 to ignore info messages and only show warnings and errors
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

Found 6400 files belonging to 4 classes.
Found 33984 files belonging to 4 classes.


=========================================       **Data Loading and Preprocessing**       ======================================== <br>
In this section, we prepare the training and testing datasets using TensorFlow's image_dataset_from_directory function, which loads images from their respective directories, resizes them to 128x128 pixels, and applies categorical labels. The training dataset is shuffled to promote model generalization, while the testing dataset is kept in a fixed order for consistent evaluation.

To optimize data loading for parallel processing, we shard each dataset into 4 parts. This sharding setup is designed for use with multiple devices, distributing data across devices and reducing loading overhead. Additionally, data augmentation techniques—such as random flips, brightness, and contrast adjustments—are applied only to the training dataset to enhance model generalization. Both datasets undergo normalization, scaling pixel values to the range [0, 1].

For further efficiency, we cache, shuffle, and prefetch the data. This caching minimizes redundant data loading, and prefetching enables the pipeline to asynchronously load batches during model training, reducing idle time. The combination of these preprocessing steps helps ensure that our data pipeline is optimized for both performance and generalization potential.

In [None]:
# print(f'train dataset shard size: {sum(1 for _ in train_dataset)}')
# print(f'test dataset shard size: {sum(1 for _ in test_dataset)}')

## **Basic Sequential CNN**

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization
from tensorflow.keras import regularizers

# define the CNN architecture using the add method
model = Sequential()

# first convolutional layer with L2 regularization
model.add(Conv2D(32, (3, 3), activation='relu', kernel_regularizer=regularizers.l2(0.001), input_shape=(128, 128, 3)))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))

# second convolutional layer with L2 regularization
model.add(Conv2D(64, (3, 3), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))

# third convolutional layer with L2 regularization
model.add(Conv2D(128, (3, 3), activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(BatchNormalization())
model.add(MaxPooling2D((2, 2)))

# flatten and fully connected layers
model.add(Flatten())
model.add(Dense(128, activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))
model.add(BatchNormalization())
model.add(Dropout(0.5))

# output layer with softmax activation for multi-class classification
model.add(Dense(4, activation='softmax'))

# compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# display model architecture
model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [None]:
# train the model
history = model.fit(
    train_dataset,
    epochs=10,  # Adjust epochs as needed
    validation_data=test_dataset
)

Epoch 1/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 777ms/step - accuracy: 0.4114 - loss: 2.0692 - val_accuracy: 0.2746 - val_loss: 3.2210
Epoch 2/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m36s[0m 726ms/step - accuracy: 0.5910 - loss: 1.5076 - val_accuracy: 0.2932 - val_loss: 8.7210
Epoch 3/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 704ms/step - accuracy: 0.6518 - loss: 1.2607 - val_accuracy: 0.2820 - val_loss: 10.7169
Epoch 4/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m35s[0m 706ms/step - accuracy: 0.7963 - loss: 0.9523 - val_accuracy: 0.2652 - val_loss: 4.2618
Epoch 5/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m37s[0m 743ms/step - accuracy: 0.8835 - loss: 0.7814 - val_accuracy: 0.2188 - val_loss: 3.7354
Epoch 6/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 801ms/step - accuracy: 0.8982 - loss: 0.7262 - val_accuracy: 0.2632 - val_loss: 3.6499
Epoch 7/10
[1m50/50

# **CNN Architecture with Batch Normalization for Enhanced Training Stability**

### (a) building the CNN architecture

In [None]:
from tensorflow.keras import layers, models
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization, ReLU
from tensorflow.keras.optimizers import Adam

# initialize the model
model_bnorm = models.Sequential()

# first convolutional layer + BatchNormalization
model_bnorm.add(Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))  # No activation here, BatchNormalization will handle it
model_bnorm.add(BatchNormalization())
model_bnorm.add(MaxPooling2D((2, 2)))

# second convolutional layer + BatchNormalization
model_bnorm.add(Conv2D(64, (3, 3), activation='relu'))
model_bnorm.add(BatchNormalization())
model_bnorm.add(MaxPooling2D((2, 2)))

# third convolutional layer + BatchNormalization
model_bnorm.add(Conv2D(128, (3, 3), activation='relu'))
model_bnorm.add(BatchNormalization())
model_bnorm.add(MaxPooling2D((2, 2)))

# flatten and fully connected layers
model_bnorm.add(Flatten())
model_bnorm.add(Dense(128, activation='relu'))
model_bnorm.add(BatchNormalization())
model_bnorm.add(Dropout(0.5))  # Dropout for regularization

# output layer with 4 classes
model_bnorm.add(Dense(4, activation='softmax'))

# compile the model
model_bnorm.compile(
    optimizer=Adam(),
    loss='categorical_crossentropy',  # Categorical crossentropy for multi-class classification
    metrics=['accuracy']
)

# display model architecture
model_bnorm.summary()

### (b) Training the network

In [None]:
# train the model
history = model_bnorm.fit(
    train_dataset,  # Training data
    epochs=10,  # Adjust the number of epochs as needed
    validation_data=test_dataset,  # Optional: Validation data
    verbose=2  # Set verbose to 1 or 2 to get more information about the training process
)

# optional: You can also save the model after training
# model_bnorm.save('model_with_bn.h5')  # Save the model to a file

# **Regularization**

## Applying L2 regularization to the kernels

In [None]:
from tensorflow.keras import layers, models, regularizers
from tensorflow.keras.optimizers import Adam

# initialize the model
model_with_bn = models.Sequential()

# first convolutional layer with L2 regularization + BatchNormalization
model_with_bn.add(layers.Conv2D(32, (3, 3), activation='relu',
                                kernel_regularizer=regularizers.l2(0.001),
                                input_shape=(128, 128, 3)))
model_with_bn.add(layers.BatchNormalization())
model_with_bn.add(layers.MaxPooling2D((2, 2)))

# second convolutional layer with L2 regularization + BatchNormalization
model_with_bn.add(layers.Conv2D(64, (3, 3), activation='relu',
                                kernel_regularizer=regularizers.l2(0.001)))
model_with_bn.add(layers.BatchNormalization())
model_with_bn.add(layers.MaxPooling2D((2, 2)))

# third convolutional layer with L2 regularization + BatchNormalization
model_with_bn.add(layers.Conv2D(128, (3, 3), activation='relu',
                                kernel_regularizer=regularizers.l2(0.001)))
model_with_bn.add(layers.BatchNormalization())
model_with_bn.add(layers.MaxPooling2D((2, 2)))

# flatten and fully connected layers with L2 regularization
model_with_bn.add(layers.Flatten())
model_with_bn.add(layers.Dense(128, activation='relu',
                                kernel_regularizer=regularizers.l2(0.001)))
model_with_bn.add(layers.BatchNormalization())
model_with_bn.add(layers.Dropout(0.5))  # Dropout for regularization

# output layer
model_with_bn.add(layers.Dense(4, activation='softmax'))

# compile the model
model_with_bn.compile(
    optimizer=Adam(),
    loss='categorical_crossentropy',  # Categorical crossentropy for multi-class classification
    metrics=['accuracy']
)

# display model architecture
model_with_bn.summary()

In [None]:
# compile the model
model_with_bn.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# display the model architecture
model_with_bn.summary()

## training bn model

In [None]:
# train the model
history = model_with_bn.fit(
    train_dataset,                 # training dataset
    epochs=10,                     # number of epochs to train
    validation_data=test_dataset,  # validation dataset
)

Epoch 1/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m56s[0m 1s/step - accuracy: 0.4294 - loss: 1.9060 - val_accuracy: 0.2838 - val_loss: 2.4031
Epoch 2/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 1s/step - accuracy: 0.6003 - loss: 1.3822 - val_accuracy: 0.3040 - val_loss: 3.0146
Epoch 3/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m84s[0m 1s/step - accuracy: 0.7238 - loss: 1.1111 - val_accuracy: 0.2878 - val_loss: 3.6929
Epoch 4/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 1s/step - accuracy: 0.8268 - loss: 0.9093 - val_accuracy: 0.2989 - val_loss: 4.4162
Epoch 5/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 1s/step - accuracy: 0.8742 - loss: 0.8239 - val_accuracy: 0.3127 - val_loss: 4.6352
Epoch 6/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 1s/step - accuracy: 0.9582 - loss: 0.6880 - val_accuracy: 0.3345 - val_loss: 5.1319
Epoch 7/10
[1m50/50[0m [32m━━━━━━━━━━

## Incorporating Learning Rate Reduction to Improve Model Performance

In [None]:
from tensorflow.keras.callbacks import ReduceLROnPlateau

# reducing the Learning Rate
reduce_lr = ReduceLROnPlateau(monitor='val_loss',
                              factor=0.2,
                              patience=3,
                              min_lr=1e-6)

history = model_with_bn.fit(
    train_dataset,
    epochs=10,
    validation_data=test_dataset,
    callbacks=[reduce_lr]
)

Epoch 1/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m62s[0m 1s/step - accuracy: 0.3708 - loss: 2.1566 - val_accuracy: 0.3180 - val_loss: 2.2988 - learning_rate: 0.0010
Epoch 2/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m54s[0m 1s/step - accuracy: 0.6175 - loss: 1.3548 - val_accuracy: 0.2828 - val_loss: 3.3903 - learning_rate: 0.0010
Epoch 3/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m68s[0m 1s/step - accuracy: 0.7723 - loss: 1.0457 - val_accuracy: 0.2820 - val_loss: 4.4756 - learning_rate: 0.0010
Epoch 4/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m60s[0m 1s/step - accuracy: 0.8416 - loss: 0.8988 - val_accuracy: 0.2874 - val_loss: 4.7121 - learning_rate: 0.0010
Epoch 5/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m56s[0m 1s/step - accuracy: 0.9168 - loss: 0.7762 - val_accuracy: 0.2820 - val_loss: 6.0167 - learning_rate: 2.0000e-04
Epoch 6/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m80s[0m 

the scores for the above model are not that good as you can see the val_loss score increase instead of decreasing

# **Regularization and Optimization Techniques**

## Model 1: L2 Regularization + Dropout + BatchNormalization

In [None]:
from tensorflow.keras import layers, models, regularizers
from tensorflow.keras.optimizers import Adam

# Initialize the model
model_1 = models.Sequential()

# First convolutional layer with L2 regularization + BatchNormalization
model_1.add(layers.Conv2D(32, (3, 3), activation='relu',
                          kernel_regularizer=regularizers.l2(0.001),
                          input_shape=(128, 128, 3)))
model_1.add(layers.BatchNormalization())
model_1.add(layers.MaxPooling2D((2, 2)))

# Second convolutional layer with L2 regularization + BatchNormalization
model_1.add(layers.Conv2D(64, (3, 3), activation='relu',
                          kernel_regularizer=regularizers.l2(0.001)))
model_1.add(layers.BatchNormalization())
model_1.add(layers.MaxPooling2D((2, 2)))

# Third convolutional layer with L2 regularization + BatchNormalization
model_1.add(layers.Conv2D(128, (3, 3), activation='relu',
                          kernel_regularizer=regularizers.l2(0.001)))
model_1.add(layers.BatchNormalization())
model_1.add(layers.MaxPooling2D((2, 2)))

# Flatten and fully connected layers with L2 regularization
model_1.add(layers.Flatten())
model_1.add(layers.Dense(128, activation='relu',
                         kernel_regularizer=regularizers.l2(0.001)))
model_1.add(layers.BatchNormalization())


# Dropout layer for regularization
model_1.add(layers.Dropout(0.5))

# Output layer for classification
model_1.add(layers.Dense(4, activation='softmax'))

# Compile the model
model_1.compile(
    optimizer=Adam(),
    loss='categorical_crossentropy',  # Categorical crossentropy for multi-class classification
    metrics=['accuracy']
)

# Display model architecture
model_1.summary()

In [None]:
history = model_1.fit(
    train_dataset,                  # Your training dataset
    epochs=10,                       # Number of epochs to train
    validation_data=test_dataset,     # Your validation dataset
    callbacks=[reduce_lr]            # Optional: Use the learning rate reduction callback
)

NameError: name 'train_dataset' is not defined

## Model 2: L2 Regularization + Early Stopping

In [None]:
from tensorflow.keras.callbacks import EarlyStopping

# EarlyStopping callback to monitor validation loss
early_stopping = EarlyStopping(monitor='val_loss', patience=3)

model_2 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(128, 128, 3)),

    # Convolutional layers with L2 regularization
    tf.keras.layers.Conv2D(32, (3, 3), activation=None,
                           kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),
    tf.keras.layers.MaxPooling2D((2, 2)),

    tf.keras.layers.Conv2D(64, (3, 3), activation=None,
                           kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),
    tf.keras.layers.MaxPooling2D((2, 2)),

    tf.keras.layers.Conv2D(128, (3, 3), activation=None,
                           kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),
    tf.keras.layers.MaxPooling2D((2, 2)),

    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation=None,
                          kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),

    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(4, activation='softmax')
])

# Compile the model
model_2.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Fit the model with early stopping
history_2 = model_2.fit(
    train_dataset,
    epochs=10,
    validation_data=test_dataset,
    callbacks=[early_stopping]
)

# Display model architecture
model_2.summary()

Epoch 1/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m56s[0m 1s/step - accuracy: 0.4385 - loss: 1.7971 - val_accuracy: 0.2632 - val_loss: 2.6970
Epoch 2/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m94s[0m 1s/step - accuracy: 0.5818 - loss: 1.3663 - val_accuracy: 0.2633 - val_loss: 3.8487
Epoch 3/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m64s[0m 1s/step - accuracy: 0.7478 - loss: 1.0687 - val_accuracy: 0.2820 - val_loss: 5.4459
Epoch 4/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m77s[0m 1s/step - accuracy: 0.8374 - loss: 0.8906 - val_accuracy: 0.2841 - val_loss: 5.7390


KeyboardInterrupt: 

In [None]:
from tensorflow.keras import regularizers
from tensorflow.keras import layers
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau

# Define model with L2 regularization, Dropout, and BatchNormalization
model_1 = tf.keras.models.Sequential([
    tf.keras.layers.Input(shape=(128, 128, 3)),

    # First convolutional layer with L2 regularization
    tf.keras.layers.Conv2D(32, (3, 3), activation=None,
                           kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),
    tf.keras.layers.MaxPooling2D((2, 2)),

    # Second convolutional layer with L2 regularization
    tf.keras.layers.Conv2D(64, (3, 3), activation=None,
                           kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),
    tf.keras.layers.MaxPooling2D((2, 2)),

    # Third convolutional layer with L2 regularization
    tf.keras.layers.Conv2D(128, (3, 3), activation=None,
                           kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),
    tf.keras.layers.MaxPooling2D((2, 2)),

    tf.keras.layers.Flatten(),
    # Dense layer with L2 regularization
    tf.keras.layers.Dense(128, activation=None,
                          kernel_regularizer=regularizers.l2(0.001)),
    tf.keras.layers.BatchNormalization(),
    tf.keras.layers.ReLU(),

    # Dropout layer
    tf.keras.layers.Dropout(0.5),

    # Output layer
    tf.keras.layers.Dense(4, activation='softmax')
])

# Compile the model
model_1.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Print model summary
model_1.summary()

# Early stopping to avoid overfitting
early_stopping = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

# Reduce the learning rate if validation loss doesn't improve
reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.2, patience=3, min_lr=1e-6)

# Train the model with callbacks
history = model_1.fit(
    train_dataset,                    # Your training dataset
    epochs=10,                         # Number of epochs to train
    validation_data=test_dataset,       # Your validation dataset
    callbacks=[early_stopping, reduce_lr]  # Add both callbacks for better optimization
)

Epoch 1/10
[1m30/50[0m [32m━━━━━━━━━━━━[0m[37m━━━━━━━━[0m [1m13s[0m 699ms/step - accuracy: 0.4171 - loss: 2.0399

# **Transfer Learning**

In [None]:
import numpy as np
from tensorflow.keras.applications import VGG16
from sklearn.svm import SVC
from sklearn.metrics import classification_report, accuracy_score

# load the pre-trained VGG16 model (without the top layers for feature extraction)
vgg16_base = VGG16(weights='imagenet', include_top=False, input_shape=(128, 128, 3))

# function to extract features from the dataset using VGG16
def extract_features(dataset, model):
    features = []
    labels = []
    for image_batch, label_batch in dataset:
        feature_batch = model.predict(image_batch)  # Extract features for the batch
        features.append(feature_batch)
        labels.append(label_batch)

    # convert lists to numpy arrays
    features = np.vstack(features)
    labels = np.vstack(labels)

    # flatten the features for SVC (make them 2D)
    features = features.reshape(features.shape[0], -1)

    return features, np.argmax(labels, axis=1)  # Return the flattened features and the class labels

# extract features for training and testing datasets
train_features, train_labels = extract_features(train_dataset, vgg16_base)
test_features, test_labels = extract_features(test_dataset, vgg16_base)

# train SVC model
svc_model = SVC(kernel='linear', class_weight='balanced', decision_function_shape='ovr')  # OvR is default
svc_model.fit(train_features, train_labels)

# make predictions
svc_predictions = svc_model.predict(test_features)

# evaluate the SVC model
svc_accuracy = accuracy_score(test_labels, svc_predictions)
print("SVC Accuracy:", svc_accuracy)
print("SVC Classification Report:")
print(classification_report(test_labels, svc_predictions))

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━

2024-11-07 11:09:07.158031: I tensorflow/core/framework/local_rendezvous.cc:405] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4s/step
[1m1/1[0m [32m━━━

## ResNet50

In [None]:
from tensorflow.keras.applications import ResNet50
from tensorflow.keras import layers, models

# Load the pre-trained ResNet50 model without the top classification layer
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(128, 128, 3))

# Freeze the layers of the pre-trained model
base_model.trainable = False

# Build the custom model
model_transfer = models.Sequential([
    base_model,
    tf.keras.layers.GlobalAveragePooling2D(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dropout(0.5),
    tf.keras.layers.Dense(4, activation='softmax')  # Output layer for 4 classes
])

# Compile the model
model_transfer.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Display model architecture
model_transfer.summary()

# Train the model on your dataset
history_transfer = model_transfer.fit(
    train_dataset,
    epochs=10,
    validation_data=test_dataset
)

Epoch 1/10
[1m50/50[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1s/step - accuracy: 0.3797 - loss: 1.4651

### Fine-tuning
fine-tune the deeper layers of ResNet50 by unfreezing some layers and continuing training

In [None]:
from tensorflow.keras.applications import ResNet50

# Build model using ResNet50 base
resnet50_base = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
resnet50_base.trainable = False  # Freeze ResNet50 layers

resnet50_model = sequential([
    resnet50_base,
    Flatten(),
    Dense(256, activation='relu', kernel_regularizer='l2'),
    Dropout(0.5),
    Dense(train_data.num_classes, activation='softmax')
])

# Compile and train ResNet50 model
resnet50_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
resnet50_history = resnet50_model.fit(train_data, validation_data=test_data, epochs=10)

# F1 score for ResNet50
resnet50_predictions = np.argmax(resnet50_model.predict(test_data), axis=1)
resnet50_f1 = f1_score(test_labels, resnet50_predictions, average='weighted')
print("ResNet50 F1 Score:", resnet50_f1)
print("ResNet50 Classification Report:")
print(classification_report(test_labels, resnet50_predictions))

NameError: name 'sequential' is not defined