<a href="https://colab.research.google.com/github/spindouken/holbertonschool-machine_learning/blob/master/unsupervised_learning/hyperparameter_tuning/MRIalz.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Alzheimer's Disease Classification using MRI Images and Bayesian Hyperparameter Optimization

## Overview

This project aims to develop a machine learning model to classify Alzheimer's disease based on MRI scans.

## Technical Approach

- **Data**: https://www.kaggle.com/datasets/sachinkumar413/alzheimer-mri-dataset/
- **Model**: Convolutional Neural Network (CNN) based on MobileNetV2 architecture.
- **Hyperparameter Tuning**: Bayesian Optimization using GPyOpt.
- **Evaluation Metrics**: Accuracy, AUC-ROC, and F1-Score.

## Sections

1. **Imports**: Setting up the environment by importing necessary libraries.
2. **Data Loading**: Function to load and preprocess the MRI images.
3. **Model Creation**: Function to initialize and compile the CNN model.
4. **Hyperparameter Space Definition**: Outlining the hyperparameters to be optimized.
5. **Custom Metrics**: Including F1 Score as a custom evaluation metric.
6. **Bayesian Optimization**: Objective function for the GPyOpt Bayesian optimizer.
7. **Main Function**: Orchestrating the Bayesian optimization process.
8. **Results Visualization**: Function to plot and save the optimization results.
9. **Final Model Training**: Using the best hyperparameters to train the final model.

## Usage

Run each cell in sequence to go through the data loading, model creation, hyperparameter tuning, and final model training steps.

## Contributors

Mason counts

## Last Updated

10/17/23



need to install GPyOpt in colab environment and run imports for all files

In [3]:
pip install GPyOpt



In [15]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import GPyOpt
from tensorflow.keras import Sequential
from tensorflow.keras.layers import Flatten, Dense, Dropout
from tensorflow.keras.applications import DenseNet169
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.metrics import AUC
from tensorflow.keras import backend as K

# === DATA LOADING FUNCTION ===
# The load_data function is responsible for loading MRI images from a directory
# It uses ImageDataGenerator to augment and split the data into training and validation subsets
# The data (images) have already been preprocessed

In [5]:
def load_data(data_dir, batch_size=32):
    """
    load the Alzheimer MRI dataset from a specified directory

    Args:
        data_dir (str): directory where the dataset is stored
        batch_size (int): batch size for the data generator
            will default to 32 if not specified when calling the function

    Returns:
        trainingGenerator, validationGenerator: data generators for training and validation sets
    """
    datagen = ImageDataGenerator(validation_split=0.2)  # 20% data for validation

    trainingGenerator = datagen.flow_from_directory(
        data_dir,
        target_size=(128, 128),
        batch_size=batch_size,
        class_mode='categorical',
        subset='training'
    )

    validationGenerator = datagen.flow_from_directory(
        data_dir,
        target_size=(128, 128),
        batch_size=batch_size,
        class_mode='categorical',
        subset='validation'
    )

    return trainingGenerator, validationGenerator


# === MODEL CREATION FUNCTION ===
# The create_model function initializes a Convolutional Neural Network (CNN) using MobileNetV2 as the base model.
# It adds a 'top model' for classification and returns the compiled model.

In [19]:
from tensorflow.keras import Model
"""
Use transfer learning to build on a pre-trained CNN
    model for Alzheimer's classification
"""
def create_model(input_shape, num_classes):
    """
    Create the custom CNN model for Alzheimer's classification

    Parameters:
        input_shape (tuple): Shape of the input images.
        num_classes (int): Number of classes in the dataset.
    """
    # pre-trained DenseNet169 as the base model
    base_model = DenseNet169(
        include_top=False, weights="imagenet", input_shape=input_shape
    )

    # freeze all layers except the last two dense blocks
    for layer in base_model.layers[:-17]:
        layer.trainable = False

    # get the output tensor of the base model
    base_model_output = base_model.output

    x = GlobalAveragePooling2D()(base_model_output)
    x = Dense(512, activation="relu")(x)
    x = Dropout(0.4)(x)
    x = Dense(256, activation="relu")(x)
    x = Dropout(0.4)(x)
    x = Dense(num_classes, activation="softmax")(x)

    # compile Model
    model = Model(inputs=base_model.input, outputs=x)
    model.compile(
        optimizer="adam", loss="categorical_crossentropy", metrics=["accuracy"]
    )

    return model


# === HYPERPARAMETER SPACE DEFINITION ===
# The define_hyperparameter_space function outlines the hyperparameters to be optimized.
# It returns a list of dictionaries, each specifying the name, type, and domain of a hyperparameter.

In [7]:
#!/usr/bin/env python3
import GPyOpt

def define_hyperparameter_space():
    """
    Define the hyperparameter space for Bayesian optimization.

    Returns:
        domain (list): List of dictionaries specifying the hyperparameter space.
    """
    domain = [
        {'name': 'learning_rate', 'type': 'continuous', 'domain': (1e-4, 1e-2)},
        {'name': 'dense_units', 'type': 'discrete', 'domain': (128, 256, 512)},
        {'name': 'dropout_rate', 'type': 'continuous', 'domain': (0.3, 0.7)},
        {'name': 'l2_weight', 'type': 'continuous', 'domain': (1e-5, 1e-3)},
        {'name': 'batch_size', 'type': 'discrete', 'domain': (16, 32, 64)}
    ]
    return domain

# === F1 SCORE FUNCTION ===
# Custom metric function for computing the F1 Score.
# F1 Score is the harmonic mean of precision and recall and is crucial for binary classification tasks.

In [8]:
def f1_score(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    recall = true_positives / (possible_positives + K.epsilon())
    f1_val = 2 * (precision * recall) / (precision + recall + K.epsilon())
    return f1_val



# === BAYESIAN OPTIMIZATION FUNCTION ===
# The BayesianOptimization function serves as the objective function for GPyOpt.
# It takes in hyperparameters, trains a model, evaluates it on the validation set,
# and returns the validation loss.

In [20]:
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping
from tensorflow.keras.metrics import AUC
from tensorflow.keras import backend as K

iterationCount = 0
bestValidationLoss = float('inf')
bestF1Score = -1.0
bestHyperparameters = None
bestModel = None


def f1_score(y_true, y_pred):
    true_positives = K.sum(K.round(K.clip(y_true * y_pred, 0, 1)))
    possible_positives = K.sum(K.round(K.clip(y_true, 0, 1)))
    predicted_positives = K.sum(K.round(K.clip(y_pred, 0, 1)))
    precision = true_positives / (predicted_positives + K.epsilon())
    recall = true_positives / (possible_positives + K.epsilon())
    f1_val = 2 * (precision * recall) / (precision + recall + K.epsilon())
    return f1_val

def get_best_model():
    global bestModel
    return bestModel

def get_best_hyperparameters():
    global bestHyperparameters
    return bestHyperparameters

def BayesianOptimization(params):
    """
    Creates model from create_model with training and validation data,
        performs bayesian optimization on the model with the given hyperparameter space,
        and returns the satisfacing metric.

    Parameters:
        params (dict): Hyperparameters to optimize.

    Function is set to only save the best_model after training is complete
    """
    global bestValidationLoss,bestHyperparameters, bestModel, bestF1Score, iterationCount

    # extract hyperparameters from params
    params = params[0]
    learning_rate = params[0]
    dense_units = int(params[1])
    dropout_rate = params[2]
    l2_weight = params[3]
    batch_size = int(params[4])

    hyperparameters = f"lr={learning_rate:.5f}, du={dense_units}, dr={dropout_rate:.3f}, l2={l2_weight:.5f}, bs={batch_size}"

    iterationCount += 1
    print(f"Iteration: {iterationCount}")
    print(f"Optimizing with: {hyperparameters}")

    # load training data
    trainingGenerator, validationGenerator = load_data('/content/drive/MyDrive/MRIalz/Dataset/', batch_size=batch_size)

    # create model from create_model.py
    model = create_model((128, 128, 3), 4)  # Assuming 128x128 images and 4 classes

    # compile model with new hyperparameters
    model.compile(optimizer=Adam(learning_rate=learning_rate),
                  loss='categorical_crossentropy',
                  metrics=['accuracy', AUC(name='auc'), f1_score])

    # early_stopping will stop training if F1 score doesn't improve for # epochs (patience)
    early_stopping = EarlyStopping(monitor='val_f1_score', patience=2, verbose=1, mode='min')

    history = model.fit(trainingGenerator, epochs=1, validation_data=validationGenerator,
                        batch_size=batch_size, callbacks=[early_stopping])

    # return validation loss from the last epoch
    currentValidationLoss = history.history['val_loss'][-1]
    currentF1Score = history.history['val_f1_score'][-1]

    # check if the current model performs better than the best model stored in memory
    if currentF1Score > bestF1Score:
        bestF1Score = currentF1Score
        bestHyperparameters = hyperparameters
        bestModel = model

    # update best validation loss to be printed during training
    if currentValidationLoss < bestValidationLoss:
        bestValidationLoss = currentValidationLoss

    print(f"Validation loss with these parameters: {currentValidationLoss}")
    print("AUC-ROC:", history.history['val_auc'][-1])
    print("F1-Score:", currentF1Score)
    print(f"Current best validation loss: {bestValidationLoss}, with hyperparameters: {bestHyperparameters}")
    print(f"Current best F1 score: {bestF1Score}, with hyperparameters: {bestHyperparameters}")

    return currentF1Score


# === MAIN FUNCTION ===
# Orchestrates the Bayesian optimization process.
# It initializes the optimizer, runs it, and saves the best hyperparameters found during optimization.

In [21]:
import os
import GPyOpt
import pickle
import datetime
from tensorflow.keras.layers  import GlobalAveragePooling2D


def main():
    """
    Run Bayesian optimization to tune hyperparameters using GPyOpt
        and save the best hyperparameters (in best_params.pkl) to be used in training final_model.py
        .pkl file will come with timestamp to account for multiple bayesian optimization runs

    Main function utilizes the following functions to perform Bayesian optimization:
        define_hyperparameter_space.py
        BayesianOptimization.py
        load_and_preprocess.py
        save_and_plot.py

    Note: bayesian optimization is actually performed in BayesianOptimization.py
    """
    # create a directory for best models if it doesn't exist
    if not os.path.exists('/content/drive/MyDrive/MRIalz/best_models'):
        os.makedirs('/content/drive/MyDrive/MRIalz/best_models')

    print("Starting Bayesian optimization...")

    # use function (from main folder)
    #   which defined the hyperparameter space for Bayesian optimization
    domainExpansion = define_hyperparameter_space()

    # initialize bayesian optimization
    # add initial_design_numdata=0 to avoid random initialization
    #   and speed up optimization (for bug testing)
    optimizer = GPyOpt.methods.BayesianOptimization(
        f=BayesianOptimization,
        domain=domainExpansion,
        acquisition_type="EI",  # expected improvement
        exact_feval=True,
        maximize=False,
    )

    # specify max run count for optimization
    optimizer.run_optimization(max_iter=1)

    best_model = get_best_model()
    bestHyperparameters = get_best_hyperparameters()

    if best_model is not None:
        best_model.save(f"best_models/bestModel_{bestHyperparameters}.h5")

    print(
        "Bayesian optimization completed. Next step: Use the best hyperparameters to train your final model."
    )

    timestamp = datetime.datetime.now().strftime("%m-%d-%y-%H:%M")
    filename = f"best_params_{timestamp}.pkl"
    # retrieve best parameters from optimizer and save them to be used in final_model.py
    best_params = optimizer.x_opt
    with open(filename, "wb") as f:
        pickle.dump(best_params, f)

    # save and plot the results of the optimization
    #   this will save the convergence plot as 'convergence.png'
    #   and the optimization evaluations as 'bayes_opt_MRIalz.txt'
    save_and_plot(optimizer)
    print(
        "Best hyperparameters saved to best_params_{timestamp}.pkl. Convergence and acquisition visualizations were stored in their respective .png files."
    )

if __name__ == "__main__":
    main()


Starting Bayesian optimization...
Iteration: 1
Optimizing with: lr=0.00869, du=128, dr=0.644, l2=0.00076, bs=16
Found 5121 images belonging to 4 classes.
Found 1279 images belonging to 4 classes.
  2/321 [..............................] - ETA: 32:01 - loss: 11.3741 - accuracy: 0.3125 - auc: 0.5356 - f1_score: 0.2187     

KeyboardInterrupt: ignored

# === PLOT AND SAVE RESULTS ===
# The save_and_plot function visualizes the convergence of the Bayesian optimization process.
# It also saves this and other evaluations into a text file.

In [None]:
import matplotlib.pyplot as plt
import GPyOpt

def save_and_plot(optimizer):
    """
    Save evaluations and plot convergence.
    """
    # plot and save convergence
    optimizer.plot_convergence()
    plt.savefig('/content/drive/MyDrive/MRIalz/convergence.png')

    # save optimization evaluations to a text file
    with open('/content/drive/MyDrive/MRIalz/bayes_opt_MRIalz.txt', 'w') as f:
        f.write(str(optimizer.get_evaluations()))


# === FINAL MODEL TRAINING ===
# The train_final_model function utilizes the best hyperparameters that were found during bayesian optimization to train the final model
# It saves the best 'checkpoint' of the model training based on the validation loss

In [None]:
import pickle
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping


def train_final_model(best_params):
    """
    Train the final model using the best hyperparameters.

    Parameters:
        best_params (dict): Best hyperparameters obtained from Bayesian optimization.
    """
    print("Training final model with best hyperparameters...")
    # Extract best hyperparameters
    learning_rate = best_params[0]
    dense_units = int(best_params[1])
    dropout_rate = best_params[2]
    l2_weight = best_params[3]
    batch_size = int(best_params[4])

    # Load data
    train_generator, val_generator = load_data('/content/drive/MyDrive/MRIalz/Dataset/')

    # Create and compile model
    model = create_model((128, 128, 3), 4)
    model.compile(optimizer=Adam(learning_rate=learning_rate), loss='categorical_crossentropy', metrics=['accuracy'])

    # Checkpoint to save the best model
    checkpoint = ModelCheckpoint("/content/drive/MyDrive/MRIalz/final_model.h5", monitor='val_loss', verbose=1, save_best_only=True, mode='min')

    # Train the model
    model.fit(train_generator, epochs=10, validation_data=val_generator, batch_size=batch_size, callbacks=[checkpoint])

    print("Final model trained and saved as 'final_model.h5'.")

with open("/content/drive/MyDrive/MRIalz/best_params.pkl", "rb") as f:
    best_params = pickle.load(f)

train_final_model(best_params)


# === Print Best_params ===
# Print the best parameters from bayesian optimization
# The parameters will be printed with their names according to the hyperparameters defined in define_hyperparameter_space function

In [None]:
with open('/content/drive/MyDrive/MRIalz/best_params.pkl', 'rb') as f:
    # Load the data from the file
    best_params = pickle.load(f)

    # Get the hyperparameter names
    hyperparameter_space = define_hyperparameter_space()
    hyperparameter_names = [param['name'] for param in hyperparameter_space]

    # Map names to best_params and print
    named_best_params = dict(zip(hyperparameter_names, best_params))

    print("Best Parameters:")
    for name, value in named_best_params.items():
        print(f"{name}: {value}")
