<a href="https://colab.research.google.com/github/g-e-mm/Alzheimers-predictor-cnn/blob/main/Capstone_Project_2_Gem.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---
# **Alzheimer's Disease Predictor using CNN**
---
Done by,<br> Gem Barnaba as a part of PGP Data Analytics with ML course from IMARTICUS Learning<br> for the purpose of Capstone Project 2

**APPROACH TO THE PROJECT**<br><br>
**1.** [**Project Objectives and Dataset Info**](#Section1)<br>
**2.** [**Loading Dataset and Libraries**](#Section2)<br>
**3.** [**Data Pre-Processing**](#Section3)<br>
**4** [**Model Building**](#Section4)<br>
  - **4.1** [**Declaring the Model and Layers**](#Section401)
  - **4.2** [**Compiling the Model**](#Section402)
  - **4.3** [**Fitting the Model**](#Section403)

**5.** [**Prediction using and evaluating the Model**](#Section5)<br>
**6.** [**Deployment using Gradio**](#Section6)<br>
**7.** [**Footnote**](#Section7)<br>

<a name = Section1></a>
# **1. Project Objectives and Dataset Info**
---

**Context:**<br>


---


- **Alzheimer's Disease (AD)**: AD is the most common form of dementia, causing progressive brain cell damage, memory loss, and cognitive decline. It's expected to affect 152 million people by 2050, with annual treatment costs of 1 trillion USD. Early diagnosis can slow its progression, but AD datasets are limited and imbalanced, often leading to misclassification of early symptoms as "No Alzheimer's."

- **GAN-based Solutions**: Conventional methods for addressing class imbalance are not optimal. GANs, particularly WGANs-GP, can generate new MRI images, improving classifier performance on unseen data. This helps mitigate bias and class imbalance in AD datasets.

- **Improved Performance**: Synthetic MRIs generated using WGANs-GP improved minority class accuracy by 91.4%, with only a 1% drop for the majority class. The balanced dataset achieved 99% accuracy, outperforming traditional methods like SMOTE and image augmentation.

- **High-Quality Synthetic MRIs**: The synthetic MRIs showed strong metrics, with an FID score of 0.13, SSIM of 0.97, PSNR of 32 dB, and Sharpness Difference of 0.04, indicating they closely match the quality of original MRIs.

**Dataset Description:**

---

- **Dataset Composition**: The dataset includes a mix of real and synthetic axial MRIs, created to address the class imbalance in the original Kaggle Alzheimer's dataset. The dataset features four categories: "No Impairment" (100 patients), "Very Mild Impairment" (70 patients), "Mild Impairment" (28 patients), and "Moderate Impairment" (2 patients). Each patient's brain was sliced into 32 horizontal axial MRIs.

- **MRI Acquisition**: The MRI images were captured using a 1.5 Tesla MRI scanner with a T1-weighted sequence. The images have a 128x128 pixel resolution in “.jpg” format, and all have been pre-processed to remove the skull.

- **Synthetic MRI Considerations**: The synthetic MRIs were not verified by a radiologist, so the dataset may not fully represent real-world patient symptoms. However, there are no privacy concerns since the synthetic MRIs do not resemble actual patients.

<a name = Section2></a>
# **2. Loading Datasets and Libraries**
---

In [None]:
import os
import random
import numpy as np
import pandas as pd
import seaborn as sns
import tensorflow as tf
from tensorflow import keras
from keras.preprocessing import image
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import random
from tensorflow.keras.metrics import Precision, Recall
from sklearn.metrics import confusion_matrix, classification_report, accuracy_score
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D,Activation, Flatten, Dense, Dropout, BatchNormalization, LSTM, ConvLSTM2D
from keras.losses import categorical_crossentropy


In [None]:
!pip install gradio --quiet

In [None]:
import gradio as gr

In [None]:
tf.random.set_seed(5638)
random.seed(5638)

In [None]:
!unzip AD_Prediction.zip

In [None]:
def load_image_data(train_dir, test_dir, target_size=(128, 128), batch_size=32):
  """Loads image data from the specified directories and returns training and testing generators.

  Args:
    train_dir: The directory containing the training data.
    test_dir: The directory containing the testing data.
    target_size: The desired image size.
    batch_size: The desired batch size.

  Returns:
    A tuple of two ImageDataGenerator objects: train_generator and test_generator.
  """

  train_datagen = ImageDataGenerator(rescale=1./255,
                                    shear_range=0.2,
                                    zoom_range=0.2,
                                    horizontal_flip=True)

  test_datagen = ImageDataGenerator(rescale=1./255)

  train_generator = train_datagen.flow_from_directory(
      train_dir,
      target_size=target_size,
      batch_size=batch_size,
      class_mode='categorical')

  test_generator = test_datagen.flow_from_directory(
      test_dir,
      target_size=target_size,
      batch_size=batch_size,
      class_mode='categorical')

  return train_generator, test_generator

train_dir = '/content/Combined Dataset/train'
test_dir = '/content/Combined Dataset/test'
train_df, test_df = load_image_data(train_dir, test_dir)

<a name = Section3></a>
# **3. Data Preprocessing**
---

In [None]:
labels = {value: key for key, value in train_df.class_indices.items()}

print("Label in in train and validation datasets\n")

for key, value in labels.items():
    print(f'{key} : {value}')

In [None]:
import matplotlib.pyplot as plt

def display_images(dataframe, num_images=16):
  """Displays a specified number of images from the dataframe along with their labels.

  Args:
    dataframe: The dataframe containing image filenames and labels.
    num_images: The number of images to display.
  """

  # Get a batch of images and labels using next()
  batch_x, batch_y = next(dataframe) # Use next() with the iterator

  plt.figure(figsize=(16, 8))
  for i in range(min(num_images, dataframe.batch_size)):
      img = batch_x[i]
      # Get the index of the predicted class (highest probability)
      label_index = batch_y[i].argmax()

      plt.subplot(4, 4, i + 1)
      plt.imshow(img)
      plt.title(f"Label: {label_index}") # Displays the predicted class index
      plt.axis('off')
  plt.tight_layout()
  plt.show()

display_images(train_df)

<a name = Section4></a>
# **4. Model Building**
---

<a name = Section401></a>
# **4.1. Declaring the Model and Layers**

In [None]:
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(4, activation='softmax')) # 4 output classes

In [None]:
model.summary()

<a name = Section402></a>
# **4.2. Compiling the Model**

In [None]:
img = 128
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy', Precision(), Recall()])

<a name = Section403></a>
# **4.3. Fitting the Model**

In [None]:
hist = model.fit(train_df, epochs=10, validation_data=test_df, batch_size=32)

<a name = Section5></a>
# **5. Prediction using and evaluating the model**
---

In [None]:
y_pred = model.predict(test_df)
y_pred_classes = np.argmax(y_pred, axis=1)
y_true = test_df.classes

In [None]:
print("Accuracy:", accuracy_score(y_true, y_pred_classes))
print("Classification Report:\n", classification_report(y_true, y_pred_classes))
print("Confusion Matrix:\n", confusion_matrix(y_true, y_pred_classes))

<a name = Section6></a>
# **6. Deployment using Gradio**
---

In [None]:
def predict_image(image):
  """Predicts the Alzheimer's disease case based on an input image.

  Args:
    image: The input image.

  Returns:
    The predicted case (e.g., "No Impairment", "Mild Impairment").
  """
  img = image.reshape((-1, 128, 128, 3))
  prediction = model.predict(img)
  predicted_class_index = np.argmax(prediction)
  predicted_class_label = labels[predicted_class_index]

  return predicted_class_label

In [None]:
!pip install gradio
import gradio as gr


iface = gr.Interface(
    fn=predict_image,
    inputs=gr.Image(),
    outputs="text",
    title="Alzheimer's Disease Predictor",
    description="Upload an MRI image to predict the case of Alzheimer's disease."
)

iface.launch()

<a name = Section7></a>
# **7. Footnote**
---