Objective:

The objective of this project is to develop a deep learning model capable of classifying medical images into different categories using a transfer learning approach with a pretrained MobileNetV2 model.
Description of Model:

This project utilizes the MobileNetV2 architecture, a lightweight convolutional neural network pre-trained on the ImageNet dataset. The final fully connected (FC) layer is replaced to match the number of classes present in the Medical MNIST dataset. By using transfer learning, the model leverages learned features from a general image classification task (ImageNet) and fine-tunes them for the medical image classification task. This allows the model to efficiently classify medical images with a relatively smaller dataset.
Description of Code:

Libraries Used:
TensorFlow
MedMNIST
Numpy
Matplotlib
Dataset Handling:
Dataset: Medical MNIST dataset (specifically PathMNIST in this case) is loaded using the medmnist library.
Preprocessing: Images are resized to 224x224 (to match MobileNetV2 input size), normalized, and converted to tensors.
Splits: The dataset is split into training, validation, and test sets, and DataLoader is used to create batches.
Model Definition:
A pretrained MobileNetV2 model is loaded from TensorFlow Keras Applications.
The fully connected (FC) layer of MobileNetV2 is replaced with a new FC layer to output predictions based on the number of classes in the dataset (10 classes for PathMNIST).
The base layers of MobileNetV2 are frozen to prevent weight updates during training, and only the new layers are trained.
Training Loop:
The model is trained for 10 epochs using the Adam optimizer.
Training and validation accuracy and loss are tracked during the process.
The best-performing model on the validation set is saved for further evaluation.
Visualization:
A loss curve is plotted to visualize the training and validation loss during the training process.
Training and validation accuracy are printed at each epoch.
Performance Evaluation:

Training & Validation Metrics:
Accuracy and loss are printed for both training and validation sets across epochs.
Best Model: The model with the highest validation accuracy is saved and considered the best-performing model.
Visualization:
Loss curve: A plot is generated to track the loss over epochs and visualize how the training progresses.
Expected Performance Metrics:
Accuracy: The model is expected to achieve good accuracy on the validation set after fine-tuning.
Loss: The training loss should decrease over epochs as the model fine-tunes itself for the task.
Performance Variation: Actual accuracy and loss may vary depending on the dataset quality, the number of samples, and the complexity of the task.
My Comments:

The modular structure of the model and training pipeline allows easy modification to experiment with different architectures, optimizers, or datasets. The transfer learning approach helps improve model performance even with a relatively smaller medical image dataset, leveraging the power of pretrained features.
For further improvements, you might consider:
Data Augmentation: Adding techniques like rotation, zoom, and flipping to improve the model's generalization.
Experimenting with other Pretrained Models: Trying deeper models such as ResNet50 or InceptionV3 for potentially better results.
Hyperparameter Tuning: Experimenting with learning rates, dropout rates, and batch sizes to optimize the model.
Ensemble Models: Combining the predictions of several models to improve overall performance.
Model Limitations:

Dataset Variability: The model might struggle if the dataset has high variability in terms of image quality or class imbalance.
Limited Dataset Size: If the dataset is too small, the model might overfit. In such cases, more data or augmentation techniques should be considered.
Frozen Layers: Only the final layers of MobileNetV2 are trained. More advanced techniques, such as fine-tuning more layers of the base model, can be explored for better performance.

In [3]:
%pip install medmnist tensorflow

  pid, fd = os.forkpty()


Collecting medmnist
  Downloading medmnist-3.0.2-py3-none-any.whl.metadata (14 kB)
Downloading medmnist-3.0.2-py3-none-any.whl (25 kB)
Installing collected packages: medmnist
Successfully installed medmnist-3.0.2

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.1.2[0m[39;49m -> [0m[32;49m25.0.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m
Note: you may need to restart the kernel to use updated packages.


In [10]:
import medmnist
from medmnist import INFO
from torchvision import transforms
from torch.utils.data import DataLoader

# Specify the dataset
data_flag = 'pathmnist'
download = True

# Get dataset info
info = INFO[data_flag]
DataClass = getattr(medmnist, info['python_class'])

# Define transformations
data_transforms = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[.5], std=[.5])
])

# Load datasets
train_dataset = DataClass(split='train', transform=data_transforms, download=download)
val_dataset = DataClass(split='val', transform=data_transforms, download=download)
test_dataset = DataClass(split='test', transform=data_transforms, download=download)

# Create data loaders
train_loader = DataLoader(dataset=train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(dataset=val_dataset, batch_size=32, shuffle=False)
test_loader = DataLoader(dataset=test_dataset, batch_size=32, shuffle=False)


Using downloaded and verified file: /Users/nehasharma/.medmnist/pathmnist.npz
Using downloaded and verified file: /Users/nehasharma/.medmnist/pathmnist.npz
Using downloaded and verified file: /Users/nehasharma/.medmnist/pathmnist.npz


In [7]:
# Print the info dictionary to check its structure
print(info)

{'python_class': 'PathMNIST', 'description': 'The PathMNIST is based on a prior study for predicting survival from colorectal cancer histology slides, providing a dataset (NCT-CRC-HE-100K) of 100,000 non-overlapping image patches from hematoxylin & eosin stained histological images, and a test dataset (CRC-VAL-HE-7K) of 7,180 image patches from a different clinical center. The dataset is comprised of 9 types of tissues, resulting in a multi-class classification task. We resize the source images of 3×224×224 into 3×28×28, and split NCT-CRC-HE-100K into training and validation set with a ratio of 9:1. The CRC-VAL-HE-7K is treated as the test set.', 'url': 'https://zenodo.org/records/10519652/files/pathmnist.npz?download=1', 'MD5': 'a8b06965200029087d5bd730944a56c1', 'url_64': 'https://zenodo.org/records/10519652/files/pathmnist_64.npz?download=1', 'MD5_64': '55aa9c1e0525abe5a6b9d8343a507616', 'url_128': 'https://zenodo.org/records/10519652/files/pathmnist_128.npz?download=1', 'MD5_128': 

In [None]:
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np
import requests

# Step 1: Download the dataset
url = 'https://zenodo.org/records/10519652/files/pathmnist_224.npz?download=1'
file_name = 'pathmnist_224.npz'

# Download the file
response = requests.get(url)
with open(file_name, 'wb') as f:
    f.write(response.content)

# Step 2: Load the dataset
dataset = np.load(file_name)

# Extract images and labels
train_images = dataset['train_images']
train_labels = dataset['train_labels']
val_images = dataset['val_images']
val_labels = dataset['val_labels']
test_images = dataset['test_images']
test_labels = dataset['test_labels']

# Normalize images to the range [0, 1]
train_images = train_images.astype(np.float32) / 255.0
val_images = val_images.astype(np.float32) / 255.0
test_images = test_images.astype(np.float32) / 255.0

# One-hot encode labels
train_labels = tf.keras.utils.to_categorical(train_labels, 9)
val_labels = tf.keras.utils.to_categorical(val_labels, 9)
test_labels = tf.keras.utils.to_categorical(test_labels, 9)

# Load the pre-trained base model (ResNet50)
base_model = tf.keras.applications.ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze the base model
base_model.trainable = False

# Build the model
model = models.Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(9, activation='softmax')  # Output layer with 9 classes
])

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=10, batch_size=32, validation_data=(val_images, val_labels))

# Evaluate the model
test_loss, test_accuracy = model.evaluate(test_images, test_labels)
print(f'Test Accuracy: {test_accuracy * 100:.2f}%')
